Galen, George and others that might have SM BTL interest.
In my quest of looking at MPI_Iprobe performance I found what I think is an issue. If you have an application that is using the SM BTL and does a small message send <=256 followed by an MPI_Iprobe the mca_btl_sm_component function that is eventually called as a result of the opal_progress will receive and ack message from its send and then return. The net affect is that the real message is after the ack message doesn't get read until a second MPI_Iprobe is made. It seems to me that mca_btl_sm_component should read all Ack messages from a particular fifo until it either finds a real send fragment or no more messages on the fifo. Otherwise, we are forcing calls like MPI_Iprobe to not return messages that are really there. I am not sure by IB but I know that the TCP BTL does not show this issue (which doesn't surprise me since I imagine the BTL is relying on TCP to handle this type of protocol stuff).
Before I go munging with the code I wanted to make sure I am not overlooking something here. One concern is if I change the code to drain all the ack messages is that going to disrupt performance elsewhere?
--td