All,

I have noticed an issue in the 1.2 branch when mpi_preconnect_all=1. The one way communication pattern (ranks either send or receive from each other) may not fully establish connection with peers. Example, if I have a 3 process mpi job and rank 0 does not do any mpi communication after MPI_Init() the other ranks attempts to connect will not be progressed (I have seen this with tcp and udapl). The preconnect pattern has changed slightly in the trunk but essentially it is still a one way communication, either send or receive with each rank. So although the issue I see in the 1.2 branch does not appear in the trunk I wonder if this will show up again.

An alternative to the preconnect pattern that comes to mind would be to perform a send and receive between all ranks to ensure that connections have been fully established.

Does anyone have thoughts or comments on this, or reasons not to have all ranks send and receive from all?

-DON

Reply via email to