Jeff Squyres wrote:
Galen Shipman and I talked about this a bit and suggest the following:
- During the connection dance (probably for both the udapl and openib
BTLs), whichever peer ends up being the connection initiator (don't
forget about the race condition where 2 peers may simultaneously decide
to initiate -- this case is handled properly in the OMPI code; but just
make sure you modify the side that ends up being actual initiator), they
can send their pending fragment immediately (and Steve is right that
there will always be a pending fragment, because OMPI doesn't make a
connection until the first send).
- The other peer (the receiver of the connection) must wait to send its
pending fragment(s) until it receives the first frag from the connection
initiator. This can be accomplished either with another flag on the
OMPI module struct or perhaps making it part of the connection protocol
(i.e., don't transition the endpoint to be CONNECTED until the first
fragment is received). Either of which can be used to queue up
fragments on the receiver until the first fragment is received from the
initiator. I'd have to look in the code deeper, but I'm *guessing* that
it might be best to use the already-existing state flag (i.e., checking
for CONNECTED) because then you won't be introducing any more
conditionals in the critical path.
A different approach which you might want to consider is to have at the
btl level --two-- connections per <src,dst> ranks. so if A wants to send
B it does so through the A --> B connection and if B wants to send A it
does so through the B --> A connection. To some extent, this is the
approach taken by IPoIB-CM (I am not enough into the RFC to understand
the reasoning but i am quite sure this was the approach in the initial
implementation). At first thought it mights seems not very elegant, but
taking it into the details (projected on the ompi env) you might find it
even nice.
Or.
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general