On Dec 11, 2006, at 6:06 AM, Eric Barton wrote:

John,

Does that sound roughly right?  Anything else I should be taking
into account?

The guiding principles for completion are...

1. If you return success from lnd_send or lnd_recv, you must call
   lnet_finalize() within finite time.

2. You may only call lnet_finalize() when there is no longer any
   chance that the underlying network can touch (read or write) the
   payload buffer.

3. The completion status on sends isn't critical.  Lustre only really
   needs to know that sending is over; knowing whether the send was
   good or not is really just icing on the cake (e.g. so that it
   doens't have to wait for a full timeout for an RPC reply if sending
   the request failed).

4. The completion status on receives is completely critical.  You may
   only return success if the sink buffer has been filled correctly.

                Cheers,
                        Eric

Two other comments:

1) Do not hold any locks when calling any lnet_ functions.

2) Make sure you are _completely_ done with your buffer before calling lnet_finalize(). I ran into a race condition where I called lnet_finalize() then placed the rx or tx descriptor on my idle queue. :-)

_______________________________________________
Lustre-devel mailing list
Lustre-devel@clusterfs.com
https://mail.clusterfs.com/mailman/listinfo/lustre-devel

Reply via email to