On Dec 11, 2006, at 6:06 AM, Eric Barton wrote:
John,
Does that sound roughly right? Anything else I should be taking
into account?
The guiding principles for completion are...
1. If you return success from lnd_send or lnd_recv, you must call
lnet_finalize() within finite time.
2. You may only call lnet_finalize() when there is no longer any
chance that the underlying network can touch (read or write) the
payload buffer.
3. The completion status on sends isn't critical. Lustre only really
needs to know that sending is over; knowing whether the send was
good or not is really just icing on the cake (e.g. so that it
doens't have to wait for a full timeout for an RPC reply if sending
the request failed).
4. The completion status on receives is completely critical. You may
only return success if the sink buffer has been filled correctly.
Cheers,
Eric
Two other comments:
1) Do not hold any locks when calling any lnet_ functions.
2) Make sure you are _completely_ done with your buffer before
calling lnet_finalize(). I ran into a race condition where I called
lnet_finalize() then placed the rx or tx descriptor on my idle
queue. :-)
_______________________________________________
Lustre-devel mailing list
Lustre-devel@clusterfs.com
https://mail.clusterfs.com/mailman/listinfo/lustre-devel