Scott Atchley wrote:
Hi all,

Other than an operation timeout, does PVFS2 call BMI_method_cancel()?

I can only think of one other scenario. On the client side we typically pre-post receives (for the server ack) before sending a request. If the request send fails then the client will probably call the BMI cancel method to clean up the receive that was posted. Look for job_bmi_cancel() calls in sys-io.sm for examples.

When PVFS2 calls BMI_method_cancel() on a single operation, does it imply that PVFS2 will cancel all outstanding operations for that peer?

No, although depending on the method and what state the operation is in, the method may have to do that to implement cancel. For example, there are cases in tcp in which if the socket has bad data in it there isn't much choice but to drop the connection, which may will probably wack some unrelated operations along the way.

This is fine, as long as the error code reported includes the BMI type mask so that the higher levels in the I/O stack recognize those secondary failed operations as network errors and retry them.

I ask because we are adding a function to MX that can bulk cancel all outstanding requests (sends and matched receives) for a specific peer. The requests return with a host unreachable status. If that would be useful, would you want access to such a call?

It is probably up to your implementation as to whether it is helpful or not. I don't think you necessarily need to cancel all of the operations unless that eases your implementation.

-Phil
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to