[EMAIL PROTECTED] wrote on Thu, 21 Dec 2006 16:26 -0500:
> On Dec 21, 2006, at 3:59 PM, Pete Wyckoff wrote:
> >[EMAIL PROTECTED] wrote on Thu, 21 Dec 2006 15:50 -0500:
> >>Client posts a receive with op_id 5, bmi tag 1 and length 32808
> >>Client posts an unexpected send with op_id 7, bmi tag 1 and length 24
[..]
> >>Server receives unexpected recv with bmi tag 1 and length 24
> >>Server posts an expected send with op_id 79, bmi tag 1 and length 816
[..]
> >>On the Client:
> >>[E 15:40:10.538206] job_time_mgr_expire: job time out: cancelling bmi
> >>operation, job_id: 4.
> >>[E 15:40:10.538421] job_time_mgr_expire: job time out: cancelling bmi
> >>operation, job_id: 6.
[..]
> >>On the Server:
> >>[E 12/21 15:40] job_time_mgr_expire: job time out: cancelling bmi
> >>operation, job_id: 78.
[..]
> I did not think the op_ids would match, but bmi_mx does not see the
> timed out ops in any post_send or post_recv functions. Are these
> operations passing through bmi_mx (possibly via other BMI_meth_*
> functions) or are these unrelated to bmi_mx?
IDs are assigned to jobs. IDs are also assigned to BMI operations.
They share the same number space but are different things. A job
may require a few BMI operations to go to completion, and perhaps a
few disk operations. Job id 78 seems to require BMI id 79, for instance.
> Also, the client posts a receive with bmi tag 1 for a length of 32808
> but the server posts a send with bmi tag 1 and a length of 816. Is
> that normal?
It is known in the protocol that the maximum size of the response is
32k-ish. But, the server rarely needs to construct a response of
the maximum size. It's normal.
-- Pete
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers