Re: [Pvfs2-developers] BMI questions

Scott Atchley Fri, 01 Dec 2006 05:10:43 -0800

On Dec 1, 2006, at 4:33 AM, Sam Lang wrote:

Your example above is currently how writes work. The client sendsan unexpected message to the server (a control message for the IO,file info, size of the IO, etc.), which posts an expected receive,and then sends an expected back to the client. The client posts areceive for the expected before sending the unexpected. After thereceive of the expected message at the client completes (this is a'ready for IO' message from the server), It posts a send of theactual IO (this will be up to FlowBufferSize). Once that sendcompletes, it posts another one, and assumes that the server hasalready posted another receive (based on the size of the entireIO). Once all the IO has completed at the server (includingpushing the data to disk), the server sends a response ack message,which the client posted a receive for before doing any of theactual IO.

Ok.

It looks like the flow code on the server doesn't actually post thenext recv of IO (IO2), until the first recv has completed (IO1), soits possible that the client posts (and starts) the next sendbefore the server posts the next receive, although its probablyunlikely.

If IO operations are always > 32 KB, I would agree. But if any are <=32 KB, MX will buffer them on the send side and complete immediately.The client could then post another even if MX is in the middle ofdelivering the first one. I can override this behavior (use mx_issend()) or use credits for control flow.

Each BMI receive uses a separate buffer (up to a max of 8 buffers).

Does this mean that at most, the client will post 8 IO sends peroperation?

Every time a bmi recv completes, two things happen, the associatedtrove write is posted, and a new bmi recv is posted. So over time,bmi receives will get posted at the server before bmi sends getposted at the client, but the second and maybe third bmi receivesposted may be posted after the bmi sends at the client.
To answer your specific questions:
The same bmi tag is passed to each of the post_send and post_recvcalls for the entire IO operation.

I can live with this as long as only one receive is posted at a timeusing a specific tag.

As to hitting resource limits, the client doesn't post the nextsend until the previous send has completed. I think with enough IOoperations from different clients happening concurrently, it may bepossible to run into the resource issues you speak of, but I needto verify that.


Definitely.

Yes it always posts a receive for an expected message. For mostexpected messages the receive is guaranteed to be posted before thepeer posts the send. That doesn't appear to guaranteed in the IOcase though, as I mentioned above.
Hope this helps.

-sam

Tremendously. In one of the diagrams above, you seem to indicate thatthe server will post receives for unexpected messages. Is this thecase? If so, does it simply use BMI_method_post_recv()? With whattag, etc.?

From the IB code, it looks like the server does not post anunexpected, but relies on the BMI method to receive the message andput it in a queue, and then return it when BMI_method_test_unexpected() is called. Am I reading this wrong?


Scott
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] BMI questions

Reply via email to