On 2010-08-22, at 11:58, burlen wrote:
Andreas Dilger wrote:
>> Currently, 1MB is the largest bulk IO size, and is the typical size used by 
>> clients for all IO.
> 
> Is my understanding correct?
> 
> A single RPC request will initiate an RDMA transfer of at most 
> "max_pages_per_rpc". where the page unit is Lustre page size 65536. Each RDMA 
> transfer is executed in 1MB chunks.  On a given client, if there are more 
> than "max_pages_per_rpc" pages of data available to transfer , multiple RPCs 
> are issued and multiple RDMA's are initiated.

No, the max_pages_per_rpc is scaled down proportionately for systems with large 
PAGE_SIZE.  This is because the node doesn't know what the PAGE_SIZE of the 
peer is.

There is a patch in bugzilla that does what you propose - submit larger IO 
request RPCs, and do multiple 1MB RDMA xfers per request.  However, this showed 
performance _loss_ in some cases (in particular shared-file IO), and the reason 
for this regression was never diagnosed.

> Would it be correct to say: The purpose of the "max_pages_per_rpc" parameter 
> is to enable the servers to even out the individual progress of concurrent 
> clients with a lot of data to move and more fairly apportion the available 
> bandwidth amongst concurrently writing clients?

Yes, partly.  The more important factor is max_rpcs_in_flight, which limits the 
number of requests that a client can submit to each server at one time.

There was a research paper written to have dynamic max_rpcs_in_flight that 
showed performance improvements when there are few clients active, and we'd 
like to include that code into Lustre when it is ready.

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to