Hi, thanks for previous help. I have some question about Lustre RPC and the sequence of events that occur during large concurrent write() involving many processes and large data size per process. I understand there is a mechanism of flow control by credits, but I'm a little unclear on how it works in general after reading the "networking & io protocol" white paper.
Is it true that a write() RPC transfer's data in chunks of at least 1MB and at most (max_pages_per_rpc*page_size) Bytes, where page_size=2^16 ? I can use the bounds to estimate the number of RPCs issued per MB of data to write? About how many concurrent incoming write() RPC per OSS service thread can a single server handle before it stops responding to incoming RPCs ? What happens to an RPC when the server is too busy to handle it, is it even issued by the client ? Does the client have to poll and/or resend the RPC ? Does the process of polling for flow control credits add significant network/server congestion ? Is it likely that a large number of RPC's/flow control credit requests will induce enough network congestion so that client's RPC's timeout ? How does the client handle such a timeout ? Burlen _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
