> On May 19, 2015, at 1:44 PM, Schneider, David A. <[email protected]> 
> wrote:
> 
> Thanks for the suggestion! When I had each rank run on a separate compute 
> node/host, I saw parallel performance (4 seconds for the 6GB of writing). 
> When I ran the MPI job on one host (the hosts have 12 cores, by default we 
> pack ranks onto as few hosts as possible), things happened serially, each 
> rank finished about 2 seconds after a different rank.

Hmm. That does seem like there is some bottleneck on the client side that is 
limiting the throughput from a single client.  Here are some things you could 
look into (although they might require more tinkering than you have permission 
to do):

1) Based on your output from “lctl list_nids”, it looks like you are running 
IP-over-IB.  Can you configure the clients to use RDMA?  (They would have nids 
like x.x.x.x@o2ib.)

2) Do you have the option of trying a newer client version?  Earlier lustre 
versions used a single-thread ptlrpcd to manage network traffic, but newer 
versions have a multi-threaded implementation.  You may need to compare 
compatibility with the Lustre version running on the servers though.

3) Do you gave checksums disabled?  Try running "lctl get_param 
osc.*.checksums”.  If the values are “1”, then checksums are enabled which can 
slow down performance.  You could try setting the value to “0” to see if that 
helps.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to