Hi Andreas - Thanks for the advice. I will gather additional CPU stats and see what shows up.
However, CPU does not seem to be a factor in the slower than expected large file buffered I/O reads. My machines have dual quad 2.66ghz processors, and gross CPU usage hovers around 50% when I'm running 16 "dd" read jobs. But a suspected client-side caching problem crops up when I just run a simple "dd" read job twice. The first time I run the single "dd" read job I get an expected throughput of 60-megabytes-per-second or so. However, the second time I run the job, I get a throughput of about 2-gigabytes-per-second, which is twice the top speed of my 10gb NIC, and only possible, I think, if the entire file was cached on the client when the first "dd" job was run. So, if I run 16 "dd" jobs, each trying to cache entire large files on the client, that could explain the unexpected slow aggregate throughput I would have thought that setting a low value for "max_cached_mb" would have solved this problem, but it made no difference. A further indication that client-side caching is at the root of speed slowdown, is that when I run my single "dd" job twice; but I drop client-side cache after the first run, (via "/proc/sys/vm/drop_caches"), I get an expected 60-megabytes-per-second or so throughput for both runs. Until I learn how to overcome this slowdown problem, I'll see if if I can obtain my required concurrent, multi large file read speed by carefully striping the files over a few boxes. Again, thanks for your help, and I'll appreciate any other suggestions you might have, or any ideas for other diagnostics we might run. Rick On 8/4/09, Andreas Dilger <[email protected]> wrote: > > On Aug 04, 2009 10:30 -0400, Rick Rothstein wrote: > > I'm new to Lustre (v1.8.0.1), and I've verified that > > I can get about 1000-megabytes-per-second aggregate throughput > > for large file sequential reads using direct-I/O. > > (only limited by the speed of my 10gb NIC with TCP offload engine). > > > > the above direct-I/O "dd" tests achieve about a 1000-megabyte-per-second > > aggregate throughput, but when I try the same tests with normal buffered > > I/O, (by just running "dd" without "iflag=direct"), the runs > > only get about a 550-megabyte-per-second aggregate throughput. > > > > I suspect that this slowdown may have something to do with > > client-side-caching, but normal buffered reads have not speeded up, > > even after I've tried such adjustments as: > > Note that there is a significant CPU overhead on the client when using > buffered IO, simply due to CPU usage from copying the data between > userspace and the kernel. Having multiple cores on the client (one > per dd process) allows distributing this copy overhead between cores. > > You could also run "oprofile" to see if there is anything else of > interest that is consuming a lot of CPU. > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > >
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
