I wonder if there's some type of fault in the I/O path which is increasing the latency of individual I/Os? Something like this could affect the load especially when considering the number of kernel threads on the OST. paul
John White wrote: > Hello Folks, > A while back (say 3 weeks ago) we started noticing extremely high loads > (load avg around 300 at times) on our OSSs when in production and serving IO. > This cluster was, at the time, on 1.8.2 (we have since upgraded to 1.8.4 but > the problem remains). The load increases fairly predictably as clients > generate IO but even 2 clients can produce a load avg above 5.00. An > identical file system of ours does not exhibit this behavior (sticks below > load avg 1.00 under even the heaviest IO load). I've looked around bugzilla > and haven't found anything. We've disabled heartbeat on the off-chance that > was generating the load (it's not), we've attempted using a different client > transport (o2ib->tcp), this did not solve the issue. There doesn't appear to > be any specific non-kernel thread causing the high-load. The only info in > dmesg/syslog pertains to sporadic client evictions or sporadic slow setattr > due to heavy IO load (we've since tuned the number of OST threads). We're > basical ly > out of ideas to try. > > As reference, this is a 1 MDS/4 OSS cluster backed by a DDN 9900 couplet (15 > tiers, 1:1 lun mapping) running the lustre.org rpm build kernel for 1.8.4. > The MDS/OSSs are Dell R710s and the MDT is a Dell MD1000. Is this a common > problem or should a bug be filed? Any info available upon request. Thanks > for your time. > ---------------- > John White > High Performance Computing Services (HPCS) > (510) 486-7307 > One Cyclotron Rd, MS: 50B-3209C > Lawrence Berkeley National Lab > Berkeley, CA 94720 > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
