On Fri, 2009-08-28 at 15:00 -0400, Scott Atchley wrote: > Lustre: 4227:0:(filter_io_26.c:641:filter_commitrw_write()) lustre- > OST0000: slow i_mutex 30s > Lustre: 4222:0:(lustre_fsfilt.h:320:fsfilt_commit_wait()) lustre- > OST0000: slow journal start 30s > Lustre: 4222:0:(filter_io_26.c:724:filter_commitrw_write()) lustre- > OST0000: slow commitrw commit 30s > Lustre: 4242:0:(filter_io_26.c:706:filter_commitrw_write()) lustre- > OST0000: slow direct_io 30s > Lustre: 4242:0:(filter_io_26.c:706:filter_commitrw_write()) Skipped 4 > previous similar messages > > Should I be concerned or is this normal?
It means that I/Os are completing more slowly that Lustre would like, which as you can guess means you are hammering the disk(s) too hard. Try reducing the number of OST threads. Ideally you want those messages to go away even when you are pushing the OSTs to capacity. Ideally you want just enough OST threads to push the disks to capacity but no more. So measure, reduce, measure. If the throughput is the same or better after reducing, reduce further and measure again. Repeat until you have found the sweet spot. Obdfilter-survey in the iokit automates this for you running many tests at different thread counts letting you see where the sweet spot is without all the iterating. b.
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
