Thank you Andreas, After investigating a little deeper, one of our users' processes creates lots of smaller files and made i/o. This is the cause of the "fewer stripes" log message. And for the csum problem my investigation is still in progress.
Thanks again. On Wed, Jul 1, 2009 at 12:12 AM, Andreas Dilger <[email protected]> wrote: > On Jun 29, 2009 18:35 +0300, Ender G�ler wrote: > > We have lustre 1.6.5.1 installation on RHEL 5.1. The interconnect is > > infiniband. I came across the errors like following, on mds: > > > > Lustre: 21241:0:(lov_qos.c:427:qos_shrink_lsm()) using fewer stripes for > > object 103514695: old 8 new 6 > > This can happen if some of your OSTs are not responsive to precreate > requests. It appears you are using a wide striping by default, which > is good if you have lots of clients reading/writing from the same file > on a regular basis, but is not recommended if clients normally read/write > from a single file OR the bandwidth of a single OST can handle the needs > of a single client. > > > And here is the errors regarding to checksum, on one of the ost's: > > LustreError: 12397:0:(ost_handler.c:1225:ost_brw_write()) client csum > > 41d0fa49, original server csum e388fa92, server csum now e388fa92 > > This looks like you are having network problems, or possibly you are > using mmap IO? The data is arriving at the server is different than > the data that was originally checksummed by the client. This can happen > in some cases if the client is doing repeated mmap writes to the same > part of the file. > > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > >
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
