My best guess (and please correct me if I'm wrong) is that those messages are 
because the underlying block devices are slow to respond to i/o requests. It 
looks like you're using DRBD. What's your interconnect? 

On Jan 24, 2010, at 9:42 PM, Lex wrote:

> Hi list 
> 
> I have one OSS with hadware info like this : 
> 
> CPU Intel(R) xeon E5420 2.5 Ghz
> Chipset intel 5000P 
> 8GB RAM 
> 
> With this OSS, we using 2 RAID-5 arrays as OSTs ( each has 4 x 1.5 TB hard 
> drive with RAID controller adaptec 5805 ) 
> 
> I worked quite smooth before, but, about 2 weeks ago, in /var/log/messages, i 
> saw many warning ( i thought so)  like this: 
> 
> Jan 25 08:41:23 OST6 kernel: Lustre: 
> 9587:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow 
> direct_io 35s
> Jan 25 08:41:34 OST6 kernel: Lustre: 
> 9608:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow 
> direct_io 41s
> Jan 25 08:41:34 OST6 kernel: Lustre: 
> 9608:0:(filter_io_26.c:706:filter_commitrw_write()) Skipped 2 previous 
> similar messages
> Jan 25 08:41:35 OST6 kernel: Lustre: 
> 9645:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow 
> direct_io 43s
> Jan 25 08:58:10 OST6 kernel: Lustre: 
> 9646:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow 
> direct_io 31s
> Jan 25 08:59:39 OST6 kernel: Lustre: 
> 9609:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow 
> direct_io 30s
> Jan 25 09:01:05 OST6 kernel: Lustre: 
> 9587:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow 
> direct_io 33s
> Jan 25 09:03:23 OST6 kernel: Lustre: 
> 9633:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow 
> direct_io 32s
> Jan 25 09:11:25 OST6 kernel: Lustre: 
> 9585:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow 
> direct_io 36s
> 
> I googled around and found that it's because a problem with oss_num_threads 
> and even though brought it down to 64 ( followed by the function i found in 
> the 1.8 manual: thread_number = RAM * CPU core / 128 MB, its value is 256  ) 
> 
> options ost oss_num_threads=64
> 
> It still didn't help. 
> 
> I thought it was only the harmless warning but maybe wrong, our performance 
> is goes down quite heavily ( it's maybe because of other reason, but for now, 
> i am only doubting slow direct_io problem ) 
> 
> iostat -m 1 1
> Linux 2.6.18-92.1.17.el5_lustre.1.8.0custom (OST6)      01/25/2010
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.01    0.02    2.86   25.01    0.00   72.10
> 
> Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
> sda               1.30         0.01         0.00      11386       3469
> sdb               1.30         0.01         0.00      11531       3469
> sdc             131.50        12.40         0.26   11793218     249934
> sdd             178.46        18.00         0.26   17124065     250334
> md2               3.33         0.02         0.00      22915       2634
> md1               0.00         0.00         0.00          0          0
> md0               0.00         0.00         0.00          0          0
> drbd3           480.10        12.39         0.26   11789047     249639
> drbd6           565.85        14.89         0.26   14168452     249211
> 
> 
> So, could anyone please tell me whether it's warning impact our system 
> performance or not ? and if it does, give me solution or advice to resolve 
> it, please 
> 
> Best regards 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Lustre-discuss mailing list
> [email protected]
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to