Am Samstag 05 Dezember 2009 04:59:55 schrieb Andreas Dilger: > > We've had problems in the past with 3ware controllers at other sites > in the past - the performance is not as good as expected, since they > rely heavily on readahead to get good performance.
True. But we had serious problems (incl data loss) with adaptec controllers before. The performance is superb but usability and maintenance is a nightmare in a Linux environment. > > That said: > > > Dec 4 12:42:56 sadosrd24 LustreError: 4744:0:(ost_handler.c: > > 882:ost_brw_read()) @@@ timeout on bulk PUT after 100+0s > > r...@ffff81007efa7e00 x7869690/t0 o3->eb2e7e64-c1d9- > > d1f6-8f9d-1ba9629ff...@net_0x20000c0a8106f_uuid:0/0 lens 384/336 e 0 > > to 0 dl 1259926976 ref 1 fl Interpret:/0/0 rc 0/0 > > This means that the IO didn't complete before the timeout. This could > be because the OST IO is so slow that no RPC can complete before the > timeout, or because there is packet loss. In our case a misconfigured D-Link switch caused the problems. So we bought the 'packet loss' option in our case. > > Some things to try: > - reduce the number of OSS threads via module parameter: > option ost oss_num_threads=N > - increase the lustre timeout (details in the manual) Thank you very much for your help showing the direction together with Bernd Schubert. Best Regards Heiko _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
