[Lustre-discuss] osc_brw_redo_request error on clients

James Robnett Wed, 09 Feb 2011 13:36:09 -0800

I have a fairly simple lustre environment that consists of a single MDS and
2 OSS's each with 4 OST's.  The servers and clients are all running Lustre
1.8.5 under RHEL 5.5,  RPM's downloaded from lustre.


Normally I've had no problems but recently I have multiple clients
reporting the following error:

LustreError: 3935:0:(osc_request.c:1629:osc_brw_redo_request()) @@@ redo
for recoverable error  req@ffff8101ae084000 x1358858531428366/t60136289752
o4->[email protected]@o2ib:6/4 lens 448/608 e 0 to 1 dl
1297285890 ref 2 fl Interpret:R/0/0 rc 0/0

which in turn appears to generate a premature EOF on our user software.

There are no corresponding errors on the servers.

I seem to only see this error on clients connected via QDR infiniband
though that may be a false lead.  In addition the problem seems more
prevalent under load.  Lastly it seems to be getting worse, almost as
if there's some garbage collection issue on the clients.

I've done some searching and don't see reports involving that routine.  It
seems like a timeout of some sort.  Any hints as to what this error
indicates as a problem ?

james

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

[Lustre-discuss] osc_brw_redo_request error on clients

Reply via email to