Yes, it turns out its bug 14379. I applied the provided patches and everything works fine now. Thanks for the follow up!
-Aaron On Dec 12, 2007, at 11:23 AM, Oleg Drokin wrote: > Hello! > > On Dec 11, 2007, at 6:51 PM, Aaron S. Knister wrote: > >> This is the strangest problem I have seen. I have a lustre >> filesystem mounted on a linux server and its being exported to >> various alpha systems. The alphas mount it just fine however under >> heavy load the NFS server stops responding, as does the lustre >> mount on the export server. The weird thing is that if i mount the >> nfs export on another nfs server and run the same benchmark >> (bonnie) everything is fine. The lustre mount on the export server >> can take a real pounding (ive seen it push 300MB/sec) so I don't >> know why nfs is crashing it. >> On the nfs export server i see these messages-- >> Lustre: 4224:0:(o2iblnd_cb.c:412:kiblnd_handle_rx()) PUT_NACK from >> [EMAIL PROTECTED] >> LustreError: 4400:0:(client.c:969:ptlrpc_expire_one_request()) @@@ >> timeout (sent at 1197415542, 100s ago) [EMAIL PROTECTED] x38827/ >> t0 o36->[EMAIL PROTECTED]@o2ib:12 lens 14256/672 ref >> 1 fl Rpc:/0/0 rc 0/-22 >> Lustre: data-MDT0000-mdc-ffff81082d702000: Connection to service >> data-MDT0000 via nid [EMAIL PROTECTED] was lost; in progress >> operations using this service >> will wait for recovery to complete. > > Any messages on mds at this time? > > Bye, > Oleg Aaron Knister Associate Systems Administrator/Web Designer Center for Research on Environment and Water (301) 595-7001 [EMAIL PROTECTED] _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
