Yes, it turns out its bug 14379. I applied the provided patches and  
everything works fine now. Thanks for the follow up!

-Aaron

On Dec 12, 2007, at 11:23 AM, Oleg Drokin wrote:

> Hello!
>
> On Dec 11, 2007, at 6:51 PM, Aaron S. Knister wrote:
>
>> This is the strangest problem I have seen. I have a lustre  
>> filesystem mounted on a linux server and its being exported to  
>> various alpha systems. The alphas mount it just fine however under  
>> heavy load the NFS server stops responding, as does the lustre  
>> mount on the export server. The weird thing is that if i mount the  
>> nfs export on another nfs server and run the same benchmark  
>> (bonnie) everything is fine. The lustre mount on the export server  
>> can take a real pounding (ive seen it push 300MB/sec) so I don't  
>> know why nfs is crashing it.
>> On the nfs export server i see these messages--
>> Lustre: 4224:0:(o2iblnd_cb.c:412:kiblnd_handle_rx()) PUT_NACK from  
>> [EMAIL PROTECTED]
>> LustreError: 4400:0:(client.c:969:ptlrpc_expire_one_request()) @@@  
>> timeout (sent at 1197415542, 100s ago)  [EMAIL PROTECTED] x38827/ 
>> t0 o36->[EMAIL PROTECTED]@o2ib:12 lens 14256/672 ref  
>> 1 fl Rpc:/0/0 rc 0/-22
>> Lustre: data-MDT0000-mdc-ffff81082d702000: Connection to service  
>> data-MDT0000 via nid [EMAIL PROTECTED] was lost; in progress  
>> operations using this service
>> will wait for recovery to complete.
>
> Any messages on mds at this time?
>
> Bye,
>    Oleg

Aaron Knister
Associate Systems Administrator/Web Designer
Center for Research on Environment and Water

(301) 595-7001
[EMAIL PROTECTED]



_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to