Hi,

在 2010-11-16,下午7:25, Arne Brutschy 写道:

> Hello,
> 
>> From the log, we can see that either your MGS node was not ready for
>> connection yet, or there's network error between client and the MGS node.
> 
> No error on the server nor on the client. What else can it be? Maybe the
> switch is bad, I can see RX errors on most of it's interfaces.

The switch could be the culprit - error message shows client failed to send 
request to MGS. Network sending status was -EHOSTUNREACH.
I suggest you reexamine the network of your system.

> 
>> Were you rebooting the MGS at the moment?
> 
> No. It's something that happenes regularly.
> 
>> Since you said there's no errors on the interface, you need to check
>> the lnet connection and also verify that the MGS/MDT are up running.
> 
> As far as I can tell, everything seems to be set up correctly. I have
> quite a simple setup (single network, single interface gbe).
> 
> Thanks
> Arne
> 
>> 在 2010-11-15,下午11:32, Arne Brutschy 写道:
>> 
>>> Hi all,
>>> 
>>> I am mounting lustre through an fstab entry. This fails quite often, the
>>> nodes end up without the lustre mount. Even when I log in, it take 2-3
>>> tries to get it to mount. This is what I get:
>>> 
>>>       mount /lustre
>>>       mount.lustre: mount 10.1....@tcp0:/lustre at /lustre failed: Cannot 
>>> send after transport endpoint shutdown
>>> 
>>> This is /var/log/messages:
>>> 
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 
>>> 2124:0:(lib-move.c:2441:LNetPut()) Error sending PUT to 12345-10.1....@tcp: 
>>> -113
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 
>>> 2124:0:(events.c:66:request_out_callback()) @@@ type 4, status -113  
>>> r...@d73d7c00 x1352468062535684/t0 o250->[email protected]@tcp_0:26/25 lens 
>>> 368/584 e 0 to 1 dl 1289834868 ref 2 fl Rpc:N/0/0 rc 0/0
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 
>>> 29069:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID  
>>> r...@d73d7800 x1352468062535685/t0 o101->[email protected]@tcp_0:26/25 lens 
>>> 296/544 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 15c-8: 
>>> mgc10.1....@tcp: The configuration from log 'lustre-client' failed (-108). 
>>> This may be the result of communication errors between this node and the 
>>> MGS, a bad configuration, or other errors. See the syslog for more 
>>> information.
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 
>>> 29069:0:(llite_lib.c:1176:ll_fill_super()) Unable to process log: -108
>>>       Nov 15 16:27:43 compute-1-10 kernel: LustreError: 
>>> 29069:0:(obd_mount.c:2045:lustre_fill_super()) Unable to mount  (-108)
>>> 
>>> I have no errors on the interface, so I assume this is a timing problem.
>>> Can I improve this through some timeout setting?
>>> 
>>> Cheers,
>>> Arne
>>> 
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> [email protected]
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> 
> 
> -- 
> Arne Brutschy
> Ph.D. Student                    Email    arne.brutschy(AT)ulb.ac.be
> IRIDIA CP 194/6                  Web      iridia.ulb.ac.be/~abrutschy
> Universite' Libre de Bruxelles   Tel      +32 2 650 2273
> Avenue Franklin Roosevelt 50     Fax      +32 2 650 2715
> 1050 Bruxelles, Belgium          (Fax at IRIDIA secretary)
> 
> _______________________________________________
> Lustre-discuss mailing list
> [email protected]
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to