Hi,

Looks like connection timeout, likely temporary as it appears to have 
reconnected and recovered without any problems.

What other issue are you experiencing?

-cf


On 09/29/2011 10:39 PM, Ashok nulguda wrote:
> Dear All,
>
> I am having lustre error on my HPC as given below.Please any one can 
> help me to resolve this problem.
> Thanks in Advance.
> Sep 30 08:40:23 service0 kernel: [343138.837222] Lustre: 
> 8300:0:(client.c:1476:ptlrpc_expire_one_request()) Skipped 1 previous 
> similar message
> Sep 30 08:40:23 service0 kernel: [343138.837233] Lustre: 
> lustre-OST0008-osc-ffff880b272cf800: Connection to service 
> lustre-OST0008 via nid 10.148.0.106@o2ib was lost; in progress 
> operations using this service will wait for recovery to complete.
> Sep 30 08:40:24 service0 kernel: [343139.837260] Lustre: 
> 8300:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request 
> x1380984193067288 sent from lustre-OST0006-osc-ffff880b272cf800 to NID 
> 10.148.0.106@o2ib 7s ago has timed out (7s prior to deadline).
> Sep 30 08:40:24 service0 kernel: [343139.837263]   
> req@ffff880a5f800c00 x1380984193067288/t0 
> o3->[email protected]@o2ib:6/4 lens 448/592 e 0 to 1 dl 
> 1317352224 ref 2 fl Rpc:/0/0 rc 0/0
> Sep 30 08:40:24 service0 kernel: [343139.837269] Lustre: 
> 8300:0:(client.c:1476:ptlrpc_expire_one_request()) Skipped 38 previous 
> similar messages
> Sep 30 08:40:24 service0 kernel: [343140.129284] LustreError: 
> 9983:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -11 from 
> cancel RPC: canceling anyway
> Sep 30 08:40:24 service0 kernel: [343140.129290] LustreError: 
> 9983:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Skipped 1 previous 
> similar message
> Sep 30 08:40:24 service0 kernel: [343140.129295] LustreError: 
> 9983:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) 
> ldlm_cli_cancel_list: -11
> Sep 30 08:40:24 service0 kernel: [343140.129299] LustreError: 
> 9983:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) Skipped 1 previous 
> similar message
> Sep 30 08:40:25 service0 kernel: [343140.837308] Lustre: 
> 8300:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request 
> x1380984193067299 sent from lustre-OST0010-osc-ffff880b272cf800 to NID 
> 10.148.0.106@o2ib 7s ago has timed out (7s prior to deadline).
> Sep 30 08:40:25 service0 kernel: [343140.837311]   
> req@ffff880a557c4400 x1380984193067299/t0 
> o3->[email protected]@o2ib:6/4 lens 448/592 e 0 to 1 dl 
> 1317352225 ref 2 fl Rpc:/0/0 rc 0/0
> Sep 30 08:40:25 service0 kernel: [343140.837316] Lustre: 
> 8300:0:(client.c:1476:ptlrpc_expire_one_request()) Skipped 4 previous 
> similar messages
> Sep 30 08:40:26 service0 kernel: [343141.245365] LustreError: 
> 30978:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -11 from 
> cancel RPC: canceling anyway
> Sep 30 08:40:26 service0 kernel: [343141.245371] LustreError: 
> 22729:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) 
> ldlm_cli_cancel_list: -11
> Sep 30 08:40:26 service0 kernel: [343141.245378] LustreError: 
> 30978:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Skipped 1 previous 
> similar message
> Sep 30 08:40:33 service0 kernel: [343148.245683] Lustre: 
> 22725:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request 
> x1380984193067302 sent from lustre-OST0004-osc-ffff880b272cf800 to NID 
> 10.148.0.106@o2ib 14s ago has timed out (14s prior to deadline).
> Sep 30 08:40:33 service0 kernel: [343148.245686]   
> req@ffff8805c879e800 x1380984193067302/t0 
> o103->[email protected]@o2ib:17/18 lens 296/384 e 0 to 
> 1 dl 1317352233 ref 1 fl Rpc:N/0/0 rc 0/0
> Sep 30 08:40:33 service0 kernel: [343148.245692] Lustre: 
> 22725:0:(client.c:1476:ptlrpc_expire_one_request()) Skipped 2 previous 
> similar messages
> Sep 30 08:40:33 service0 kernel: [343148.245708] LustreError: 
> 22725:0:(ldlm_request.c:1025:ldlm_cli_cancel_req()) Got rc -11 from 
> cancel RPC: canceling anyway
> Sep 30 08:40:33 service0 kernel: [343148.245714] LustreError: 
> 22725:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) 
> ldlm_cli_cancel_list: -11
> Sep 30 08:40:33 service0 kernel: [343148.245717] LustreError: 
> 22725:0:(ldlm_request.c:1587:ldlm_cli_cancel_list()) Skipped 1 
> previous similar message
> Sep 30 08:40:36 service0 kernel: [343151.548005] LustreError: 11-0: an 
> error occurred while communicating with 10.148.0.106@o2ib. The 
> ost_connect operation failed with -16
> Sep 30 08:40:36 service0 kernel: [343151.548008] LustreError: Skipped 
> 1 previous similar message
> Sep 30 08:40:36 service0 kernel: [343151.548024] LustreError: 167-0: 
> This client was evicted by lustre-OST000b; in progress operations 
> using this service will fail.
> Sep 30 08:40:36 service0 kernel: [343151.548250] LustreError: 
> 30452:0:(llite_mmap.c:210:ll_tree_unlock()) couldn't unlock -5
> Sep 30 08:40:36 service0 kernel: [343151.550210] LustreError: 
> 8300:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID  
> req@ffff88049528c400 x1380984193067406/t0 
> o3->[email protected]@o2ib:6/4 lens 448/592 e 0 to 1 dl 
> 0 ref 2 fl Rpc:/0/0 rc 0/0
> Sep 30 08:40:36 service0 kernel: [343151.594742] Lustre: 
> lustre-OST0000-osc-ffff880b272cf800: Connection restored to service 
> lustre-OST0000 using nid 10.148.0.106@o2ib.
> Sep 30 08:40:36 service0 kernel: [343151.837203] Lustre: 
> lustre-OST0006-osc-ffff880b272cf800: Connection restored to service 
> lustre-OST0006 using nid 10.148.0.106@o2ib.
> Sep 30 08:40:37 service0 kernel: [343152.842631] Lustre: 
> lustre-OST0003-osc-ffff880b272cf800: Connection restored to service 
> lustre-OST0003 using nid 10.148.0.106@o2ib.
> Sep 30 08:40:37 service0 kernel: [343152.842636] Lustre: Skipped 3 
> previous similar messages
>
>
> Thanks and Regards
> Ashok
>
> -- 
> *Ashok Nulguda
> *
> *TATA ELXSI LTD*
> *Mb : +91 9689945767
> *
> *Email :[email protected] <mailto:[email protected]>*
>
>
> _______________________________________________
> Lustre-discuss mailing list
> [email protected]
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
______________________________________________________________________
This email may contain privileged or confidential information, which should 
only be used for the purpose for which it was sent by Xyratex. No further 
rights or licenses are granted to use such information. If you are not the 
intended recipient of this message, please notify the sender by return and 
delete it. You may not use, copy, disclose or rely on the information contained 
in it.
 
Internet email is susceptible to data corruption, interception and unauthorised 
amendment for which Xyratex does not accept liability. While we have taken 
reasonable precautions to ensure that this email is free of viruses, Xyratex 
does not accept liability for the presence of any computer viruses in this 
email, nor for any losses caused as a result of viruses.
 
Xyratex Technology Limited (03134912), Registered in England & Wales, 
Registered Office, Langstone Road, Havant, Hampshire, PO9 1SA.
 
The Xyratex group of companies also includes, Xyratex Ltd, registered in 
Bermuda, Xyratex International Inc, registered in California, Xyratex 
(Malaysia) Sdn Bhd registered in Malaysia, Xyratex Technology (Wuxi) Co Ltd 
registered in The People's Republic of China and Xyratex Japan Limited 
registered in Japan.
______________________________________________________________________
 

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to