Hi,

is there anybody who can read these messages and give me a hint where to
look for the problem? I'm getting rather easilly this LBUG due to either
(o2iblnd_cb.c:1068:kiblnd_tx_complete()) ASSERTION(tx->tx_sending > 0) failed
or
(o2iblnd_cb.c:171:kiblnd_get_idle_tx()) ASSERTION(tx->tx_sending == 0) failed

Using lustre 1.6.1 as downloaded, on top of RHEL4U5, with o2ib and getting
this a few times per day while writing huge files with "dd".

Any hint (where to look into this further) would be very welcome! Some more
surroundings of the error message are below.

Best regards,
Erich




Lustre: necd3-OST0000-osc-0000010080cd5800: Connection restored to service 
necd3-OST0000 using nid [EMAIL PROTECTED]
LustreError: 6819:0:(events.c:134:client_bulk_callback()) event type 0, status 
-5, desc 00000100156ba000
LustreError: 6820:0:(events.c:134:client_bulk_callback()) event type 0, status 
-5, desc 000001001f1f4000
LustreError: 6819:0:(events.c:134:client_bulk_callback()) event type 0, status 
-5, desc 00000100a10b0000
LustreError: 6819:0:(events.c:134:client_bulk_callback()) event type 0, status 
-5, desc 000001002e3fa000
LustreError: 6820:0:(events.c:134:client_bulk_callback()) event type 0, status 
-5, desc 0000010066604000
LustreError: 6819:0:(events.c:134:client_bulk_callback()) event type 0, status 
-5, desc 0000010070022000
LustreError: 6820:0:(events.c:55:request_out_callback()) @@@ type 4, status -5  
[EMAIL PROTECTED] x1000806/t0 o400->[EMAIL PROTECTED]@o2ib_0:26 lens 128/128 
ref 2 fl Rpc:N/0/0 rc 0/-22
LustreError: 6820:0:(events.c:55:request_out_callback()) Skipped 6 previous 
similar messages
LustreError: 6820:0:(o2iblnd_cb.c:1068:kiblnd_tx_complete()) 
ASSERTION(tx->tx_sending > 0) failed
LustreError: 6819:0:(o2iblnd_cb.c:1068:kiblnd_tx_complete()) 
ASSERTION(tx->tx_sending > 0) failed
LustreError: 6819:0:(tracefile.c:433:libcfs_assertion_failed()) LBUG
Lustre: 6819:0:(linux-debug.c:168:libcfs_debug_dumpstack()) showing stack for 
process 6819
kiblnd_sd_00  R  running task       0  6819      1          6824  6820 (L-TLB)
0000000000000000 0000000000000000 ffffffffa028d43d 0000000000000005
       ffffff000006c5a0 0000000000000000 0000000000000005 ffffffffa0288894
       0000000000000000 0000000000000000
Call Trace:<ffffffffa0288894>{:libcfs:libcfs_assertion_failed+84}
       <ffffffffa0404d53>{:ko2iblnd:kiblnd_tx_complete+67}
       <0>LustreError: 6820:0:(tracefile.c:433:libcfs_assertion_failed()) LBUG
kiblnd_sd_01  R  running task       0  6820      1          6819  6821 (L-TLB)
0000000000000000 0000000000000000 ffffffffa028d43d 0000000000000005
       ffffff000006c6d0 0000000000000000 0000000000000005 ffffffffa0288894
       <ffffffff80133741>{__wake_up+54} 
<ffffffffa0409e60>{:ko2iblnd:kiblnd_scheduler+736}
       0000000000000000 0000000000000000
Call Trace:<ffffffffa0288894>{:libcfs:libcfs_assertion_failed+84}
       <ffffffffa0404d53>{:ko2iblnd:kiblnd_tx_complete+67}
       <ffffffff8013369a>{default_wake_function+0} 
<ffffffff80110de3>{child_rip+8}
       <ffffffffa0409b80>{:ko2iblnd:kiblnd_scheduler+0} 
<ffffffff80133741>{__wake_up+54}<ffffffff80110ddb>{child_rip+0}

 <3>LustreError: 6824:0:(client.c:962:ptlrpc_expire_one_request()) @@@ network 
error (sent at 1187792190, 0s ago)  [EMAIL PROTECTED] x1000806/t0 o400->[EMAIL 
PROTECTED]@o2ib_0:26 lens 128/128 ref 1 fl Rpc:N/0/0 rc 0/-22
<ffffffffa0409e60>{:ko2iblnd:kiblnd_scheduler+736}
       <3>LustreError: 6824:0:(client.c:962:ptlrpc_expire_one_request()) 
Skipped 8 previous similar messages
LustreError: 166-1: [EMAIL PROTECTED]: Connection to service MGS via nid [EMAIL 
PROTECTED] was lost; in progress operations using this service will fail.
<ffffffff8013369a>{default_wake_function+0} <1>LustreError: dumping log to 
/tmp/lustre-log.1187792190.6819
<ffffffff80110de3>{child_rip+8}
       <ffffffffa0409b80>{:ko2iblnd:kiblnd_scheduler+0} 
<ffffffff80110ddb>{child_rip+0}

LustreError: dumping log to /tmp/lustre-log.1187792190.6820
LustreError: 2697:0:(events.c:55:request_out_callback()) @@@ type 4, status 
-113  [EMAIL PROTECTED] x1000808/t0 o400->[EMAIL PROTECTED]@o2ib:28 lens 
128/128 ref 2 fl Rpc:N/0/0 rc 0/-22
LustreError: 2697:0:(events.c:55:request_out_callback()) Skipped 1 previous 
similar message
LustreError: 6823:0:(o2iblnd_cb.c:2843:kiblnd_check_conns()) Timed out RDMA 
with [EMAIL PROTECTED]
Lustre: necd3-OST0000-osc-0000010080cd5800: Connection to service necd3-OST0000 
via nid [EMAIL PROTECTED] was lost; in progress operations using this service 
will wait for recovery to complete.
Lustre: Skipped 3 previous similar messages
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) @@@ IMP_INVALID  
[EMAIL PROTECTED] x1000867/t0 o101->[EMAIL PROTECTED]@o2ib_0:26 lens 232/240 
ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 6823:0:(o2iblnd_cb.c:2843:kiblnd_check_conns()) Timed out RDMA 
with [EMAIL PROTECTED]
LustreError: 6823:0:(o2iblnd_cb.c:2843:kiblnd_check_conns()) Timed out RDMA 
with [EMAIL PROTECTED]
LustreError: 6823:0:(events.c:55:request_out_callback()) @@@ type 4, status 
-103  [EMAIL PROTECTED] x1000850/t0 o400->[EMAIL PROTECTED]@o2ib:28 lens 
128/128 ref 2 fl Rpc:N/0/0 rc 0/-22
LustreError: 6823:0:(events.c:55:request_out_callback()) Skipped 1 previous 
similar message
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) @@@ IMP_INVALID  
[EMAIL PROTECTED] x1000871/t0 o101->[EMAIL PROTECTED]@o2ib_0:26 lens 232/240 
ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) Skipped 3 
previous similar messages
LustreError: 6825:0:(client.c:962:ptlrpc_expire_one_request()) @@@ timeout 
(sent at 1187792290, 100s ago)  [EMAIL PROTECTED] x1000856/t0 o250->[EMAIL 
PROTECTED]@o2ib_0:26 lens 304/328 ref 2 fl Rpc:/0/0 rc 0/-22
LustreError: 6825:0:(client.c:962:ptlrpc_expire_one_request()) Skipped 26 
previous similar messages
LustreError: 6823:0:(o2iblnd_cb.c:2843:kiblnd_check_conns()) Timed out RDMA 
with [EMAIL PROTECTED]
LustreError: 6823:0:(o2iblnd_cb.c:2843:kiblnd_check_conns()) Skipped 1 previous 
similar message
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) @@@ IMP_INVALID  
[EMAIL PROTECTED] x1000886/t0 o101->[EMAIL PROTECTED]@o2ib_0:26 lens 232/240 
ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) Skipped 3 
previous similar messages
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) @@@ IMP_INVALID  
[EMAIL PROTECTED] x1000890/t0 o101->[EMAIL PROTECTED]@o2ib_0:26 lens 232/240 
ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) Skipped 3 
previous similar messages
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) @@@ IMP_INVALID  
[EMAIL PROTECTED] x1000905/t0 o101->[EMAIL PROTECTED]@o2ib_0:26 lens 232/240 
ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) Skipped 3 
previous similar messages
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) @@@ IMP_INVALID  
[EMAIL PROTECTED] x1000913/t0 o101->[EMAIL PROTECTED]@o2ib_0:26 lens 232/240 
ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) Skipped 3 
previous similar messages
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) @@@ IMP_INVALID  
[EMAIL PROTECTED] x1000924/t0 o101->[EMAIL PROTECTED]@o2ib_0:26 lens 232/240 
ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) Skipped 3 
previous similar messages
LustreError: 6825:0:(client.c:962:ptlrpc_expire_one_request()) @@@ timeout 
(sent at 1187792641, 100s ago)  [EMAIL PROTECTED] x1000917/t0 o38->[EMAIL 
PROTECTED]@o2ib:12 lens 304/328 ref 2 fl Rpc:/0/0 rc 0/-22
LustreError: 6825:0:(client.c:962:ptlrpc_expire_one_request()) Skipped 63 
previous similar messages
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) @@@ IMP_INVALID  
[EMAIL PROTECTED] x1000939/t0 o101->[EMAIL PROTECTED]@o2ib_0:26 lens 232/240 
ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 22354:0:(client.c:520:ptlrpc_import_delay_req()) Skipped 3 
previous similar messages



_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to