Hi,
We are testing 1.6b5 for a InfiniBand cluster with RHEL 4. We use the
binaries provides by CFS and use OFED 1.1 as the IB stack.
At several times some of the clients hang during fs mount or when an OST
is added (see log).
Error:
LustreError: 1776:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
from OFED:
enum ib_cm_rej_reason {
IB_CM_REJ_INVALID_SERVICE_ID = 8,
Once an IPoIB ping is started to the corresponding OST the client
continues. Afterwards it is quite stable.
Any idea how this could be fixed?
Thanks,
Mirko
Lustre: mount data:
Lustre: profile: testfs-client
Lustre: device: [EMAIL PROTECTED]:/testfs
Lustre: flags: 2
Lustre: 0 UP mgc [EMAIL PROTECTED] 438411f9-d2cc-f576-9a5d-bc927badfa60 5
Lustre: 1 UP lov testfs-clilov-0000010075688000
7255c262-21e0-f804-91dd-2e8008cc166a 3
Lustre: 2 UP mdc testfs-MDT0000-mdc-0000010075688000
7255c262-21e0-f804-91dd-2e8008cc166a 4
Lustre: 3 UP osc testfs-OST0000-osc-0000010075688000
7255c262-21e0-f804-91dd-2e8008cc166a 4
Lustre: 4 UP osc testfs-OST0001-osc-0000010075688000
7255c262-21e0-f804-91dd-2e8008cc166a 4
Lustre: mount [EMAIL PROTECTED]:/testfs complete
Lustre: client 0000010075688000 umount complete
Lustre: mount data:
Lustre: profile: testfs-client
Lustre: device: [EMAIL PROTECTED]:/testfs
Lustre: flags: 2
Lustre: 0 UP mgc [EMAIL PROTECTED] caf868ce-f8dc-8c83-ecd4-caf4a75378f2 5
Lustre: 1 UP lov testfs-clilov-000001007eaba800
3119a81c-5954-8b92-edab-5e38c9f7743d 3
Lustre: 2 UP mdc testfs-MDT0000-mdc-000001007eaba800
3119a81c-5954-8b92-edab-5e38c9f7743d 4
Lustre: 3 UP osc testfs-OST0000-osc-000001007eaba800
3119a81c-5954-8b92-edab-5e38c9f7743d 4
Lustre: 4 UP osc testfs-OST0001-osc-000001007eaba800
3119a81c-5954-8b92-edab-5e38c9f7743d 4
Lustre: 5 UP osc testfs-OST0002-osc-000001007eaba800
3119a81c-5954-8b92-edab-5e38c9f7743d 4
Lustre: 6 UP osc testfs-OST0003-osc-000001007eaba800
3119a81c-5954-8b92-edab-5e38c9f7743d 4
Lustre: 7 UP osc testfs-OST0004-osc-000001007eaba800
3119a81c-5954-8b92-edab-5e38c9f7743d 4
Lustre: 8 UP osc testfs-OST0005-osc-000001007eaba800
3119a81c-5954-8b92-edab-5e38c9f7743d 4
Lustre: 9 UP osc testfs-OST0006-osc-000001007eaba800
3119a81c-5954-8b92-edab-5e38c9f7743d 4
Lustre: 10 UP osc testfs-OST0007-osc-000001007eaba800
3119a81c-5954-8b92-edab-5e38c9f7743d 4
Lustre: mount [EMAIL PROTECTED]:/testfs complete
LustreError: 1776:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1776:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1776:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1776:0:(events.c:51:request_out_callback()) Skipped 1 previous
similar message
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166521780, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 1
previous similar message
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Rx from [EMAIL
PROTECTED] failed: 5
LustreError: 1776:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1776:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1776:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1776:0:(events.c:51:request_out_callback()) Skipped 3 previous
similar messages
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166521805, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 3
previous similar messages
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Rx from [EMAIL
PROTECTED] failed: 5
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Skipped 5 previous
similar messages
LustreError: 1775:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1775:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1775:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1775:0:(events.c:51:request_out_callback()) Skipped 3 previous
similar messages
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166521830, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 3
previous similar messages
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
LustreError: 5170:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Rx from [EMAIL
PROTECTED] failed: 5
LustreError: 5170:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Skipped 1 previous
similar message
LustreError: 1775:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1775:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1775:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1775:0:(events.c:51:request_out_callback()) Skipped 3 previous
similar messages
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166521855, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 3
previous similar messages
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Rx from [EMAIL
PROTECTED] failed: 5
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Skipped 15 previous
similar messages
LustreError: 1776:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1776:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1776:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1776:0:(events.c:51:request_out_callback()) Skipped 3 previous
similar messages
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166521880, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 3
previous similar messages
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Rx from [EMAIL
PROTECTED] failed: 5
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Skipped 15 previous
similar messages
LustreError: 1776:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1776:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1776:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1776:0:(events.c:51:request_out_callback()) Skipped 3 previous
similar messages
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166521905, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 3
previous similar messages
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Rx from [EMAIL
PROTECTED] failed: 5
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Skipped 17 previous
similar messages
LustreError: 1775:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1775:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1775:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1775:0:(events.c:51:request_out_callback()) Skipped 3 previous
similar messages
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166521930, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 3
previous similar messages
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
LustreError: 5170:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Rx from [EMAIL
PROTECTED] failed: 5
LustreError: 5170:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Skipped 1 previous
similar message
LustreError: 1775:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1775:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1775:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1775:0:(events.c:51:request_out_callback()) Skipped 3 previous
similar messages
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166521955, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 3
previous similar messages
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Rx from [EMAIL
PROTECTED] failed: 5
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Skipped 15 previous
similar messages
LustreError: 1776:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1776:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1776:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1776:0:(events.c:51:request_out_callback()) Skipped 3 previous
similar messages
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166521980, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 3
previous similar messages
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Rx from [EMAIL
PROTECTED] failed: 5
LustreError: 5171:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Skipped 15 previous
similar messages
LustreError: 1776:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1776:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1776:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1776:0:(events.c:51:request_out_callback()) Skipped 3 previous
similar messages
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166522005, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 3
previous similar messages
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
LustreError: 5170:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Rx from [EMAIL
PROTECTED] failed: 5
LustreError: 5170:0:(o2iblnd_cb.c:455:kiblnd_rx_complete()) Skipped 15 previous
similar messages
LustreError: 1775:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) [EMAIL PROTECTED]
rejected: reason 8, size 148
LustreError: 1775:0:(o2iblnd_cb.c:1935:kiblnd_peer_connect_failed()) Deleting
messages for [EMAIL PROTECTED]: connection failed
LustreError: 1775:0:(events.c:51:request_out_callback()) @@@ type 4, status -113
LustreError: 1775:0:(events.c:51:request_out_callback()) Skipped 3 previous
similar messages
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout
(sent at 1166522030, 0s ago)
LustreError: 5909:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 3
previous similar messages
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
Lustre: 5909:0:(peer.c:238:lnet_debug_peer()) [EMAIL PROTECTED] 2
up 8 8 8 8 6 0
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss