Hi,

we are experiencing some weird behaviour on one of our Lustre clients.

First some information about our environment:
- Lustre 1.8.1.1 CentOS52 (Kernel 2.6.18-92.1.13.el5)
- self-compiled patchless clients with quotas enabled (quotas not used at the 
moment)
- Infiniband interconnect, OFED 1.3.3
- OSS: 2x Sun Fire X4540 with 48TB, with official sun kernel 
2.6.18-128.7.1.el5_lustre.1.8.1.1, 4
OSTs on
each of them, OFED 1.3.3
- MDS: 2x Sun Fire X4100 M2 + 1x StorEdge 3320 with heartbeat failover, 
official sun kernel
2.6.18-128.7.1.el5_lustre.1.8.1.1, OFED 1.4.2
- Lustre mounted on /hpcscr
- OSS1 is 192.168.60....@o2ib, hostname is hpc9oss1
- OSS2 is 192.168.60....@o2ib, hostname is hpc9oss2
- MDS1 is 192.168.60....@o2ib, hostname is hpc9mds1 (active)
- MDS2 is 192.168.60....@o2ib, hostname is hpc9mds2 (standby, was active for a 
short time while
mds1 was lifted from 1.6.7.1 to 1.8.1.1)
- problematic client is 192.168.60....@o2ib, hostname hpc9master02
- no problems on the Infiniband

Problem:
Users report a slow Lustre filesystem on this particular machine 
(hpc9master02). Running "find
/hpcscr -ls" gets stuck after some time and most of the time it continues after 
some seconds, but
sometimes it takes several minutes and sometimes I get errors (one "I/O error", 
then several "Cannot
send after transport endpoint shutdown") and the find terminates. The IB error 
counters do not change
during this test.

This is the client syslog during a find run that aborted with the I/O error 
(see below for MDS and
OSS syslogs):
The output is shortened and commented, I can provide the complete logs if 
necessary.
--------------------------------------------
Feb  2 15:59:02 hpc9master02 kernel: Lustre: 
5947:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:02 hpc9master02 last message repeated 11 times
Feb  2 15:59:02 hpc9master02 kernel: Lustre: 
5946:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:02 hpc9master02 kernel: Lustre: 
5947:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:02 hpc9master02 kernel: Lustre: 
5947:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:02 hpc9master02 kernel: Lustre: 
5946:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:02 hpc9master02 kernel: Lustre: 
5947:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:02 hpc9master02 kernel: Lustre: 
5946:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:02 hpc9master02 kernel: Lustre: 
5157:0:(o2iblnd_cb.c:2384:kiblnd_reconnect())
192.168.60....@o2ib: retrying (stale), 12, 12, queue_dep: 8, max_frag: 256, 
msg_size: 4096

Then for 192.168.60....@o2ib:
Feb  2 15:59:02 hpc9master02 kernel: Lustre: 
5947:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:03 hpc9master02 kernel: Lustre: 
5157:0:(o2iblnd_cb.c:2384:kiblnd_reconnect())
192.168.60....@o2ib: retrying (stale), 12, 12, queue_dep: 8, max_frag: 256, 
msg_size: 4096

Then for 192.168.60....@o2ib:
Feb  2 15:59:03 hpc9master02 kernel: Lustre: 
5947:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:03 hpc9master02 kernel: Lustre: 
5942:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:03 hpc9master02 kernel: Lustre: 
5942:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:03 hpc9master02 kernel: Lustre: 
5157:0:(o2iblnd_cb.c:2384:kiblnd_reconnect())
192.168.60....@o2ib: retrying (stale), 12, 12, queue_dep: 8, max_frag: 256, 
msg_size: 4096

Again for 192.168.60....@o2ib:
Feb  2 15:59:03 hpc9master02 kernel: Lustre: 
5947:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
192.168.60....@o2ib failed: 5
Feb  2 15:59:03 hpc9master02 kernel: Lustre: 
5157:0:(o2iblnd_cb.c:2384:kiblnd_reconnect())
192.168.60....@o2ib: retrying (stale), 12, 12, queue_dep: 8, max_frag: 256, 
msg_size: 4096

And then the connection gets closed:
Feb  2 16:00:17 hpc9master02 kernel: Lustre: 
5151:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error 0(waiting)
Feb  2 16:00:17 hpc9master02 kernel: LustreError: 
5945:0:(events.c:66:request_out_callback()) @@@
type 4, status -5  r...@ffff811f83619000 x1326214018219145/t0
o101->[email protected]@o2ib:12/10 lens 512/1232 e 0 t
o 1 dl 1265122824 ref 2 fl Rpc:P/0/0 rc 0/0

Afterwards it all starts over again until these messages appear:
Feb  2 16:02:54 hpc9master02 kernel: Lustre: 
29726:0:(client.c:1383:ptlrpc_expire_one_request()) @@@
Request x1326214018348828 sent from cfd1-MDT0000-mdc-ffff811026fc1400 to NID 
192.168.60....@o2ib 7s
ago has timed out (limit 7s).
Feb  2 16:02:54 hpc9master02 kernel:   r...@ffff811ec8cb2c00 
x1326214018348828/t0
o101->[email protected]@o2ib:12/10 lens 568/1232 e 0 to 1 dl 
1265122974 ref 1 fl
Rpc:/0/0 rc 0/0
Feb  2 16:02:54 hpc9master02 kernel: Lustre: 
29726:0:(client.c:1383:ptlrpc_expire_one_request())
Skipped 5 previous similar messages
Feb  2 16:02:54 hpc9master02 kernel: Lustre: cfd1-MDT0000-mdc-ffff811026fc1400: 
Connection to
service cfd1-MDT0000 via nid 192.168.60....@o2ib was lost; in progress 
operations using this service
will wait for recovery to complete.

Followed again by lots of RX failed messages...
--------------------------------------------

This is what the mds reports:
--------------------------------------------
Feb  2 15:58:12 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 15:59:02 hpc9mds1 last message repeated 4 times
Feb  2 15:59:27 hpc9mds1 last message repeated 3 times
Feb  2 15:59:27 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:2232:kiblnd_passive_connect()) Conn
race 192.168.60....@o2ib
Feb  2 15:59:27 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:2232:kiblnd_passive_connect()) Skipped
1 previous similar message
Feb  2 15:59:27 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:00:17 hpc9mds1 last message repeated 3 times
Feb  2 16:00:42 hpc9mds1 last message repeated 3 times
Feb  2 16:01:07 hpc9mds1 kernel: Lustre: 
5007:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:01:07 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:01:32 hpc9mds1 last message repeated 3 times
Feb  2 16:01:39 hpc9mds1 kernel: Lustre: 
8664:0:(ldlm_lib.c:541:target_handle_reconnect())
cfd1-MDT0000: b80f9710-f288-9c2a-ee70-16c25f8ca97d reconnecting
Feb  2 16:01:39 hpc9mds1 kernel: Lustre: 
8664:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped 2
previous similar messages
Feb  2 16:01:57 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:02:47 hpc9mds1 last message repeated 4 times
Feb  2 16:02:47 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:2232:kiblnd_passive_connect()) Conn
race 192.168.60....@o2ib
Feb  2 16:02:47 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:2232:kiblnd_passive_connect()) Skipped
1 previous similar message
Feb  2 16:02:54 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:03:37 hpc9mds1 last message repeated 3 times
Feb  2 16:04:52 hpc9mds1 last message repeated 6 times
Feb  2 16:06:07 hpc9mds1 last message repeated 8 times
Feb  2 16:07:22 hpc9mds1 last message repeated 7 times
Feb  2 16:07:22 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:2220:kiblnd_passive_connect()) Conn
stale 192.168.60....@o2ib [old ver: 12, new ver: 12]
Feb  2 16:07:22 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:2220:kiblnd_passive_connect()) Skipped
62 previous similar messages
Feb  2 16:07:29 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:07:47 hpc9mds1 kernel: Lustre: 
5008:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116(waiting)
Feb  2 16:07:47 hpc9mds1 kernel: LustreError: 
5781:0:(events.c:367:server_bulk_callback()) event
type 4, status -103, desc ffff81010df46d80
Feb  2 16:07:53 hpc9mds1 kernel: LustreError: 
25108:0:(handler.c:161:mds_sendpage()) @@@ bulk
failed: timeout 0(4096), evicting 
b80f9710-f288-9c2a-ee70-16c25f8ca...@net_0x50000c0a83ce2_uuid
Feb  2 16:07:53 hpc9mds1 kernel:   r...@ffff8100a7e2f000 x1326214018567382/t0
o37->b80f9710-f288-9c2a-ee70-16c25f8ca...@net_0x50000c0a83ce2_uuid:0/0 lens 
408/360 e 1 to 0 dl
1265123298 ref 1 fl Interpret:/0/0 rc 0/0
Feb  2 16:07:54 hpc9mds1 kernel: Lustre: 
5007:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:08:03 hpc9mds1 kernel: BUG: soft lockup - CPU#1 stuck for 10s! 
[ll_mdt_rdpg_12:25108]
Feb  2 16:08:03 hpc9mds1 kernel: CPU 1:
Feb  2 16:08:03 hpc9mds1 kernel: Modules linked in: mds(U) fsfilt_ldiskfs(U) 
mgs(U) mgc(U)
ldiskfs(U) crc16(U) nfsd(U) exportfs(U) lockd(U) nfs_acl(U) auth_rpcgss(U) 
ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) hidp(U) rfcomm(U
) l2cap(U) bluetooth(U) sunrpc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) 
ksocklnd(U) ko2iblnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) rdma_ucm(U) qlgc_vnic(U) 
ib_sdp(U) rdma_cm(U)
iw_cm(U) ib_addr(U) ib_ipoib(U) ipo
ib_helper(U) ib_cm(U) ib_sa(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) ib_uverbs(U) 
ib_umad(U)
iw_cxgb3(U) cxgb3(U) ib_ipath(U) ib_mthca(U) mptctl(U) dm_multipath(U) 
scsi_dh(U) video(U)
backlight(U) sbs(U) i2c_ec(U) button(U) batter
y(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) 
mlx4_ib(U) ib_mad(U)
ib_core(U) joydev(U) mlx4_core(U) i2c_nforce2(U) sr_mod(U) pcspkr(U) ide_cd(U) 
serio_raw(U)
i2c_core(U) shpchp(U) e1000(U) k8temp(U)
forcedeth(U) hwmon(U) cdrom(U) sg(U) edac_mc(U) dm_snapshot(U) dm_zero(U) 
dm_mirror(U) dm_log(U)
dm_mod(U) usb_stora
Feb  2 16:08:03 hpc9mds1 kernel: e(U) mptfc(U) scsi_transport_fc(U) mptspi(U) 
scsi_transport_spi(U)
mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) sd_mod(U) scsi_mod(U) 
ext3(U) jbd(U)
uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Feb  2 16:08:03 hpc9mds1 kernel: Pid: 25108, comm: ll_mdt_rdpg_12 Tainted: G
2.6.18-128.7.1.el5_lustre.1.8.1.1 #1
Feb  2 16:08:03 hpc9mds1 kernel: RIP: 0010:[<ffffffff8006de08>]  
[<ffffffff8006de08>]
do_gettimeoffset_hpet+0x7/0x28
Feb  2 16:08:03 hpc9mds1 kernel: RSP: 0018:ffff8102050555b8  EFLAGS: 00000206
Feb  2 16:08:03 hpc9mds1 kernel: RAX: 0000000051f252e8 RBX: ffff8102050555f0 
RCX: 0000000000000383
Feb  2 16:08:03 hpc9mds1 kernel: RDX: 0000000000000040 RSI: 0000000000010000 
RDI: ffff8102050555f0
Feb  2 16:08:03 hpc9mds1 kernel: RBP: 6666666638313031 R08: 0000000000000000 
R09: 0000000000000383
Feb  2 16:08:03 hpc9mds1 kernel: R10: 0000000000000000 R11: 0000000000000000 
R12: 6463313434363031
Feb  2 16:08:03 hpc9mds1 kernel: R13: 0000000000000000 R14: 00000000ffffffff 
R15: ffff81010f780080
Feb  2 16:08:03 hpc9mds1 kernel: FS:  00002b3759c5c220(0000) 
GS:ffff810103f99440(0000)
knlGS:00000000f7f9d6c0
Feb  2 16:08:03 hpc9mds1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Feb  2 16:08:03 hpc9mds1 kernel: CR2: 00002b3ef0e64240 CR3: 0000000000201000 
CR4: 00000000000006e0
Feb  2 16:08:03 hpc9mds1 kernel:
Feb  2 16:08:03 hpc9mds1 kernel: Call Trace:
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff8006dc92>] do_gettimeofday+0x40/0x8f
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff88636164>] 
:libcfs:set_ptldebug_header+0x34/0xa0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff8863e300>] 
:libcfs:libcfs_debug_vmsg2+0x70/0x990
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff8878a7ef>] 
:ptlrpc:ptlrpc_set_destroy+0x37f/0x390
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff887555e0>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff887555e0>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff887588d5>] 
:ptlrpc:ldlm_resource_putref+0x395/0x3a0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff887555e0>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff887578f9>] 
:ptlrpc:ldlm_resource_getref+0xc9/0xe0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff88755601>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x21/0xc0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff886d4b3b>] 
:obdclass:lustre_hash_for_each_empty+0x1cb/0x2b0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff88bf05e1>] 
:mds:mds_disconnect+0x121/0xe40
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff8003d382>] lock_timer_base+0x1b/0x3c
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff8004aa63>] 
try_to_del_timer_sync+0x51/0x5a
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff886d7994>] 
:obdclass:class_fail_export+0x384/0x4c0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff88bea311>] 
:mds:mds_readpage+0x1591/0x18c0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff8001a1b8>] vsnprintf+0x559/0x59e
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff88bed732>] 
:mds:mds_handle+0x2172/0x4cd0
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff8863b5d8>] 
:libcfs:libcfs_ip_addr2str+0x38/0x40
Feb  2 16:08:03 hpc9mds1 kernel:  [<ffffffff8863b9ee>] 
:libcfs:libcfs_nid2str+0xbe/0x110
Feb  2 16:08:04 hpc9mds1 kernel:  [<ffffffff887a2357>] 
:ptlrpc:ptlrpc_server_handle_request+0xa97/0x1170
Feb  2 16:08:04 hpc9mds1 kernel:  [<ffffffff8008881d>] 
__wake_up_common+0x3e/0x68
Feb  2 16:08:04 hpc9mds1 kernel:  [<ffffffff887a5e08>] 
:ptlrpc:ptlrpc_main+0x1218/0x13e0
Feb  2 16:08:04 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:08:04 hpc9mds1 kernel:  [<ffffffff800b48f2>] 
audit_syscall_exit+0x33c/0x357
Feb  2 16:08:04 hpc9mds1 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Feb  2 16:08:04 hpc9mds1 kernel:  [<ffffffff887a4bf0>] 
:ptlrpc:ptlrpc_main+0x0/0x13e0
Feb  2 16:08:04 hpc9mds1 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Feb  2 16:08:04 hpc9mds1 kernel:
Feb  2 16:08:12 hpc9mds1 kernel: Lustre: 
5007:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:08:13 hpc9mds1 kernel: BUG: soft lockup - CPU#1 stuck for 10s! 
[ll_mdt_rdpg_12:25108]
Feb  2 16:08:13 hpc9mds1 kernel: CPU 1:
Feb  2 16:08:13 hpc9mds1 kernel: Modules linked in: mds(U) fsfilt_ldiskfs(U) 
mgs(U) mgc(U)
ldiskfs(U) crc16(U) nfsd(U) exportfs(U) lockd(U) nfs_acl(U) auth_rpcgss(U) 
ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) hidp(U) rfcomm(U
) l2cap(U) bluetooth(U) sunrpc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) 
ksocklnd(U) ko2iblnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) rdma_ucm(U) qlgc_vnic(U) 
ib_sdp(U) rdma_cm(U)
iw_cm(U) ib_addr(U) ib_ipoib(U) ipo
ib_helper(U) ib_cm(U) ib_sa(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) ib_uverbs(U) 
ib_umad(U)
iw_cxgb3(U) cxgb3(U) ib_ipath(U) ib_mthca(U) mptctl(U) dm_multipath(U) 
scsi_dh(U) video(U)
backlight(U) sbs(U) i2c_ec(U) button(U) batter
y(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) 
mlx4_ib(U) ib_mad(U)
ib_core(U) joydev(U) mlx4_core(U) i2c_nforce2(U) sr_mod(U) pcspkr(U) ide_cd(U) 
serio_raw(U)
i2c_core(U) shpchp(U) e1000(U) k8temp(U)
forcedeth(U) hwmon(U) cdrom(U) sg(U) edac_mc(U) dm_snapshot(U) dm_zero(U) 
dm_mirror(U) dm_log(U)
dm_mod(U) usb_stora
Feb  2 16:08:13 hpc9mds1 kernel: e(U) mptfc(U) scsi_transport_fc(U) mptspi(U) 
scsi_transport_spi(U)
mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) sd_mod(U) scsi_mod(U) 
ext3(U) jbd(U)
uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Feb  2 16:08:13 hpc9mds1 kernel: Pid: 25108, comm: ll_mdt_rdpg_12 Tainted: G
2.6.18-128.7.1.el5_lustre.1.8.1.1 #1
Feb  2 16:08:13 hpc9mds1 kernel: RIP: 0010:[<ffffffff80064ba8>]  
[<ffffffff80064ba8>]
_spin_unlock_irqrestore+0x8/0x9
Feb  2 16:08:13 hpc9mds1 kernel: RSP: 0018:ffff810205055578  EFLAGS: 00000293
Feb  2 16:08:13 hpc9mds1 kernel: RAX: 000000000000037e RBX: ffff81021f8382a0 
RCX: 0000000000000001
Feb  2 16:08:13 hpc9mds1 kernel: RDX: 000000000000032e RSI: 0000000000000293 
RDI: ffff81010f780080
Feb  2 16:08:13 hpc9mds1 kernel: RBP: ffff810205055610 R08: 00000000ffffffff 
R09: 0000000000000020
Feb  2 16:08:13 hpc9mds1 kernel: R10: 0000000000000000 R11: 0000000000000000 
R12: 0000000000000000
Feb  2 16:08:13 hpc9mds1 kernel: R13: ffffffff8001a1b8 R14: ffff8100d9680000 
R15: 0000000000000000
Feb  2 16:08:13 hpc9mds1 kernel: FS:  00002b3759c5c220(0000) 
GS:ffff810103f99440(0000)
knlGS:00000000f7f9d6c0
Feb  2 16:08:13 hpc9mds1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Feb  2 16:08:13 hpc9mds1 kernel: CR2: 00002b3ef0e64240 CR3: 0000000000201000 
CR4: 00000000000006e0
Feb  2 16:08:13 hpc9mds1 kernel:
Feb  2 16:08:13 hpc9mds1 kernel: Call Trace:
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff8863e978>] 
:libcfs:libcfs_debug_vmsg2+0x6e8/0x990
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff887555e0>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff88753935>] 
:ptlrpc:ldlm_lock_put+0x3c5/0x3d0
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff887555e0>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff88771f31>] 
:ptlrpc:ldlm_export_lock_put+0xd1/0xe0
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff887555e0>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff886d523c>] 
:obdclass:lustre_hash_del+0x33c/0x370
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff8875311a>] 
:ptlrpc:ldlm_lock_destroy_internal+0x1ca/0x250
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff88755408>] 
:ptlrpc:ldlm_lock_destroy_nolock+0x68/0xd0
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff88755548>] 
:ptlrpc:ldlm_lock_cancel+0xd8/0x170
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff8875566e>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x8e/0xc0
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff886d4b3b>] 
:obdclass:lustre_hash_for_each_empty+0x1cb/0x2b0
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff88bf05e1>] 
:mds:mds_disconnect+0x121/0xe40
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff8003d382>] lock_timer_base+0x1b/0x3c
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff8004aa63>] 
try_to_del_timer_sync+0x51/0x5a
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff886d7994>] 
:obdclass:class_fail_export+0x384/0x4c0
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff88bea311>] 
:mds:mds_readpage+0x1591/0x18c0
Feb  2 16:08:13 hpc9mds1 kernel:  [<ffffffff8001a1b8>] vsnprintf+0x559/0x59e
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff88bed732>] 
:mds:mds_handle+0x2172/0x4cd0
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff8863b5d8>] 
:libcfs:libcfs_ip_addr2str+0x38/0x40
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff8863b9ee>] 
:libcfs:libcfs_nid2str+0xbe/0x110
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff887a2357>] 
:ptlrpc:ptlrpc_server_handle_request+0xa97/0x1170
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff8008881d>] 
__wake_up_common+0x3e/0x68
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff887a5e08>] 
:ptlrpc:ptlrpc_main+0x1218/0x13e0
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff800b48f2>] 
audit_syscall_exit+0x33c/0x357
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff887a4bf0>] 
:ptlrpc:ptlrpc_main+0x0/0x13e0
Feb  2 16:08:14 hpc9mds1 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Feb  2 16:08:14 hpc9mds1 kernel:
Feb  2 16:08:19 hpc9mds1 kernel: Lustre: 
5007:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:08:23 hpc9mds1 kernel: BUG: soft lockup - CPU#1 stuck for 10s! 
[ll_mdt_rdpg_12:25108]
Feb  2 16:08:23 hpc9mds1 kernel: CPU 1:
Feb  2 16:08:23 hpc9mds1 kernel: Modules linked in: mds(U) fsfilt_ldiskfs(U) 
mgs(U) mgc(U)
ldiskfs(U) crc16(U) nfsd(U) exportfs(U) lockd(U) nfs_acl(U) auth_rpcgss(U) 
ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) hidp(U) rfcomm(U
) l2cap(U) bluetooth(U) sunrpc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) 
ksocklnd(U) ko2iblnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) rdma_ucm(U) qlgc_vnic(U) 
ib_sdp(U) rdma_cm(U)
iw_cm(U) ib_addr(U) ib_ipoib(U) ipo
ib_helper(U) ib_cm(U) ib_sa(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) ib_uverbs(U) 
ib_umad(U)
iw_cxgb3(U) cxgb3(U) ib_ipath(U) ib_mthca(U) mptctl(U) dm_multipath(U) 
scsi_dh(U) video(U)
backlight(U) sbs(U) i2c_ec(U) button(U) batter
y(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) 
mlx4_ib(U) ib_mad(U)
ib_core(U) joydev(U) mlx4_core(U) i2c_nforce2(U) sr_mod(U) pcspkr(U) ide_cd(U) 
serio_raw(U)
i2c_core(U) shpchp(U) e1000(U) k8temp(U)
forcedeth(U) hwmon(U) cdrom(U) sg(U) edac_mc(U) dm_snapshot(U) dm_zero(U) 
dm_mirror(U) dm_log(U)
dm_mod(U) usb_stora
Feb  2 16:08:23 hpc9mds1 kernel: e(U) mptfc(U) scsi_transport_fc(U) mptspi(U) 
scsi_transport_spi(U)
mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) sd_mod(U) scsi_mod(U) 
ext3(U) jbd(U)
uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Feb  2 16:08:23 hpc9mds1 kernel: Pid: 25108, comm: ll_mdt_rdpg_12 Tainted: G
2.6.18-128.7.1.el5_lustre.1.8.1.1 #1
Feb  2 16:08:23 hpc9mds1 kernel: RIP: 0010:[<ffffffff886d4b88>]  
[<ffffffff886d4b88>]
:obdclass:lustre_hash_for_each_empty+0x218/0x2b0
Feb  2 16:08:23 hpc9mds1 kernel: RSP: 0018:ffff810205055830  EFLAGS: 00000206
Feb  2 16:08:23 hpc9mds1 kernel: RAX: 000000000000ffff RBX: 0000000000004ce1 
RCX: 0000000000000001
Feb  2 16:08:23 hpc9mds1 kernel: RDX: 0000000000000c20 RSI: 0000000000000293 
RDI: ffffc20005c3be1c
Feb  2 16:08:23 hpc9mds1 kernel: RBP: ffffffff88753935 R08: 00000000ffffffff 
R09: 0000000000000020
Feb  2 16:08:23 hpc9mds1 kernel: R10: 0000000000000000 R11: 0000000000000000 
R12: ffffffff887555e0
Feb  2 16:08:23 hpc9mds1 kernel: R13: ffff8101f16b9df8 R14: ffff8101e17ed888 
R15: ffff8101f16b9dc0
Feb  2 16:08:23 hpc9mds1 kernel: FS:  00002b3759c5c220(0000) 
GS:ffff810103f99440(0000)
knlGS:00000000f7f9d6c0
Feb  2 16:08:23 hpc9mds1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Feb  2 16:08:23 hpc9mds1 kernel: CR2: 00002b3ef0e64240 CR3: 0000000000201000 
CR4: 00000000000006e0
Feb  2 16:08:23 hpc9mds1 kernel:
Feb  2 16:08:23 hpc9mds1 kernel: Call Trace:
Feb  2 16:08:23 hpc9mds1 kernel:  [<ffffffff88bf05e1>] 
:mds:mds_disconnect+0x121/0xe40
Feb  2 16:08:23 hpc9mds1 kernel:  [<ffffffff8003d382>] lock_timer_base+0x1b/0x3c
Feb  2 16:08:23 hpc9mds1 kernel:  [<ffffffff8004aa63>] 
try_to_del_timer_sync+0x51/0x5a
Feb  2 16:08:23 hpc9mds1 kernel:  [<ffffffff886d7994>] 
:obdclass:class_fail_export+0x384/0x4c0
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff88bea311>] 
:mds:mds_readpage+0x1591/0x18c0
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff8001a1b8>] vsnprintf+0x559/0x59e
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff88bed732>] 
:mds:mds_handle+0x2172/0x4cd0
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff8863b5d8>] 
:libcfs:libcfs_ip_addr2str+0x38/0x40
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff8863b9ee>] 
:libcfs:libcfs_nid2str+0xbe/0x110
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff887a2357>] 
:ptlrpc:ptlrpc_server_handle_request+0xa97/0x1170
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff8008881d>] 
__wake_up_common+0x3e/0x68
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff887a5e08>] 
:ptlrpc:ptlrpc_main+0x1218/0x13e0
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff800b48f2>] 
audit_syscall_exit+0x33c/0x357
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff887a4bf0>] 
:ptlrpc:ptlrpc_main+0x0/0x13e0
Feb  2 16:08:24 hpc9mds1 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Feb  2 16:08:24 hpc9mds1 kernel:
Feb  2 16:08:37 hpc9mds1 kernel: Lustre: 
5007:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:08:44 hpc9mds1 kernel: Lustre: 
5007:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:08:47 hpc9mds1 kernel: BUG: soft lockup - CPU#1 stuck for 10s! 
[ll_mdt_rdpg_12:25108]
Feb  2 16:08:47 hpc9mds1 kernel: CPU 1:
Feb  2 16:08:47 hpc9mds1 kernel: Modules linked in: mds(U) fsfilt_ldiskfs(U) 
mgs(U) mgc(U)
ldiskfs(U) crc16(U) nfsd(U) exportfs(U) lockd(U) nfs_acl(U) auth_rpcgss(U) 
ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) hidp(U) rfcomm(U
) l2cap(U) bluetooth(U) sunrpc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) 
ksocklnd(U) ko2iblnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) rdma_ucm(U) qlgc_vnic(U) 
ib_sdp(U) rdma_cm(U)
iw_cm(U) ib_addr(U) ib_ipoib(U) ipo
ib_helper(U) ib_cm(U) ib_sa(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) ib_uverbs(U) 
ib_umad(U)
iw_cxgb3(U) cxgb3(U) ib_ipath(U) ib_mthca(U) mptctl(U) dm_multipath(U) 
scsi_dh(U) video(U)
backlight(U) sbs(U) i2c_ec(U) button(U) batter
y(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) 
mlx4_ib(U) ib_mad(U)
ib_core(U) joydev(U) mlx4_core(U) i2c_nforce2(U) sr_mod(U) pcspkr(U) ide_cd(U) 
serio_raw(U)
i2c_core(U) shpchp(U) e1000(U) k8temp(U)
forcedeth(U) hwmon(U) cdrom(U) sg(U) edac_mc(U) dm_snapshot(U) dm_zero(U) 
dm_mirror(U) dm_log(U)
dm_mod(U) usb_stora
Feb  2 16:08:47 hpc9mds1 kernel: e(U) mptfc(U) scsi_transport_fc(U) mptspi(U) 
scsi_transport_spi(U)
mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) sd_mod(U) scsi_mod(U) 
ext3(U) jbd(U)
uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Feb  2 16:08:47 hpc9mds1 kernel: Pid: 25108, comm: ll_mdt_rdpg_12 Tainted: G
2.6.18-128.7.1.el5_lustre.1.8.1.1 #1
Feb  2 16:08:47 hpc9mds1 kernel: RIP: 0010:[<ffffffff886d4b88>]  
[<ffffffff886d4b88>]
:obdclass:lustre_hash_for_each_empty+0x218/0x2b0
Feb  2 16:08:47 hpc9mds1 kernel: RSP: 0018:ffff810205055830  EFLAGS: 00000206
Feb  2 16:08:47 hpc9mds1 kernel: RAX: 000000000000ffff RBX: 0000000000001d7e 
RCX: 0000000000000001
Feb  2 16:08:47 hpc9mds1 kernel: RDX: 000000000000029a RSI: 0000000000000293 
RDI: ffffc20005c0c7ec
Feb  2 16:08:47 hpc9mds1 kernel: RBP: ffffffff88753935 R08: 00000000ffffffff 
R09: 0000000000000020
Feb  2 16:08:47 hpc9mds1 kernel: R10: 0000000000000000 R11: 0000000000000000 
R12: ffffffff887555e0
Feb  2 16:08:47 hpc9mds1 kernel: R13: ffff8101f16b9df8 R14: ffff810092b5ae88 
R15: ffff8101f16b9dc0
Feb  2 16:08:47 hpc9mds1 kernel: FS:  00002b3759c5c220(0000) 
GS:ffff810103f99440(0000)
knlGS:00000000f7f9d6c0
Feb  2 16:08:47 hpc9mds1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Feb  2 16:08:47 hpc9mds1 kernel: CR2: 00002b3ef0e64240 CR3: 0000000000201000 
CR4: 00000000000006e0
Feb  2 16:08:47 hpc9mds1 kernel:
Feb  2 16:08:47 hpc9mds1 kernel: Call Trace:
Feb  2 16:08:47 hpc9mds1 kernel:  [<ffffffff88bf05e1>] 
:mds:mds_disconnect+0x121/0xe40
Feb  2 16:08:47 hpc9mds1 kernel:  [<ffffffff8003d382>] lock_timer_base+0x1b/0x3c
Feb  2 16:08:47 hpc9mds1 kernel:  [<ffffffff8004aa63>] 
try_to_del_timer_sync+0x51/0x5a
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff886d7994>] 
:obdclass:class_fail_export+0x384/0x4c0
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff88bea311>] 
:mds:mds_readpage+0x1591/0x18c0
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff8001a1b8>] vsnprintf+0x559/0x59e
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff88bed732>] 
:mds:mds_handle+0x2172/0x4cd0
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff8863b5d8>] 
:libcfs:libcfs_ip_addr2str+0x38/0x40
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff8863b9ee>] 
:libcfs:libcfs_nid2str+0xbe/0x110
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff887a2357>] 
:ptlrpc:ptlrpc_server_handle_request+0xa97/0x1170
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff8008881d>] 
__wake_up_common+0x3e/0x68
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff887a5e08>] 
:ptlrpc:ptlrpc_main+0x1218/0x13e0
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff800b48f2>] 
audit_syscall_exit+0x33c/0x357
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff887a4bf0>] 
:ptlrpc:ptlrpc_main+0x0/0x13e0
Feb  2 16:08:48 hpc9mds1 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Feb  2 16:08:48 hpc9mds1 kernel:
Feb  2 16:09:01 hpc9mds1 kernel: BUG: soft lockup - CPU#1 stuck for 10s! 
[ll_mdt_rdpg_12:25108]
Feb  2 16:09:01 hpc9mds1 kernel: CPU 1:
Feb  2 16:09:01 hpc9mds1 kernel: Modules linked in: mds(U) fsfilt_ldiskfs(U) 
mgs(U) mgc(U)
ldiskfs(U) crc16(U) nfsd(U) exportfs(U) lockd(U) nfs_acl(U) auth_rpcgss(U) 
ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) hidp(U) rfcomm(U
) l2cap(U) bluetooth(U) sunrpc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) 
ksocklnd(U) ko2iblnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) rdma_ucm(U) qlgc_vnic(U) 
ib_sdp(U) rdma_cm(U)
iw_cm(U) ib_addr(U) ib_ipoib(U) ipo
ib_helper(U) ib_cm(U) ib_sa(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) ib_uverbs(U) 
ib_umad(U)
iw_cxgb3(U) cxgb3(U) ib_ipath(U) ib_mthca(U) mptctl(U) dm_multipath(U) 
scsi_dh(U) video(U)
backlight(U) sbs(U) i2c_ec(U) button(U) batter
y(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) 
mlx4_ib(U) ib_mad(U)
ib_core(U) joydev(U) mlx4_core(U) i2c_nforce2(U) sr_mod(U) pcspkr(U) ide_cd(U) 
serio_raw(U)
i2c_core(U) shpchp(U) e1000(U) k8temp(U)
forcedeth(U) hwmon(U) cdrom(U) sg(U) edac_mc(U) dm_snapshot(U) dm_zero(U) 
dm_mirror(U) dm_log(U)
dm_mod(U) usb_stora
Feb  2 16:09:01 hpc9mds1 kernel: e(U) mptfc(U) scsi_transport_fc(U) mptspi(U) 
scsi_transport_spi(U)
mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) sd_mod(U) scsi_mod(U) 
ext3(U) jbd(U)
uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Feb  2 16:09:01 hpc9mds1 kernel: Pid: 25108, comm: ll_mdt_rdpg_12 Tainted: G
2.6.18-128.7.1.el5_lustre.1.8.1.1 #1
Feb  2 16:09:01 hpc9mds1 kernel: RIP: 0010:[<ffffffff8006de08>]  
[<ffffffff8006de08>]
do_gettimeoffset_hpet+0x7/0x28
Feb  2 16:09:01 hpc9mds1 kernel: RSP: 0018:ffff810205055548  EFLAGS: 00000202
Feb  2 16:09:01 hpc9mds1 kernel: RAX: 00000000a85c775f RBX: ffff810205055580 
RCX: 00000000000003c3
Feb  2 16:09:01 hpc9mds1 kernel: RDX: 0000000000000001 RSI: 0000000000010000 
RDI: ffff810205055580
Feb  2 16:09:01 hpc9mds1 kernel: RBP: ffffffff8006dc92 R08: 0000000000000000 
R09: 00000000000003c3
Feb  2 16:09:01 hpc9mds1 kernel: R10: 0000000000000000 R11: 0000000000000000 
R12: 0000000000000010
Feb  2 16:09:01 hpc9mds1 kernel: R13: ffff81010f780080 R14: ffff8100a9034330 
R15: 0000009a00000000
Feb  2 16:09:01 hpc9mds1 kernel: FS:  00002b3759c5c220(0000) 
GS:ffff810103f99440(0000)
knlGS:00000000f7f9d6c0
Feb  2 16:09:01 hpc9mds1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Feb  2 16:09:01 hpc9mds1 kernel: CR2: 00002b3ef0e64240 CR3: 0000000000201000 
CR4: 00000000000006e0
Feb  2 16:09:01 hpc9mds1 kernel:
Feb  2 16:09:01 hpc9mds1 kernel: Call Trace:
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff8006dc92>] do_gettimeofday+0x40/0x8f
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff88636164>] 
:libcfs:set_ptldebug_header+0x34/0xa0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff8863e300>] 
:libcfs:libcfs_debug_vmsg2+0x70/0x990
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff8875086c>] 
:ptlrpc:_ldlm_lock_debug+0x57c/0x6e0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff8878a7ef>] 
:ptlrpc:ptlrpc_set_destroy+0x37f/0x390
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff887555e0>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff887555e0>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff887588d5>] 
:ptlrpc:ldlm_resource_putref+0x395/0x3a0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff88753703>] 
:ptlrpc:ldlm_lock_put+0x193/0x3d0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff88771ecf>] 
:ptlrpc:ldlm_export_lock_put+0x6f/0xe0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff8875567e>] 
:ptlrpc:ldlm_cancel_locks_for_export_cb+0x9e/0xc0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff886d4b79>] 
:obdclass:lustre_hash_for_each_empty+0x209/0x2b0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff88bf05e1>] 
:mds:mds_disconnect+0x121/0xe40
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff8003d382>] lock_timer_base+0x1b/0x3c
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff8004aa63>] 
try_to_del_timer_sync+0x51/0x5a
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff886d7994>] 
:obdclass:class_fail_export+0x384/0x4c0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff88bea311>] 
:mds:mds_readpage+0x1591/0x18c0
Feb  2 16:09:01 hpc9mds1 kernel:  [<ffffffff8001a1b8>] vsnprintf+0x559/0x59e
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff88bed732>] 
:mds:mds_handle+0x2172/0x4cd0
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff8863b5d8>] 
:libcfs:libcfs_ip_addr2str+0x38/0x40
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff8863b9ee>] 
:libcfs:libcfs_nid2str+0xbe/0x110
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff887a2357>] 
:ptlrpc:ptlrpc_server_handle_request+0xa97/0x1170
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff8008881d>] 
__wake_up_common+0x3e/0x68
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff887a5e08>] 
:ptlrpc:ptlrpc_main+0x1218/0x13e0
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff800b48f2>] 
audit_syscall_exit+0x33c/0x357
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff887a4bf0>] 
:ptlrpc:ptlrpc_main+0x0/0x13e0
Feb  2 16:09:02 hpc9mds1 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Feb  2 16:09:02 hpc9mds1 kernel:
Feb  2 16:09:02 hpc9mds1 kernel: Lustre: 
5007:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:09:09 hpc9mds1 kernel: Lustre: 
5007:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:09:11 hpc9mds1 kernel: BUG: soft lockup - CPU#1 stuck for 10s! 
[ll_mdt_rdpg_12:25108]
Feb  2 16:09:11 hpc9mds1 kernel: CPU 1:
Feb  2 16:09:11 hpc9mds1 kernel: Modules linked in: mds(U) fsfilt_ldiskfs(U) 
mgs(U) mgc(U)
ldiskfs(U) crc16(U) nfsd(U) exportfs(U) lockd(U) nfs_acl(U) auth_rpcgss(U) 
ipmi_devintf(U)
ipmi_si(U) ipmi_msghandler(U) hidp(U) rfcomm(U
) l2cap(U) bluetooth(U) sunrpc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) 
ksocklnd(U) ko2iblnd(U)
ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) rdma_ucm(U) qlgc_vnic(U) 
ib_sdp(U) rdma_cm(U)
iw_cm(U) ib_addr(U) ib_ipoib(U) ipo
ib_helper(U) ib_cm(U) ib_sa(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) ib_uverbs(U) 
ib_umad(U)
iw_cxgb3(U) cxgb3(U) ib_ipath(U) ib_mthca(U) mptctl(U) dm_multipath(U) 
scsi_dh(U) video(U)
backlight(U) sbs(U) i2c_ec(U) button(U) batter
y(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) 
mlx4_ib(U) ib_mad(U)
ib_core(U) joydev(U) mlx4_core(U) i2c_nforce2(U) sr_mod(U) pcspkr(U) ide_cd(U) 
serio_raw(U)
i2c_core(U) shpchp(U) e1000(U) k8temp(U)
forcedeth(U) hwmon(U) cdrom(U) sg(U) edac_mc(U) dm_snapshot(U) dm_zero(U) 
dm_mirror(U) dm_log(U)
dm_mod(U) usb_stora
Feb  2 16:09:11 hpc9mds1 kernel: e(U) mptfc(U) scsi_transport_fc(U) mptspi(U) 
scsi_transport_spi(U)
mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) sd_mod(U) scsi_mod(U) 
ext3(U) jbd(U)
uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Feb  2 16:09:11 hpc9mds1 kernel: Pid: 25108, comm: ll_mdt_rdpg_12 Tainted: G
2.6.18-128.7.1.el5_lustre.1.8.1.1 #1
Feb  2 16:09:11 hpc9mds1 kernel: RIP: 0010:[<ffffffff80064aee>]  
[<ffffffff80064aee>]
_write_lock+0x7/0xf
Feb  2 16:09:11 hpc9mds1 kernel: RSP: 0018:ffff810205055828  EFLAGS: 00000246
Feb  2 16:09:11 hpc9mds1 kernel: RAX: 000000000000ffff RBX: 0000000000000c49 
RCX: 0000000000000001
Feb  2 16:09:11 hpc9mds1 kernel: RDX: 0000000000000ec0 RSI: 0000000000000293 
RDI: ffffc20005bfb49c
Feb  2 16:09:11 hpc9mds1 kernel: RBP: ffff8101f16b9df8 R08: 00000000ffffffff 
R09: 0000000000000020
Feb  2 16:09:11 hpc9mds1 kernel: R10: 0000000000000000 R11: 0000000000000000 
R12: ffff8100c6087e88
Feb  2 16:09:11 hpc9mds1 kernel: R13: ffff8101f16b9dc0 R14: ffff8100c6086380 
R15: ffff8100c6087e00
Feb  2 16:09:11 hpc9mds1 kernel: FS:  00002b3759c5c220(0000) 
GS:ffff810103f99440(0000)
knlGS:00000000f7f9d6c0
Feb  2 16:09:11 hpc9mds1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Feb  2 16:09:11 hpc9mds1 kernel: CR2: 00002b3ef0e64240 CR3: 0000000000201000 
CR4: 00000000000006e0
Feb  2 16:09:11 hpc9mds1 kernel:
Feb  2 16:09:11 hpc9mds1 kernel: Call Trace:
Feb  2 16:09:11 hpc9mds1 kernel:  [<ffffffff886d4ba7>] 
:obdclass:lustre_hash_for_each_empty+0x237/0x2b0
Feb  2 16:09:11 hpc9mds1 kernel:  [<ffffffff88bf05e1>] 
:mds:mds_disconnect+0x121/0xe40
Feb  2 16:09:11 hpc9mds1 kernel:  [<ffffffff8003d382>] lock_timer_base+0x1b/0x3c
Feb  2 16:09:11 hpc9mds1 kernel:  [<ffffffff8004aa63>] 
try_to_del_timer_sync+0x51/0x5a
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff886d7994>] 
:obdclass:class_fail_export+0x384/0x4c0
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff88bea311>] 
:mds:mds_readpage+0x1591/0x18c0
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff8001a1b8>] vsnprintf+0x559/0x59e
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff88bed732>] 
:mds:mds_handle+0x2172/0x4cd0
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff8863b5d8>] 
:libcfs:libcfs_ip_addr2str+0x38/0x40
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff8863b9ee>] 
:libcfs:libcfs_nid2str+0xbe/0x110
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff887a2357>] 
:ptlrpc:ptlrpc_server_handle_request+0xa97/0x1170
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff8008881d>] 
__wake_up_common+0x3e/0x68
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff887a5e08>] 
:ptlrpc:ptlrpc_main+0x1218/0x13e0
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff800b48f2>] 
audit_syscall_exit+0x33c/0x357
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff887a4bf0>] 
:ptlrpc:ptlrpc_main+0x0/0x13e0
Feb  2 16:09:12 hpc9mds1 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Feb  2 16:09:12 hpc9mds1 kernel:
Feb  2 16:09:27 hpc9mds1 kernel: Lustre: 
5007:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:09:34 hpc9mds1 kernel: Lustre: 
5007:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
--------------------------------------------

The OSS1 shows this:
--------------------------------------------
Feb  2 15:58:37 hpc9oss1 kernel: Lustre: 
7938:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 15:59:02 hpc9oss1 last message repeated 3 times
Feb  2 15:59:27 hpc9oss1 kernel: Lustre: 
7936:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 15:59:59 hpc9oss1 last message repeated 3 times
Feb  2 16:01:07 hpc9oss1 last message repeated 5 times
Feb  2 16:01:07 hpc9oss1 kernel: Lustre: 
7936:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:01:32 hpc9oss1 kernel: Lustre: 
7941:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:01:39 hpc9oss1 kernel: Lustre: 
7938:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:02:22 hpc9oss1 last message repeated 3 times
Feb  2 16:02:22 hpc9oss1 kernel: Lustre: 
7938:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:02:47 hpc9oss1 kernel: Lustre: 
7941:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:02:48 hpc9oss1 kernel: Lustre: 
7941:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:03:12 hpc9oss1 kernel: Lustre: 
7938:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:03:38 hpc9oss1 last message repeated 3 times
Feb  2 16:04:02 hpc9oss1 kernel: Lustre: 
7937:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:04:04 hpc9oss1 kernel: Lustre: 
7937:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:04:21 hpc9oss1 kernel: Lustre: 
10054:0:(ldlm_lib.c:541:target_handle_reconnect())
cfd1-OST0002: b80f9710-f288-9c2a-ee70-16c25f8ca97d reconnecting
Feb  2 16:04:21 hpc9oss1 kernel: Lustre: 
10054:0:(ldlm_lib.c:541:target_handle_reconnect()) Skipped
1 previous similar message
Feb  2 16:04:27 hpc9oss1 kernel: Lustre: 
7937:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:05:17 hpc9oss1 last message repeated 4 times
Feb  2 16:06:07 hpc9oss1 last message repeated 5 times
Feb  2 16:06:32 hpc9oss1 kernel: Lustre: 
7938:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:06:57 hpc9oss1 last message repeated 2 times
Feb  2 16:06:57 hpc9oss1 kernel: Lustre: 
7938:0:(o2iblnd_cb.c:2232:kiblnd_passive_connect()) Conn
race 192.168.60....@o2ib
Feb  2 16:06:57 hpc9oss1 kernel: Lustre: 
7938:0:(o2iblnd_cb.c:2232:kiblnd_passive_connect()) Skipped
1 previous similar message
Feb  2 16:07:06 hpc9oss1 kernel: Lustre: 
7938:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:07:22 hpc9oss1 kernel: Lustre: 
7942:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:07:29 hpc9oss1 kernel: Lustre: 
7937:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:08:19 hpc9oss1 last message repeated 2 times
Feb  2 16:08:19 hpc9oss1 kernel: Lustre: 
7937:0:(o2iblnd_cb.c:2220:kiblnd_passive_connect()) Conn
stale 192.168.60....@o2ib [old ver: 12, new ver: 12]
Feb  2 16:08:19 hpc9oss1 kernel: Lustre: 
7937:0:(o2iblnd_cb.c:2220:kiblnd_passive_connect()) Skipped
46 previous similar messages
Feb  2 16:08:37 hpc9oss1 kernel: Lustre: 
7937:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:09:09 hpc9oss1 last message repeated 3 times
Feb  2 16:09:34 hpc9oss1 last message repeated 2 times
Feb  2 16:09:52 hpc9oss1 kernel: Lustre: 
7941:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:09:59 hpc9oss1 kernel: Lustre: 
7937:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:10:42 hpc9oss1 last message repeated 3 times
--------------------------------------------

The OSS2 shows this:
--------------------------------------------
Feb  2 15:58:37 hpc9oss2 kernel: Lustre: 
7947:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 15:58:37 hpc9oss2 kernel: Lustre: 
7947:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 15:59:02 hpc9oss2 kernel: Lustre: 
7942:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 15:59:52 hpc9oss2 last message repeated 4 times
Feb  2 16:00:42 hpc9oss2 last message repeated 5 times
Feb  2 16:01:07 hpc9oss2 kernel: Lustre: 
7948:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:01:07 hpc9oss2 kernel: Lustre: 
7948:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:01:32 hpc9oss2 kernel: Lustre: 
7944:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:02:04 hpc9oss2 last message repeated 3 times
Feb  2 16:02:04 hpc9oss2 kernel: Lustre: 
10237:0:(ldlm_lib.c:541:target_handle_reconnect())
cfd1-OST0007: b80f9710-f288-9c2a-ee70-16c25f8ca97d reconnecting
Feb  2 16:02:22 hpc9oss2 kernel: Lustre: 
7944:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:03:12 hpc9oss2 last message repeated 4 times
Feb  2 16:04:27 hpc9oss2 last message repeated 6 times
Feb  2 16:05:42 hpc9oss2 last message repeated 6 times
Feb  2 16:06:32 hpc9oss2 last message repeated 5 times
Feb  2 16:06:57 hpc9oss2 kernel: Lustre: 
7947:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:07:04 hpc9oss2 last message repeated 2 times
Feb  2 16:07:04 hpc9oss2 kernel: Lustre: 
10027:0:(ldlm_lib.c:541:target_handle_reconnect())
cfd1-OST0004: b80f9710-f288-9c2a-ee70-16c25f8ca97d reconnecting
Feb  2 16:07:22 hpc9oss2 kernel: Lustre: 
7944:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:08:07 hpc9oss2 last message repeated 3 times
Feb  2 16:08:07 hpc9oss2 kernel: Lustre: 
7944:0:(o2iblnd_cb.c:2220:kiblnd_passive_connect()) Conn
stale 192.168.60....@o2ib [old ver: 12, new ver: 12]
Feb  2 16:08:07 hpc9oss2 kernel: Lustre: 
7944:0:(o2iblnd_cb.c:2220:kiblnd_passive_connect()) Skipped
47 previous similar messages
Feb  2 16:08:12 hpc9oss2 kernel: Lustre: 
7944:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:09:02 hpc9oss2 last message repeated 4 times
Feb  2 16:09:09 hpc9oss2 kernel: Lustre: 
7944:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:09:27 hpc9oss2 kernel: Lustre: 
7943:0:(o2iblnd_cb.c:1785:kiblnd_close_conn_locked())
Closing conn to 192.168.60....@o2ib: error -116
Feb  2 16:09:59 hpc9oss2 last message repeated 3 times
--------------------------------------------

What's the cause of those "Rx from 192.168.60....@o2ib failed: 5"? Any ideas 
what's going on here?

Thanks,

Ulrich Sibiller



-- 
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Roland Niemeier, 
Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Michel Lepert
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196 


_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to