I have a scratch lustre filesystem setup running on rhel6.2 with lustre 2.2 (whamcloud) rpms. my meta data servers seems to be plagued with this LBUG from time to time
any thoughts on what i should do? [root@metal -]# Message from syslogd@metal at Aug 109:46:26 ... kernel: LustreError: 3218:0: (mdt_open.c:1035:mdt_reconstruct_open()) AssERTION( (!(rc < 0) I I (lustre_msg_get_transno(req->r~repmsg) == 0)) ) failed: Message from syslogd@metal at Aug 1 09:46:26 .,. kernel:LustreError: 3218:0:(mdt_open.c:1035:mdt_reconstruct_open()) LBUG Message from syslogd@metal at Aug 1 09:46:26 kernel:Kernel panic -not syncing: LBUG@o2ib:0/0 lens 192/0 e 0 to 0 dl 1343828771 ref 1 fl Interpret:H/0/ffffffff rc 0/ 1 Aug 1 09:46:26 metal kernel: LustreError: 3218:0:(mdt_open.c:1035:mdt_reconstruct_open()) ASSERTION ( (!(rc < 0) I I (lustre_msg_get_transno(req->r~repmsg) == 0)) ) failed: Message from syslogd@metal at Aug 1 09:46:26 ... kernel: LustreError: 3218:0:(mdt_open.c:1035:mdt_reconstruct_open()) AssERTION( (!(rc < 0) I I (lustre_msg_get_transno(req->r~repmsg) == 0)) ) failed: Message from syslogd@metal at Aug 1 09:46:26 ... kernel: LustreError: 3218:0:(mdt_open.c:1035:mdt_reconstruct_open()) LBUG Aug 1 09:46:26 metal kernel: LustreError: 3218:0: (mdt_open.c:1035:mdt_reconstruct_open()) LBUG Aug 1 09:46:26 metal kernel: Pid: 3218, comm: mdt_24 Aug 1 09:46:26 metal kernel: Aug 1 09:46:26 metal kernel: Call Trace: Aug 1 09:46:26 metal kernel: [<ffffffffa0422835>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] Aug 1 09:46:26 metal kernel: [<ffffffffa0422d67>] lbug_with_loc+0x47/0xb0 [libcfs] Aug 1 09:46:26 metal kernel: [<ffffffffa0d6e60b>] mdt_reconstruct_open+0x63b/0x8c0 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0d620fb>] mdt_reconstruct+0x4b/0xb0 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0d511f9>] mdt_reint_internal+0x609/0x7b0 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0d5l5d5>] mdt_intent_reint+0x185/0x4a0 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0d5036l>] mdt_intent_policy+0x2dl/0x600 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0668e39>] ldlm_lock_enqueue+0x2f9/0x830 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffffa068gef0>] ldlm_handle_enqueue0+0x420/0xd90 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffffa0d506d6>] mdt_enqueue+0x46/0x130 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0d47b9d>] mdt_handle_common+0x74d/0x1400 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0d48925>] mdt_regular_handle+0x15/0x20 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa06b10l1>] ptlrpc_server_handle_request+0x3cl/0xcb0 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffffa04233ee>] ? cfs_timer_arm+0xe/0x10 [libcfs] Aug 1 09:46:26 metal kernel: [<ffffffffa042de19>] ? lc_watchdog_touch+0x79/0xl10 [libcfs] Aug 1 09:46:26 metal kernel: [<ffffffffa06ab0e2>] ? ptlrpc_wait_event+0xb2/0x2c0 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffff8l0519c3>] ? __wake_up+0x53/0x70 Aug 1 09:46:26 metal kernel: [<ffffffffa06b201f>] ptlrpc_main+0x7lf/0x12l0 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffffa06b1900>] ? ptlrpc_main+0x0/0x12l0 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffff8100c14a>] child_rip+0xa/0x20 Aug 1 09:46:26 metal kernel: [<ffffffffa06b1900>] ? ptlrpc_main+0x0/0x1210 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffffa06b1900>] ? ptlrpc_main+0x0/0x1210 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffff8100c140>] ? child_rip+0x0/0x20 Aug 1 09:46:26 metal kernel: Aug 1 09:46:26 metal kernel: Kernel panic not syncing: LBUG Message from syslogd@meta1 at Aug 109:46:26 ... kernel:Kernel panic -not syncing: LBUG Aug 1 09:46:26 metal kernel: Pid: 3218, comm: mdt_24 Not tainted 2.6.32-220.4.2.e16.x86_64 #1 Aug 1 09:46:26 metal kernel: Call Trace: Aug 1 09:46:26 metal kernel: [<ffffffff814ec61a>] ? panic+0x78/0x143 Aug 1 09:46:26 metal kernel: [<ffffffffa0422dbb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] Aug 1 09:46:26 metal kernel: [<ffffffffa0d6e60b>] ? mdt_reconstruct_open+0x63b/0x8c0 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0d620fb>] ? mdt_reconstruct+0x4b/0xb0 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0dS1lf9>] ? mdt_reint_internal+0x609/0x7b0 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0dSlSdS>] ? mdt_intent_reint+0x18S/0x4a0 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0d5036l>] ? mdt_intent~olicy+0x2d1/0x600 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0668e39>] ? ldlm_lock_enqueue+0x2f9/0x830 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffffa068gef0>] ? ldlm_handle_enqueue0+0x420/0xd90 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffffa0dS06d6>] ? mdt_enqueue+0x46/0x130 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0d47b9d>] ? mdt_handle_common+0x74d/0x1400 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa0d48925>] ? mdt_regular_handle+0xlS/0x20 [mdt] Aug 1 09:46:26 metal kernel: [<ffffffffa06b10ll>] ? ptlrpc_server_handle_request+0x3c1/0xcb0 [ptlrpc] Aug 1 09:46:26 metal kernel: [<ffffffffa04233ee>] ? cfs_timer_arm+0xe/0x10 [libcfs] Aug 1 09:46:26 metal kernel: [<ffffffffa042del9>] ? lc_watchdog_touch+0x79/0xl10 [libcfs] Aug 1 09:46:26 metal kernel: [<ffffffffa06ab0e2>] ? ptlrpc_wait_event+0xb2/0x2c0 [ptlrpc] _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
