Hi, The last few weeks we had several times a kernel stacktrace and after that the ocfs2 filesystems don't respond anymore (no output on ls) at all the nodes.
Kern.log at node-2 ---------------------------------------------------------------------------- Oct 3 06:57:18 XXX kernel: (7178,0):dlm_drop_lockres_ref:2291 ERROR: while dropping ref on 6EDBC1B22BBB4E28AD9453CD5B2F60C3:M000000000000000007f06600000000 (master=0) got -22. Oct 3 06:57:18 XXX kernel: (7178,0):dlm_print_one_lock_resource:50 lockres: M000000000000000007f06600000000, owner=0, state=64 Oct 3 06:57:18 XXX kernel: (7178,0):__dlm_print_one_lock_resource:82 lockres: M000000000000000007f06600000000, owner=0, state=64 Oct 3 06:57:18 XXX kernel: (7178,0):__dlm_print_one_lock_resource:84 last used: 49827182, on purge list: yes Oct 3 06:57:18 XXX kernel: (7178,0):dlm_print_lockres_refmap:61 refmap nodes: [ ], inflight=0 Oct 3 06:57:18 XXX kernel: (7178,0):__dlm_print_one_lock_resource:86 granted queue: Oct 3 06:57:18 XXX kernel: (7178,0):__dlm_print_one_lock_resource:101 converting queue: Oct 3 06:57:18 XXX kernel: (7178,0):__dlm_print_one_lock_resource:116 blocked queue: Oct 3 06:57:20 XXX kernel: ------------[ cut here ]------------ Oct 3 06:57:20 XXX kernel: kernel BUG at fs/ocfs2/dlm/dlmmaster.c:2293! Oct 3 06:57:20 XXX kernel: invalid opcode: 0000 [#1] SMP Oct 3 06:57:20 XXX kernel: Modules linked in: ocfs2 xt_multiport nf_conntrack_ipv4 xt_state nf_conntrack iptable_filter dm_round_robin dm_rdac ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs dm_multipath dm_mod qla2xxx Oct 3 06:57:20 XXX kernel: Oct 3 06:57:20 XXX kernel: Pid: 7178, comm: dlm_thread Not tainted (2.6.25.5-qla2xxx-mpath-fw-cluster-hm64 #1) Oct 3 06:57:20 XXX kernel: EIP: 0060:[<f8eebd11>] EFLAGS: 00010286 CPU: 0 Oct 3 06:57:20 XXX kernel: EIP is at dlm_drop_lockres_ref+0x1c1/0x280 [ocfs2_dlm] Oct 3 06:57:20 XXX kernel: EAX: e79268a8 EBX: f7118600 ECX: c06a6ca4 EDX: 00000092 Oct 3 06:57:20 XXX kernel: ESI: ffffffea EDI: f5b21eff EBP: 0000001f ESP: f5b21ea4 Oct 3 06:57:20 XXX kernel: DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Oct 3 06:57:20 XXX kernel: Process dlm_thread (pid: 7178, ti=f5b20000 task=f72ec430 task.ti=f5b20000) Oct 3 06:57:20 XXX kernel: Stack: f8efebec 00001c0a 00000000 f8ef9cd2 000008f3 f599b940 0000001f ede9c460 Oct 3 06:57:20 XXX kernel: 00000000 ffffffea e7926880 f7118600 ede9c460 00000000 1f010000 3030304d Oct 3 06:57:20 XXX kernel: 30303030 30303030 30303030 66373030 30363630 30303030 00303030 00000000 Oct 3 06:57:20 XXX kernel: Call Trace: Oct 3 06:57:20 XXX kernel: [<f8edf347>] dlm_thread+0x327/0x1420 [ocfs2_dlm] Oct 3 06:57:20 XXX kernel: [<c011beb9>] hrtick_set+0x69/0x140 Oct 3 06:57:20 XXX kernel: [<c0133180>] autoremove_wake_function+0x0/0x50 Oct 3 06:57:20 XXX kernel: [<f8edf020>] dlm_thread+0x0/0x1420 [ocfs2_dlm] Oct 3 06:57:20 XXX kernel: [<c0132e92>] kthread+0x42/0x70 Oct 3 06:57:20 XXX kernel: [<c0132e50>] kthread+0x0/0x70 Oct 3 06:57:20 XXX kernel: [<c0103a17>] kernel_thread_helper+0x7/0x10 Oct 3 06:57:20 XXX kernel: ======================= Oct 3 06:57:20 XXX kernel: Code: d2 9c ef f8 89 54 24 08 89 44 24 14 8b 81 d8 00 00 00 c7 04 24 ec eb ef f8 89 44 24 04 e8 98 55 23 c7 8b 44 24 28 e8 3f 2c ff ff <0f> 0b eb fe 3d 00 fe ff ff 0f 95 c2 83 f8 fc 0f 95 c0 84 d0 0f Oct 3 06:57:20 XXX kernel: EIP: [<f8eebd11>] dlm_drop_lockres_ref+0x1c1/0x280 [ocfs2_dlm] SS:ESP 0068:f5b21ea4 Oct 3 06:57:20 XXX kernel: ---[ end trace 52ed3dea72cac956 ]--- ---------------------------------------------------------------------------- kern.log at node-1: Oct 3 06:57:18 XXX kernel: (5799,1):dlm_deref_lockres_handler:2336 ERROR: 6EDBC1B22BBB4E28AD9453CD5B2F60C3:M000000000000000007f06600000000: bad lockres name # uname -r: 2.6.25.5 # debugfs.ocfs2 -V debugfs.ocfs2 1.4.1 # dmesg OCFS2 Node Manager 1.5.0 OCFS2 DLM 1.5.0 OCFS2 DLMFS 1.5.0 We have 2 nodes in the cluster and the freeze was observed on both nodes. Only a reboot solves the problem. Any help appreciated. Christian van Barneveld _______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
