RO Holders: 1 EX Holders: 0 So node 18 wants to upgrade to EX. For that to happen, node 17 has to downgrade from PR. But it cannot because there is 1 RO (readonly) holder. If you are using NFS and see a nfsd in a D state, then that would be it. I've just released 1.4.7 in which this issue has been addressed.
Sunil Brad Plant wrote: > Hi Sunil, > > I managed to collect the fs_locks and dlm_locks output on both nodes this > time. www1 is node 17 while www2 is node 18. I had to reboot www1 to fix the > problem but of course www1 couldn't unmount the file system so the other > nodes saw it as a crash. > > Both nodes are running 2.6.18-164.15.1.el5.centos.plusxen with the matching > ocfs2 1.4.4-1 rpm downloaded from > http://oss.oracle.com/projects/ocfs2/files/RedHat/RHEL5/x86_64/. > > Do you make anything of this? > > I read that there is going to be a new ocfs2 release soon. I'm sure there's > lots of bug fixes, but are there any in there that you think might solve this > problem? > > Cheers, > > Brad > > > www2 ~ # ./scanlocks2 > /dev/xvdd3 M0000000000000000095a0300000000 > > www2 ~ # debugfs.ocfs2 -R "fs_locks M0000000000000000095a0300000000" > /dev/xvdd3 |cat > Lockres: M0000000000000000095a0300000000 Mode: Protected Read > Flags: Initialized Attached Busy > RO Holders: 0 EX Holders: 0 > Pending Action: Convert Pending Unlock Action: None > Requested Mode: Exclusive Blocking Mode: No Lock > PR > Gets: 6802 Fails: 0 Waits (usec) Total: 0 Max: 0 > EX > Gets: 16340 Fails: 0 Waits (usec) Total: 12000 Max: 8000 > Disk Refreshes: 0 > > www2 ~ # debugfs.ocfs2 -R "dlm_locks M0000000000000000095a0300000000" > /dev/xvdd3 |cat > Lockres: M0000000000000000095a0300000000 Owner: 18 State: 0x0 > Last Used: 0 ASTs Reserved: 0 Inflight: 0 Migration Pending: No > Refs: 4 Locks: 2 On Lists: None > Reference Map: 17 > Lock-Queue Node Level Conv Cookie Refs AST BAST > Pending-Action > Granted 17 PR -1 17:62487955 2 No No None > Converting 18 PR EX 18:6599867 2 No No None > > > www1 ~ # ./scanlocks2 > > www1 ~ # debugfs.ocfs2 -R "fs_locks M0000000000000000095a0300000000" > /dev/xvdd3 |cat > Lockres: M0000000000000000095a0300000000 Mode: Protected Read > Flags: Initialized Attached Blocked Queued > RO Holders: 1 EX Holders: 0 > Pending Action: None Pending Unlock Action: None > Requested Mode: Protected Read Blocking Mode: Exclusive > PR > Gets: 110 Fails: 3 Waits (usec) Total: 32000 Max: 12000 > EX > Gets: 0 Fails: 0 Waits (usec) Total: 0 Max: 0 > Disk Refreshes: 0 > > www1 ~ # debugfs.ocfs2 -R "dlm_locks M0000000000000000095a0300000000" > /dev/xvdd3 |cat > Lockres: M0000000000000000095a0300000000 Owner: 18 State: 0x0 > Last Used: 0 ASTs Reserved: 0 Inflight: 0 Migration Pending: No > Refs: 3 Locks: 1 On Lists: None > Reference Map: > Lock-Queue Node Level Conv Cookie Refs AST BAST > Pending-Action > Granted 17 PR -1 17:62487955 2 No No None > > > > > > > > On Fri, 19 Mar 2010 08:48:39 -0700 > Sunil Mushran <sunil.mush...@oracle.com> wrote: > > >> In findpath <lockname>, the lockname needs to be in angular brackets. >> >> Did you manage to trap the oops stack trace of the crash? >> >> So the dlm on the master says that node 250 has a PR but the fslocks >> on 250 says that it has requested a PR but not gotten a reply back as yet. >> Next time also dump the dlm_lock on 250. (The message flow is fs on 250 >> talsk to dlm on 250 which talkd to dlm on master which may have to talk >> to other nodes but eventually replies to dlm on 250 which then pings the >> fs on that node. Roundtrip happens in a couple hundred of usecs in gige.) >> >> Running a mix of localflock and not is not advisable. Not the end of the >> world >> though. It depends on how flocks are being used. >> >> Is this a mix of virtual and physical boxes? >> >> Brad Plant wrote: >> >>> Hi Sunil, >>> >>> I seem to have struck this issue, also I'm not using nfs. I've got other >>> processes stuck in the D stat. It's a mail server and the processes are >>> postfix and courier-imap. As per your instructions, I've run scanlocks2, >>> and debugfs.ocfs2: >>> >>> mail1 ~ # ./scanlocks2 >>> /dev/xvdc1 M0000000000000000808bc800000000 >>> >>> mail1 ~ # debugfs.ocfs2 -R "fs_locks -l M0000000000000000808bc800000000" >>> /dev/xvdc1 |cat >>> Lockres: M0000000000000000808bc800000000 Mode: Protected Read >>> Flags: Initialized Attached Busy >>> RO Holders: 0 EX Holders: 0 >>> Pending Action: Convert Pending Unlock Action: None >>> Requested Mode: Exclusive Blocking Mode: No Lock >>> Raw LVB: 05 00 00 00 00 00 00 01 00 00 01 99 00 00 01 99 >>> 12 1f c9 67 29 71 32 86 12 e8 e2 f6 d1 07 8c 15 >>> 12 e8 e2 f6 d1 07 8c 15 00 00 00 00 00 00 10 00 >>> 41 c0 00 05 00 00 00 00 4b b6 12 7d 00 00 00 00 >>> PR > Gets: 471598 Fails: 0 Waits (usec) Total: 64002 Max: 8000 >>> EX > Gets: 8041 Fails: 0 Waits (usec) Total: 28001 Max: 4000 >>> Disk Refreshes: 0 >>> >>> mail1 ~ # debugfs.ocfs2 -R "dlm_locks -l M0000000000000000808bc800000000" >>> /dev/xvdc1 |cat >>> Lockres: M0000000000000000808bc800000000 Owner: 1 State: 0x0 >>> Last Used: 0 ASTs Reserved: 0 Inflight: 0 Migration Pending: No >>> Refs: 4 Locks: 2 On Lists: None >>> Reference Map: 250 >>> Raw LVB: 05 00 00 00 00 00 00 01 00 00 01 99 00 00 01 99 >>> 12 1f c9 67 29 71 32 86 12 e8 e2 f6 d1 07 8c 15 >>> 12 e8 e2 f6 d1 07 8c 15 00 00 00 00 00 00 10 00 >>> 41 c0 00 05 00 00 00 00 4b b6 12 7d 00 00 00 00 >>> Lock-Queue Node Level Conv Cookie Refs AST BAST >>> Pending-Action >>> Granted 250 PR -1 250:10866405 2 No No None >>> Converting 1 PR EX 1:95 2 No No None >>> >>> mail1 *is* node number 1, so this is the master node. >>> >>> I managed to run scanlocks2 on node 250 (backup1) and also managed to get >>> the following: >>> >>> backup1 ~ # debugfs.ocfs2 -R "fs_locks -l M00000000000000007e89e400000000" >>> /dev/xvdc1 |cat >>> Lockres: M00000000000000007e89e400000000 Mode: Invalid >>> Flags: Initialized Busy >>> RO Holders: 0 EX Holders: 0 >>> Pending Action: Attach Pending Unlock Action: None >>> Requested Mode: Protected Read Blocking Mode: Invalid >>> Raw LVB: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> PR > Gets: 0 Fails: 0 Waits (usec) Total: 0 Max: 0 >>> EX > Gets: 0 Fails: 0 Waits (usec) Total: 0 Max: 0 >>> Disk Refreshes: 0 >>> >>> A further run of scanlocks2 however resulted in backup1 (node 250) crashing. >>> >>> The FS is mounted by 3 nodes: mail1, mail2 and backup1. mail1 and mail2 are >>> running the latest centos 5 xen kernel with NO localflocks. backup1 is >>> running a 2.6.28.10 vanilla mainline kernel (pv-ops) WITH localflocks. >>> >>> I had to switch backup1 to a mainline kernel with localflocks because >>> performing backups on backup1 using rsync seemed to take a long time (3-4 >>> times longer) when using the centos 5 xen kernel with no localflocks. I was >>> running all nodes on recent-ish mainline kernels, but have only recently >>> converted most of them to centos 5 because of repeated ocfs2 stability >>> issues with mainline kernels and ocfs2. >>> >>> When backup1 crashed, the lock held by mail1 seemed to be released and >>> everything went back to normal. >>> >>> I tried to do a debugfs.ocfs2 -R "findpath M00000000000000007e89e400000000" >>> /dev/xvdc1 |cat but it said "usage: locate <inode#>" despite the man page >>> stating otherwise. -R "locate ..." said the same. >>> >>> I hope you're able to get some useful info from the above. If not, can you >>> please provide the next steps that you would want me to run *in case* it >>> happens again. >>> >>> Cheers, >>> >>> Brad >>> >>> >>> On Thu, 18 Mar 2010 11:25:28 -0700 >>> Sunil Mushran <sunil.mush...@oracle.com> wrote: >>> >>> >>> >>>> I am assuming you are mounting the nfs mounts with the nordirplus >>>> mount option. If not, that is known to deadlock a nfsd thread leading >>>> to what you are seeing. >>>> >>>> There are two possible reasons for this error. One is a dlm issue. >>>> Other is a local deadlock like above. >>>> >>>> To see if the dlm is the cause for the hang, run scanlocks2. >>>> http://oss.oracle.com/~smushran/.dlm/scripts/scanlocks2 >>>> >>>> This will dump the busy lock resources. Run it a few times. If >>>> a lock resource comes up regularly, then it indicates a dlm problem. >>>> >>>> Then dump the fs and dlm lock state on that node. >>>> debugfs.ocfs2 -R "fs_locks LOCKNAME" /dev/sdX >>>> debugfs.ocfs2 -R "dlm_locks LOCKNAME" /dev/sdX >>>> >>>> The dlm lock will tell you the master node. Repeat the two dumps >>>> on the master node. The dlm lock on the master node will point >>>> to the current holder. Repeat the same on that node. Email all that >>>> to me asap. >>>> >>>> michael.a.jaqu...@verizon.com wrote: >>>> >>>> >>>>> All, >>>>> >>>>> I've seen a few posts about this issue in the past, but not a resolution. >>>>> I have a 3 node cluster sharing ocfs2 volumes to app nodes via nfs. On >>>>> occasion, one of our db nodes will have nfs go into an uninterruptable >>>>> sleep state. The nfs daemon is completely useless at this point. The db >>>>> node has to be rebooted to resolve. It seems that nfs is waiting on >>>>> ocfs2_wait_for_mask. Any suggestions on a resolution would be >>>>> appreciated. >>>>> >>>>> root 18387 0.0 0.0 0 0 ? S< Mar15 0:00 [nfsd4] >>>>> root 18389 0.0 0.0 0 0 ? D Mar15 0:10 [nfsd] >>>>> root 18390 0.0 0.0 0 0 ? D Mar15 0:10 [nfsd] >>>>> root 18391 0.0 0.0 0 0 ? D Mar15 0:10 [nfsd] >>>>> root 18392 0.0 0.0 0 0 ? D Mar15 0:13 [nfsd] >>>>> root 18393 0.0 0.0 0 0 ? D Mar15 0:08 [nfsd] >>>>> root 18394 0.0 0.0 0 0 ? D Mar15 0:09 [nfsd] >>>>> root 18395 0.0 0.0 0 0 ? D Mar15 0:12 [nfsd] >>>>> root 18396 0.0 0.0 0 0 ? D Mar15 0:13 [nfsd] >>>>> >>>>> 18387 nfsd4 worker_thread >>>>> 18389 nfsd ocfs2_wait_for_mask >>>>> 18390 nfsd ocfs2_wait_for_mask >>>>> 18391 nfsd ocfs2_wait_for_mask >>>>> 18392 nfsd ocfs2_wait_for_mask >>>>> 18393 nfsd ocfs2_wait_for_mask >>>>> 18394 nfsd ocfs2_wait_for_mask >>>>> 18395 nfsd ocfs2_wait_for_mask >>>>> 18396 nfsd ocfs2_wait_for_mask >>>>> >>>>> >>>>> -Mike Jaquays >>>>> _______________________________________________ >>>>> Ocfs2-users mailing list >>>>> Ocfs2-users@oss.oracle.com >>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Ocfs2-users mailing list >>>> Ocfs2-users@oss.oracle.com >>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>> >>>> > > _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users