Re: [Ocfs2-users] processes in "D" State

Sunil Mushran Mon, 19 Apr 2010 11:40:25 -0700

RO Holders: 1  EX Holders: 0

So node 18 wants to upgrade to EX. For that to happen,
node 17 has to downgrade from PR. But it cannot because
there is 1 RO (readonly) holder. If you are using NFS and
see a nfsd in a D state, then that would be it. I've just
released 1.4.7 in which this issue has been addressed.


Sunil


Brad Plant wrote:
> Hi Sunil,
>
> I managed to collect the fs_locks and dlm_locks output on both nodes this 
> time. www1 is node 17 while www2 is node 18. I had to reboot www1 to fix the 
> problem but of course www1 couldn't unmount the file system so the other 
> nodes saw it as a crash.
>
> Both nodes are running 2.6.18-164.15.1.el5.centos.plusxen with the matching 
> ocfs2 1.4.4-1 rpm downloaded from 
> http://oss.oracle.com/projects/ocfs2/files/RedHat/RHEL5/x86_64/.
>
> Do you make anything of this?
>
> I read that there is going to be a new ocfs2 release soon. I'm sure there's 
> lots of bug fixes, but are there any in there that you think might solve this 
> problem?
>
> Cheers,
>
> Brad
>
>
> www2 ~ # ./scanlocks2 
> /dev/xvdd3 M0000000000000000095a0300000000
>
> www2 ~ # debugfs.ocfs2 -R "fs_locks  M0000000000000000095a0300000000" 
> /dev/xvdd3 |cat
> Lockres: M0000000000000000095a0300000000  Mode: Protected Read
> Flags: Initialized Attached Busy
> RO Holders: 0  EX Holders: 0
> Pending Action: Convert  Pending Unlock Action: None
> Requested Mode: Exclusive  Blocking Mode: No Lock
> PR > Gets: 6802  Fails: 0    Waits (usec) Total: 0  Max: 0
> EX > Gets: 16340  Fails: 0    Waits (usec) Total: 12000  Max: 8000
> Disk Refreshes: 0
>
> www2 ~ # debugfs.ocfs2 -R "dlm_locks  M0000000000000000095a0300000000" 
> /dev/xvdd3 |cat
> Lockres: M0000000000000000095a0300000000   Owner: 18   State: 0x0 
> Last Used: 0      ASTs Reserved: 0    Inflight: 0    Migration Pending: No
> Refs: 4    Locks: 2    On Lists: None
> Reference Map: 17 
>  Lock-Queue  Node  Level  Conv  Cookie           Refs  AST  BAST  
> Pending-Action
>  Granted     17    PR     -1    17:62487955      2     No   No    None
>  Converting  18    PR     EX    18:6599867       2     No   No    None
>
>
> www1 ~ # ./scanlocks2 
>
> www1 ~ # debugfs.ocfs2 -R "fs_locks  M0000000000000000095a0300000000" 
> /dev/xvdd3 |cat
> Lockres: M0000000000000000095a0300000000  Mode: Protected Read
> Flags: Initialized Attached Blocked Queued
> RO Holders: 1  EX Holders: 0
> Pending Action: None  Pending Unlock Action: None
> Requested Mode: Protected Read  Blocking Mode: Exclusive
> PR > Gets: 110  Fails: 3    Waits (usec) Total: 32000  Max: 12000
> EX > Gets: 0  Fails: 0    Waits (usec) Total: 0  Max: 0
> Disk Refreshes: 0
>
> www1 ~ # debugfs.ocfs2 -R "dlm_locks  M0000000000000000095a0300000000" 
> /dev/xvdd3 |cat
> Lockres: M0000000000000000095a0300000000   Owner: 18   State: 0x0 
> Last Used: 0      ASTs Reserved: 0    Inflight: 0    Migration Pending: No
> Refs: 3    Locks: 1    On Lists: None
> Reference Map: 
>  Lock-Queue  Node  Level  Conv  Cookie           Refs  AST  BAST  
> Pending-Action
>  Granted     17    PR     -1    17:62487955      2     No   No    None
>
>
>
>
>
>
>
> On Fri, 19 Mar 2010 08:48:39 -0700
> Sunil Mushran <sunil.mush...@oracle.com> wrote:
>
>   
>> In findpath <lockname>, the lockname needs to be in angular brackets.
>>
>> Did you manage to trap the oops stack trace of the crash?
>>
>> So the dlm on the master says that node 250 has a PR but the fslocks
>> on 250 says that it has requested a PR but not gotten a reply back as yet.
>> Next time also dump the dlm_lock on 250. (The message flow is fs on 250
>> talsk to dlm on 250 which talkd to dlm on master which may have to talk
>> to other nodes but eventually replies to dlm on 250 which then pings the
>> fs on that node. Roundtrip happens in a couple hundred of usecs in gige.)
>>
>> Running a mix of localflock and not is not advisable. Not the end of the 
>> world
>> though. It depends on how flocks are being used.
>>
>> Is this a mix of virtual and physical boxes?
>>
>> Brad Plant wrote:
>>     
>>> Hi Sunil,
>>>
>>> I seem to have struck this issue, also I'm not using nfs. I've got other 
>>> processes stuck in the D stat. It's a mail server and the processes are 
>>> postfix and courier-imap. As per your instructions, I've run scanlocks2, 
>>> and debugfs.ocfs2:
>>>
>>> mail1 ~ # ./scanlocks2 
>>> /dev/xvdc1 M0000000000000000808bc800000000
>>>
>>> mail1 ~ # debugfs.ocfs2 -R "fs_locks -l M0000000000000000808bc800000000" 
>>> /dev/xvdc1 |cat
>>> Lockres: M0000000000000000808bc800000000  Mode: Protected Read
>>> Flags: Initialized Attached Busy
>>> RO Holders: 0  EX Holders: 0
>>> Pending Action: Convert  Pending Unlock Action: None
>>> Requested Mode: Exclusive  Blocking Mode: No Lock
>>> Raw LVB:    05 00 00 00 00 00 00 01 00 00 01 99 00 00 01 99 
>>>             12 1f c9 67 29 71 32 86 12 e8 e2 f6 d1 07 8c 15 
>>>             12 e8 e2 f6 d1 07 8c 15 00 00 00 00 00 00 10 00 
>>>             41 c0 00 05 00 00 00 00 4b b6 12 7d 00 00 00 00 
>>> PR > Gets: 471598  Fails: 0    Waits (usec) Total: 64002  Max: 8000
>>> EX > Gets: 8041  Fails: 0    Waits (usec) Total: 28001  Max: 4000
>>> Disk Refreshes: 0
>>>
>>> mail1 ~ # debugfs.ocfs2 -R "dlm_locks -l M0000000000000000808bc800000000" 
>>> /dev/xvdc1 |cat
>>> Lockres: M0000000000000000808bc800000000   Owner: 1    State: 0x0 
>>> Last Used: 0      ASTs Reserved: 0    Inflight: 0    Migration Pending: No
>>> Refs: 4    Locks: 2    On Lists: None
>>> Reference Map: 250 
>>> Raw LVB:    05 00 00 00 00 00 00 01 00 00 01 99 00 00 01 99 
>>>             12 1f c9 67 29 71 32 86 12 e8 e2 f6 d1 07 8c 15 
>>>             12 e8 e2 f6 d1 07 8c 15 00 00 00 00 00 00 10 00 
>>>             41 c0 00 05 00 00 00 00 4b b6 12 7d 00 00 00 00 
>>>  Lock-Queue  Node  Level  Conv  Cookie           Refs  AST  BAST  
>>> Pending-Action
>>>  Granted     250   PR     -1    250:10866405     2     No   No    None
>>>  Converting  1     PR     EX    1:95             2     No   No    None
>>>
>>> mail1 *is* node number 1, so this is the master node.
>>>
>>> I managed to run scanlocks2 on node 250 (backup1) and also managed to get 
>>> the following:
>>>
>>> backup1 ~ # debugfs.ocfs2 -R "fs_locks -l M00000000000000007e89e400000000" 
>>> /dev/xvdc1 |cat
>>> Lockres: M00000000000000007e89e400000000  Mode: Invalid
>>> Flags: Initialized Busy
>>> RO Holders: 0  EX Holders: 0
>>> Pending Action: Attach  Pending Unlock Action: None
>>> Requested Mode: Protected Read  Blocking Mode: Invalid
>>> Raw LVB:    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>>>             00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>>>             00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>>>             00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>>> PR > Gets: 0  Fails: 0    Waits (usec) Total: 0  Max: 0
>>> EX > Gets: 0  Fails: 0    Waits (usec) Total: 0  Max: 0
>>> Disk Refreshes: 0
>>>
>>> A further run of scanlocks2 however resulted in backup1 (node 250) crashing.
>>>
>>> The FS is mounted by 3 nodes: mail1, mail2 and backup1. mail1 and mail2 are 
>>> running the latest centos 5 xen kernel with NO localflocks. backup1 is 
>>> running a 2.6.28.10 vanilla mainline kernel (pv-ops) WITH localflocks.
>>>
>>> I had to switch backup1 to a mainline kernel with localflocks because 
>>> performing backups on backup1 using rsync seemed to take a long time (3-4 
>>> times longer) when using the centos 5 xen kernel with no localflocks. I was 
>>> running all nodes on recent-ish mainline kernels, but have only recently 
>>> converted most of them to centos 5 because of repeated ocfs2 stability 
>>> issues with mainline kernels and ocfs2.
>>>
>>> When backup1 crashed, the lock held by mail1 seemed to be released and 
>>> everything went back to normal.
>>>
>>> I tried to do a debugfs.ocfs2 -R "findpath M00000000000000007e89e400000000" 
>>> /dev/xvdc1 |cat but it said "usage: locate <inode#>" despite the man page 
>>> stating otherwise. -R "locate ..." said the same.
>>>
>>> I hope you're able to get some useful info from the above. If not, can you 
>>> please provide the next steps that you would want me to run *in case* it 
>>> happens again.
>>>
>>> Cheers,
>>>
>>> Brad
>>>
>>>
>>> On Thu, 18 Mar 2010 11:25:28 -0700
>>> Sunil Mushran <sunil.mush...@oracle.com> wrote:
>>>
>>>   
>>>       
>>>> I am assuming you are mounting the nfs mounts with the nordirplus
>>>> mount option. If not, that is known to deadlock a nfsd thread leading
>>>> to what you are seeing.
>>>>
>>>> There are two possible reasons for this error. One is a dlm issue.
>>>> Other is a local deadlock like above.
>>>>
>>>> To see if the dlm is the cause for the hang, run scanlocks2.
>>>> http://oss.oracle.com/~smushran/.dlm/scripts/scanlocks2
>>>>
>>>> This will dump the busy lock resources. Run it a few times. If
>>>> a lock resource comes up regularly, then it indicates a dlm problem.
>>>>
>>>> Then dump the fs and dlm lock state on that node.
>>>> debugfs.ocfs2 -R "fs_locks LOCKNAME" /dev/sdX
>>>> debugfs.ocfs2 -R "dlm_locks LOCKNAME" /dev/sdX
>>>>
>>>> The dlm lock will tell you the master node. Repeat the two dumps
>>>> on the master node. The dlm lock on the master node will point
>>>> to the current holder. Repeat the same on that node. Email all that
>>>> to me asap.
>>>>
>>>> michael.a.jaqu...@verizon.com wrote:
>>>>     
>>>>         
>>>>> All,
>>>>>
>>>>> I've seen a few posts about this issue in the past, but not a resolution. 
>>>>>  I have a 3 node cluster sharing ocfs2 volumes to app nodes via nfs.  On 
>>>>> occasion, one of our db nodes will have nfs go into an uninterruptable 
>>>>> sleep state.  The nfs daemon is completely useless at this point.  The db 
>>>>> node has to be rebooted to resolve.  It seems that nfs is waiting on 
>>>>> ocfs2_wait_for_mask.  Any suggestions on a resolution would be 
>>>>> appreciated.
>>>>>
>>>>> root     18387  0.0  0.0      0     0 ?        S<   Mar15   0:00 [nfsd4]
>>>>> root     18389  0.0  0.0      0     0 ?        D    Mar15   0:10 [nfsd]
>>>>> root     18390  0.0  0.0      0     0 ?        D    Mar15   0:10 [nfsd]
>>>>> root     18391  0.0  0.0      0     0 ?        D    Mar15   0:10 [nfsd]
>>>>> root     18392  0.0  0.0      0     0 ?        D    Mar15   0:13 [nfsd]
>>>>> root     18393  0.0  0.0      0     0 ?        D    Mar15   0:08 [nfsd]
>>>>> root     18394  0.0  0.0      0     0 ?        D    Mar15   0:09 [nfsd]
>>>>> root     18395  0.0  0.0      0     0 ?        D    Mar15   0:12 [nfsd]
>>>>> root     18396  0.0  0.0      0     0 ?        D    Mar15   0:13 [nfsd] 
>>>>>
>>>>> 18387 nfsd4           worker_thread
>>>>> 18389 nfsd            ocfs2_wait_for_mask
>>>>> 18390 nfsd            ocfs2_wait_for_mask
>>>>> 18391 nfsd            ocfs2_wait_for_mask
>>>>> 18392 nfsd            ocfs2_wait_for_mask
>>>>> 18393 nfsd            ocfs2_wait_for_mask
>>>>> 18394 nfsd            ocfs2_wait_for_mask
>>>>> 18395 nfsd            ocfs2_wait_for_mask
>>>>> 18396 nfsd            ocfs2_wait_for_mask
>>>>>  
>>>>>
>>>>> -Mike Jaquays
>>>>> _______________________________________________
>>>>> Ocfs2-users mailing list
>>>>> Ocfs2-users@oss.oracle.com
>>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>>   
>>>>>       
>>>>>           
>>>> _______________________________________________
>>>> Ocfs2-users mailing list
>>>> Ocfs2-users@oss.oracle.com
>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>     
>>>>         
>
>   


_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] processes in "D" State

Reply via email to