Hi Changwei,

>>> 
> Hi Gang,
> 
> Never met such a lockup issue so far with o2cb stack applied.
> And I don't know how to debug this with fs/dlm.
I think this issue should belong to dlmglue layer problem.

> What I suspect currently is that this lock resource was once QUEUED but fail 
> to re-QUEUED since BLOCKED has been marked.
Yes, If this lockres is in the blocked_lock_list of Node3's OSB, this lockres 
can be downgraded to NL level first,
let the processes on Node1 and Node2 to continue,
then the process on Node3 can require this Ex lock,
The whole process will look as our expectation.  

Thanks
Gang

> 
> Thanks,
> Changwei
> 
> 
> On 2018/1/24 14:30, Gang He wrote:
>> Hello Guys,
>> 
>> I met a ocfs2 hang issue when running on Multi-nodes Refcount Block Group 
> Test on three nodes cluster, this hang issue can not reproduced each time, 
> but it looks there is a bug in dlmglue layer.
>> The bug symptom looks as below,
>> Node1:
>> ocfs2te+  2403  2372  2403  0    1 Jan18 ?        00:00:00 
> /usr/lib64/mpi/gcc/openmpi/bin/orted --hnp-topo-sig 
> 0N:4S:0L3:4L2:4L1:4C:4H:x86_64 -mca ess env -mca orte_ess_jobid 174456832 
> -mca 
> orte_es
>> ocfs2te+  2404  2403  2404  0    2 Jan18 ?        00:00:06 
> /usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 
> -w 
> /mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s         <<== hang process
>> ocfs2te+  2404  2403  2405  0    2 Jan18 ?        00:00:00 
> /usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 
> -w 
> /mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s
>> sle15b4-nd1:/ # cat /proc/2404/stack
>> [<ffffffffc06878e5>] __ocfs2_cluster_lock.isra.37+0x2a5/0x6d0 [ocfs2]
>> [<ffffffffc0688918>] ocfs2_inode_lock_full_nested+0x1d8/0x930 [ocfs2]
>> [<ffffffffc068912d>] ocfs2_inode_lock_tracker+0xbd/0x180 [ocfs2]
>> [<ffffffffc068bfcc>] ocfs2_permission+0x3c/0xc0 [ocfs2]
>> [<ffffffff8e23def9>] __inode_permission+0x69/0xc0
>> [<ffffffff8e23ff9c>] link_path_walk+0x26c/0x4e0
>> [<ffffffff8e241c93>] path_openat+0xd3/0x1570
>> [<ffffffff8e24429c>] do_filp_open+0x8c/0xf0
>> [<ffffffff8e231586>] do_sys_open+0x1a6/0x230
>> [<ffffffff8e6acfb7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
>> [<ffffffffffffffff>] 0xffffffffffffffff
>> 
>> Node2:
>> ocfs2te+ 21312 28669 21312  0    2 Jan18 pts/2    00:00:00 
> /usr/lib64/mpi/gcc/openmpi/bin/mpirun -mca plm_rsh_agent ssh:rsh -mca btl 
> tcp,self --host sle15b4-nd2,sle15b4-nd1,sle15b4-nd3 /usr/local/o
>> ocfs2te+ 21312 28669 21313  0    2 Jan18 pts/2    00:00:00 
> /usr/lib64/mpi/gcc/openmpi/bin/mpirun -mca plm_rsh_agent ssh:rsh -mca btl 
> tcp,self --host sle15b4-nd2,sle15b4-nd1,sle15b4-nd3 /usr/local/o
>> ocfs2te+ 21314 21312 21314  0    1 Jan18 pts/2    00:00:00 /usr/bin/ssh -x 
> sle15b4-nd1     PATH=/usr/lib64/mpi/gcc/openmpi/bin:$PATH ; export PATH ; 
> LD_LIBRARY_PATH=/usr/lib64/mpi/gcc/openmpi/lib64
>> ocfs2te+ 21315 21312 21315  0    1 Jan18 pts/2    00:00:00 /usr/bin/ssh -x 
> sle15b4-nd3     PATH=/usr/lib64/mpi/gcc/openmpi/bin:$PATH ; export PATH ; 
> LD_LIBRARY_PATH=/usr/lib64/mpi/gcc/openmpi/lib64
>> ocfs2te+ 21316 21312 21316  0    2 Jan18 pts/2    00:00:07 
> /usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 
> -w 
> /mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s <<== hang process
>> ocfs2te+ 21316 21312 21317  0    2 Jan18 pts/2    00:00:00 
> /usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 
> -w 
> /mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s
>> sle15b4-nd2:/ # cat /proc/21316/stack
>> [<ffffffffc06bc8e5>] __ocfs2_cluster_lock.isra.37+0x2a5/0x6d0 [ocfs2]
>> [<ffffffffc06bd918>] ocfs2_inode_lock_full_nested+0x1d8/0x930 [ocfs2]
>> [<ffffffffc06d6cd0>] ocfs2_mknod+0x1e0/0xeb0 [ocfs2]
>> [<ffffffffc06d7af5>] ocfs2_create+0x35/0x120 [ocfs2]
>> [<ffffffffa6242e41>] path_openat+0x1281/0x1570
>> [<ffffffffa624429c>] do_filp_open+0x8c/0xf0
>> [<ffffffffa6231586>] do_sys_open+0x1a6/0x230
>> [<ffffffffa66acfb7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
>> [<ffffffffffffffff>] 0x
>> 
>> Node3:
>> ocfs2te+ 13108 13079 13108  0    1 Jan18 ?        00:00:00 
> /usr/lib64/mpi/gcc/openmpi/bin/orted --hnp-topo-sig 
> 0N:4S:0L3:4L2:4L1:4C:4H:x86_64 -mca ess env -mca orte_ess_jobid 174456832 
> -mca 
> orte_es
>> ocfs2te+ 13109 13108 13109  0    2 Jan18 ?        00:00:06 
> /usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 
> -w 
> /mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s <<== hang process
>> ocfs2te+ 13109 13108 13110  0    2 Jan18 ?        00:00:00 
> /usr/local/ocfs2-test/bin/multi_reflink_test -i 1 -p 1000 -l 10485760 -n 100 
> -w 
> /mnt/shared/ocfs2-discontig-bg-test-multi-nodes -s
>> sle15b4-nd3:~ # cat /proc/13109/stack
>> [<ffffffffc08b48e5>] __ocfs2_cluster_lock.isra.37+0x2a5/0x6d0 [ocfs2]
>> [<ffffffffc08b5918>] ocfs2_inode_lock_full_nested+0x1d8/0x930 [ocfs2]
>> [<ffffffffc08d2907>] ocfs2_mv_orphaned_inode_to_new+0x77/0x7b0 [ocfs2]
>> [<ffffffffc08dd019>] ocfs2_reflink+0x309/0x430 [ocfs2]
>> [<ffffffffc08dd866>] ocfs2_reflink_ioctl+0x2a6/0x330 [ocfs2]
>> [<ffffffffc08c3537>] ocfs2_ioctl+0x237/0x640 [ocfs2]
>> [<ffffffff81247b2d>] do_vfs_ioctl+0x8d/0x5d0
>> [<ffffffff812480e4>] SyS_ioctl+0x74/0x80
>> [<ffffffff816acfb7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
>> [<ffffffffffffffff>] 0xffffffffffffffff
>> 
>> These three processes in the cluster have been hanged for more than one 
> days,
>> The processes on Node1 and Node2 are hanged at getting meta-dlm-lock (from 
>> NL 
> to PR or EX) of inode 20122639 (dentry is 
> "ocfs2-discontig-bg-test-multi-nodes"),
>> Node3 has hold this inode meta-dlm-lock with EX level, but the process on 
> Node3 is trying to get meta-dlm-lock again and is hanged at waitting for 
> lockres blocked flag clear.
>> But this lockres is not queued on Node3, the memory dump looks like,
>>   ip_inode_lockres = {
>>      l_priv = 0xffff9edb6af75458,
>>      l_ops = 0xffffffffc0952160 <ocfs2_inode_inode_lops>,
>>      l_blocked_list = {
>>        next = 0xffff9edb6af75160, <<= NOT in block lock list of super obj
>>        prev = 0xffff9edb6af75160
>>      },
>>      l_mask_waiters = {
>>        next = 0xffffb6e140453bd0,
>>        prev = 0xffffb6e141bc3d10
>>      },
>>      l_holders = {
>>        next = 0xffff9edb6af75180,
>>        prev = 0xffff9edb6af75180
>>      },
>>      l_flags = 69,  <<== No OCFS2_LOCK_QUEUED flag
>>      l_name = "M0000000000000001330c0f00000000",
>>      l_ro_holders = 0,
>>      l_ex_holders = 0,
>>      l_level = 5 '\005',
>>      l_requested = 0 '\000',
>>      l_blocking = 5 '\005',
>>      l_type = 0 '\000',
>>      l_action = 0 '\000',
>>      l_unlock_action = 0 '\000',
>>      l_pending_gen = 4997,
>> 
>> In fact, OSB on Node 3 has a empty block lock list, that means this lockres 
> will not be handled anymore.
>>    dc_work_sequence = 174228,
>>    blocked_lock_list = {   << == empty list
>>      next = 0xffff9edb991e4660,
>>      prev = 0xffff9edb991e4660
>>    },
>>    blocked_lock_count = 0,
>>    dquot_drop_list = {
>>      first = 0x0
>>    },
>>    dquot_drop_work = {
>> 
>> So, these processes on each node will be hanged forever, it looks there is a 
> bug for queuing lockres.
>> Did you meet this bug before I investigate it further?
>> 
>> Thanks a lot.
>> Gang
>> 
>> 
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel@oss.oracle.com 
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel 
>> 

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Reply via email to