On 10/12/2016 06:54 PM, Eric Ren wrote:
> Hi,
> 
> On 10/12/2016 05:45 PM, Junxiao Bi wrote:
>> On 10/12/2016 05:34 PM, Eric Ren wrote:
>>> Hi Junxiao,
>>>
>>> On 10/12/2016 02:47 PM, Junxiao Bi wrote:
>>>> On 10/12/2016 10:36 AM, Eric Ren wrote:
>>>>> Hi,
>>>>>
>>>>> When backporting those patches, I find that they are already in our
>>>>> product kernel, maybe
>>>>> via "stable kernel" policy, although our product kernel is 4.4
>>>>> while the
>>>>> patches were merged
>>>>> into 4.6.
>>>>>
>>>>> Seems it's another deadlock that happens when doing `chmod -R 777
>>>>> /mnt/ocfs2`
>>>>> among mutilple nodes at the same time.
>>>> Yes, but i just finish running ocfs2 full test on linux next-20161006
>>>> and didn't find any issue.
>>> Thanks a lot, really!
>>>
>>> 1. What's the size of your ocfs2 disk? My disk is 200G.
>> 212G
>>
>>> 2. Did you run discontig block group test with multiple nodes? with this
>>> option:
>> Yes, but i don't know what that option is.
>>
>>>                  " -m ocfs2cts1,ocfs2cts2"
> 
> ocfs2ctsX is the host name of cluster nodes. Discontig bg testcase will
> run in local mode if without
> this option.
It had, 3 machines were used. I first thought ocfs2cts1,ocfs2cts2 is the
option.

Thanks,
Junxiao.
> 
> Thanks
> Eric
> 
>>>
>>> 3. Then, I am using fs/dlm. That's a different point.
>> Yes, that deserve a look since your issue is cluster locking hung.
>>
>> Thanks,
>> Junxiao.
>>> Thanks,
>>> Eric
>>>
>>>> Thanks,
>>>> Junxiao.
>>>>
>>>>> Thanks,
>>>>> Eric
>>>>> On 10/12/2016 09:23 AM, Eric Ren wrote:
>>>>>> Hi Junxiao,
>>>>>>
>>>>>>> Hi Eric,
>>>>>>>
>>>>>>> On 10/11/2016 10:42 AM, Eric Ren wrote:
>>>>>>>> Hi Junxiao,
>>>>>>>>
>>>>>>>> As the subject, the testing hung there on a kernel without your
>>>>>>>> patches:
>>>>>>>>
>>>>>>>> "ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock
>>>>>>>> hang"
>>>>>>>> and
>>>>>>>> "ocfs2: fix posix_acl_create deadlock"
>>>>>>>>
>>>>>>>> The stack trace is:
>>>>>>>> ```
>>>>>>>> ocfs2cts1:~ # pstree -pl 24133
>>>>>>>> discontig_runne(24133)───activate_discon(21156)───mpirun(15146)─┬─fillup_contig_b(15149)───sudo(15231)───chmod(15232)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ocfs2cts1:~ # pgrep -a chmod
>>>>>>>> 15232 /bin/chmod -R 777 /mnt/ocfs2
>>>>>>>>
>>>>>>>> ocfs2cts1:~ # cat /proc/15232/stack
>>>>>>>> [<ffffffffa05377ef>] __ocfs2_cluster_lock.isra.39+0x1bf/0x620
>>>>>>>> [ocfs2]
>>>>>>>> [<ffffffffa053856d>] ocfs2_inode_lock_full_nested+0x12d/0x840
>>>>>>>> [ocfs2]
>>>>>>>> [<ffffffffa0538dbb>] ocfs2_inode_lock_atime+0xcb/0x170 [ocfs2]
>>>>>>>> [<ffffffffa0531e61>] ocfs2_readdir+0x41/0x1b0 [ocfs2]
>>>>>>>> [<ffffffff8120d03c>] iterate_dir+0x9c/0x110
>>>>>>>> [<ffffffff8120d453>] SyS_getdents+0x83/0xf0
>>>>>>>> [<ffffffff815e126e>] entry_SYSCALL_64_fastpath+0x12/0x6d
>>>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>>> ```
>>>>>>>>
>>>>>>>> Do you think this issue can be fixed by your patches?
>>>>>>> Looks not. Those two patches are to fix recursive locking deadlock.
>>>>>>> But
>>>>>>> from above call trace, there is no recursive lock.
>>>>>> Sorry, the call trace on another node was missing.  Here it is:
>>>>>>
>>>>>> ocfs2cts2:~ # pstree -lp
>>>>>> sshd(4292)─┬─sshd(4745)───sshd(4753)───bash(4754)───orted(4781)───fillup_contig_b(4782)───sudo(4864)───chmod(4865)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ocfs2cts2:~ # cat /proc/4865/stack
>>>>>> [<ffffffffa053e7ef>] __ocfs2_cluster_lock.isra.39+0x1bf/0x620 [ocfs2]
>>>>>> [<ffffffffa053f56d>] ocfs2_inode_lock_full_nested+0x12d/0x840 [ocfs2]
>>>>>> [<ffffffffa059c860>] ocfs2_iop_get_acl+0x40/0xf0 [ocfs2]
>>>>>> [<ffffffff812044e6>] generic_permission+0x166/0x1c0
>>>>>> [<ffffffffa0542aca>] ocfs2_permission+0xaa/0xd0 [ocfs2]
>>>>>> [<ffffffff81204596>] __inode_permission+0x56/0xb0
>>>>>> [<ffffffff812068fa>] link_path_walk+0x29a/0x560
>>>>>> [<ffffffff81206cbf>] path_lookupat+0x7f/0x110
>>>>>> [<ffffffff8120929c>] filename_lookup+0x9c/0x150
>>>>>> [<ffffffff811f96c3>] SyS_fchmodat+0x33/0x90
>>>>>> [<ffffffff815e126e>] entry_SYSCALL_64_fastpath+0x12/0x6d
>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>
>>>>>> Thanks,
>>>>>> Eric
>>>>>>
>>>>>>
>>>>>>> Thanks,
>>>>>>> Junxiao.
>>>>>>>> I will try your patches later, but I am little worried the
>>>>>>>> possibility
>>>>>>>> of reproduction may not be 100%.
>>>>>>>> So ask you to confirm;-)
>>>>>>>>
>>>>>>>> Eric
>>>>>> _______________________________________________
>>>>>> Ocfs2-devel mailing list
>>>>>> Ocfs2-devel@oss.oracle.com
>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Reply via email to