Re: [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread

2018-01-11 Thread Shichangkuo
Hi Joseph
Thanks for replying.
Umount will flush the ocfs2 workqueue in function 
ocfs2_truncate_log_shutdown and journal recovery is one work of ocfs2 wq.

Thanks
Changkuo

> -Original Message-
> From: Joseph Qi [mailto:jiangqi...@gmail.com]
> Sent: Friday, January 12, 2018 1:51 PM
> To: shichangkuo (Cloud); z...@suse.com; j...@suse.cz
> Cc: ocfs2-devel@oss.oracle.com
> Subject: Re: [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2
> workqueue triggered by ocfs2rec thread
> 
> Hi Changkuo,
> 
> You said s_umount was acquired by umount and ocfs2rec was blocked when
> acquiring it. But you didn't describe why umount was blocked.
> 
> Thanks,
> Joseph
> 
> On 18/1/12 11:43, Shichangkuo wrote:
> > Hi all,
> >   Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock
> with umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as
> follows:
> > journal recovery work:
> > [] call_rwsem_down_read_failed+0x14/0x30
> > [] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
> > [] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
> > [] process_one_work+0x130/0x350 []
> > worker_thread+0x46/0x3b0 [] kthread+0x101/0x140
> > [] ret_from_fork+0x1f/0x30 []
> > 0x
> >
> > /bin/umount:
> > [] flush_workqueue+0x104/0x3e0 []
> > ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2] []
> > ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2] []
> > ocfs2_put_super+0x31/0xa0 [ocfs2] []
> > generic_shutdown_super+0x6d/0x120 []
> > kill_block_super+0x2d/0x60 []
> > deactivate_locked_super+0x51/0x90 []
> > cleanup_mnt+0x3b/0x70 [] task_work_run+0x86/0xa0
> > [] exit_to_usermode_loop+0x6d/0xa9
> > [] do_syscall_64+0x11d/0x130 []
> > entry_SYSCALL64_slow_path+0x25/0x25
> > [] 0x
> >
> > Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was
> already locked by umount thread, then get a deadlock.
> > This issue was introduced by
> c3b004460d77bf3f980d877be539016f2df4df12 and
> 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
> > I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was
> already removed.
> > Shall we add a new mutex?
> >
> > Thanks
> > Changkuo
> > --
> > ---
> > 本邮件及其附件含有新华三技术有限公司的保密信息,仅限于发送给上
> 面地址中列出
> > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或
> 部分地泄露、复制、
> > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件
> 通知发件人并删除本
> > 邮件!
> > This e-mail and its attachments contain confidential information from
> > New H3C, which is intended only for the person or entity whose address
> > is listed above. Any use of the information contained herein in any
> > way (including, but not limited to, total or partial disclosure,
> > reproduction, or dissemination) by persons other than the intended
> > recipient(s) is prohibited. If you receive this e-mail in error,
> > please notify the sender by phone or email immediately and delete it!
> > ___
> > Ocfs2-devel mailing list
> > Ocfs2-devel@oss.oracle.com
> > https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> >
___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread

2018-01-11 Thread Joseph Qi
Hi Changkuo,

You said s_umount was acquired by umount and ocfs2rec was blocked when
acquiring it. But you didn't describe why umount was blocked.

Thanks,
Joseph

On 18/1/12 11:43, Shichangkuo wrote:
> Hi all,
>   Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with 
> umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
> journal recovery work:
> [] call_rwsem_down_read_failed+0x14/0x30
> [] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
> [] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
> [] process_one_work+0x130/0x350
> [] worker_thread+0x46/0x3b0
> [] kthread+0x101/0x140
> [] ret_from_fork+0x1f/0x30
> [] 0x
> 
> /bin/umount:
> [] flush_workqueue+0x104/0x3e0
> [] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
> [] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
> [] ocfs2_put_super+0x31/0xa0 [ocfs2]
> [] generic_shutdown_super+0x6d/0x120
> [] kill_block_super+0x2d/0x60
> [] deactivate_locked_super+0x51/0x90
> [] cleanup_mnt+0x3b/0x70
> [] task_work_run+0x86/0xa0
> [] exit_to_usermode_loop+0x6d/0xa9
> [] do_syscall_64+0x11d/0x130
> [] entry_SYSCALL64_slow_path+0x25/0x25
> [] 0x
>   
> Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was 
> already locked by umount thread, then get a deadlock.
> This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 
> 5f530de63cfc6ca8571cbdf58af63fb166cc6517.
> I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already 
> removed.
> Shall we add a new mutex?
> 
> Thanks
> Changkuo
> -
> 本邮件及其附件含有新华三技术有限公司的保密信息,仅限于发送给上面地址中列出
> 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
> 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
> 邮件!
> This e-mail and its attachments contain confidential information from New 
> H3C, which is
> intended only for the person or entity whose address is listed above. Any use 
> of the
> information contained herein in any way (including, but not limited to, total 
> or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please 
> notify the sender
> by phone or email immediately and delete it!
> ___
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

[Ocfs2-devel] [Ocfs2-dev] BUG: deadlock with umount and ocfs2 workqueue triggered by ocfs2rec thread

2018-01-11 Thread Shichangkuo
Hi all,
  Now we are testing ocfs2 with 4.14 kernel, and we finding a deadlock with 
umount and ocfs2 workqueue triggered by ocfs2rec thread. The stack as follows:
journal recovery work:
[] call_rwsem_down_read_failed+0x14/0x30
[] ocfs2_finish_quota_recovery+0x62/0x450 [ocfs2]
[] ocfs2_complete_recovery+0xc1/0x440 [ocfs2]
[] process_one_work+0x130/0x350
[] worker_thread+0x46/0x3b0
[] kthread+0x101/0x140
[] ret_from_fork+0x1f/0x30
[] 0x

/bin/umount:
[] flush_workqueue+0x104/0x3e0
[] ocfs2_truncate_log_shutdown+0x3b/0xc0 [ocfs2]
[] ocfs2_dismount_volume+0x8c/0x3d0 [ocfs2]
[] ocfs2_put_super+0x31/0xa0 [ocfs2]
[] generic_shutdown_super+0x6d/0x120
[] kill_block_super+0x2d/0x60
[] deactivate_locked_super+0x51/0x90
[] cleanup_mnt+0x3b/0x70
[] task_work_run+0x86/0xa0
[] exit_to_usermode_loop+0x6d/0xa9
[] do_syscall_64+0x11d/0x130
[] entry_SYSCALL64_slow_path+0x25/0x25
[] 0x
  
Function ocfs2_finish_quota_recovery try to get sb->s_umount, which was already 
locked by umount thread, then get a deadlock.
This issue was introduced by c3b004460d77bf3f980d877be539016f2df4df12 and 
5f530de63cfc6ca8571cbdf58af63fb166cc6517.
I think we cannot use :: s_umount, but the mutex ::dqonoff_mutex was already 
removed.
Shall we add a new mutex?

Thanks
Changkuo
-
本邮件及其附件含有新华三技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from New H3C, 
which is
intended only for the person or entity whose address is listed above. Any use 
of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender
by phone or email immediately and delete it!
___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] kernel bug

2018-01-11 Thread Changwei Ge
HiCédric,

Sorry I can't answer your question, but we are trying to.
Please be patient.

On 2018/1/11 19:52, BASSAGET Cédric wrote:
> Hi Changwei,
>
> short question : Will the stable release of kernel 4.15 fix this bug ?
> Regards
>
> 2018-01-11 8:06 GMT+01:00 Changwei Ge  >:
>
> Hi Cédric,
>
> On kernel mainline:)
>
> On 2018/1/11 14:50, BASSAGET Cédric wrote:
> > Hi Changwei.
> > My question may be stupid, but... can you tell me where I can find the 
> source of latest ocfs2 ?
> > Every google search points to oracle website which refers to 1.6 
> versions
> >
> > 2018-01-11 2:03 GMT+01:00 Changwei Ge   >>:
> >
> > Hi Cédric,
> >
> > These two patches are already picked by Andrew being merged into 
> -mm tree for now.
> > So you can refer to below links for them:
> >
> > 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__ozlabs.org_-7Eakpm_mmots_broken-2Dout_ocfs2-2Dmake-2Dmetadata-2Destimation-2Daccurate-2Dand-2Dclear.patch=DwIFAw=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=C7gAd4uDxlAvTdc0vmU6X8CMk6L2iDY8-HD0qT6Fo7Y=zLKfEc8ZLxWTDHb5-Hp55vQ4ukrtrF97ebOxW36d9YM=7H5XI8dsIHCDXk9lVb-nBQvD0iy_f_2duQ49PGFfRGA=
>  
>   
> >
> > 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__ozlabs.org_-7Eakpm_mmots_broken-2Dout_ocfs2-2Dtry-2Dto-2Dreuse-2Dextent-2Dblock-2Din-2Ddealloc-2Dwithout-2Dmeta-5Falloc.patch=DwIFAw=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE=C7gAd4uDxlAvTdc0vmU6X8CMk6L2iDY8-HD0qT6Fo7Y=zLKfEc8ZLxWTDHb5-Hp55vQ4ukrtrF97ebOxW36d9YM=XDr0YwGE4j3vh1bC7qfY3kUNPr2cxs9N8cUbyFes00o=
>  
>   
> >
> >
> > Thanks,
> > Changwei
> >
> > On 2018/1/10 22:34, BASSAGET Cédric wrote:
> > > Hi Changwei,
> > > Can you give me a ref or a link pointing to your patch ?
> > > Thanks
> > >
> >  > 2018-01-10 12:57 GMT+01:00 Changwei Ge   >    > >
> > > Hi BASSAGET,
> > >
> > > We ocfs2 developers are solving a DIO crash issue which may 
> share the same root cause with yours.
> > >
> > > You can refer to my patch set of 2 and backport them into 
> your kernel to see if the issue can be kicked away.
> > >
> > >ocfs2: make metadata estimation accurate and clear
> > >ocfs2: try to reuse extent block in dealloc without 
> meta_alloc
> > >
> > > On 2018/1/10 19:48, BASSAGET 

Re: [Ocfs2-devel] [PATCH] ocfs2/xattr: assign errno to 'ret' in ocfs2_calc_xattr_init()

2018-01-11 Thread Changwei Ge
Acked-by: Changwei Ge 

On 2018/1/11 16:14, piaojun wrote:
> We need catch the errno returned by ocfs2_xattr_get_nolock() and assign
> it to 'ret' for printing and noticing upper callers.
> 
> Signed-off-by: Jun Piao 
> Reviewed-by: Alex Chen 
> Reviewed-by: Yiwen Jiang 
> ---
>   fs/ocfs2/xattr.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
> index 5fdf269..439f567 100644
> --- a/fs/ocfs2/xattr.c
> +++ b/fs/ocfs2/xattr.c
> @@ -646,6 +646,7 @@ int ocfs2_calc_xattr_init(struct inode *dir,
>   if (S_ISDIR(mode))
>   a_size <<= 1;
>   } else if (acl_len != 0 && acl_len != -ENODATA) {
> + ret = acl_len;
>   mlog_errno(ret);
>   return ret;
>   }
> 


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH v2 2/2] ocfs2: add trimfs lock to avoid duplicated trims in cluster

2018-01-11 Thread Changwei Ge
On 2018/1/11 15:57, Gang He wrote:
> 
> 
> 

>> On 2018/1/11 15:19, Gang He wrote:
>>>
>>>
>>>
>>
 On 2018/1/11 12:31, Gang He wrote:
> Hi Changwei,
>
>

>> On 2018/1/11 11:33, Gang He wrote:
>>> Hi Changwei,
>>>
>>>
>>
 On 2018/1/11 10:07, Gang He wrote:
> Hi Changwei,
>
>

>> On 2018/1/10 18:14, Gang He wrote:
>>> Hi Changwei,
>>>
>>>
>>
 On 2018/1/10 17:05, Gang He wrote:
> Hi Changwei,
>
>

>> Hi Gang,
>>
>> On 2017/12/14 13:16, Gang He wrote:
>>> As you know, ocfs2 has support trim the underlying disk via
>>> fstrim command. But there is a problem, ocfs2 is a shared disk
>>> cluster file system, if the user configures a scheduled fstrim
>>> job on each file system node, this will trigger multiple nodes
>>> trim a shared disk simultaneously, it is very wasteful for CPU
>>> and IO consumption, also might negatively affect the lifetime
>>> of poor-quality SSD devices.
>>> Then, we introduce a trimfs dlm lock to communicate with each
>>> other in this case, which will make only one fstrim command to
>>> do the trimming on a shared disk among the cluster, the fstrim
>>> commands from the other nodes should wait for the first fstrim
>>> to finish and returned success directly, to avoid running a the
>>> same trim on the shared disk again.
>>>
>>> Compare with first version, I change the fstrim commands' 
>>> returned
>>> value and behavior in case which meets a fstrim command is 
>>> running
>>> on a shared disk.
>>>
>>> Signed-off-by: Gang He 
>>> ---
>>>  fs/ocfs2/alloc.c | 44 
>>> 
>>>  1 file changed, 44 insertions(+)
>>>
>>> diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
>>> index ab5105f..5c9c3e2 100644
>>> --- a/fs/ocfs2/alloc.c
>>> +++ b/fs/ocfs2/alloc.c
>>> @@ -7382,6 +7382,7 @@ int ocfs2_trim_fs(struct super_block *sb, 
>>> struct
>> fstrim_range *range)
>>> struct buffer_head *gd_bh = NULL;
>>> struct ocfs2_dinode *main_bm;
>>> struct ocfs2_group_desc *gd = NULL;
>>> +   struct ocfs2_trim_fs_info info, *pinfo = NULL;
>>
>> I think *pinfo* is not necessary.
> This pointer is necessary, since it can be NULL or non-NULL 
> depend on the
 code logic.

 This point is OK for me.

>
>>>  
>>> start = range->start >> osb->s_clustersize_bits;
>>> len = range->len >> osb->s_clustersize_bits;
>>> @@ -7419,6 +7420,42 @@ int ocfs2_trim_fs(struct super_block 
>>> *sb, struct
>> fstrim_range *range)
>>>  
>>> trace_ocfs2_trim_fs(start, len, minlen);
>>>  
>>> +   ocfs2_trim_fs_lock_res_init(osb);
>>> +   ret = ocfs2_trim_fs_lock(osb, NULL, 1);
>>
>> I don't get why try to lock here and if fails, acquire the same 
>> lock again
>> later but wait until granted.
> Please think about the user case, the patch is only used to 
> handle this
 case.
> When the administer configures a fstrim schedule task on each 
> node, then
 each node will trigger a fstrim on shared disks concurrently.
> In this case, we should avoid duplicated fstrim on a shared disk 
> since this
 will waste CPU/IO resources and affect SSD lifetime sometimes.

 I'm not worrying about that trimfs will affect SSD's lifetime 
 quite a lot,
 since physical-logical address converting table resides in RAM 
 while SSD is
 working.
 And that table won't be at a big scale. My point here is not 
 affecting this
 patch. Just a tip here.
>>> This depend on SSD firmware implementation, but for secure-trim, it 
>>> really
>> possibly affect SSD lifetime.
>>>
> Firstly, we use try_lock to get fstrim dlm lock to identify if 
> there is any
 other node which is doing fstrim on the disk.
> If not, this node is 

Re: [Ocfs2-devel] [PATCH] ocfs2/xattr: assign errno to 'ret' in ocfs2_calc_xattr_init()

2018-01-11 Thread Gang He
Looks good.


>>> 
> We need catch the errno returned by ocfs2_xattr_get_nolock() and assign
> it to 'ret' for printing and noticing upper callers.
> 
> Signed-off-by: Jun Piao 
> Reviewed-by: Alex Chen 
> Reviewed-by: Yiwen Jiang 
Reviewed-by: Gang He 

> ---
>  fs/ocfs2/xattr.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
> index 5fdf269..439f567 100644
> --- a/fs/ocfs2/xattr.c
> +++ b/fs/ocfs2/xattr.c
> @@ -646,6 +646,7 @@ int ocfs2_calc_xattr_init(struct inode *dir,
>   if (S_ISDIR(mode))
>   a_size <<= 1;
>   } else if (acl_len != 0 && acl_len != -ENODATA) {
> + ret = acl_len;
>   mlog_errno(ret);
>   return ret;
>   }
> -- 
> 
> ___
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com 
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


[Ocfs2-devel] [PATCH] ocfs2/xattr: assign errno to 'ret' in ocfs2_calc_xattr_init()

2018-01-11 Thread piaojun
We need catch the errno returned by ocfs2_xattr_get_nolock() and assign
it to 'ret' for printing and noticing upper callers.

Signed-off-by: Jun Piao 
Reviewed-by: Alex Chen 
Reviewed-by: Yiwen Jiang 
---
 fs/ocfs2/xattr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/ocfs2/xattr.c b/fs/ocfs2/xattr.c
index 5fdf269..439f567 100644
--- a/fs/ocfs2/xattr.c
+++ b/fs/ocfs2/xattr.c
@@ -646,6 +646,7 @@ int ocfs2_calc_xattr_init(struct inode *dir,
if (S_ISDIR(mode))
a_size <<= 1;
} else if (acl_len != 0 && acl_len != -ENODATA) {
+   ret = acl_len;
mlog_errno(ret);
return ret;
}
-- 

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel