Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation

2018-11-19 Thread Filipe Manana
On Mon, Nov 19, 2018 at 11:52 AM Filipe Manana  wrote:
>
> On Mon, Nov 19, 2018 at 11:35 AM Qu Wenruo  wrote:
> >
> >
> >
> > On 2018/11/19 下午7:13, Filipe Manana wrote:
> > > On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo  wrote:
> > >>
> > >>
> > >>
> > >> On 2018/11/19 下午5:48, fdman...@kernel.org wrote:
> > >>> From: Filipe Manana 
> > >>>
> > >>> If the quota enable and snapshot creation ioctls are called concurrently
> > >>> we can get into a deadlock where the task enabling quotas will deadlock
> > >>> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> > >>> twice. The following time diagram shows how this happens.
> > >>>
> > >>>CPU 0CPU 1
> > >>>
> > >>>  btrfs_ioctl()
> > >>>   btrfs_ioctl_quota_ctl()
> > >>>btrfs_quota_enable()
> > >>> mutex_lock(fs_info->qgroup_ioctl_lock)
> > >>> btrfs_start_transaction()
> > >>>
> > >>>  btrfs_ioctl()
> > >>>   btrfs_ioctl_snap_create_v2
> > >>>create_snapshot()
> > >>> --> adds snapshot to the
> > >>> list 
> > >>> pending_snapshots
> > >>> of the current
> > >>> transaction
> > >>>
> > >>> btrfs_commit_transaction()
> > >>>  create_pending_snapshots()
> > >>>create_pending_snapshot()
> > >>> qgroup_account_snapshot()
> > >>>  btrfs_qgroup_inherit()
> > >>>  mutex_lock(fs_info->qgroup_ioctl_lock)
> > >>>   --> deadlock, mutex already locked
> > >>>   by this task at
> > >>>   btrfs_quota_enable()
> > >>
> > >> The backtrace looks valid.
> > >>
> > >>>
> > >>> So fix this by adding a flag to the transaction handle that signals if 
> > >>> the
> > >>> transaction is being used for enabling quotas (only seen by the task 
> > >>> doing
> > >>> it) and do not lock the mutex qgroup_ioctl_lock at 
> > >>> btrfs_qgroup_inherit()
> > >>> if the transaction handle corresponds to the one being used to enable 
> > >>> the
> > >>> quotas.
> > >>>
> > >>> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when 
> > >>> creating snapshot")
> > >>> Signed-off-by: Filipe Manana 
> > >>> ---
> > >>>  fs/btrfs/qgroup.c  | 10 --
> > >>>  fs/btrfs/transaction.h |  1 +
> > >>>  2 files changed, 9 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> > >>> index d4917c0cddf5..3aec3bfa3d70 100644
> > >>> --- a/fs/btrfs/qgroup.c
> > >>> +++ b/fs/btrfs/qgroup.c
> > >>> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info 
> > >>> *fs_info)
> > >>>   trans = NULL;
> > >>>   goto out;
> > >>>   }
> > >>> + trans->enabling_quotas = true;
> > >>
> > >> Should we put enabling_quotas bit into btrfs_transaction instead of
> > >> btrfs_trans_handle?
> > >
> > > Why?
> > > Only the task which is enabling quotas needs to know about it.
> >
> > But it's the btrfs_qgroup_inherit() using the trans handler to avoid
> > dead lock.
> >
> > What makes sure btrfs_qgroup_inherit() get the exactly same trans
> > handler allocated here?
>
> If it's the other task (the one creating a snapshot) that starts the
> transaction commit,
> it will have to wait for the task enabling quotas to release the
> transaction - once that task
> also calls commit_transaction(), it will skip doing the commit itself
> and wait for the snapshot
> one to finish the commit, while holding the qgroup mutex (this part I
> missed before).
> So yes we'll have to use a bit in the transaction itself instead.

That (using a flag in the transaction itself) wouldn't be good, it would allow
concurrent and unprotected access to qgroup stuff at
btrfs_qgroup_inherit() by anyone who
calls it (currently only subvolume creation). Fortunately there's a
much simpler solution in v2.

>
> >
> > >
> > >>
> > >> Isn't it possible to have different trans handle pointed to the same
> > >> transaction?
> > >
> > > Yes.
> > >
> > >>
> > >> And I'm not really sure about the naming "enabling_quotas".
> > >> What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful)
> > >
> > > Too long.
> >
> > Anyway, current naming doesn't really show why we could skip mutex
> > locking. Just hope to get some name better.
>
> No name will ever show you that.
> You'll always have to see where  and how it's used, unless you want a
> name like "dont_lock_mutex_because_we_locked_it_at_btrfs...".
>
> >
> > Thanks,
> > Qu
> >
> > >
> > >
> > >>
> > >> Thanks,
> > >> Qu
> > >>
> > >>>
> > >>>   fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL);
> > >>>   if (!fs_info->qgroup_ulist) {
> > >>> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct 
> > >>> btrfs_trans_handle 

Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation

2018-11-19 Thread Qu Wenruo


On 2018/11/19 下午7:52, Filipe Manana wrote:
> On Mon, Nov 19, 2018 at 11:35 AM Qu Wenruo  wrote:
>>
>>
>>
>> On 2018/11/19 下午7:13, Filipe Manana wrote:
>>> On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo  wrote:



 On 2018/11/19 下午5:48, fdman...@kernel.org wrote:
> From: Filipe Manana 
>
> If the quota enable and snapshot creation ioctls are called concurrently
> we can get into a deadlock where the task enabling quotas will deadlock
> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> twice. The following time diagram shows how this happens.
>
>CPU 0CPU 1
>
>  btrfs_ioctl()
>   btrfs_ioctl_quota_ctl()
>btrfs_quota_enable()
> mutex_lock(fs_info->qgroup_ioctl_lock)
> btrfs_start_transaction()
>
>  btrfs_ioctl()
>   btrfs_ioctl_snap_create_v2
>create_snapshot()
> --> adds snapshot to the
> list pending_snapshots
> of the current
> transaction
>
> btrfs_commit_transaction()
>  create_pending_snapshots()
>create_pending_snapshot()
> qgroup_account_snapshot()
>  btrfs_qgroup_inherit()
>  mutex_lock(fs_info->qgroup_ioctl_lock)
>   --> deadlock, mutex already locked
>   by this task at
>   btrfs_quota_enable()

 The backtrace looks valid.

>
> So fix this by adding a flag to the transaction handle that signals if the
> transaction is being used for enabling quotas (only seen by the task doing
> it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit()
> if the transaction handle corresponds to the one being used to enable the
> quotas.
>
> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating 
> snapshot")
> Signed-off-by: Filipe Manana 
> ---
>  fs/btrfs/qgroup.c  | 10 --
>  fs/btrfs/transaction.h |  1 +
>  2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index d4917c0cddf5..3aec3bfa3d70 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
>   trans = NULL;
>   goto out;
>   }
> + trans->enabling_quotas = true;

 Should we put enabling_quotas bit into btrfs_transaction instead of
 btrfs_trans_handle?
>>>
>>> Why?
>>> Only the task which is enabling quotas needs to know about it.
>>
>> But it's the btrfs_qgroup_inherit() using the trans handler to avoid
>> dead lock.
>>
>> What makes sure btrfs_qgroup_inherit() get the exactly same trans
>> handler allocated here?
> 
> If it's the other task (the one creating a snapshot) that starts the
> transaction commit,
> it will have to wait for the task enabling quotas to release the
> transaction - once that task
> also calls commit_transaction(), it will skip doing the commit itself
> and wait for the snapshot
> one to finish the commit, while holding the qgroup mutex (this part I
> missed before).
> So yes we'll have to use a bit in the transaction itself instead.
> 
>>
>>>

 Isn't it possible to have different trans handle pointed to the same
 transaction?
>>>
>>> Yes.
>>>

 And I'm not really sure about the naming "enabling_quotas".
 What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful)
>>>
>>> Too long.
>>
>> Anyway, current naming doesn't really show why we could skip mutex
>> locking. Just hope to get some name better.
> 
> No name will ever show you that.
> You'll always have to see where  and how it's used, unless you want a
> name like "dont_lock_mutex_because_we_locked_it_at_btrfs...".

(Personally speaking I indeed prefer this one naming as it doesn't
exceed 80 chars yet)

Your statement makes sense, just keep current naming.

Thanks,
Qu

> 
>>
>> Thanks,
>> Qu
>>
>>>
>>>

 Thanks,
 Qu

>
>   fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL);
>   if (!fs_info->qgroup_ulist) {
> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
> *trans, u64 srcid,
>   u32 level_size = 0;
>   u64 nums;
>
> - mutex_lock(_info->qgroup_ioctl_lock);
> + if (trans->enabling_quotas)
> + lockdep_assert_held(_info->qgroup_ioctl_lock);
> + else
> + mutex_lock(_info->qgroup_ioctl_lock);
> +
>   if (!test_bit(BTRFS_FS_QUOTA_ENABLED, 

Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation

2018-11-19 Thread Filipe Manana
On Mon, Nov 19, 2018 at 11:35 AM Qu Wenruo  wrote:
>
>
>
> On 2018/11/19 下午7:13, Filipe Manana wrote:
> > On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo  wrote:
> >>
> >>
> >>
> >> On 2018/11/19 下午5:48, fdman...@kernel.org wrote:
> >>> From: Filipe Manana 
> >>>
> >>> If the quota enable and snapshot creation ioctls are called concurrently
> >>> we can get into a deadlock where the task enabling quotas will deadlock
> >>> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> >>> twice. The following time diagram shows how this happens.
> >>>
> >>>CPU 0CPU 1
> >>>
> >>>  btrfs_ioctl()
> >>>   btrfs_ioctl_quota_ctl()
> >>>btrfs_quota_enable()
> >>> mutex_lock(fs_info->qgroup_ioctl_lock)
> >>> btrfs_start_transaction()
> >>>
> >>>  btrfs_ioctl()
> >>>   btrfs_ioctl_snap_create_v2
> >>>create_snapshot()
> >>> --> adds snapshot to the
> >>> list pending_snapshots
> >>> of the current
> >>> transaction
> >>>
> >>> btrfs_commit_transaction()
> >>>  create_pending_snapshots()
> >>>create_pending_snapshot()
> >>> qgroup_account_snapshot()
> >>>  btrfs_qgroup_inherit()
> >>>  mutex_lock(fs_info->qgroup_ioctl_lock)
> >>>   --> deadlock, mutex already locked
> >>>   by this task at
> >>>   btrfs_quota_enable()
> >>
> >> The backtrace looks valid.
> >>
> >>>
> >>> So fix this by adding a flag to the transaction handle that signals if the
> >>> transaction is being used for enabling quotas (only seen by the task doing
> >>> it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit()
> >>> if the transaction handle corresponds to the one being used to enable the
> >>> quotas.
> >>>
> >>> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating 
> >>> snapshot")
> >>> Signed-off-by: Filipe Manana 
> >>> ---
> >>>  fs/btrfs/qgroup.c  | 10 --
> >>>  fs/btrfs/transaction.h |  1 +
> >>>  2 files changed, 9 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> >>> index d4917c0cddf5..3aec3bfa3d70 100644
> >>> --- a/fs/btrfs/qgroup.c
> >>> +++ b/fs/btrfs/qgroup.c
> >>> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
> >>>   trans = NULL;
> >>>   goto out;
> >>>   }
> >>> + trans->enabling_quotas = true;
> >>
> >> Should we put enabling_quotas bit into btrfs_transaction instead of
> >> btrfs_trans_handle?
> >
> > Why?
> > Only the task which is enabling quotas needs to know about it.
>
> But it's the btrfs_qgroup_inherit() using the trans handler to avoid
> dead lock.
>
> What makes sure btrfs_qgroup_inherit() get the exactly same trans
> handler allocated here?

If it's the other task (the one creating a snapshot) that starts the
transaction commit,
it will have to wait for the task enabling quotas to release the
transaction - once that task
also calls commit_transaction(), it will skip doing the commit itself
and wait for the snapshot
one to finish the commit, while holding the qgroup mutex (this part I
missed before).
So yes we'll have to use a bit in the transaction itself instead.

>
> >
> >>
> >> Isn't it possible to have different trans handle pointed to the same
> >> transaction?
> >
> > Yes.
> >
> >>
> >> And I'm not really sure about the naming "enabling_quotas".
> >> What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful)
> >
> > Too long.
>
> Anyway, current naming doesn't really show why we could skip mutex
> locking. Just hope to get some name better.

No name will ever show you that.
You'll always have to see where  and how it's used, unless you want a
name like "dont_lock_mutex_because_we_locked_it_at_btrfs...".

>
> Thanks,
> Qu
>
> >
> >
> >>
> >> Thanks,
> >> Qu
> >>
> >>>
> >>>   fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL);
> >>>   if (!fs_info->qgroup_ulist) {
> >>> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
> >>> *trans, u64 srcid,
> >>>   u32 level_size = 0;
> >>>   u64 nums;
> >>>
> >>> - mutex_lock(_info->qgroup_ioctl_lock);
> >>> + if (trans->enabling_quotas)
> >>> + lockdep_assert_held(_info->qgroup_ioctl_lock);
> >>> + else
> >>> + mutex_lock(_info->qgroup_ioctl_lock);
> >>> +
> >>>   if (!test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags))
> >>>   goto out;
> >>>
> >>> @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
> >>> *trans, u64 srcid,
> >>>  unlock:
> >>>   spin_unlock(_info->qgroup_lock);
> >>>  out:
> >>> - 

Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation

2018-11-19 Thread Qu Wenruo


On 2018/11/19 下午7:13, Filipe Manana wrote:
> On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo  wrote:
>>
>>
>>
>> On 2018/11/19 下午5:48, fdman...@kernel.org wrote:
>>> From: Filipe Manana 
>>>
>>> If the quota enable and snapshot creation ioctls are called concurrently
>>> we can get into a deadlock where the task enabling quotas will deadlock
>>> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
>>> twice. The following time diagram shows how this happens.
>>>
>>>CPU 0CPU 1
>>>
>>>  btrfs_ioctl()
>>>   btrfs_ioctl_quota_ctl()
>>>btrfs_quota_enable()
>>> mutex_lock(fs_info->qgroup_ioctl_lock)
>>> btrfs_start_transaction()
>>>
>>>  btrfs_ioctl()
>>>   btrfs_ioctl_snap_create_v2
>>>create_snapshot()
>>> --> adds snapshot to the
>>> list pending_snapshots
>>> of the current
>>> transaction
>>>
>>> btrfs_commit_transaction()
>>>  create_pending_snapshots()
>>>create_pending_snapshot()
>>> qgroup_account_snapshot()
>>>  btrfs_qgroup_inherit()
>>>  mutex_lock(fs_info->qgroup_ioctl_lock)
>>>   --> deadlock, mutex already locked
>>>   by this task at
>>>   btrfs_quota_enable()
>>
>> The backtrace looks valid.
>>
>>>
>>> So fix this by adding a flag to the transaction handle that signals if the
>>> transaction is being used for enabling quotas (only seen by the task doing
>>> it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit()
>>> if the transaction handle corresponds to the one being used to enable the
>>> quotas.
>>>
>>> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating 
>>> snapshot")
>>> Signed-off-by: Filipe Manana 
>>> ---
>>>  fs/btrfs/qgroup.c  | 10 --
>>>  fs/btrfs/transaction.h |  1 +
>>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
>>> index d4917c0cddf5..3aec3bfa3d70 100644
>>> --- a/fs/btrfs/qgroup.c
>>> +++ b/fs/btrfs/qgroup.c
>>> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
>>>   trans = NULL;
>>>   goto out;
>>>   }
>>> + trans->enabling_quotas = true;
>>
>> Should we put enabling_quotas bit into btrfs_transaction instead of
>> btrfs_trans_handle?
> 
> Why?
> Only the task which is enabling quotas needs to know about it.

But it's the btrfs_qgroup_inherit() using the trans handler to avoid
dead lock.

What makes sure btrfs_qgroup_inherit() get the exactly same trans
handler allocated here?

> 
>>
>> Isn't it possible to have different trans handle pointed to the same
>> transaction?
> 
> Yes.
> 
>>
>> And I'm not really sure about the naming "enabling_quotas".
>> What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful)
> 
> Too long.

Anyway, current naming doesn't really show why we could skip mutex
locking. Just hope to get some name better.

Thanks,
Qu

> 
> 
>>
>> Thanks,
>> Qu
>>
>>>
>>>   fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL);
>>>   if (!fs_info->qgroup_ulist) {
>>> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
>>> *trans, u64 srcid,
>>>   u32 level_size = 0;
>>>   u64 nums;
>>>
>>> - mutex_lock(_info->qgroup_ioctl_lock);
>>> + if (trans->enabling_quotas)
>>> + lockdep_assert_held(_info->qgroup_ioctl_lock);
>>> + else
>>> + mutex_lock(_info->qgroup_ioctl_lock);
>>> +
>>>   if (!test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags))
>>>   goto out;
>>>
>>> @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
>>> *trans, u64 srcid,
>>>  unlock:
>>>   spin_unlock(_info->qgroup_lock);
>>>  out:
>>> - mutex_unlock(_info->qgroup_ioctl_lock);
>>> + if (!trans->enabling_quotas)
>>> + mutex_unlock(_info->qgroup_ioctl_lock);
>>>   return ret;
>>>  }
>>>
>>> diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
>>> index 703d5116a2fc..a5553a1dee30 100644
>>> --- a/fs/btrfs/transaction.h
>>> +++ b/fs/btrfs/transaction.h
>>> @@ -122,6 +122,7 @@ struct btrfs_trans_handle {
>>>   bool reloc_reserved;
>>>   bool sync;
>>>   bool dirty;
>>> + bool enabling_quotas;
>>>   struct btrfs_root *root;
>>>   struct btrfs_fs_info *fs_info;
>>>   struct list_head new_bgs;
>>>
>>



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation

2018-11-19 Thread Filipe Manana
On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo  wrote:
>
>
>
> On 2018/11/19 下午5:48, fdman...@kernel.org wrote:
> > From: Filipe Manana 
> >
> > If the quota enable and snapshot creation ioctls are called concurrently
> > we can get into a deadlock where the task enabling quotas will deadlock
> > on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> > twice. The following time diagram shows how this happens.
> >
> >CPU 0CPU 1
> >
> >  btrfs_ioctl()
> >   btrfs_ioctl_quota_ctl()
> >btrfs_quota_enable()
> > mutex_lock(fs_info->qgroup_ioctl_lock)
> > btrfs_start_transaction()
> >
> >  btrfs_ioctl()
> >   btrfs_ioctl_snap_create_v2
> >create_snapshot()
> > --> adds snapshot to the
> > list pending_snapshots
> > of the current
> > transaction
> >
> > btrfs_commit_transaction()
> >  create_pending_snapshots()
> >create_pending_snapshot()
> > qgroup_account_snapshot()
> >  btrfs_qgroup_inherit()
> >  mutex_lock(fs_info->qgroup_ioctl_lock)
> >   --> deadlock, mutex already locked
> >   by this task at
> >   btrfs_quota_enable()
>
> The backtrace looks valid.
>
> >
> > So fix this by adding a flag to the transaction handle that signals if the
> > transaction is being used for enabling quotas (only seen by the task doing
> > it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit()
> > if the transaction handle corresponds to the one being used to enable the
> > quotas.
> >
> > Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating 
> > snapshot")
> > Signed-off-by: Filipe Manana 
> > ---
> >  fs/btrfs/qgroup.c  | 10 --
> >  fs/btrfs/transaction.h |  1 +
> >  2 files changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> > index d4917c0cddf5..3aec3bfa3d70 100644
> > --- a/fs/btrfs/qgroup.c
> > +++ b/fs/btrfs/qgroup.c
> > @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
> >   trans = NULL;
> >   goto out;
> >   }
> > + trans->enabling_quotas = true;
>
> Should we put enabling_quotas bit into btrfs_transaction instead of
> btrfs_trans_handle?

Why?
Only the task which is enabling quotas needs to know about it.

>
> Isn't it possible to have different trans handle pointed to the same
> transaction?

Yes.

>
> And I'm not really sure about the naming "enabling_quotas".
> What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful)

Too long.


>
> Thanks,
> Qu
>
> >
> >   fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL);
> >   if (!fs_info->qgroup_ulist) {
> > @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
> > *trans, u64 srcid,
> >   u32 level_size = 0;
> >   u64 nums;
> >
> > - mutex_lock(_info->qgroup_ioctl_lock);
> > + if (trans->enabling_quotas)
> > + lockdep_assert_held(_info->qgroup_ioctl_lock);
> > + else
> > + mutex_lock(_info->qgroup_ioctl_lock);
> > +
> >   if (!test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags))
> >   goto out;
> >
> > @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
> > *trans, u64 srcid,
> >  unlock:
> >   spin_unlock(_info->qgroup_lock);
> >  out:
> > - mutex_unlock(_info->qgroup_ioctl_lock);
> > + if (!trans->enabling_quotas)
> > + mutex_unlock(_info->qgroup_ioctl_lock);
> >   return ret;
> >  }
> >
> > diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
> > index 703d5116a2fc..a5553a1dee30 100644
> > --- a/fs/btrfs/transaction.h
> > +++ b/fs/btrfs/transaction.h
> > @@ -122,6 +122,7 @@ struct btrfs_trans_handle {
> >   bool reloc_reserved;
> >   bool sync;
> >   bool dirty;
> > + bool enabling_quotas;
> >   struct btrfs_root *root;
> >   struct btrfs_fs_info *fs_info;
> >   struct list_head new_bgs;
> >
>


Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation

2018-11-19 Thread Qu Wenruo


On 2018/11/19 下午5:48, fdman...@kernel.org wrote:
> From: Filipe Manana 
> 
> If the quota enable and snapshot creation ioctls are called concurrently
> we can get into a deadlock where the task enabling quotas will deadlock
> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> twice. The following time diagram shows how this happens.
> 
>CPU 0CPU 1
> 
>  btrfs_ioctl()
>   btrfs_ioctl_quota_ctl()
>btrfs_quota_enable()
> mutex_lock(fs_info->qgroup_ioctl_lock)
> btrfs_start_transaction()
> 
>  btrfs_ioctl()
>   btrfs_ioctl_snap_create_v2
>create_snapshot()
> --> adds snapshot to the
> list pending_snapshots
> of the current
> transaction
> 
> btrfs_commit_transaction()
>  create_pending_snapshots()
>create_pending_snapshot()
> qgroup_account_snapshot()
>  btrfs_qgroup_inherit()
>  mutex_lock(fs_info->qgroup_ioctl_lock)
>   --> deadlock, mutex already locked
>   by this task at
>   btrfs_quota_enable()

The backtrace looks valid.

> 
> So fix this by adding a flag to the transaction handle that signals if the
> transaction is being used for enabling quotas (only seen by the task doing
> it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit()
> if the transaction handle corresponds to the one being used to enable the
> quotas.
> 
> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating 
> snapshot")
> Signed-off-by: Filipe Manana 
> ---
>  fs/btrfs/qgroup.c  | 10 --
>  fs/btrfs/transaction.h |  1 +
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index d4917c0cddf5..3aec3bfa3d70 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
>   trans = NULL;
>   goto out;
>   }
> + trans->enabling_quotas = true;

Should we put enabling_quotas bit into btrfs_transaction instead of
btrfs_trans_handle?

Isn't it possible to have different trans handle pointed to the same
transaction?

And I'm not really sure about the naming "enabling_quotas".
What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful)

Thanks,
Qu

>  
>   fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL);
>   if (!fs_info->qgroup_ulist) {
> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
> *trans, u64 srcid,
>   u32 level_size = 0;
>   u64 nums;
>  
> - mutex_lock(_info->qgroup_ioctl_lock);
> + if (trans->enabling_quotas)
> + lockdep_assert_held(_info->qgroup_ioctl_lock);
> + else
> + mutex_lock(_info->qgroup_ioctl_lock);
> +
>   if (!test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags))
>   goto out;
>  
> @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
> *trans, u64 srcid,
>  unlock:
>   spin_unlock(_info->qgroup_lock);
>  out:
> - mutex_unlock(_info->qgroup_ioctl_lock);
> + if (!trans->enabling_quotas)
> + mutex_unlock(_info->qgroup_ioctl_lock);
>   return ret;
>  }
>  
> diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
> index 703d5116a2fc..a5553a1dee30 100644
> --- a/fs/btrfs/transaction.h
> +++ b/fs/btrfs/transaction.h
> @@ -122,6 +122,7 @@ struct btrfs_trans_handle {
>   bool reloc_reserved;
>   bool sync;
>   bool dirty;
> + bool enabling_quotas;
>   struct btrfs_root *root;
>   struct btrfs_fs_info *fs_info;
>   struct list_head new_bgs;
> 



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation

2018-11-19 Thread Nikolay Borisov



On 19.11.18 г. 11:48 ч., fdman...@kernel.org wrote:
> From: Filipe Manana 
> 
> If the quota enable and snapshot creation ioctls are called concurrently
> we can get into a deadlock where the task enabling quotas will deadlock
> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> twice. The following time diagram shows how this happens.
> 
>CPU 0CPU 1
> 
>  btrfs_ioctl()
>   btrfs_ioctl_quota_ctl()
>btrfs_quota_enable()
> mutex_lock(fs_info->qgroup_ioctl_lock)
> btrfs_start_transaction()
> 
>  btrfs_ioctl()
>   btrfs_ioctl_snap_create_v2
>create_snapshot()
> --> adds snapshot to the
> list pending_snapshots
> of the current
> transaction
> 
> btrfs_commit_transaction()
>  create_pending_snapshots()
>create_pending_snapshot()
> qgroup_account_snapshot()
>  btrfs_qgroup_inherit()
>  mutex_lock(fs_info->qgroup_ioctl_lock)
>   --> deadlock, mutex already locked
>   by this task at
>   btrfs_quota_enable()
> 
> So fix this by adding a flag to the transaction handle that signals if the
> transaction is being used for enabling quotas (only seen by the task doing
> it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit()
> if the transaction handle corresponds to the one being used to enable the
> quotas.
> 
> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating 
> snapshot")
> Signed-off-by: Filipe Manana 
> ---
>  fs/btrfs/qgroup.c  | 10 --
>  fs/btrfs/transaction.h |  1 +
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index d4917c0cddf5..3aec3bfa3d70 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
>   trans = NULL;
>   goto out;
>   }
> + trans->enabling_quotas = true;
>  
>   fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL);
>   if (!fs_info->qgroup_ulist) {
> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
> *trans, u64 srcid,
>   u32 level_size = 0;
>   u64 nums;
>  
> - mutex_lock(_info->qgroup_ioctl_lock);
> + if (trans->enabling_quotas)
> + lockdep_assert_held(_info->qgroup_ioctl_lock);
> + else
> + mutex_lock(_info->qgroup_ioctl_lock);
> +

nit: That's a bit ugly for my taste, but I don't think the 
alternative is any better: 

ASSERT((trans->enabling_quotas && 
!lockdep_assert_held(_info->qgroup_ioctl_lock)) || !trans->enabling_quotas)

if (!trans->enabling_quotas)
mutex_lock(...)


>   if (!test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags))
>   goto out;
>  
> @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle 
> *trans, u64 srcid,
>  unlock:
>   spin_unlock(_info->qgroup_lock);
>  out:
> - mutex_unlock(_info->qgroup_ioctl_lock);
> + if (!trans->enabling_quotas)
> + mutex_unlock(_info->qgroup_ioctl_lock);
>   return ret;
>  }
>  
> diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
> index 703d5116a2fc..a5553a1dee30 100644
> --- a/fs/btrfs/transaction.h
> +++ b/fs/btrfs/transaction.h
> @@ -122,6 +122,7 @@ struct btrfs_trans_handle {
>   bool reloc_reserved;
>   bool sync;
>   bool dirty;
> + bool enabling_quotas;
>   struct btrfs_root *root;
>   struct btrfs_fs_info *fs_info;
>   struct list_head new_bgs;
>