On Mon, Nov 19, 2018 at 11:52 AM Filipe Manana <fdman...@kernel.org> wrote:
>
> On Mon, Nov 19, 2018 at 11:35 AM Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
> >
> >
> >
> > On 2018/11/19 下午7:13, Filipe Manana wrote:
> > > On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
> > >>
> > >>
> > >>
> > >> On 2018/11/19 下午5:48, fdman...@kernel.org wrote:
> > >>> From: Filipe Manana <fdman...@suse.com>
> > >>>
> > >>> If the quota enable and snapshot creation ioctls are called concurrently
> > >>> we can get into a deadlock where the task enabling quotas will deadlock
> > >>> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> > >>> twice. The following time diagram shows how this happens.
> > >>>
> > >>>            CPU 0                                    CPU 1
> > >>>
> > >>>  btrfs_ioctl()
> > >>>   btrfs_ioctl_quota_ctl()
> > >>>    btrfs_quota_enable()
> > >>>     mutex_lock(fs_info->qgroup_ioctl_lock)
> > >>>     btrfs_start_transaction()
> > >>>
> > >>>                                              btrfs_ioctl()
> > >>>                                               btrfs_ioctl_snap_create_v2
> > >>>                                                create_snapshot()
> > >>>                                                 --> adds snapshot to the
> > >>>                                                     list 
> > >>> pending_snapshots
> > >>>                                                     of the current
> > >>>                                                     transaction
> > >>>
> > >>>     btrfs_commit_transaction()
> > >>>      create_pending_snapshots()
> > >>>        create_pending_snapshot()
> > >>>         qgroup_account_snapshot()
> > >>>          btrfs_qgroup_inherit()
> > >>>          mutex_lock(fs_info->qgroup_ioctl_lock)
> > >>>           --> deadlock, mutex already locked
> > >>>               by this task at
> > >>>               btrfs_quota_enable()
> > >>
> > >> The backtrace looks valid.
> > >>
> > >>>
> > >>> So fix this by adding a flag to the transaction handle that signals if 
> > >>> the
> > >>> transaction is being used for enabling quotas (only seen by the task 
> > >>> doing
> > >>> it) and do not lock the mutex qgroup_ioctl_lock at 
> > >>> btrfs_qgroup_inherit()
> > >>> if the transaction handle corresponds to the one being used to enable 
> > >>> the
> > >>> quotas.
> > >>>
> > >>> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when 
> > >>> creating snapshot")
> > >>> Signed-off-by: Filipe Manana <fdman...@suse.com>
> > >>> ---
> > >>>  fs/btrfs/qgroup.c      | 10 ++++++++--
> > >>>  fs/btrfs/transaction.h |  1 +
> > >>>  2 files changed, 9 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> > >>> index d4917c0cddf5..3aec3bfa3d70 100644
> > >>> --- a/fs/btrfs/qgroup.c
> > >>> +++ b/fs/btrfs/qgroup.c
> > >>> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info 
> > >>> *fs_info)
> > >>>               trans = NULL;
> > >>>               goto out;
> > >>>       }
> > >>> +     trans->enabling_quotas = true;
> > >>
> > >> Should we put enabling_quotas bit into btrfs_transaction instead of
> > >> btrfs_trans_handle?
> > >
> > > Why?
> > > Only the task which is enabling quotas needs to know about it.
> >
> > But it's the btrfs_qgroup_inherit() using the trans handler to avoid
> > dead lock.
> >
> > What makes sure btrfs_qgroup_inherit() get the exactly same trans
> > handler allocated here?
>
> If it's the other task (the one creating a snapshot) that starts the
> transaction commit,
> it will have to wait for the task enabling quotas to release the
> transaction - once that task
> also calls commit_transaction(), it will skip doing the commit itself
> and wait for the snapshot
> one to finish the commit, while holding the qgroup mutex (this part I
> missed before).
> So yes we'll have to use a bit in the transaction itself instead.

That (using a flag in the transaction itself) wouldn't be good, it would allow
concurrent and unprotected access to qgroup stuff at
btrfs_qgroup_inherit() by anyone who
calls it (currently only subvolume creation). Fortunately there's a
much simpler solution in v2.

>
> >
> > >
> > >>
> > >> Isn't it possible to have different trans handle pointed to the same
> > >> transaction?
> > >
> > > Yes.
> > >
> > >>
> > >> And I'm not really sure about the naming "enabling_quotas".
> > >> What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful)
> > >
> > > Too long.
> >
> > Anyway, current naming doesn't really show why we could skip mutex
> > locking. Just hope to get some name better.
>
> No name will ever show you that.
> You'll always have to see where  and how it's used, unless you want a
> name like "dont_lock_mutex_because_we_locked_it_at_btrfs...".
>
> >
> > Thanks,
> > Qu
> >
> > >
> > >
> > >>
> > >> Thanks,
> > >> Qu
> > >>
> > >>>
> > >>>       fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL);
> > >>>       if (!fs_info->qgroup_ulist) {
> > >>> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct 
> > >>> btrfs_trans_handle *trans, u64 srcid,
> > >>>       u32 level_size = 0;
> > >>>       u64 nums;
> > >>>
> > >>> -     mutex_lock(&fs_info->qgroup_ioctl_lock);
> > >>> +     if (trans->enabling_quotas)
> > >>> +             lockdep_assert_held(&fs_info->qgroup_ioctl_lock);
> > >>> +     else
> > >>> +             mutex_lock(&fs_info->qgroup_ioctl_lock);
> > >>> +
> > >>>       if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags))
> > >>>               goto out;
> > >>>
> > >>> @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct 
> > >>> btrfs_trans_handle *trans, u64 srcid,
> > >>>  unlock:
> > >>>       spin_unlock(&fs_info->qgroup_lock);
> > >>>  out:
> > >>> -     mutex_unlock(&fs_info->qgroup_ioctl_lock);
> > >>> +     if (!trans->enabling_quotas)
> > >>> +             mutex_unlock(&fs_info->qgroup_ioctl_lock);
> > >>>       return ret;
> > >>>  }
> > >>>
> > >>> diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
> > >>> index 703d5116a2fc..a5553a1dee30 100644
> > >>> --- a/fs/btrfs/transaction.h
> > >>> +++ b/fs/btrfs/transaction.h
> > >>> @@ -122,6 +122,7 @@ struct btrfs_trans_handle {
> > >>>       bool reloc_reserved;
> > >>>       bool sync;
> > >>>       bool dirty;
> > >>> +     bool enabling_quotas;
> > >>>       struct btrfs_root *root;
> > >>>       struct btrfs_fs_info *fs_info;
> > >>>       struct list_head new_bgs;
> > >>>
> > >>
> >

Reply via email to