Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation
On Mon, Nov 19, 2018 at 11:52 AM Filipe Manana wrote: > > On Mon, Nov 19, 2018 at 11:35 AM Qu Wenruo wrote: > > > > > > > > On 2018/11/19 下午7:13, Filipe Manana wrote: > > > On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo wrote: > > >> > > >> > > >> > > >> On 2018/11/19 下午5:48, fdman...@kernel.org wrote: > > >>> From: Filipe Manana > > >>> > > >>> If the quota enable and snapshot creation ioctls are called concurrently > > >>> we can get into a deadlock where the task enabling quotas will deadlock > > >>> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it > > >>> twice. The following time diagram shows how this happens. > > >>> > > >>>CPU 0CPU 1 > > >>> > > >>> btrfs_ioctl() > > >>> btrfs_ioctl_quota_ctl() > > >>>btrfs_quota_enable() > > >>> mutex_lock(fs_info->qgroup_ioctl_lock) > > >>> btrfs_start_transaction() > > >>> > > >>> btrfs_ioctl() > > >>> btrfs_ioctl_snap_create_v2 > > >>>create_snapshot() > > >>> --> adds snapshot to the > > >>> list > > >>> pending_snapshots > > >>> of the current > > >>> transaction > > >>> > > >>> btrfs_commit_transaction() > > >>> create_pending_snapshots() > > >>>create_pending_snapshot() > > >>> qgroup_account_snapshot() > > >>> btrfs_qgroup_inherit() > > >>> mutex_lock(fs_info->qgroup_ioctl_lock) > > >>> --> deadlock, mutex already locked > > >>> by this task at > > >>> btrfs_quota_enable() > > >> > > >> The backtrace looks valid. > > >> > > >>> > > >>> So fix this by adding a flag to the transaction handle that signals if > > >>> the > > >>> transaction is being used for enabling quotas (only seen by the task > > >>> doing > > >>> it) and do not lock the mutex qgroup_ioctl_lock at > > >>> btrfs_qgroup_inherit() > > >>> if the transaction handle corresponds to the one being used to enable > > >>> the > > >>> quotas. > > >>> > > >>> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when > > >>> creating snapshot") > > >>> Signed-off-by: Filipe Manana > > >>> --- > > >>> fs/btrfs/qgroup.c | 10 -- > > >>> fs/btrfs/transaction.h | 1 + > > >>> 2 files changed, 9 insertions(+), 2 deletions(-) > > >>> > > >>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c > > >>> index d4917c0cddf5..3aec3bfa3d70 100644 > > >>> --- a/fs/btrfs/qgroup.c > > >>> +++ b/fs/btrfs/qgroup.c > > >>> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info > > >>> *fs_info) > > >>> trans = NULL; > > >>> goto out; > > >>> } > > >>> + trans->enabling_quotas = true; > > >> > > >> Should we put enabling_quotas bit into btrfs_transaction instead of > > >> btrfs_trans_handle? > > > > > > Why? > > > Only the task which is enabling quotas needs to know about it. > > > > But it's the btrfs_qgroup_inherit() using the trans handler to avoid > > dead lock. > > > > What makes sure btrfs_qgroup_inherit() get the exactly same trans > > handler allocated here? > > If it's the other task (the one creating a snapshot) that starts the > transaction commit, > it will have to wait for the task enabling quotas to release the > transaction - once that task > also calls commit_transaction(), it will skip doing the commit itself > and wait for the snapshot > one to finish the commit, while holding the qgroup mutex (this part I > missed before). > So yes we'll have to use a bit in the transaction itself instead. That (using a flag in the transaction itself) wouldn't be good, it would allow concurrent and unprotected access to qgroup stuff at btrfs_qgroup_inherit() by anyone who calls it (currently only subvolume creation). Fortunately there's a much simpler solution in v2. > > > > > > > > >> > > >> Isn't it possible to have different trans handle pointed to the same > > >> transaction? > > > > > > Yes. > > > > > >> > > >> And I'm not really sure about the naming "enabling_quotas". > > >> What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful) > > > > > > Too long. > > > > Anyway, current naming doesn't really show why we could skip mutex > > locking. Just hope to get some name better. > > No name will ever show you that. > You'll always have to see where and how it's used, unless you want a > name like "dont_lock_mutex_because_we_locked_it_at_btrfs...". > > > > > Thanks, > > Qu > > > > > > > > > > >> > > >> Thanks, > > >> Qu > > >> > > >>> > > >>> fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL); > > >>> if (!fs_info->qgroup_ulist) { > > >>> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct > > >>> btrfs_trans_handle
Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation
On 2018/11/19 下午7:52, Filipe Manana wrote: > On Mon, Nov 19, 2018 at 11:35 AM Qu Wenruo wrote: >> >> >> >> On 2018/11/19 下午7:13, Filipe Manana wrote: >>> On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo wrote: On 2018/11/19 下午5:48, fdman...@kernel.org wrote: > From: Filipe Manana > > If the quota enable and snapshot creation ioctls are called concurrently > we can get into a deadlock where the task enabling quotas will deadlock > on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it > twice. The following time diagram shows how this happens. > >CPU 0CPU 1 > > btrfs_ioctl() > btrfs_ioctl_quota_ctl() >btrfs_quota_enable() > mutex_lock(fs_info->qgroup_ioctl_lock) > btrfs_start_transaction() > > btrfs_ioctl() > btrfs_ioctl_snap_create_v2 >create_snapshot() > --> adds snapshot to the > list pending_snapshots > of the current > transaction > > btrfs_commit_transaction() > create_pending_snapshots() >create_pending_snapshot() > qgroup_account_snapshot() > btrfs_qgroup_inherit() > mutex_lock(fs_info->qgroup_ioctl_lock) > --> deadlock, mutex already locked > by this task at > btrfs_quota_enable() The backtrace looks valid. > > So fix this by adding a flag to the transaction handle that signals if the > transaction is being used for enabling quotas (only seen by the task doing > it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit() > if the transaction handle corresponds to the one being used to enable the > quotas. > > Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating > snapshot") > Signed-off-by: Filipe Manana > --- > fs/btrfs/qgroup.c | 10 -- > fs/btrfs/transaction.h | 1 + > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c > index d4917c0cddf5..3aec3bfa3d70 100644 > --- a/fs/btrfs/qgroup.c > +++ b/fs/btrfs/qgroup.c > @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) > trans = NULL; > goto out; > } > + trans->enabling_quotas = true; Should we put enabling_quotas bit into btrfs_transaction instead of btrfs_trans_handle? >>> >>> Why? >>> Only the task which is enabling quotas needs to know about it. >> >> But it's the btrfs_qgroup_inherit() using the trans handler to avoid >> dead lock. >> >> What makes sure btrfs_qgroup_inherit() get the exactly same trans >> handler allocated here? > > If it's the other task (the one creating a snapshot) that starts the > transaction commit, > it will have to wait for the task enabling quotas to release the > transaction - once that task > also calls commit_transaction(), it will skip doing the commit itself > and wait for the snapshot > one to finish the commit, while holding the qgroup mutex (this part I > missed before). > So yes we'll have to use a bit in the transaction itself instead. > >> >>> Isn't it possible to have different trans handle pointed to the same transaction? >>> >>> Yes. >>> And I'm not really sure about the naming "enabling_quotas". What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful) >>> >>> Too long. >> >> Anyway, current naming doesn't really show why we could skip mutex >> locking. Just hope to get some name better. > > No name will ever show you that. > You'll always have to see where and how it's used, unless you want a > name like "dont_lock_mutex_because_we_locked_it_at_btrfs...". (Personally speaking I indeed prefer this one naming as it doesn't exceed 80 chars yet) Your statement makes sense, just keep current naming. Thanks, Qu > >> >> Thanks, >> Qu >> >>> >>> Thanks, Qu > > fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL); > if (!fs_info->qgroup_ulist) { > @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle > *trans, u64 srcid, > u32 level_size = 0; > u64 nums; > > - mutex_lock(_info->qgroup_ioctl_lock); > + if (trans->enabling_quotas) > + lockdep_assert_held(_info->qgroup_ioctl_lock); > + else > + mutex_lock(_info->qgroup_ioctl_lock); > + > if (!test_bit(BTRFS_FS_QUOTA_ENABLED,
Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation
On Mon, Nov 19, 2018 at 11:35 AM Qu Wenruo wrote: > > > > On 2018/11/19 下午7:13, Filipe Manana wrote: > > On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo wrote: > >> > >> > >> > >> On 2018/11/19 下午5:48, fdman...@kernel.org wrote: > >>> From: Filipe Manana > >>> > >>> If the quota enable and snapshot creation ioctls are called concurrently > >>> we can get into a deadlock where the task enabling quotas will deadlock > >>> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it > >>> twice. The following time diagram shows how this happens. > >>> > >>>CPU 0CPU 1 > >>> > >>> btrfs_ioctl() > >>> btrfs_ioctl_quota_ctl() > >>>btrfs_quota_enable() > >>> mutex_lock(fs_info->qgroup_ioctl_lock) > >>> btrfs_start_transaction() > >>> > >>> btrfs_ioctl() > >>> btrfs_ioctl_snap_create_v2 > >>>create_snapshot() > >>> --> adds snapshot to the > >>> list pending_snapshots > >>> of the current > >>> transaction > >>> > >>> btrfs_commit_transaction() > >>> create_pending_snapshots() > >>>create_pending_snapshot() > >>> qgroup_account_snapshot() > >>> btrfs_qgroup_inherit() > >>> mutex_lock(fs_info->qgroup_ioctl_lock) > >>> --> deadlock, mutex already locked > >>> by this task at > >>> btrfs_quota_enable() > >> > >> The backtrace looks valid. > >> > >>> > >>> So fix this by adding a flag to the transaction handle that signals if the > >>> transaction is being used for enabling quotas (only seen by the task doing > >>> it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit() > >>> if the transaction handle corresponds to the one being used to enable the > >>> quotas. > >>> > >>> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating > >>> snapshot") > >>> Signed-off-by: Filipe Manana > >>> --- > >>> fs/btrfs/qgroup.c | 10 -- > >>> fs/btrfs/transaction.h | 1 + > >>> 2 files changed, 9 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c > >>> index d4917c0cddf5..3aec3bfa3d70 100644 > >>> --- a/fs/btrfs/qgroup.c > >>> +++ b/fs/btrfs/qgroup.c > >>> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) > >>> trans = NULL; > >>> goto out; > >>> } > >>> + trans->enabling_quotas = true; > >> > >> Should we put enabling_quotas bit into btrfs_transaction instead of > >> btrfs_trans_handle? > > > > Why? > > Only the task which is enabling quotas needs to know about it. > > But it's the btrfs_qgroup_inherit() using the trans handler to avoid > dead lock. > > What makes sure btrfs_qgroup_inherit() get the exactly same trans > handler allocated here? If it's the other task (the one creating a snapshot) that starts the transaction commit, it will have to wait for the task enabling quotas to release the transaction - once that task also calls commit_transaction(), it will skip doing the commit itself and wait for the snapshot one to finish the commit, while holding the qgroup mutex (this part I missed before). So yes we'll have to use a bit in the transaction itself instead. > > > > >> > >> Isn't it possible to have different trans handle pointed to the same > >> transaction? > > > > Yes. > > > >> > >> And I'm not really sure about the naming "enabling_quotas". > >> What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful) > > > > Too long. > > Anyway, current naming doesn't really show why we could skip mutex > locking. Just hope to get some name better. No name will ever show you that. You'll always have to see where and how it's used, unless you want a name like "dont_lock_mutex_because_we_locked_it_at_btrfs...". > > Thanks, > Qu > > > > > > >> > >> Thanks, > >> Qu > >> > >>> > >>> fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL); > >>> if (!fs_info->qgroup_ulist) { > >>> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle > >>> *trans, u64 srcid, > >>> u32 level_size = 0; > >>> u64 nums; > >>> > >>> - mutex_lock(_info->qgroup_ioctl_lock); > >>> + if (trans->enabling_quotas) > >>> + lockdep_assert_held(_info->qgroup_ioctl_lock); > >>> + else > >>> + mutex_lock(_info->qgroup_ioctl_lock); > >>> + > >>> if (!test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags)) > >>> goto out; > >>> > >>> @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle > >>> *trans, u64 srcid, > >>> unlock: > >>> spin_unlock(_info->qgroup_lock); > >>> out: > >>> -
Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation
On 2018/11/19 下午7:13, Filipe Manana wrote: > On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo wrote: >> >> >> >> On 2018/11/19 下午5:48, fdman...@kernel.org wrote: >>> From: Filipe Manana >>> >>> If the quota enable and snapshot creation ioctls are called concurrently >>> we can get into a deadlock where the task enabling quotas will deadlock >>> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it >>> twice. The following time diagram shows how this happens. >>> >>>CPU 0CPU 1 >>> >>> btrfs_ioctl() >>> btrfs_ioctl_quota_ctl() >>>btrfs_quota_enable() >>> mutex_lock(fs_info->qgroup_ioctl_lock) >>> btrfs_start_transaction() >>> >>> btrfs_ioctl() >>> btrfs_ioctl_snap_create_v2 >>>create_snapshot() >>> --> adds snapshot to the >>> list pending_snapshots >>> of the current >>> transaction >>> >>> btrfs_commit_transaction() >>> create_pending_snapshots() >>>create_pending_snapshot() >>> qgroup_account_snapshot() >>> btrfs_qgroup_inherit() >>> mutex_lock(fs_info->qgroup_ioctl_lock) >>> --> deadlock, mutex already locked >>> by this task at >>> btrfs_quota_enable() >> >> The backtrace looks valid. >> >>> >>> So fix this by adding a flag to the transaction handle that signals if the >>> transaction is being used for enabling quotas (only seen by the task doing >>> it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit() >>> if the transaction handle corresponds to the one being used to enable the >>> quotas. >>> >>> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating >>> snapshot") >>> Signed-off-by: Filipe Manana >>> --- >>> fs/btrfs/qgroup.c | 10 -- >>> fs/btrfs/transaction.h | 1 + >>> 2 files changed, 9 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c >>> index d4917c0cddf5..3aec3bfa3d70 100644 >>> --- a/fs/btrfs/qgroup.c >>> +++ b/fs/btrfs/qgroup.c >>> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) >>> trans = NULL; >>> goto out; >>> } >>> + trans->enabling_quotas = true; >> >> Should we put enabling_quotas bit into btrfs_transaction instead of >> btrfs_trans_handle? > > Why? > Only the task which is enabling quotas needs to know about it. But it's the btrfs_qgroup_inherit() using the trans handler to avoid dead lock. What makes sure btrfs_qgroup_inherit() get the exactly same trans handler allocated here? > >> >> Isn't it possible to have different trans handle pointed to the same >> transaction? > > Yes. > >> >> And I'm not really sure about the naming "enabling_quotas". >> What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful) > > Too long. Anyway, current naming doesn't really show why we could skip mutex locking. Just hope to get some name better. Thanks, Qu > > >> >> Thanks, >> Qu >> >>> >>> fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL); >>> if (!fs_info->qgroup_ulist) { >>> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle >>> *trans, u64 srcid, >>> u32 level_size = 0; >>> u64 nums; >>> >>> - mutex_lock(_info->qgroup_ioctl_lock); >>> + if (trans->enabling_quotas) >>> + lockdep_assert_held(_info->qgroup_ioctl_lock); >>> + else >>> + mutex_lock(_info->qgroup_ioctl_lock); >>> + >>> if (!test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags)) >>> goto out; >>> >>> @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle >>> *trans, u64 srcid, >>> unlock: >>> spin_unlock(_info->qgroup_lock); >>> out: >>> - mutex_unlock(_info->qgroup_ioctl_lock); >>> + if (!trans->enabling_quotas) >>> + mutex_unlock(_info->qgroup_ioctl_lock); >>> return ret; >>> } >>> >>> diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h >>> index 703d5116a2fc..a5553a1dee30 100644 >>> --- a/fs/btrfs/transaction.h >>> +++ b/fs/btrfs/transaction.h >>> @@ -122,6 +122,7 @@ struct btrfs_trans_handle { >>> bool reloc_reserved; >>> bool sync; >>> bool dirty; >>> + bool enabling_quotas; >>> struct btrfs_root *root; >>> struct btrfs_fs_info *fs_info; >>> struct list_head new_bgs; >>> >> signature.asc Description: OpenPGP digital signature
Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation
On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo wrote: > > > > On 2018/11/19 下午5:48, fdman...@kernel.org wrote: > > From: Filipe Manana > > > > If the quota enable and snapshot creation ioctls are called concurrently > > we can get into a deadlock where the task enabling quotas will deadlock > > on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it > > twice. The following time diagram shows how this happens. > > > >CPU 0CPU 1 > > > > btrfs_ioctl() > > btrfs_ioctl_quota_ctl() > >btrfs_quota_enable() > > mutex_lock(fs_info->qgroup_ioctl_lock) > > btrfs_start_transaction() > > > > btrfs_ioctl() > > btrfs_ioctl_snap_create_v2 > >create_snapshot() > > --> adds snapshot to the > > list pending_snapshots > > of the current > > transaction > > > > btrfs_commit_transaction() > > create_pending_snapshots() > >create_pending_snapshot() > > qgroup_account_snapshot() > > btrfs_qgroup_inherit() > > mutex_lock(fs_info->qgroup_ioctl_lock) > > --> deadlock, mutex already locked > > by this task at > > btrfs_quota_enable() > > The backtrace looks valid. > > > > > So fix this by adding a flag to the transaction handle that signals if the > > transaction is being used for enabling quotas (only seen by the task doing > > it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit() > > if the transaction handle corresponds to the one being used to enable the > > quotas. > > > > Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating > > snapshot") > > Signed-off-by: Filipe Manana > > --- > > fs/btrfs/qgroup.c | 10 -- > > fs/btrfs/transaction.h | 1 + > > 2 files changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c > > index d4917c0cddf5..3aec3bfa3d70 100644 > > --- a/fs/btrfs/qgroup.c > > +++ b/fs/btrfs/qgroup.c > > @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) > > trans = NULL; > > goto out; > > } > > + trans->enabling_quotas = true; > > Should we put enabling_quotas bit into btrfs_transaction instead of > btrfs_trans_handle? Why? Only the task which is enabling quotas needs to know about it. > > Isn't it possible to have different trans handle pointed to the same > transaction? Yes. > > And I'm not really sure about the naming "enabling_quotas". > What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful) Too long. > > Thanks, > Qu > > > > > fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL); > > if (!fs_info->qgroup_ulist) { > > @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle > > *trans, u64 srcid, > > u32 level_size = 0; > > u64 nums; > > > > - mutex_lock(_info->qgroup_ioctl_lock); > > + if (trans->enabling_quotas) > > + lockdep_assert_held(_info->qgroup_ioctl_lock); > > + else > > + mutex_lock(_info->qgroup_ioctl_lock); > > + > > if (!test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags)) > > goto out; > > > > @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle > > *trans, u64 srcid, > > unlock: > > spin_unlock(_info->qgroup_lock); > > out: > > - mutex_unlock(_info->qgroup_ioctl_lock); > > + if (!trans->enabling_quotas) > > + mutex_unlock(_info->qgroup_ioctl_lock); > > return ret; > > } > > > > diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h > > index 703d5116a2fc..a5553a1dee30 100644 > > --- a/fs/btrfs/transaction.h > > +++ b/fs/btrfs/transaction.h > > @@ -122,6 +122,7 @@ struct btrfs_trans_handle { > > bool reloc_reserved; > > bool sync; > > bool dirty; > > + bool enabling_quotas; > > struct btrfs_root *root; > > struct btrfs_fs_info *fs_info; > > struct list_head new_bgs; > > >
Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation
On 2018/11/19 下午5:48, fdman...@kernel.org wrote: > From: Filipe Manana > > If the quota enable and snapshot creation ioctls are called concurrently > we can get into a deadlock where the task enabling quotas will deadlock > on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it > twice. The following time diagram shows how this happens. > >CPU 0CPU 1 > > btrfs_ioctl() > btrfs_ioctl_quota_ctl() >btrfs_quota_enable() > mutex_lock(fs_info->qgroup_ioctl_lock) > btrfs_start_transaction() > > btrfs_ioctl() > btrfs_ioctl_snap_create_v2 >create_snapshot() > --> adds snapshot to the > list pending_snapshots > of the current > transaction > > btrfs_commit_transaction() > create_pending_snapshots() >create_pending_snapshot() > qgroup_account_snapshot() > btrfs_qgroup_inherit() > mutex_lock(fs_info->qgroup_ioctl_lock) > --> deadlock, mutex already locked > by this task at > btrfs_quota_enable() The backtrace looks valid. > > So fix this by adding a flag to the transaction handle that signals if the > transaction is being used for enabling quotas (only seen by the task doing > it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit() > if the transaction handle corresponds to the one being used to enable the > quotas. > > Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating > snapshot") > Signed-off-by: Filipe Manana > --- > fs/btrfs/qgroup.c | 10 -- > fs/btrfs/transaction.h | 1 + > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c > index d4917c0cddf5..3aec3bfa3d70 100644 > --- a/fs/btrfs/qgroup.c > +++ b/fs/btrfs/qgroup.c > @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) > trans = NULL; > goto out; > } > + trans->enabling_quotas = true; Should we put enabling_quotas bit into btrfs_transaction instead of btrfs_trans_handle? Isn't it possible to have different trans handle pointed to the same transaction? And I'm not really sure about the naming "enabling_quotas". What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful) Thanks, Qu > > fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL); > if (!fs_info->qgroup_ulist) { > @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle > *trans, u64 srcid, > u32 level_size = 0; > u64 nums; > > - mutex_lock(_info->qgroup_ioctl_lock); > + if (trans->enabling_quotas) > + lockdep_assert_held(_info->qgroup_ioctl_lock); > + else > + mutex_lock(_info->qgroup_ioctl_lock); > + > if (!test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags)) > goto out; > > @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle > *trans, u64 srcid, > unlock: > spin_unlock(_info->qgroup_lock); > out: > - mutex_unlock(_info->qgroup_ioctl_lock); > + if (!trans->enabling_quotas) > + mutex_unlock(_info->qgroup_ioctl_lock); > return ret; > } > > diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h > index 703d5116a2fc..a5553a1dee30 100644 > --- a/fs/btrfs/transaction.h > +++ b/fs/btrfs/transaction.h > @@ -122,6 +122,7 @@ struct btrfs_trans_handle { > bool reloc_reserved; > bool sync; > bool dirty; > + bool enabling_quotas; > struct btrfs_root *root; > struct btrfs_fs_info *fs_info; > struct list_head new_bgs; > signature.asc Description: OpenPGP digital signature
Re: [PATCH] Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation
On 19.11.18 г. 11:48 ч., fdman...@kernel.org wrote: > From: Filipe Manana > > If the quota enable and snapshot creation ioctls are called concurrently > we can get into a deadlock where the task enabling quotas will deadlock > on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it > twice. The following time diagram shows how this happens. > >CPU 0CPU 1 > > btrfs_ioctl() > btrfs_ioctl_quota_ctl() >btrfs_quota_enable() > mutex_lock(fs_info->qgroup_ioctl_lock) > btrfs_start_transaction() > > btrfs_ioctl() > btrfs_ioctl_snap_create_v2 >create_snapshot() > --> adds snapshot to the > list pending_snapshots > of the current > transaction > > btrfs_commit_transaction() > create_pending_snapshots() >create_pending_snapshot() > qgroup_account_snapshot() > btrfs_qgroup_inherit() > mutex_lock(fs_info->qgroup_ioctl_lock) > --> deadlock, mutex already locked > by this task at > btrfs_quota_enable() > > So fix this by adding a flag to the transaction handle that signals if the > transaction is being used for enabling quotas (only seen by the task doing > it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit() > if the transaction handle corresponds to the one being used to enable the > quotas. > > Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating > snapshot") > Signed-off-by: Filipe Manana > --- > fs/btrfs/qgroup.c | 10 -- > fs/btrfs/transaction.h | 1 + > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c > index d4917c0cddf5..3aec3bfa3d70 100644 > --- a/fs/btrfs/qgroup.c > +++ b/fs/btrfs/qgroup.c > @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info) > trans = NULL; > goto out; > } > + trans->enabling_quotas = true; > > fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL); > if (!fs_info->qgroup_ulist) { > @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle > *trans, u64 srcid, > u32 level_size = 0; > u64 nums; > > - mutex_lock(_info->qgroup_ioctl_lock); > + if (trans->enabling_quotas) > + lockdep_assert_held(_info->qgroup_ioctl_lock); > + else > + mutex_lock(_info->qgroup_ioctl_lock); > + nit: That's a bit ugly for my taste, but I don't think the alternative is any better: ASSERT((trans->enabling_quotas && !lockdep_assert_held(_info->qgroup_ioctl_lock)) || !trans->enabling_quotas) if (!trans->enabling_quotas) mutex_lock(...) > if (!test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags)) > goto out; > > @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle > *trans, u64 srcid, > unlock: > spin_unlock(_info->qgroup_lock); > out: > - mutex_unlock(_info->qgroup_ioctl_lock); > + if (!trans->enabling_quotas) > + mutex_unlock(_info->qgroup_ioctl_lock); > return ret; > } > > diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h > index 703d5116a2fc..a5553a1dee30 100644 > --- a/fs/btrfs/transaction.h > +++ b/fs/btrfs/transaction.h > @@ -122,6 +122,7 @@ struct btrfs_trans_handle { > bool reloc_reserved; > bool sync; > bool dirty; > + bool enabling_quotas; > struct btrfs_root *root; > struct btrfs_fs_info *fs_info; > struct list_head new_bgs; >