Re: [RFC PATCH] btrfs: qgroup: Fix hang when using inode_cache and qgroup

2017-07-11 Thread Qu Wenruo



在 2017年06月06日 01:38, Goldwyn Rodrigues 写道:



On 05/31/2017 03:08 AM, Qu Wenruo wrote:

Commit 48a89bc4f2ce ("btrfs: qgroups: Retry after commit on getting EDQUOT")
is causing hang, with the following backtrace:

Call Trace:
  __schedule+0x374/0xaf0
  schedule+0x3d/0x90
  wait_for_commit+0x4a/0x80 [btrfs]
  ? wake_atomic_t_function+0x60/0x60
  btrfs_commit_transaction+0xe0/0xa10 [btrfs]  <<< Here
  ? start_transaction+0xad/0x510 [btrfs]
  qgroup_reserve+0x1f0/0x350 [btrfs]
  btrfs_qgroup_reserve_data+0xf8/0x2f0 [btrfs]
  ? _raw_spin_unlock+0x27/0x40
  btrfs_check_data_free_space+0x6d/0xb0 [btrfs]
  btrfs_delalloc_reserve_space+0x25/0x70 [btrfs]
  btrfs_save_ino_cache+0x402/0x650 [btrfs]
  commit_fs_roots+0xb7/0x170 [btrfs]
  btrfs_commit_transaction+0x425/0xa10 [btrfs] <<< And here
  qgroup_reserve+0x1f0/0x350 [btrfs]
  btrfs_qgroup_reserve_data+0xf8/0x2f0 [btrfs]
  ? _raw_spin_unlock+0x27/0x40
  btrfs_check_data_free_space+0x6d/0xb0 [btrfs]
  btrfs_delalloc_reserve_space+0x25/0x70 [btrfs]
  btrfs_direct_IO+0x1c5/0x3b0 [btrfs]
  generic_file_direct_write+0xab/0x150
  btrfs_file_write_iter+0x243/0x530 [btrfs]
  __vfs_write+0xc9/0x120
  vfs_write+0xcb/0x1f0
  SyS_pwrite64+0x79/0x90
  entry_SYSCALL_64_fastpath+0x18/0xad

The problem is that, inode_cache will be written in commit_fs_roots(),
which is called in btrfs_commit_transaction().

And when it fails to reserve enough data space, qgroup_reserve() will
try to call btrfs_commit_transaction() again, then we are waiting for
ourselves.

The patch will introduce can_retry parameter for qgroup_reserve(),
allowing related callers to avoid deadly commit transaction deadlock.

Now for space cache inode, we will not allow qgroup retry, so it will
not cause deadlock.

Fixes: 48a89bc4f2ce ("btrfs: qgroups: Retry after commit on getting EDQUOT")
Cc: Goldwyn Rodrigues 
Signed-off-by: Qu Wenruo 
---
Commit  48a89bc4f2ce ("btrfs: qgroups: Retry after commit on getting EDQUOT")
is not only causing such deadlock, but also screwing up qgroup reserved
space for even generic test cases.

I'm afraid we may need to revert that commit if we can't find a good way
to fix the newly caused qgroup meta reserved space underflow.
(Unlike old bug which is qgroup data reserved space underflow, this time
the commit is causing new metadata space underflow).


I tried the same with direct I/O and have the same results. I run into
underflows often. By reverting the patch, we are avoiding the problem
not resolving it. The numbers don't add up and the point is to find out
where the numbers are getting lost (or counted in excess). I will
continue investigating on this front.


What about moving the transaction commit to other thread?
E.g btrfs-transaction kernel thread.

I know this won't solve the problem as pinpoin as your solution.
We need to info btrfs-transaction thread to commit transcation before 
it's too late.

So we need a threshold mechanism for it to commit transaction in advance.

I think the possibility to encounter such problem can be hugely reduced.
And we can avoid any possible dead lock.

How do you think about this idea?

Thanks,
Qu



By ignoring the warning (unset BTRFS_DEBUG) and continuing during
overflow, we are just avoiding the problem. It does not show up in dmesg
any longer.



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] btrfs: qgroup: Fix hang when using inode_cache and qgroup

2017-06-05 Thread Goldwyn Rodrigues


On 05/31/2017 03:08 AM, Qu Wenruo wrote:
> Commit 48a89bc4f2ce ("btrfs: qgroups: Retry after commit on getting EDQUOT")
> is causing hang, with the following backtrace:
> 
> Call Trace:
>  __schedule+0x374/0xaf0
>  schedule+0x3d/0x90
>  wait_for_commit+0x4a/0x80 [btrfs]
>  ? wake_atomic_t_function+0x60/0x60
>  btrfs_commit_transaction+0xe0/0xa10 [btrfs]  <<< Here
>  ? start_transaction+0xad/0x510 [btrfs]
>  qgroup_reserve+0x1f0/0x350 [btrfs]
>  btrfs_qgroup_reserve_data+0xf8/0x2f0 [btrfs]
>  ? _raw_spin_unlock+0x27/0x40
>  btrfs_check_data_free_space+0x6d/0xb0 [btrfs]
>  btrfs_delalloc_reserve_space+0x25/0x70 [btrfs]
>  btrfs_save_ino_cache+0x402/0x650 [btrfs]
>  commit_fs_roots+0xb7/0x170 [btrfs]
>  btrfs_commit_transaction+0x425/0xa10 [btrfs] <<< And here
>  qgroup_reserve+0x1f0/0x350 [btrfs]
>  btrfs_qgroup_reserve_data+0xf8/0x2f0 [btrfs]
>  ? _raw_spin_unlock+0x27/0x40
>  btrfs_check_data_free_space+0x6d/0xb0 [btrfs]
>  btrfs_delalloc_reserve_space+0x25/0x70 [btrfs]
>  btrfs_direct_IO+0x1c5/0x3b0 [btrfs]
>  generic_file_direct_write+0xab/0x150
>  btrfs_file_write_iter+0x243/0x530 [btrfs]
>  __vfs_write+0xc9/0x120
>  vfs_write+0xcb/0x1f0
>  SyS_pwrite64+0x79/0x90
>  entry_SYSCALL_64_fastpath+0x18/0xad
> 
> The problem is that, inode_cache will be written in commit_fs_roots(),
> which is called in btrfs_commit_transaction().
> 
> And when it fails to reserve enough data space, qgroup_reserve() will
> try to call btrfs_commit_transaction() again, then we are waiting for
> ourselves.
> 
> The patch will introduce can_retry parameter for qgroup_reserve(),
> allowing related callers to avoid deadly commit transaction deadlock.
> 
> Now for space cache inode, we will not allow qgroup retry, so it will
> not cause deadlock.
> 
> Fixes: 48a89bc4f2ce ("btrfs: qgroups: Retry after commit on getting EDQUOT")
> Cc: Goldwyn Rodrigues 
> Signed-off-by: Qu Wenruo 
> ---
> Commit  48a89bc4f2ce ("btrfs: qgroups: Retry after commit on getting EDQUOT")
> is not only causing such deadlock, but also screwing up qgroup reserved
> space for even generic test cases.
> 
> I'm afraid we may need to revert that commit if we can't find a good way
> to fix the newly caused qgroup meta reserved space underflow.
> (Unlike old bug which is qgroup data reserved space underflow, this time
> the commit is causing new metadata space underflow).

I tried the same with direct I/O and have the same results. I run into
underflows often. By reverting the patch, we are avoiding the problem
not resolving it. The numbers don't add up and the point is to find out
where the numbers are getting lost (or counted in excess). I will
continue investigating on this front.

By ignoring the warning (unset BTRFS_DEBUG) and continuing during
overflow, we are just avoiding the problem. It does not show up in dmesg
any longer.


-- 
Goldwyn
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH] btrfs: qgroup: Fix hang when using inode_cache and qgroup

2017-05-31 Thread Qu Wenruo
Commit 48a89bc4f2ce ("btrfs: qgroups: Retry after commit on getting EDQUOT")
is causing hang, with the following backtrace:

Call Trace:
 __schedule+0x374/0xaf0
 schedule+0x3d/0x90
 wait_for_commit+0x4a/0x80 [btrfs]
 ? wake_atomic_t_function+0x60/0x60
 btrfs_commit_transaction+0xe0/0xa10 [btrfs]  <<< Here
 ? start_transaction+0xad/0x510 [btrfs]
 qgroup_reserve+0x1f0/0x350 [btrfs]
 btrfs_qgroup_reserve_data+0xf8/0x2f0 [btrfs]
 ? _raw_spin_unlock+0x27/0x40
 btrfs_check_data_free_space+0x6d/0xb0 [btrfs]
 btrfs_delalloc_reserve_space+0x25/0x70 [btrfs]
 btrfs_save_ino_cache+0x402/0x650 [btrfs]
 commit_fs_roots+0xb7/0x170 [btrfs]
 btrfs_commit_transaction+0x425/0xa10 [btrfs] <<< And here
 qgroup_reserve+0x1f0/0x350 [btrfs]
 btrfs_qgroup_reserve_data+0xf8/0x2f0 [btrfs]
 ? _raw_spin_unlock+0x27/0x40
 btrfs_check_data_free_space+0x6d/0xb0 [btrfs]
 btrfs_delalloc_reserve_space+0x25/0x70 [btrfs]
 btrfs_direct_IO+0x1c5/0x3b0 [btrfs]
 generic_file_direct_write+0xab/0x150
 btrfs_file_write_iter+0x243/0x530 [btrfs]
 __vfs_write+0xc9/0x120
 vfs_write+0xcb/0x1f0
 SyS_pwrite64+0x79/0x90
 entry_SYSCALL_64_fastpath+0x18/0xad

The problem is that, inode_cache will be written in commit_fs_roots(),
which is called in btrfs_commit_transaction().

And when it fails to reserve enough data space, qgroup_reserve() will
try to call btrfs_commit_transaction() again, then we are waiting for
ourselves.

The patch will introduce can_retry parameter for qgroup_reserve(),
allowing related callers to avoid deadly commit transaction deadlock.

Now for space cache inode, we will not allow qgroup retry, so it will
not cause deadlock.

Fixes: 48a89bc4f2ce ("btrfs: qgroups: Retry after commit on getting EDQUOT")
Cc: Goldwyn Rodrigues 
Signed-off-by: Qu Wenruo 
---
Commit  48a89bc4f2ce ("btrfs: qgroups: Retry after commit on getting EDQUOT")
is not only causing such deadlock, but also screwing up qgroup reserved
space for even generic test cases.

I'm afraid we may need to revert that commit if we can't find a good way
to fix the newly caused qgroup meta reserved space underflow.
(Unlike old bug which is qgroup data reserved space underflow, this time
the commit is causing new metadata space underflow).
---
 fs/btrfs/extent-tree.c |  7 +--
 fs/btrfs/qgroup.c  | 15 ++-
 fs/btrfs/qgroup.h  |  2 +-
 fs/btrfs/transaction.c |  2 +-
 4 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index e390451c72e6..15e7fcf3a7ab 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5817,7 +5817,7 @@ int btrfs_subvolume_reserve_metadata(struct btrfs_root 
*root,
if (test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags)) {
/* One for parent inode, two for dir entries */
num_bytes = 3 * fs_info->nodesize;
-   ret = btrfs_qgroup_reserve_meta(root, num_bytes, true);
+   ret = btrfs_qgroup_reserve_meta(root, num_bytes, true, true);
if (ret)
return ret;
} else {
@@ -5945,6 +5945,7 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode 
*inode, u64 num_bytes)
u64 to_free = 0;
unsigned dropped;
bool release_extra = false;
+   bool can_qgroup_retry = true;
 
/* If we are a free space inode we need to not flush since we will be in
 * the middle of a transaction commit.  We also don't need the delalloc
@@ -5957,6 +5958,7 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode 
*inode, u64 num_bytes)
if (btrfs_is_free_space_inode(inode)) {
flush = BTRFS_RESERVE_NO_FLUSH;
delalloc_lock = false;
+   can_qgroup_retry = false;
} else if (current->journal_info) {
flush = BTRFS_RESERVE_FLUSH_LIMIT;
}
@@ -5987,7 +5989,8 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode 
*inode, u64 num_bytes)
 
if (test_bit(BTRFS_FS_QUOTA_ENABLED, _info->flags)) {
ret = btrfs_qgroup_reserve_meta(root,
-   nr_extents * fs_info->nodesize, true);
+   nr_extents * fs_info->nodesize, true,
+   can_qgroup_retry);
if (ret)
goto out_fail;
}
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index deffbeb74a0b..3859acb03ae9 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2322,7 +2322,8 @@ static bool qgroup_check_limits(const struct btrfs_qgroup 
*qg, u64 num_bytes)
return true;
 }
 
-static int qgroup_reserve(struct btrfs_root *root, u64 num_bytes, bool enforce)
+static int qgroup_reserve(struct btrfs_root *root, u64 num_bytes, bool enforce,
+ bool can_retry)
 {
struct btrfs_root *quota_root;
struct btrfs_qgroup *qgroup;
@@ -2364,7 +2365,8 @@ static int qgroup_reserve(struct btrfs_root *root, u64 
num_bytes, bool