[BUG]
With current subpage RW support, the following script can hang the fs on
with 64K page size.

 # mkfs.btrfs -f -s 4k $dev
 # mount $dev -o nospace_cache $mnt
 # fsstress -w -n 50 -p 1 -s 1607749395 -d $mnt

The kernel will do an infinite loop in btrfs_punch_hole_lock_range().

[CAUSE]
In btrfs_punch_hole_lock_range() we:
- Truncate page cache range
- Lock extent io tree
- Wait any ordered extents in the range.

We exit the loop until we meet all the following conditions:
- No ordered extent in the lock range
- No page is in the lock range

The latter condition has a pitfall, it only works for sector size ==
PAGE_SIZE case.

While can't handle the following subpage case:

  0       32K     64K     96K     128K
  |       |///////||//////|       ||

lockstart=32K
lockend=96K - 1

In this case, although the range cross 2 pages,
truncate_pagecache_range() will invalidate no page at all, but only zero
the [32K, 96K) range of the two pages.

Thus filemap_range_has_page(32K, 96K-1) will always return true, thus we
will never meet the loop exit condition.

[FIX]
Fix the problem by doing page alignment for the lock range.

Function filemap_range_has_page() has already handled lend < lstart
case, we only need to round up @lockstart, and round_down @lockend for
truncate_pagecache_range().

This modification should not change any thing for sector size ==
PAGE_SIZE case, as in that case our range is already page aligned.

Signed-off-by: Qu Wenruo <w...@suse.com>
---
 fs/btrfs/file.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 8f71699fdd18..45ec3f5ef839 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2471,6 +2471,16 @@ static int btrfs_punch_hole_lock_range(struct inode 
*inode,
                                       const u64 lockend,
                                       struct extent_state **cached_state)
 {
+       /*
+        * For subpage case, if the range is not at page boundary, we could
+        * have pages at the leading/tailing part of the range.
+        * This could lead to dead loop since filemap_range_has_page()
+        * will always return true.
+        * So here we need to do extra page alignment for
+        * filemap_range_has_page().
+        */
+       u64 page_lockstart = round_up(lockstart, PAGE_SIZE);
+       u64 page_lockend = round_down(lockend + 1, PAGE_SIZE) - 1;
        while (1) {
                struct btrfs_ordered_extent *ordered;
                int ret;
@@ -2491,7 +2501,7 @@ static int btrfs_punch_hole_lock_range(struct inode 
*inode,
                    (ordered->file_offset + ordered->num_bytes <= lockstart ||
                     ordered->file_offset > lockend)) &&
                     !filemap_range_has_page(inode->i_mapping,
-                                            lockstart, lockend)) {
+                                            page_lockstart, page_lockend)) {
                        if (ordered)
                                btrfs_put_ordered_extent(ordered);
                        break;
-- 
2.31.1

Reply via email to