On 2017年07月25日 04:00, Josef Bacik wrote:
On Wed, Jul 12, 2017 at 04:49:48PM +0800, Lu Fengqi wrote:
From: Wang Xiaoguang <wangxg.f...@cn.fujitsu.com>
This issue was revealed by modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB,
When modifying BTRFS_MAX_EXTENT_SIZE(128MB) to 64KB, fsstress test often
gets these warnings from btrfs_destroy_inode():
WARN_ON(BTRFS_I(inode)->outstanding_extents);
WARN_ON(BTRFS_I(inode)->reserved_extents);
Simple test program below can reproduce this issue steadily.
Note: you need to modify BTRFS_MAX_EXTENT_SIZE to 64KB to have test,
otherwise there won't be such WARNING.
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main(void)
{
int fd;
char buf[68 *1024];
memset(buf, 0, 68 * 1024);
fd = open("testfile", O_CREAT | O_EXCL | O_RDWR);
pwrite(fd, buf, 68 * 1024, 64 * 1024);
return;
}
When BTRFS_MAX_EXTENT_SIZE is 64KB, and buffered data range is:
64KB 128K 132KB
|-----------------------------------------------|---------------|
64 + 4KB
1) for above data range, btrfs_delalloc_reserve_metadata() will reserve
metadata and set BTRFS_I(inode)->outstanding_extents to 2.
(68KB + 64KB - 1) / 64KB == 2
Outstanding_extents: 2
2) then btrfs_dirty_page() will be called to dirty pages and set
EXTENT_DELALLOC flag. In this case, btrfs_set_bit_hook() will be called
twice.
The 1st set_bit_hook() call will set DEALLOC flag for the first 64K.
64KB 128KB
|-----------------------------------------------|
64KB DELALLOC
Outstanding_extents: 2
Set_bit_hooks() uses FIRST_DELALLOC flag to avoid re-increase
outstanding_extents counter.
So for 1st set_bit_hooks() call, it won't modify outstanding_extents,
it's still 2.
Then FIRST_DELALLOC flag is *CLEARED*.
3) 2nd btrfs_set_bit_hook() call.
Because FIRST_DELALLOC have been cleared by previous set_bit_hook(),
btrfs_set_bit_hook() will increase BTRFS_I(inode)->outstanding_extents by
one, so now BTRFS_I(inode)->outstanding_extents is 3.
64KB 128KB 132KB
|-----------------------------------------------|----------------|
64K DELALLOC 4K DELALLOC
Outstanding_extents: 3
But the correct outstanding_extents number should be 2, not 3.
The 2nd btrfs_set_bit_hook() call just screwed up this, and leads to the
WARN_ON().
Normally, we can solve it by only increasing outstanding_extents in
set_bit_hook().
But the problem is for delalloc_reserve/release_metadata(), we only have
a 'length' parameter, and calculate in-accurate outstanding_extents.
If we only rely on set_bit_hook() release_metadata() will crew things up
as it will decrease inaccurate number.
So the fix we use is:
1) Increase *INACCURATE* outstanding_extents at delalloc_reserve_meta
Just as a place holder.
2) Increase *accurate* outstanding_extents at set_bit_hooks()
This is the real increaser.
3) Decrease *INACCURATE* outstanding_extents before returning
This makes outstanding_extents to correct value.
For 128M BTRFS_MAX_EXTENT_SIZE, due to limitation of
__btrfs_buffered_write(), each iteration will only handle about 2MB
data.
So btrfs_dirty_pages() won't need to handle cases cross 2 extents.
NACK on this, it just makes a crappy problem harder to understand. We need to
stop using outstanding_extents as both a reservation thing _and_ an accurate
count.
Indeed, outstanding_extent is hard to understand and causing tons of
problems.
I'll rework this and fix the other issues related to over-reservation
with compression. Thanks,
Looking forward to it.
Thanks,
Qu
Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html