BTRFS kernel OOPS 4.8.11

2016-12-05 Thread Gerard Saraber
I have a NAS with a mix of 6, 4 and 3 TB drives: shrapnel zm # btrfs filesystem df /home/exports Data, RAID1: total=19.59TiB, used=19.51TiB System, RAID1: total=32.00MiB, used=2.75MiB Metadata, RAID1: total=76.00GiB, used=74.71GiB GlobalReserve, single: total=512.00MiB, used=0.00B shrapnel zm #

Re: bio linked list corruption.

2016-12-05 Thread Vegard Nossum
On 5 December 2016 at 19:11, Andy Lutomirski wrote: > On Sun, Dec 4, 2016 at 3:04 PM, Vegard Nossum wrote: >> On 23 November 2016 at 20:58, Dave Jones wrote: >>> On Wed, Nov 23, 2016 at 02:34:19PM -0500, Dave Jones wrote: >>>

Re: bio linked list corruption.

2016-12-05 Thread Linus Torvalds
On Mon, Dec 5, 2016 at 10:11 AM, Andy Lutomirski wrote: > > So your kernel has been smp-alternatived. That 3e comes from > alternatives_smp_unlock. If you're running on SMP with UP > alternatives, things will break. I'm assuming he's just running in a VM with a single CPU.

Re: bio linked list corruption.

2016-12-05 Thread Linus Torvalds
On Mon, Dec 5, 2016 at 11:11 AM, Vegard Nossum wrote: > > [ cut here ] > WARNING: CPU: 22 PID: 14012 at mm/shmem.c:2668 shmem_fallocate+0x9a7/0xac0 Ok, good. So that's confirmed as the cause of this problem. And the call chain that I wanted is

Re: bio linked list corruption.

2016-12-05 Thread Vegard Nossum
On 5 December 2016 at 20:11, Vegard Nossum wrote: > On 5 December 2016 at 18:55, Linus Torvalds > wrote: >> On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum >> wrote: >> Since you apparently can recreate this fairly

Re: bio linked list corruption.

2016-12-05 Thread Vegard Nossum
On 5 December 2016 at 21:35, Linus Torvalds wrote: > Note for Ingo and Peter: this patch has not been tested at all. But > Vegard did test an earlier patch of mine that just verified that yes, > the issue really was that wait queue entries remained on the wait >

Re: bio linked list corruption.

2016-12-05 Thread Linus Torvalds
Adding the scheduler people to the participants list, and re-attaching the patch, because while this patch is internal to the VM code, the issue itself is not. There might well be other cases where somebody goes "wake_up_all()" will wake everybody up, so I can put the wait queue head on the

Re: bio linked list corruption.

2016-12-05 Thread Linus Torvalds
On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum wrote: > > The warning shows that it made it past the list_empty_careful() check > in finish_wait() but then bugs out on the >task_list > dereference. > > Anything stick out? I hate that shmem waitqueue garbage. It's really

[RESEND][PATCH v2] btrfs-progs: add dev stats returncode option

2016-12-05 Thread Austin S. Hemmelgarn
Currently, `btrfs device stats` returns non-zero only when there was an error getting the counter values. This is fine for when it gets run by a user directly, but is a serious pain when trying to use it in a script or for monitoring since you need to parse the (not at all machine friendly)

Re: [PATCH v2] btrfs-progs: utils: negative numbers are more plausible than sizes over 8 EiB

2016-12-05 Thread Omar Sandoval
On Sat, Dec 03, 2016 at 03:39:54PM -0500, Zygo Blaxell wrote: > I got tired of seeing "16.00EiB" whenever btrfs-progs encounters a > negative size value, e.g. during resize: > > Unallocated: >/dev/mapper/datamd18 16.00EiB > > This version is much more useful: > > Unallocated: >

Re: bio linked list corruption.

2016-12-05 Thread Dave Jones
On Mon, Dec 05, 2016 at 06:09:29PM +0100, Vegard Nossum wrote: > On 5 December 2016 at 12:10, Vegard Nossum wrote: > > On 5 December 2016 at 00:04, Vegard Nossum wrote: > >> FWIW I hit this as well: > >> > >> BUG: unable to handle kernel

Re: bio linked list corruption.

2016-12-05 Thread Vegard Nossum
On 5 December 2016 at 18:55, Linus Torvalds wrote: > On Mon, Dec 5, 2016 at 9:09 AM, Vegard Nossum wrote: >> >> The warning shows that it made it past the list_empty_careful() check >> in finish_wait() but then bugs out on the >task_list >>

Re: bio linked list corruption.

2016-12-05 Thread Andy Lutomirski
On Sun, Dec 4, 2016 at 3:04 PM, Vegard Nossum wrote: > On 23 November 2016 at 20:58, Dave Jones wrote: >> On Wed, Nov 23, 2016 at 02:34:19PM -0500, Dave Jones wrote: >> >> > [ 317.689216] BUG: Bad page state in process kworker/u8:8 pfn:4d8fd4

Re: system hangs due to qgroups

2016-12-05 Thread Marc Joliet
On Monday 05 December 2016 11:16:35 Marc Joliet wrote: [...] > https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.7.3_sanitized.imag > e.xz > https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.8.5_sanitized.ima > ge.xz BTW, since my problem appears to have been known, does anybody

Re: [PATCH 1/2] btrfs-progs: Correct value printed by assertions/BUG_ON/WARN_ON

2016-12-05 Thread Qu Wenruo
BTW, the DISABLE_BACKTRACE branch seems quite different from backtrace one. #define BUG_ON(c) assert_trace(#c, __FILE__, __func__, __LINE__, (long)(c)) #define WARN_ON(c) warning_trace(#c, __FILE__, __func__, __LINE__, (long)(c)) #define ASSERT(c) assert_trace(#c, __FILE__, __func__, __LINE__,

Re: [PATCH 1/2] btrfs-progs: Correct value printed by assertions/BUG_ON/WARN_ON

2016-12-05 Thread Qu Wenruo
At 12/06/2016 10:51 AM, Goldwyn Rodrigues wrote: On 12/05/2016 08:03 PM, Qu Wenruo wrote: BTW, the DISABLE_BACKTRACE branch seems quite different from backtrace one. #define BUG_ON(c) assert_trace(#c, __FILE__, __func__, __LINE__, (long)(c)) #define WARN_ON(c) warning_trace(#c, __FILE__,

[PATCH] generic/35[67]: disable swapfile tests on Btrfs

2016-12-05 Thread Omar Sandoval
From: Omar Sandoval Btrfs doesn't support swapfiles (yet?), so generic/356 fails erroneously, and generic/357 only passes by accident. Let's add a _require_scratch_swapfile helper and add it to these tests. Signed-off-by: Omar Sandoval --- I have some code

Re: [SOLVED] Re: system hangs due to qgroups

2016-12-05 Thread Qu Wenruo
At 12/05/2016 10:43 PM, Marc Joliet wrote: On Monday 05 December 2016 12:01:28 Marc Joliet wrote: This seems to be a NULL pointer bug in qgroup relocation fix. The latest fix (not merged yet) should address it. You could try the for-next-20161125 branch from David to fix it:

Re: [PATCH v2 02/14] btrfs-progs: check: introduce function to find dir_item

2016-12-05 Thread Qu Wenruo
At 11/02/2016 11:21 PM, David Sterba wrote: On Wed, Sep 21, 2016 at 11:15:52AM +0800, Qu Wenruo wrote: From: Lu Fengqi Introduce a new function find_dir_item() to find DIR_ITEM for the given key, and check it with the specified INODE_REF/INODE_EXTREF match.

Re: crc32c_le performance hit

2016-12-05 Thread Chris Murphy
On Mon, Dec 5, 2016 at 8:46 AM, Chris Mason wrote: > On 12/04/2016 04:28 PM, Chris Murphy wrote: >> >> 4.8.11-300.fc25.x86_64 >> >> I'm currently doing a btrfs send/receive and I'm seeing a rather large >> hit for crc32c, bigger than aes-ni (the volume is on dm crypt), using >> perf

Re: [PATCH 1/2] btrfs-progs: Correct value printed by assertions/BUG_ON/WARN_ON

2016-12-05 Thread Goldwyn Rodrigues
On 12/05/2016 08:03 PM, Qu Wenruo wrote: > BTW, the DISABLE_BACKTRACE branch seems quite different from backtrace one. > > #define BUG_ON(c) assert_trace(#c, __FILE__, __func__, __LINE__, (long)(c)) > #define WARN_ON(c) warning_trace(#c, __FILE__, __func__, __LINE__, > (long)(c)) > #define

[PATCH 2/3] btrfs: cow_file_range() num_bytes and disk_num_bytes are same

2016-12-05 Thread Anand Jain
This patch deletes local variable disk_num_bytes as its value is same as num_bytes in the function cow_file_range(). Signed-off-by: Anand Jain --- fs/btrfs/inode.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/inode.c

[PATCH 1/3] btrfs: use BTRFS_COMPRESS_NONE to specify no compression

2016-12-05 Thread Anand Jain
Signed-off-by: Anand Jain --- fs/btrfs/inode.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8e3a5a266917..96e5f8a49d4c 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -540,7 +540,7 @@ static

[PATCH 0/3] Misc fixes mostly cleanup

2016-12-05 Thread Anand Jain
A set of unrelated miscellaneous cleanup patches. Anand Jain (3): btrfs: use BTRFS_COMPRESS_NONE to specify no compression btrfs: cow_file_range() num_bytes and disk_num_bytes are same btrfs: consolidate auto defrag kick off policies fs/btrfs/inode.c | 44

[PATCH 3/3] btrfs: consolidate auto defrag kick off policies

2016-12-05 Thread Anand Jain
As of now writes smaller than 64k for non compressed extents and 16k for compressed extents inside eof are considered as candidate for auto defrag, put them together at a place. --- fs/btrfs/inode.c | 27 +++ 1 file changed, 19 insertions(+), 8 deletions(-) diff --git

[PATCH] btrfs-progs: recursive defrag cleanup duplicate code

2016-12-05 Thread Anand Jain
Signed-off-by: Anand Jain --- cmds-filesystem.c | 20 ++-- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/cmds-filesystem.c b/cmds-filesystem.c index 41623f3183a8..ecac37edf936 100644 --- a/cmds-filesystem.c +++ b/cmds-filesystem.c @@

[PATCH] recursive defrag cleanup

2016-12-05 Thread Anand Jain
The command, btrfs fi defrag -v /btrfs does nothing, it won't defrag the files under /btrfs as user may expect. The command with recursive option btrfs fi defrag -vr /btrfs would defrag all the files under /btrfs including files in its sub directories. While attempting to fix this. The

[PATCH 1/4] btrfs-progs: check: Fix assert when using lowmem on fs with tree reloc tree

2016-12-05 Thread Qu Wenruo
When using lowmem mode, btrfs check will report ASSERT for calling btrfs_read_fs_root() on tree reloc tree. Fix it by checking objectid and call btrfs_read_fs_root_no_cache() for tree reloc tree. Signed-off-by: Qu Wenruo --- cmds-check.c | 8 +++- 1 file changed, 7

[PATCH 2/4] btrfs-progs: check: Fix lowmem mode stack overflow caused by fsck/023

2016-12-05 Thread Qu Wenruo
Lowmem mode fsck will overflow its stack since it will do infinite backref check for tree reloc root. We should not check backref if it's pointing to itself for tree reloc root. Signed-off-by: Qu Wenruo --- cmds-check.c | 16 ++-- 1 file changed, 14

[PATCH 3/4] btrfs-progs: check: Fix lowmem false alert on tree reloc tree

2016-12-05 Thread Qu Wenruo
Lowmem mode will report false alert if the fs has tree reloc tree like: ERROR: shared extent[30011392 4096] lost its parent (parent: 30011392, level: 1) The problem is check_shared_block_backref() can't handle tree reloc tree's self-pointing backref. And still try to read out the tree block then

[PATCH 4/4] btrfs-progs: check: Fix false alert on generation mismatch for tree reloc tree

2016-12-05 Thread Qu Wenruo
For fs with tree reloc tree(under balancing), lowmem mode will report false alert like: ERROR: extent[62914560 4096] backref generation mismatch, wanted: <=9, have: 13 This is because lowmem mode adds a more restrict check, to ensure generation in fs tree won't be smaller than extent tree. In

Re: [PATCH 1/2] btrfs-progs: Correct value printed by assertions/BUG_ON/WARN_ON

2016-12-05 Thread Goldwyn Rodrigues
Hi Qu, Yes, the assert for ifdef BTRFS_DIABLE_BACKTRACE is not correct. The condition should not have a not(!). Thanks for reporting. On 12/05/2016 01:10 AM, Qu Wenruo wrote: > Hi, Goldwyn and David, > > This patch seems to cause btrfs test case 023 to fail. > > Bisect leads me to this patch.

Re: system hangs due to qgroups

2016-12-05 Thread Marc Joliet
On Monday 05 December 2016 10:00:13 Marc Joliet wrote: > OK, I'll post the URLs once the images are uploaded. (I had Dropbox public > URLs right before my desktop crashed -- see below -- but now dropbox-cli > doesn't want to create them.) Alright, here you go:

Re: bio linked list corruption.

2016-12-05 Thread Vegard Nossum
On 5 December 2016 at 00:04, Vegard Nossum wrote: > FWIW I hit this as well: > > BUG: unable to handle kernel paging request at 81ff08b7 > IP: [] __lock_acquire.isra.32+0xda/0x1a30 > CPU: 0 PID: 21744 Comm: trinity-c56 Tainted: GB 4.9.0-rc7+ #217

Re: [PATCH] Btrfs: fix lockdep warning about log_mutex

2016-12-05 Thread Filipe Manana
On Thu, Dec 1, 2016 at 9:45 PM, Liu Bo wrote: > While checking INODE_REF/INODE_EXTREF for a corner case, we may acquire a > different inode's log_mutex with holding the current inode's log_mutex, and > lockdep has complained this with a possilble deadlock warning. > > Fix

Re: Metadata balance fails ENOSPC

2016-12-05 Thread Duncan
Stefan Priebe - Profihost AG posted on Mon, 05 Dec 2016 12:12:12 +0100 as excerpted: > isn't there a way to move free space to unallocated space again? Yes, btrfs balance, but... -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use

Re: Metadata balance fails ENOSPC

2016-12-05 Thread Stefan Priebe - Profihost AG
isn't there a way to move free space to unallocated space again? Am 03.12.2016 um 05:43 schrieb Andrei Borzenkov: > 01.12.2016 18:48, Chris Murphy пишет: >> On Thu, Dec 1, 2016 at 7:10 AM, Stefan Priebe - Profihost AG >> wrote: >>> >>> Am 01.12.2016 um 14:51 schrieb Hans

Re: system hangs due to qgroups

2016-12-05 Thread Marc Joliet
On Monday 05 December 2016 12:01:28 Marc Joliet wrote: > > You could try the for-next-20161125 branch from David to fix it: > > https://github.com/kdave/btrfs-devel/tree/for-next-20161125 > > OK, I'll try that, thanks! I just have to wait for it to finish cloning... FWIW, I get this warning:

Re: system hangs due to qgroups

2016-12-05 Thread Marc Joliet
On Sunday 04 December 2016 11:52:40 Chris Murphy wrote: > On Sun, Dec 4, 2016 at 9:02 AM, Marc Joliet wrote: > > Also, now the file system fails with the BUG I mentioned, see here: > > > > [Sun Dec 4 12:27:07 2016] BUG: unable to handle kernel paging request at > >

[PATCH 0/4] Lowmem fsck false alert fixes

2016-12-05 Thread Qu Wenruo
Btrfs-progs test case 023 will cause assert and a lot of false alerts for lowmem mode. The problems are caused by several reasons, from bad handler for tree reloc root(calling btrfs_read_fs_root on tree reloc tree) to too restrict check. Fix the lowmem mode bugs. There is another bug which

[SOLVED] Re: system hangs due to qgroups

2016-12-05 Thread Marc Joliet
On Monday 05 December 2016 12:01:28 Marc Joliet wrote: > > This seems to be a NULL pointer bug in qgroup relocation fix. > > > > > > > > The latest fix (not merged yet) should address it. > > > > > > > > You could try the for-next-20161125 branch from David to fix it: > >

Re: BTRFS kernel OOPS 4.8.11

2016-12-05 Thread Borislav Petkov
+ linux-btrfs On Mon, Dec 05, 2016 at 09:30:52AM -0600, Gerard Saraber wrote: > I have a NAS with a mix of 6, 4 and 3 TB drives: > > shrapnel zm # btrfs filesystem df /home/exports > Data, RAID1: total=19.59TiB, used=19.51TiB > System, RAID1: total=32.00MiB, used=2.75MiB > Metadata, RAID1:

Re: crc32c_le performance hit

2016-12-05 Thread Chris Mason
On 12/04/2016 04:28 PM, Chris Murphy wrote: 4.8.11-300.fc25.x86_64 I'm currently doing a btrfs send/receive and I'm seeing a rather large hit for crc32c, bigger than aes-ni (the volume is on dm crypt), using perf top. 14.03% btrfs[.] __crc32c_le 10.50%

Re: [PATCH 10/18] btrfs: root->fs_info cleanup, btrfs_calc_{trans,trunc}_metadata_size

2016-12-05 Thread David Sterba
On Fri, Dec 02, 2016 at 12:07:30AM -0500, je...@suse.com wrote: > -static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_root *root, > +static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_fs_info > *fs_info, >unsigned num_items) >

Re: [PATCH 10/18] btrfs: root->fs_info cleanup, btrfs_calc_{trans,trunc}_metadata_size

2016-12-05 Thread Jeff Mahoney
On 12/5/16 10:29 AM, David Sterba wrote: > On Fri, Dec 02, 2016 at 12:07:30AM -0500, je...@suse.com wrote: >> -static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_root *root, >> +static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_fs_info >> *fs_info, >>

BTRFS kernel OOPS 4.8.11

2016-12-05 Thread Gerard Saraber
I have a NAS with a mix of 6, 4 and 3 TB drives: shrapnel zm # btrfs filesystem df /home/exports Data, RAID1: total=19.59TiB, used=19.51TiB System, RAID1: total=32.00MiB, used=2.75MiB Metadata, RAID1: total=76.00GiB, used=74.71GiB GlobalReserve, single: total=512.00MiB, used=0.00B shrapnel zm #

Re: bio linked list corruption.

2016-12-05 Thread Vegard Nossum
On 5 December 2016 at 12:10, Vegard Nossum wrote: > On 5 December 2016 at 00:04, Vegard Nossum wrote: >> FWIW I hit this as well: >> >> BUG: unable to handle kernel paging request at 81ff08b7 >> IP: [] __lock_acquire.isra.32+0xda/0x1a30