Re: [PATCH 1/9] btrfs: qgroup: Add trace point for qgroup reserved space

2017-03-08 Thread Qu Wenruo
At 03/08/2017 12:30 AM, Jeff Mahoney wrote: On 2/27/17 2:10 AM, Qu Wenruo wrote: Introduce the following trace points: qgroup_update_reserve qgroup_meta_reserve These trace points are handy to trace qgroup reserve space related problems. Signed-off-by: Qu Wenruo

[PATCH v3.1 7/7] btrfs: Enhance missing device kernel message

2017-03-08 Thread Qu Wenruo
For missing device, btrfs will just refuse to mount with almost meaningless kernel message like: BTRFS info (device vdb6): disk space caching is enabled BTRFS info (device vdb6): has skinny extents BTRFS error (device vdb6): failed to read the system array: -5 BTRFS error (device vdb6):

Re: [PATCH] btrfs: add missing memset while reading compressed inline extents

2017-03-08 Thread Zygo Blaxell
On Wed, Mar 08, 2017 at 10:27:33AM +, Filipe Manana wrote: > On Wed, Mar 8, 2017 at 3:18 AM, Zygo Blaxell > wrote: > > From: Zygo Blaxell > > > > This is a story about 4 distinct (and very old) btrfs bugs. > > > > Commit

[PATCH v3] btrfs: add missing memset while reading compressed inline extents

2017-03-08 Thread Zygo Blaxell
This is a story about 4 distinct (and very old) btrfs bugs. Commit c8b978188c ("Btrfs: Add zlib compression support") added three data corruption bugs for inline extents (bugs #1-3). Commit 93c82d5750 ("Btrfs: zero page past end of inline file items") fixed bug #1: uncompressed inline extents

Re: [PATCH 5/8] nowait aio: return on congested block device

2017-03-08 Thread Goldwyn Rodrigues
On 03/08/2017 10:17 AM, Jens Axboe wrote: > On 03/08/2017 08:00 AM, Goldwyn Rodrigues wrote: >> >> >> On 03/08/2017 01:03 AM, Sagi Grimberg wrote: >>> -if (likely(blk_queue_enter(q, false) == 0)) { +if (likely(blk_queue_enter(q, bio_flagged(bio, BIO_NOWAIT)) == 0))

[PATCH v3.1 0/7] Chunk level degradable check

2017-03-08 Thread Qu Wenruo
Btrfs currently uses num_tolerated_disk_barrier_failures to do global check for tolerated missing device. Although the one-size-fit-all solution is quite safe, it's too strict if data and metadata has different duplication level. For example, if one use Single data and RAID1 metadata for 2

[PATCH v3.1 2/7] btrfs: Do chunk level rw degrade check at mount time

2017-03-08 Thread Qu Wenruo
Now use the btrfs_check_rw_degradable() to do mount time degration check. With this patch, now we can mount with the following case: # mkfs.btrfs -f -m raid1 -d single /dev/sdb /dev/sdc # wipefs -a /dev/sdc # mount /dev/sdb /mnt/btrfs -o degraded As the single data chunk is only in sdb, so

[PATCH v3.1 1/7] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount

2017-03-08 Thread Qu Wenruo
Introduce a new function, btrfs_check_rw_degradable(), to check if all chunks in btrfs is OK for degraded rw mount. It provides the new basis for accurate btrfs mount/remount and even runtime degraded mount check other than old one-size-fit-all method. Signed-off-by: Qu Wenruo

[PATCH v3.1 5/7] btrfs: Allow barrier_all_devices to do chunk level device check

2017-03-08 Thread Qu Wenruo
The last user of num_tolerated_disk_barrier_failures is barrier_all_devices(). But it's can be easily changed to new per-chunk degradable check framework. Now btrfs_device will have two extra members, representing send/wait error, set at write_dev_flush() time. With these 2 new members,

[PATCH v3.1 3/7] btrfs: Do chunk level degradation check for remount

2017-03-08 Thread Qu Wenruo
Just the same for mount time check, use btrfs_check_rw_degradable() to check if we are OK to be remounted rw. Signed-off-by: Qu Wenruo Tested-by: Austin S. Hemmelgarn Tested-by: Adam Borowski Tested-by: Dmitrii Tcvetkov

[PATCH v3.1 4/7] btrfs: Introduce extra_rw_degrade_errors parameter for btrfs_check_rw_degradable

2017-03-08 Thread Qu Wenruo
Introduce a new structure, extra_rw_degrade_errors, to record devid<->error mapping. This strucutre will have a array to record runtime error, which affects degraded mount, like failure to flush or wait one device. Also allow btrfs_check_rw_degradable() to accept such structure as another error

[PATCH v3.1 6/7] btrfs: Cleanup num_tolerated_disk_barrier_failures

2017-03-08 Thread Qu Wenruo
As we use per-chunk degradable check, now the global num_tolerated_disk_barrier_failures is of no use. So cleanup it. Signed-off-by: Qu Wenruo Tested-by: Austin S. Hemmelgarn Tested-by: Adam Borowski Tested-by: Dmitrii

Re: [PATCH v2 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount

2017-03-08 Thread Qu Wenruo
At 03/09/2017 02:26 AM, Anand Jain wrote: Introduce a new function, btrfs_check_rw_degradable(), to check if all chunks in btrfs is OK for degraded rw mount. It provides the new basis for accurate btrfs mount/remount and even runtime degraded mount check other than old one-size-fit-all

Re: [PATCH v3 0/7] Chunk level degradable check

2017-03-08 Thread Goffredo Baroncelli
Hi Qu, I made some tests (see table below). Basically I created a hybrid btrfs filesystem composed by a metadata "single" profile, and a data single/raid1/raid10/raid5/raid6 profile. For each test case I tried to remove a disk (which could be used by data and or metadata), and then I checked

Re: [PATCH v3 0/7] Chunk level degradable check

2017-03-08 Thread Anand Jain
Looks like I found an unrelated bug, though, that messed with my testing. And it looks like a nasty one: once "btrfs dev scan" sees a disk, it stores its device and will then happily use it without verification even if it's been pulled out and replaced by something else. Lemme investigate that

Re: [PATCH v3 0/7] Chunk level degradable check

2017-03-08 Thread Anand Jain
Looks like I found an unrelated bug, though, that messed with my testing. And it looks like a nasty one: once "btrfs dev scan" sees a disk, it stores its device and will then happily use it without verification even if it's been pulled out and replaced by something else. Lemme investigate that

Re: [PATCH v3 0/7] Chunk level degradable check

2017-03-08 Thread Anand Jain
v3: Remove one duplicated missing device output Use the advice from Anand Jain, not to add new members in btrfs_device, but use a new structure extra_rw_degrade_errors, to record error when sending down/waiting device. Suggested local variables because, v2 had theoretical bug as

Re: [PATCH v2 1/6] btrfs: Introduce a function to check if all chunks a OK for degraded rw mount

2017-03-08 Thread Anand Jain
Introduce a new function, btrfs_check_rw_degradable(), to check if all chunks in btrfs is OK for degraded rw mount. It provides the new basis for accurate btrfs mount/remount and even runtime degraded mount check other than old one-size-fit-all method. Sorry for late response. But this

Re: btrfs throws ENOSPC even on almost empty filesystem when using "cfq" scheduler

2017-03-08 Thread Luca Citi
More information in case it may be useful. I also ran the tests with different mainline kernels. I have downloaded kernels from: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8.2/ http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8.3/ The problem does not seem to affect 4.8.2 but it does

[PATCH] Btrfs: consistent usage of types in balance_args

2017-03-08 Thread Hans van Kranenburg
The btrfs_balance_args are only used for the balance ioctl, so use __u instead of __le here for consistency. The __le usage was introduced in bc3094673f22d and dee32d0ac3719 and was probably a result of copy/pasting when the code was written. The usage of __le did not break anything, but it's

Re: [RFC PATCH v1 00/30] fs: inode->i_version rework and optimization

2017-03-08 Thread J. Bruce Fields
On Fri, Mar 03, 2017 at 07:53:57PM -0500, Jeff Layton wrote: > On Fri, 2017-03-03 at 18:00 -0500, J. Bruce Fields wrote: > > On Wed, Dec 21, 2016 at 12:03:17PM -0500, Jeff Layton wrote: > > > tl;dr: I think we can greatly reduce the cost of the inode->i_version > > > counter, by exploiting the

[PATCH] Btrfs: fix incorrect space accounting after failure to insert inline extent

2017-03-08 Thread fdmanana
From: Filipe Manana When using compression, if we fail to insert an inline extent we incorrectly end up attempting to free the reserved data space twice, once through extent_clear_unlock_delalloc(), because we pass it the flag EXTENT_DO_ACCOUNTING, and once through a direct

Re: [PATCH 5/8] nowait aio: return on congested block device

2017-03-08 Thread Jens Axboe
On 03/08/2017 08:00 AM, Goldwyn Rodrigues wrote: > > > On 03/08/2017 01:03 AM, Sagi Grimberg wrote: >> >>> -if (likely(blk_queue_enter(q, false) == 0)) { >>> +if (likely(blk_queue_enter(q, bio_flagged(bio, BIO_NOWAIT)) >>> == 0)) { >>> ret = q->make_request_fn(q,

Re: [PATCH 5/8] nowait aio: return on congested block device

2017-03-08 Thread Christoph Hellwig
On Wed, Mar 08, 2017 at 04:28:06PM +0100, Jan Kara wrote: > Well, that's not really good. If we cannot support this for both blk-mq and > legacy block layer the feature will not be usable. So please work on blk-mq > support as well. Exactly. In addition to that anything implementing a feature

Re: [PATCH 5/8] nowait aio: return on congested block device

2017-03-08 Thread Jan Kara
On Wed 08-03-17 09:00:09, Goldwyn Rodrigues wrote: > > > On 03/08/2017 01:03 AM, Sagi Grimberg wrote: > > > >> -if (likely(blk_queue_enter(q, false) == 0)) { > >> +if (likely(blk_queue_enter(q, bio_flagged(bio, BIO_NOWAIT)) > >> == 0)) { > >> ret =

btrfs throws ENOSPC even on almost empty filesystem when using "cfq" scheduler

2017-03-08 Thread Luca Citi
This is a follow up from my previous bug report: http://www.spinics.net/lists/linux-btrfs/msg62916.html I think this is a serious bug in btrfs and should be investigated. I have done some further progress trying to collect information that may help btrfs developers understand where the problem

Re: [PATCH 0/9 PULL REQUEST] Qgroup fixes for 4.11

2017-03-08 Thread Jeff Mahoney
On 3/6/17 3:08 AM, Qu Wenruo wrote: > Any response? > > These patches are already here for at least 2 kernel releases. > And are all bug fixes, and fix bugs that are already reported. > > I didn't see any reason why it should be delayed for so long time. I'll work my way through as I can. I

Re: [PATCH 5/8] nowait aio: return on congested block device

2017-03-08 Thread Goldwyn Rodrigues
On 03/08/2017 01:03 AM, Sagi Grimberg wrote: > >> -if (likely(blk_queue_enter(q, false) == 0)) { >> +if (likely(blk_queue_enter(q, bio_flagged(bio, BIO_NOWAIT)) >> == 0)) { >> ret = q->make_request_fn(q, bio); > > I think that for ->make_request to not block we'd

Re: [PATCH 2/2] btrfs-progs: receive: handle root subvol path in clone

2017-03-08 Thread David Sterba
On Wed, Feb 22, 2017 at 11:56:37PM +0100, Benedikt Morbach wrote: > testcase: > # ro subvol /src/parent > # rw subvol /src/foo > clone /src/parent/file /src/foo/file > subvol snapshot -r /src/foo /src/foo.snap > > # generates a "clone parent/file -> foo.snap/file" send command

Re: [PATCH] btrfs: Change s_flags instead of returning -EBUSY

2017-03-08 Thread Goldwyn Rodrigues
On 03/07/2017 07:47 AM, David Sterba wrote: > On Sat, Mar 04, 2017 at 12:33:22PM -0600, Goldwyn Rodrigues wrote: >> From: Goldwyn Rodrigues >> >> The problem is with parallel mounting multiple subvolumes rw when the >> root filesystem is marked as read-only such as a boot

Re: [PATCH 3/9] btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount option

2017-03-08 Thread Jeff Mahoney
On 3/7/17 7:36 PM, Qu Wenruo wrote: > > > At 03/08/2017 03:21 AM, Jeff Mahoney wrote: >> On 2/27/17 2:10 AM, Qu Wenruo wrote: >>> [BUG] >>> The easist way to reproduce the bug is: >>> -- >>> # mkfs.btrfs -f $dev -n 16K >>> # mount $dev $mnt -o inode_cache >>> # btrfs quota enable $mnt >>>

Btrfs progs release 4.10

2017-03-08 Thread David Sterba
Hi, btrfs-progs version 4.10 have been released. There are patches that have queued so far, plus a few recent additions from the mailingslit. Changes: * send: dump output fixes: missing newlies * check: several fixes for the lowmem mode, improved error reporting * build * removed some

Re: [PATCH v3 0/7] Chunk level degradable check

2017-03-08 Thread Austin S. Hemmelgarn
On 2017-03-07 21:41, Qu Wenruo wrote: Btrfs currently uses num_tolerated_disk_barrier_failures to do global check for tolerated missing device. Although the one-size-fit-all solution is quite safe, it's too strict if data and metadata has different duplication level. For example, if one use

Re: mount policy for subvols if root is remount,ro

2017-03-08 Thread Goldwyn Rodrigues
On 03/07/2017 07:59 AM, Goldwyn Rodrigues wrote: > Hi, > > I want to know if re-mounting the root filesystem read-only should > change subvolumes mounted as read-write to read-only as well? We do > allow mounting subvolumes RW _after_ the root filesystem is mounted RO. > > # mount /dev/vdb

Re: [PATCH v2] btrfs-progs: report I/O errors when closing the filesystem

2017-03-08 Thread David Sterba
On Mon, Mar 06, 2017 at 08:42:01AM +0800, Qu Wenruo wrote: > > > At 03/04/2017 01:02 AM, Omar Sandoval wrote: > > From: Omar Sandoval > > > > If the final fsync() on the Btrfs device fails, we just swallow the > > error and don't alert the user in any way. This was uncovered by

Re: Btrfs progs pre-release 4.10-rc1

2017-03-08 Thread David Sterba
On Wed, Mar 08, 2017 at 09:11:37AM +0900, Tsutomu Itoh wrote: > >>> > >>> Benedikt Morbach (1): > >>> btrfs-progs: send-dump: add missing newlines > >>> > >>> David Sterba (102): > >> > >>> btrfs-progs: rework option parser to use getopt for global options > >> > >> I think that

Re: [PATCH v7 2/2] btrfs: Handle delalloc error correctly to avoid ordered extent hang

2017-03-08 Thread Filipe Manana
On Wed, Mar 8, 2017 at 2:25 AM, Qu Wenruo wrote: > [BUG] > If run_delalloc_range() returns error and there is already some ordered > extents created, btrfs will be hanged with the following backtrace: > > Call Trace: > __schedule+0x2d4/0xae0 > schedule+0x3d/0x90 >

Re: [PATCH v7 1/2] btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error

2017-03-08 Thread Filipe Manana
On Wed, Mar 8, 2017 at 2:25 AM, Qu Wenruo wrote: > [BUG] > When btrfs_reloc_clone_csum() reports error, it can underflow metadata > and leads to kernel assertion on outstanding extents in > run_delalloc_nocow() and cow_file_range(). > > BTRFS info (device vdb5):

Re: [PATCH] btrfs: add missing memset while reading compressed inline extents

2017-03-08 Thread Filipe Manana
On Wed, Mar 8, 2017 at 3:18 AM, Zygo Blaxell wrote: > From: Zygo Blaxell > > This is a story about 4 distinct (and very old) btrfs bugs. > > Commit c8b978188c ("Btrfs: Add zlib compression support") added > three data corruption bugs

Re: [PATCH v2] Btrfs: fix invalid attempt to free reserved space on failure to cow range

2017-03-08 Thread Liu Bo
On Tue, Mar 07, 2017 at 04:24:49AM +, fdman...@kernel.org wrote: > From: Filipe Manana > > When attempting to COW a file range (we are starting writeback and doing > COW), if we manage to reserve an extent for the range we will write into > but fail after reserving it and

Re: [PATCH v3 0/7] Chunk level degradable check

2017-03-08 Thread Dmitrii Tcvetkov
On Wed, 8 Mar 2017 10:41:17 +0800 Qu Wenruo wrote: > This patchset will introduce a new per-chunk degradable check for > btrfs, allow above case to succeed, and it's quite small anyway. > v2: > Update after almost 2 years. > Add the last patch to enhance the kernel