Re: [PATCH 3/6] fstests: regression test for btrfs dio read repair

2017-05-16 Thread Liu Bo
On Tue, May 16, 2017 at 11:48:46AM -0600, Liu Bo wrote: > On Wed, May 10, 2017 at 06:53:26PM +0800, Eryu Guan wrote: > > On Tue, May 09, 2017 at 11:56:08AM -0600, Liu Bo wrote: [...] > > > + > > > +# step 3, 128k dio read (this read can repair bad copy) > > > +echo "step 3..repair the bad

Re: Can't remount a BTRFS partition read write after a drive failure

2017-05-16 Thread Chris Murphy
On Tue, May 16, 2017 at 6:56 AM, Sylvain Leroux wrote: > Hi, > I'm investigating BTRFS using an external USB HDD on a Linux Debian > Stretch/Sid system. > > The drive is not reliable. And I noticed when there is an error and the > USB device appears to be dead to the kernel,

[RFC PATCH v3.2 6/6] btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges

2017-05-16 Thread Qu Wenruo
[BUG] For the following case, btrfs can underflow qgroup reserved space at error path: (Page size 4K, function name without "btrfs_" prefix) Task A | Task B -- Buffered_write [0, 2K)

[RFC PATCH v3.2 3/6] btrfs: qgroup: Return actually freed bytes for qgroup release or free data

2017-05-16 Thread Qu Wenruo
btrfs_qgroup_release/free_data() only returns 0 or a negative error number (ENOMEM is the only possible error). This is normally good enough, but sometimes we need the accurate byte number it freed/released. Change it to return actually released/freed bytenr number instead of 0 for success. And

[RFC PATCH v3.2 4/6] btrfs: qgroup: Fix qgroup reserved space underflow caused by buffered write and quota enable

2017-05-16 Thread Qu Wenruo
[BUG] Under the following case, we can underflow qgroup reserved space. Task A|Task B --- Quota disabled | Buffered write | |- btrfs_check_data_free_space() |

[RFC PATCH v3.2 1/6] btrfs: qgroup: Add quick exit for non-fs extents

2017-05-16 Thread Qu Wenruo
Modify btrfs_qgroup_account_extent() to exit quicker for non-fs extents. The quick exit condition is: 1) The extent belongs to a non-fs tree Only fs-tree extents can affect qgroup numbers and is the only case where extent can be shared between different trees. Although strictly speaking

[RFC PATCH v3.2 5/6] btrfs: qgroup: Introduce extent changeset for qgroup reserve functions

2017-05-16 Thread Qu Wenruo
Introduce a new parameter, struct extent_changeset for btrfs_qgroup_reserved_data() and its callers. Such extent_changeset was used in btrfs_qgroup_reserve_data() to record which range it reserved in current reserve, so it can free it at error path. The reason we need to export it to callers is,

[RFC PATCH v3.2 2/6] btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function

2017-05-16 Thread Qu Wenruo
Quite a lot of qgroup corruption happens due to wrong timing of calling btrfs_qgroup_prepare_account_extents(). Since the safest timing is calling it just before btrfs_qgroup_account_extents(), there is no need to separate these 2 function. Merging them will make code cleaner and less bug prone.

[RFC PATCH v3.2 0/6] Qgroup fixes, Non-stack version

2017-05-16 Thread Qu Wenruo
The remaining qgroup fixes patches, based on the Chris' for-linus-4.12 branch with commit 9bcaaea7418d09691f1ffab5c49aacafe3eef9d0 as base. Can be fetched from github: https://github.com/adam900710/linux/tree/qgroup_fixes_non_stack This update is commit message update only. v2: Add

Re: [RFC PATCH v3.1 3/6] btrfs: qgroup: Return actually freed bytes for qgroup release or free data

2017-05-16 Thread Qu Wenruo
At 05/17/2017 09:12 AM, Qu Wenruo wrote: At 05/16/2017 10:23 PM, David Sterba wrote: On Fri, May 12, 2017 at 11:30:43AM +0800, Qu Wenruo wrote: btrfs_qgroup_release/free_data() only returns 0 or minus error number(ENOMEM is the only possible error). "btrfs_qgroup_release/free_data() only

Re: [RFC PATCH v3.1 3/6] btrfs: qgroup: Return actually freed bytes for qgroup release or free data

2017-05-16 Thread Qu Wenruo
At 05/16/2017 10:23 PM, David Sterba wrote: On Fri, May 12, 2017 at 11:30:43AM +0800, Qu Wenruo wrote: btrfs_qgroup_release/free_data() only returns 0 or minus error number(ENOMEM is the only possible error). "btrfs_qgroup_release/free_data() only returns 0 or a negative error number

Re: "Corrected" errors persist after scrubbing

2017-05-16 Thread Chris Murphy
On Tue, May 16, 2017 at 3:53 AM, Tom Hale wrote: > # mdadm --build /dev/md0 --level=faulty --raid-devices=1 /dev/loop0 > # mdadm --grow /dev/md0 --layout=rp400 > layout for /dev/md0 set to 12803 I'm not familiar with this method of simulating problems. Everything I've seen on this

Re: [PATCH 3/6] fstests: regression test for btrfs dio read repair

2017-05-16 Thread Liu Bo
On Wed, May 10, 2017 at 06:53:26PM +0800, Eryu Guan wrote: > On Tue, May 09, 2017 at 11:56:08AM -0600, Liu Bo wrote: > > This case tests whether dio read can repair the bad copy if we have > > a good copy. > > > > Commit 2dabb3248453 ("Btrfs: Direct I/O read: Work on sectorsized blocks") > >

Re: [PATCH 2/6] Btrfs: use bio_clone_bioset_partial to simplify DIO submit

2017-05-16 Thread Liu Bo
On Tue, May 16, 2017 at 07:37:37AM -0700, Christoph Hellwig wrote: > > } > > > > +struct bio *btrfs_bio_clone_partial(struct bio *orig, gfp_t gfp_mask, int > > offset, int size) > > +{ > > + struct bio *bio; > > + > > + bio = bio_clone_fast(orig, gfp_mask, btrfs_bioset); > > + if (bio) {

[PATCH 4/4] btrfs: remove unused member list from btrfs_end_io_wq

2017-05-16 Thread David Sterba
The end io work queue items have been tracked by the work queues since "Btrfs: Add async worker threads for pre and post IO checksumming" (8b7128429235d9bd72cfd5e) (2008). Signed-off-by: David Sterba --- fs/btrfs/disk-io.c | 1 - 1 file changed, 1 deletion(-) diff --git

[PATCH 0/4] Btrfs, remove unused structure members

2017-05-16 Thread David Sterba
With some help of coccinelle and coccigrep I found a few stucture members that are completely unused, we can remove them and reduce structure sizes. David Sterba (4): btrfs: remove unused member err from reada_extent btrfs: remove unused member list from async_submit_bio btrfs: remove

[PATCH 1/4] btrfs: remove unused member err from reada_extent

2017-05-16 Thread David Sterba
Seems to be unused since the initial commit, we ignore readahead errors anyway, the full read will handle that if necessary. Signed-off-by: David Sterba --- fs/btrfs/reada.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c index

[PATCH 2/4] btrfs: remove unused member list from async_submit_bio

2017-05-16 Thread David Sterba
The list used to track checksums in the early version (2.6.29), but I was able not pinpoint the commit that stopped using it. Everything apparently works without it for a long time. Signed-off-by: David Sterba --- fs/btrfs/disk-io.c | 1 - 1 file changed, 1 deletion(-) diff

[PATCH 3/4] btrfs: remove unused members dir_path from recorded_ref

2017-05-16 Thread David Sterba
The two members do not seem to be used since the initial commit. Signed-off-by: David Sterba --- fs/btrfs/send.c | 8 1 file changed, 8 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index fc496a6f842a..e8185c83f667 100644 --- a/fs/btrfs/send.c +++

[PATCH 09/10] btrfs: scrub: clean up division in scrub_find_csum

2017-05-16 Thread David Sterba
Use proper helpers for 64bit division. Signed-off-by: David Sterba --- fs/btrfs/scrub.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index a2366f041f47..6c8c4535dc43 100644 --- a/fs/btrfs/scrub.c +++

[PATCH 10/10] btrfs: scrub: simplify srrub worker initialization

2017-05-16 Thread David Sterba
Minor simplification, merge calls to one. Signed-off-by: David Sterba --- fs/btrfs/scrub.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 6c8c4535dc43..59b053feb42e 100644 --- a/fs/btrfs/scrub.c +++

[PATCH 06/10] btrfs: scrub: use bool for flush_all_writes

2017-05-16 Thread David Sterba
flush_all_writes is an atomic but does not use the semantics at all, it's just on/off indicator, we can use bool. Signed-off-by: David Sterba --- fs/btrfs/scrub.c | 18 -- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/scrub.c

[PATCH 03/10] btrfs: scrub: simplify cleanup of wr_ctx in scrub_free_ctx

2017-05-16 Thread David Sterba
We don't need to take the mutex and zero out wr_cur_bio, as this is called after the scrub finished. Signed-off-by: David Sterba --- fs/btrfs/scrub.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index

[PATCH 05/10] btrfs: scrub: remove pages_per_wr_bio from scrub context

2017-05-16 Thread David Sterba
The only purpose seems to store SCRUB_PAGES_PER_WR_BIO, we can use the constant directly. Signed-off-by: David Sterba --- fs/btrfs/scrub.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index

[PATCH 08/10] btrfs: scrub: clean up division in __scrub_mark_bitmap

2017-05-16 Thread David Sterba
Use proper helpers for 64bit division and then cast to narrower type. Signed-off-by: David Sterba --- fs/btrfs/scrub.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 5b135df691fa..a2366f041f47 100644 ---

[PATCH 04/10] btrfs: scrub: embed scrub_wr_ctx into scrub context

2017-05-16 Thread David Sterba
The structure scrub_wr_ctx is not used anywhere just the scrub context, we can move the members there. The tgtdev is renamed so it's more clear that it belongs to the "wr" part. Signed-off-by: David Sterba --- fs/btrfs/scrub.c | 103

[PATCH 07/10] btrfs: scrub: use fs_info::sectorsize and drop it from scrub context

2017-05-16 Thread David Sterba
As we now have the node/block sizes in fs_info, we can use them and can drop the local copies. Signed-off-by: David Sterba --- fs/btrfs/scrub.c | 24 ++-- 1 file changed, 10 insertions(+), 14 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c

[PATCH 02/10] btrfs: scrub: inline helper scrub_free_wr_ctx

2017-05-16 Thread David Sterba
The helper scrub_free_wr_ctx is used only once and fits into scrub_free_ctx as it continues sctx shutdown, no need to keep it separate. Signed-off-by: David Sterba --- fs/btrfs/scrub.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git

[PATCH 00/10] Btrfs, minor scrub code cleanups

2017-05-16 Thread David Sterba
A bunch of minor cleanups in the scrub code. For 4.13. David Sterba (10): btrfs: scrub: inline helper scrub_setup_wr_ctx btrfs: scrub: inline helper scrub_free_wr_ctx btrfs: scrub: simplify cleanup of wr_ctx in scrub_free_ctx btrfs: scrub: embed scrub_wr_ctx into scrub context btrfs:

[PATCH 01/10] btrfs: scrub: inline helper scrub_setup_wr_ctx

2017-05-16 Thread David Sterba
The helper scrub_setup_wr_ctx is used only once and fits into scrub_setup_ctx as it continues intialization, no need to keep it separate. Signed-off-by: David Sterba --- fs/btrfs/scrub.c | 36 +--- 1 file changed, 9 insertions(+), 27

Re: Btrfs/SSD

2017-05-16 Thread Kai Krakow
Am Tue, 16 May 2017 14:21:20 +0200 schrieb Tomasz Torcz : > On Tue, May 16, 2017 at 03:58:41AM +0200, Kai Krakow wrote: > > Am Mon, 15 May 2017 22:05:05 +0200 > > schrieb Tomasz Torcz : > > > [...] > > > > > > Let me add my 2 cents.

[PATCH] btrfs: use generic slab for for btrfs_transaction

2017-05-16 Thread David Sterba
Observing the number of slab objects of btrfs_transaction, there's just one active on an almost quiescent filesystem, and the number of objects goes to about ten when sync is in progress. Then the nubmer goes down to 1. This matches the expectations of the transaction lifetime. For such use the

Re: [PATCH 2/6] Btrfs: use bio_clone_bioset_partial to simplify DIO submit

2017-05-16 Thread Christoph Hellwig
> } > > +struct bio *btrfs_bio_clone_partial(struct bio *orig, gfp_t gfp_mask, int > offset, int size) > +{ > + struct bio *bio; > + > + bio = bio_clone_fast(orig, gfp_mask, btrfs_bioset); > + if (bio) { bio_clone_fast will never fail when backed by a bioset, which this one always

Re: [RFC PATCH v3.1 3/6] btrfs: qgroup: Return actually freed bytes for qgroup release or free data

2017-05-16 Thread David Sterba
On Fri, May 12, 2017 at 11:30:43AM +0800, Qu Wenruo wrote: > btrfs_qgroup_release/free_data() only returns 0 or minus error > number(ENOMEM is the only possible error). "btrfs_qgroup_release/free_data() only returns 0 or a negative error number (ENOMEM is the only possible error)." > This is

Can't remount a BTRFS partition read write after a drive failure

2017-05-16 Thread Sylvain Leroux
Hi, I'm investigating BTRFS using an external USB HDD on a Linux Debian Stretch/Sid system. The drive is not reliable. And I noticed when there is an error and the USB device appears to be dead to the kernel, I am later unable to remount rw the drive. I can mount it read only though. This seems

Re: [RFC PATCH v3.1 1/6] btrfs: qgroup: Add quick exit for non-fs extents

2017-05-16 Thread David Sterba
On Fri, May 12, 2017 at 11:30:41AM +0800, Qu Wenruo wrote: > For btrfs_qgroup_account_extent(), modify make it exit quicker for > non-fs extents. A short explanation why this is ok would be desired here. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a

Re: [RFC PATCH 0/2] Introduce blkdev_issue_flush_no_wait()

2017-05-16 Thread Bart Van Assche
On Tue, 2017-05-16 at 17:39 +0800, Anand Jain wrote: > BTRFS wanted a block device flush function which does not wait for > its completion, so that the flush for the next device can be called > in the same thread. > > Here is a RFC patch to provide the function > 'blkdev_issue_flush_no_wait()',

Re: Btrfs/SSD

2017-05-16 Thread Austin S. Hemmelgarn
On 2017-05-16 08:21, Tomasz Torcz wrote: On Tue, May 16, 2017 at 03:58:41AM +0200, Kai Krakow wrote: Am Mon, 15 May 2017 22:05:05 +0200 schrieb Tomasz Torcz : My drive has # smartctl -a /dev/sda | grep LBA 241 Total_LBAs_Written 0x0032 099 099 000Old_age

Re: Btrfs/SSD

2017-05-16 Thread Tomasz Torcz
On Tue, May 16, 2017 at 03:58:41AM +0200, Kai Krakow wrote: > Am Mon, 15 May 2017 22:05:05 +0200 > schrieb Tomasz Torcz : > > > > Yes, I considered that, too. And when I tried, there was almost no > > > perceivable performance difference between bcache-writearound and > > >

Re: [PATCH 2/2] btrfs: Use blkdev_issue_flush_no_wait()

2017-05-16 Thread Christoph Hellwig
On Tue, May 16, 2017 at 05:39:14PM +0800, Anand Jain wrote: > Signed-off-by: Anand Jain An explanation on why things are changed is entirely missing here.. > --- > fs/btrfs/disk-io.c | 108 > - > fs/btrfs/volumes.h |

Re: [PATCH 1/2] block: Introduce blkdev_issue_flush_no_wait()

2017-05-16 Thread Christoph Hellwig
On Tue, May 16, 2017 at 05:39:13PM +0800, Anand Jain wrote: > blkdev_issue_flush() is a blocking function and returns only after > the flush bio is completed, so a module handling more than one > device can't issue flush for all the devices unless it uses worker > thread. > > This patch adds a

Re: Btrfs/SSD

2017-05-16 Thread Austin S. Hemmelgarn
On 2017-05-15 15:49, Kai Krakow wrote: Am Mon, 15 May 2017 08:03:48 -0400 schrieb "Austin S. Hemmelgarn" : That's why I don't trust any of my data to them. But I still want the benefit of their speed. So I use SSDs mostly as frontend caches to HDDs. This gives me big

Re: "Corrected" errors persist after scrubbing

2017-05-16 Thread Austin S. Hemmelgarn
On 2017-05-16 05:53, Tom Hale wrote: Hi Chris, On 09/05/17 02:26, Chris Murphy wrote: Read errors are fixed by overwrites. If the underlying device doesn't report an error for the write command, it's assumed to succeed. Even md and LVM raid's do this. I understand assuming writes succeed in

Re: [PATCH] btrfs: add mount umount logs

2017-05-16 Thread David Sterba
On Tue, May 16, 2017 at 04:41:49PM +0800, Anand Jain wrote: > By looking at the logs we should be able to know when the FS was > mounted and unmounted and the options used, so to help forensic > investigations. Could be a useful feature, but you're adding it to the wrong layer. -- To unsubscribe

Re: "Corrected" errors persist after scrubbing

2017-05-16 Thread Tom Hale
Hi Chris, On 09/05/17 02:26, Chris Murphy wrote: > Read errors are fixed by overwrites. If the underlying device doesn't > report an error for the write command, it's assumed to succeed. Even > md and LVM raid's do this. I understand assuming writes succeed in general. However, for a tool which

Re: [PATCH v2] btrfs: add framework to handle device flush error as a volume

2017-05-16 Thread Anand Jain
On 05/10/2017 01:12 AM, David Sterba wrote: On Sat, May 06, 2017 at 07:17:54AM +0800, Anand Jain wrote: This adds comments to the flush error handling part of the code, and hopes to maintain the same logic with a framework which can be used to handle the errors at the volume level.

[PATCH 2/2] btrfs: Use blkdev_issue_flush_no_wait()

2017-05-16 Thread Anand Jain
Signed-off-by: Anand Jain --- fs/btrfs/disk-io.c | 108 - fs/btrfs/volumes.h | 2 +- 2 files changed, 33 insertions(+), 77 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index

[PATCH 1/2] block: Introduce blkdev_issue_flush_no_wait()

2017-05-16 Thread Anand Jain
blkdev_issue_flush() is a blocking function and returns only after the flush bio is completed, so a module handling more than one device can't issue flush for all the devices unless it uses worker thread. This patch adds a new function blkdev_issue_flush_no_wait(), which uses submit_bio() instead

[RFC PATCH 0/2] Introduce blkdev_issue_flush_no_wait()

2017-05-16 Thread Anand Jain
BTRFS wanted a block device flush function which does not wait for its completion, so that the flush for the next device can be called in the same thread. Here is a RFC patch to provide the function 'blkdev_issue_flush_no_wait()', which is based on the current device flush function

Re: [PATCH] btrfs: add mount umount logs

2017-05-16 Thread Qu Wenruo
Yes, this is indeed a nice feature. But what about only outputting this info when BTRFS_DEBUG is selected? I'm sure there will be some guys not liking such kernel message flooding everywhere when doing testing. Thanks, Qu At 05/16/2017 04:41 PM, Anand Jain wrote: By looking at the logs we

[PATCH] btrfs: add mount umount logs

2017-05-16 Thread Anand Jain
By looking at the logs we should be able to know when the FS was mounted and unmounted and the options used, so to help forensic investigations. Signed-off-by: Anand Jain --- fs/btrfs/super.c | 17 + 1 file changed, 17 insertions(+) diff --git