[PATCH v4 13/27] lib: add errseq_t type and infrastructure for handling it

2017-05-09 Thread Jeff Layton
An errseq_t is a way of recording errors in one place, and allowing any number of "subscribers" to tell whether an error has been set again since a previous time. It's implemented as an unsigned 32-bit value that is managed with atomic operations. The low order bits are designated to hold an

[PATCH v4 10/27] 9p: set mapping error when writeback fails in launder_page

2017-05-09 Thread Jeff Layton
launder_page is just writeback under the page lock. We still need to mark the mapping for errors there when they occur. Signed-off-by: Jeff Layton Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig --- fs/9p/vfs_addr.c | 5 - 1 file

[GIT PULL] Btrfs

2017-05-09 Thread Chris Mason
Hi Linus, My for-linus-4.12 branch: git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus-4.12 Has fixes and cleanups Dave Sterba collected for the merge window. The biggest functional fixes are between btrfs raid5/6 and scrub, and raid5/6 and device replacement. Some

Re: [PATCH 1/1] btrfs-progs: send: fail on first -ENODATA only

2017-05-09 Thread David Sterba
On Mon, May 01, 2017 at 03:09:41PM +0200, Christian Brauner wrote: > The original bug-reporter verified that my patch fixes the bug. See > > https://bugzilla.kernel.org/show_bug.cgi?id=195597 Thanks, patch applied. I'll add reference to the changelog. -- To unsubscribe from this list: send the

[PATCH v4 01/27] fs: remove unneeded forward definition of mm_struct from fs.h

2017-05-09 Thread Jeff Layton
Signed-off-by: Jeff Layton --- include/linux/fs.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 7251f7bb45e8..38adefd8e2a0 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1252,8 +1252,6 @@ extern void

Re: [PATCH] btrfs: remove redundant assignment and check on variable ret

2017-05-09 Thread Colin Ian King
On 09/05/17 17:55, Liu Bo wrote: > On Sat, May 06, 2017 at 11:01:05PM +0100, Colin King wrote: >> From: Colin Ian King >> >> Variable ret is assigned to zero and is always zero throughout the >> function. Thus the check for ret being less than zero is always >> false

[PATCH v4 12/27] cifs: set mapping error when page writeback fails in writepage or launder_pages

2017-05-09 Thread Jeff Layton
Signed-off-by: Jeff Layton Reviewed-by: Christoph Hellwig --- fs/cifs/file.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 21d404535739..0bee7f8d91ad 100644 --- a/fs/cifs/file.c +++

[PATCH v4 14/27] fs: new infrastructure for writeback error handling and reporting

2017-05-09 Thread Jeff Layton
Most filesystems currently use mapping_set_error and filemap_check_errors for setting and reporting/clearing writeback errors at the mapping level. filemap_check_errors is indirectly called from most of the filemap_fdatawait_* functions and from filemap_write_and_wait*. These functions are called

[xfstests PATCH v2 1/3] generic: add a writeback error handling test

2017-05-09 Thread Jeff Layton
I'm working on a set of kernel patches to change how writeback errors are handled and reported in the kernel. Instead of reporting a writeback error to only the first fsync caller on the file, I aim to make the kernel report them once on every file description. This patch adds a test for the new

[xfstests PATCH v2 0/3] xfstest for updated writeback error handling

2017-05-09 Thread Jeff Layton
I've numbered the new test as 999 for the moment so as not to collide with tests being added while I've been working on this. I can change that and resend if this should go in. I'm working on a set of kernel patches to change how writeback errors are handled and reported in the kernel. Instead of

Re: [PATCH v2] btrfs: add framework to handle device flush error as a volume

2017-05-09 Thread David Sterba
On Sat, May 06, 2017 at 07:17:54AM +0800, Anand Jain wrote: > This adds comments to the flush error handling part of > the code, and hopes to maintain the same logic with a > framework which can be used to handle the errors at the > volume level. > > Signed-off-by: Anand Jain

Re: "corrupt leaf, invalid item offset size pair"

2017-05-09 Thread Roman Mamedov
On Mon, 8 May 2017 20:05:44 +0200 "Janos Toth F." wrote: > May be someone more talented will be able to assist you but in my > experience this kind of damage is fatal in practice (even if you could > theoretically fix it, it's probably easier to recreate the fs and >

Re: [PATCH v3] btrfs: relocation: Enhance kernel error output for relocation

2017-05-09 Thread David Sterba
On Wed, Feb 15, 2017 at 09:39:05AM +0800, Qu Wenruo wrote: > When balance(relocation) fails, btrfs-progs will report like: > > ERROR: error during balancing '/mnt/scratch': Input/output error > There may be more info in syslog - try dmesg | tail > > However kernel can't provide may useful info

Re: [PATCH] btrfs-progs: print-tree: Add leaf flags and backref revision output

2017-05-09 Thread David Sterba
On Mon, May 08, 2017 at 03:38:10PM +0800, Qu Wenruo wrote: > Btrfs header has a u64 member flags, whose lowest 56 bits are for header > flags like WRITTEN and RELOC. > And its highest 8 bits are for backref revision. > > Manually checking btrfs_header_flags() will be a pain, so add such leaf >

[PATCH v4 23/27] gfs2: clean up some filemap_* calls

2017-05-09 Thread Jeff Layton
In some places, it's trying to reset the mapping error after calling filemap_fdatawait. That's no longer required. Also, turn several filemap_fdatawrite+filemap_fdatawait calls into filemap_write_and_wait. That will at least return writeback errors that occur during the write phase.

[PATCH v4 22/27] jbd2: don't reset error in journal_finish_inode_data_buffers

2017-05-09 Thread Jeff Layton
Now that we don't clear writeback errors after fetching them, there is no need to reset them. This is also potentially racy. Signed-off-by: Jeff Layton --- fs/jbd2/commit.c | 13 ++--- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/fs/jbd2/commit.c

[PATCH v4 08/27] dax: set errors in mapping when writeback fails

2017-05-09 Thread Jeff Layton
Jan's description for this patch is much better than mine, so I'm quoting it verbatim here: DAX currently doesn't set errors in the mapping when cache flushing fails in dax_writeback_mapping_range(). Since this function can get called only from fsync(2) or sync(2), this is actually as good as it

[PATCH v4 11/27] fuse: set mapping error in writepage_locked when it fails

2017-05-09 Thread Jeff Layton
This ensures that we see errors on fsync when writeback fails. Signed-off-by: Jeff Layton Reviewed-by: Christoph Hellwig --- fs/fuse/file.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/fuse/file.c b/fs/fuse/file.c index ec238fb5a584..07d0efcb050c

Re: [PATCH] btrfs: remove redundant assignment and check on variable ret

2017-05-09 Thread Liu Bo
On Sat, May 06, 2017 at 11:01:05PM +0100, Colin King wrote: > From: Colin Ian King > > Variable ret is assigned to zero and is always zero throughout the > function. Thus the check for ret being less than zero is always > false and so mapping_set_error always has an

[PATCH v4 07/27] orangefs: don't call filemap_write_and_wait from fsync

2017-05-09 Thread Jeff Layton
Orangefs doesn't do buffered writes yet, so there's no point in initiating and waiting for writeback. Signed-off-by: Jeff Layton Reviewed-by: Christoph Hellwig Acked-by: Mike Marshall --- fs/orangefs/file.c | 5 + 1 file changed, 1

[PATCH v4 05/27] btrfs: btrfs_wait_tree_block_writeback can be void return

2017-05-09 Thread Jeff Layton
Nothing checks its return value. Signed-off-by: Jeff Layton --- fs/btrfs/disk-io.c | 6 +++--- fs/btrfs/disk-io.h | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index eb1ee7b6f532..8c479bd5534a 100644 ---

[PATCH v4 03/27] mm: fix mapping_set_error call in me_pagecache_dirty

2017-05-09 Thread Jeff Layton
The error code should be negative. Since this ends up in the default case anyway, this is harmless, but it's less confusing to negate it. Also, later patches will require a negative error code here. Signed-off-by: Jeff Layton Reviewed-by: Ross Zwisler

[PATCH v4 20/27] cifs: cleanup writeback handling errors and comments

2017-05-09 Thread Jeff Layton
Now that writeback errors are handled on a per-file basis using the new sequence counter method at the vfs layer, we no longer need to re-set errors in the mapping after doing writeback in non-fsync codepaths. Also, fix up some bogus comments. Signed-off-by: Jeff Layton ---

[PATCH v4 18/27] mm: don't TestClearPageError in __filemap_fdatawait_range

2017-05-09 Thread Jeff Layton
The -EIO returned here can end up overriding whatever error is marked in the address space, and be returned at fsync time, even when there is a more appropriate error stored in the mapping. Read errors are also sometimes tracked on a per-page level using PG_error. Suppose we have a read error on

[PATCH v4 19/27] buffer: set errors in mapping at the time that the error occurs

2017-05-09 Thread Jeff Layton
I noticed on xfs that I could still sometimes get back an error on fsync on a fd that was opened after the error condition had been cleared. The problem is that the buffer code sets the write_io_error flag and then later checks that flag to set the error in the mapping. That flag perisists for

[PATCH v4 17/27] mm: remove AS_EIO and AS_ENOSPC flags

2017-05-09 Thread Jeff Layton
They're no longer used. Signed-off-by: Jeff Layton --- include/linux/pagemap.h | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 32512ffc15fa..9593eac41499 100644 ---

[PATCH v4 06/27] fs: check for writeback errors after syncing out buffers in generic_file_fsync

2017-05-09 Thread Jeff Layton
ext2 currently does a test+clear of the AS_EIO flag, which is is problematic for some coming changes. What we really need to do instead is call filemap_check_errors in __generic_file_fsync after syncing out the buffers. That will be sufficient for this case, and help other callers detect these

[PATCH v4 09/27] nilfs2: set the mapping error when calling SetPageError on writeback

2017-05-09 Thread Jeff Layton
In a later patch, we're going to want to make the fsync codepath not do a TestClearPageError call as that can override the error set in the address space. To do that though, we need to ensure that filesystems that are relying on the PG_error bit for reporting writeback errors also set an error in

[PATCH v4 04/27] buffer: use mapping_set_error instead of setting the flag

2017-05-09 Thread Jeff Layton
Signed-off-by: Jeff Layton Reviewed-by: Jan Kara Reviewed-by: Matthew Wilcox Reviewed-by: Christoph Hellwig --- fs/buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/buffer.c b/fs/buffer.c

Re: [PATCH] btrfs-progs: btrfs-convert: Add larger device support

2017-05-09 Thread Lakshmipathi.G
> > Does the mke2fs command exist on your system? We maybe want to use > mkfs.ext4 directly. Yes, mke2fs is there. (I'll check using mkfs.ext4 directly) # ./convert-tests/009-common-inode-flags/test.sh [TEST/conv] common inode flags test, btrfs defaults failed: mke2fs -t ext4 -b 4096 -F

[PATCH] btrfs: fix incorrect error return ret being passed to mapping_set_error

2017-05-09 Thread Colin King
From: Colin Ian King The setting of return code ret should be based on the error code passed into function end_extent_writepage and not on ret. Thanks to Liu Bo for spotting this mistake in the original fix I submitted. Detected by CoverityScan, CID#1414312 ("Logically

Re: [PATCH 2/3] btrfs: remove inode argument from repair_io_failure

2017-05-09 Thread Chandan Rajendra
On Friday, May 05, 2017 11:57:14 AM Josef Bacik wrote: > Once we remove the btree_inode we won't have an inode to pass anymore, just > pass > the fs_info directly and the inum since we use that to print out the repair > message. > The changes look fine, Reviewed-by: Chandan Rajendra

Re: [PATCH 3/3] Btrfs: don't pass the inode through clean_io_failure

2017-05-09 Thread Chandan Rajendra
On Friday, May 05, 2017 11:57:15 AM Josef Bacik wrote: > Instead pass around the failure tree and the io tree. > The changes look fine, Reviewed-by: Chandan Rajendra -- chandan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of

Re: [PATCH 1/3] Btrfs: replace tree->mapping with tree->private_data

2017-05-09 Thread Chandan Rajendra
On Friday, May 05, 2017 11:57:13 AM Josef Bacik wrote: > For extent_io tree's we have carried the address_mapping of the inode around > in > the io tree in order to pull the inode back out for calling into various tree > ops hooks. This works fine when everything that has an extent_io_tree has

Re: [PATCH] btrfs: fix incorrect error return ret being passed to mapping_set_error

2017-05-09 Thread Liu Bo
On Tue, May 09, 2017 at 06:14:01PM +0100, Colin King wrote: > From: Colin Ian King > > The setting of return code ret should be based on the error code > passed into function end_extent_writepage and not on ret. Thanks > to Liu Bo for spotting this mistake in the

Re: [PATCH 6/7] btrfs: Make flush bios explicitely sync

2017-05-09 Thread Liu Bo
On Tue, May 02, 2017 at 05:03:50PM +0200, Jan Kara wrote: > Commit b685d3d65ac7 "block: treat REQ_FUA and REQ_PREFLUSH as > synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...} > definitions. generic_make_request_checks() however strips REQ_FUA and > REQ_PREFLUSH flags from a bio when

[PATCH 1/6] fstests: add _filter_filefrag

2017-05-09 Thread Liu Bo
_filter_filefrag is a helper function to filter filefrag's output and it can be used to get a file's file offset and physical offset. Signed-off-by: Liu Bo --- common/filter | 19 +++ 1 file changed, 19 insertions(+) diff --git a/common/filter

[PATCH 6/6] fstests: regression test for nocsum buffered read's repair

2017-05-09 Thread Liu Bo
This is to test whether buffered read retry-repair code is able to work in raid1 case as expected. Please note that without checksum, btrfs doesn't know if the data used to repair is correct, so repair is more of resync which makes sure that both of the copy has the same content. Commit

[PATCH 2/6] fstests: add _get_current_dmesg

2017-05-09 Thread Liu Bo
_get_current_dmesg can be used to grep customized pattern. Signed-off-by: Liu Bo --- common/rc | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/common/rc b/common/rc index 78a2101..111ed69 100644 --- a/common/rc +++ b/common/rc @@ -3215,6

Re: Struggling with file system slowness

2017-05-09 Thread Liu Bo
On Fri, May 05, 2017 at 09:24:32AM -0400, Matt McKinnon wrote: > > Too little information. Is IO happening at the same time? Is > > compression on? Deduplicated? Lots of subvolumes? SSD? What > > kind of workload and file size/distribution profile? > > Only write IO during the load spikes. No

Re: Struggling with file system slowness

2017-05-09 Thread Matt McKinnon
Those snapshots were created using Marc Merlin's script (thanks, Marc). They don't do anything except sit around on the file system for a week or so and then are removed. I'm now doing quarter-hourly snaps instead of nightly since I have nightly backups of the filesytem going off-site. So

Re: [PATCH ping] btrfs: warn about RAID5/6 being experimental at mount time

2017-05-09 Thread Goffredo Baroncelli
Hi, On 2017-05-09 03:49, Adam Borowski wrote: > Write hole is pretty nasty for metadata (likely to cause total filesystem > loss) but when on -draid{5,6} -mraid{1,10} it's nowhere as bad. So for 4.12 > it might be ok to put up big warnings only for metadata. On the other hand, > data loss

Re: [PATCH v3] fstests: regression test for btrfs dio read repair

2017-05-09 Thread Liu Bo
On Mon, May 08, 2017 at 12:50:40PM -0700, Liu Bo wrote: > On Wed, May 03, 2017 at 06:08:36PM +0800, Eryu Guan wrote: > > On Fri, Apr 28, 2017 at 11:25:52AM -0600, Liu Bo wrote: > > > This case tests whether dio read can repair the bad copy if we have > > > a good copy. > > > > > > Commit

[PATCH 4/6] fstests: regression test for btrfs buffered read's repair

2017-05-09 Thread Liu Bo
This case tests whether buffered read can repair the bad copy if we have a good copy. Commit 20a7db8ab3f2 ("btrfs: add dummy callback for readpage_io_failed and drop checks") introduced the regression. The upstream fix is Btrfs: bring back repair during read Signed-off-by: Liu Bo

[PATCH 5/6] fstests: regression test for nocsum dio read's repair

2017-05-09 Thread Liu Bo
Commit 2dabb3248453 ("Btrfs: Direct I/O read: Work on sectorsized blocks") introduced this regression. It'd cause 'Segmentation fault' error. The upstream fix is Btrfs: fix segment fault when doing dio read Signed-off-by: Liu Bo --- tests/btrfs/142 | 137

[PATCH 3/6] fstests: regression test for btrfs dio read repair

2017-05-09 Thread Liu Bo
This case tests whether dio read can repair the bad copy if we have a good copy. Commit 2dabb3248453 ("Btrfs: Direct I/O read: Work on sectorsized blocks") introduced the regression. The upstream fix is Btrfs: fix invalid dereference in btrfs_retry_endio Signed-off-by: Liu Bo

[PATCH 0/6] Regression test for btrfs read repair

2017-05-09 Thread Liu Bo
This set is adding four regression test case for btrfs read repair, and both directIO read and buffered read are tested. Patch 1 and 2 is adding two helpers to summarize the common code among the following test cases in patch 3-6. Liu Bo (6): fstests: add _filter_filefrag fstests: add

Re: [PATCH] Btrfs: fix the mount failure due to missing devices

2017-05-09 Thread Liu Bo
On Wed, May 03, 2017 at 06:54:23AM +0200, Adam Borowski wrote: > On Tue, May 02, 2017 at 05:48:40PM -0600, Liu Bo wrote: > > Say there is a raid1 btrfs which consists of two disks, after one disk > > becomes unavailable, we can still mount it in degraded mode once, for > > the second mount it

Re: [PATCH] Btrfs: tolerate errors if we have retried successfully

2017-05-09 Thread Liu Bo
On Fri, May 05, 2017 at 06:52:45PM +0200, David Sterba wrote: > On Thu, Apr 13, 2017 at 06:11:56PM -0700, Liu Bo wrote: > > With raid1 profile, dio read isn't tolerating IO errors if read length is > > less than the stripe length (64K). > > Can you please write more details why this is true? Some

Re: [PATCH v4 13/27] lib: add errseq_t type and infrastructure for handling it

2017-05-09 Thread NeilBrown
On Tue, May 09 2017, Jeff Layton wrote: > An errseq_t is a way of recording errors in one place, and allowing any > number of "subscribers" to tell whether an error has been set again > since a previous time. > > It's implemented as an unsigned 32-bit value that is managed with atomic >

[PATCH 0/2] space_info update/creation refactoring

2017-05-09 Thread Nikolay Borisov
Here are two patches which aim to disentangle and make more explicit the situation when a space_info has to be created VS when space_info values are being updated. It survived multiple xfstest runs with additional ASSERTs which I have removed in this posting. One such assert which didn't

[PATCH 1/2] btrfs: Separate space_info create/update

2017-05-09 Thread Nikolay Borisov
Currently the struct space_info creation code is intermixed in the udpate_space_info function. There are well-defined points at which the we actually want to create brand-new space_info structs (e.g. during mount of the filesystem as well as sometimes when adding/initialising new chunks). In such

[PATCH 2/2] btrfs: Refactor update_space_info

2017-05-09 Thread Nikolay Borisov
Following the factoring out of the creation code udpate_space_info can only be called for already-existing space_info structs. As such it cannot fail. Remove superfulous error handling and make the function return void. Signed-off-by: Nikolay Borisov ---

Re: btrfsck lowmem mode shows corruptions

2017-05-09 Thread Qu Wenruo
At 05/06/2017 02:15 AM, Kai Krakow wrote: Am Fri, 5 May 2017 08:55:10 +0800 schrieb Qu Wenruo : At 05/05/2017 01:29 AM, Kai Krakow wrote: Hello! Since I saw a few kernel freezes lately (due to experimenting with ck-sources) including some filesystem-related

Re: [PATCH v3] btrfs: relocation: Enhance kernel error output for relocation

2017-05-09 Thread Qu Wenruo
At 05/10/2017 01:29 AM, David Sterba wrote: On Wed, Feb 15, 2017 at 09:39:05AM +0800, Qu Wenruo wrote: When balance(relocation) fails, btrfs-progs will report like: ERROR: error during balancing '/mnt/scratch': Input/output error There may be more info in syslog - try dmesg | tail However

[RFC PATCH v3 5/6] btrfs: qgroup: Introduce extent changeset for qgroup reserve functions

2017-05-09 Thread Qu Wenruo
Introduce a new parameter, struct extent_changeset for btrfs_qgroup_reserved_data() and its callers. Such extent_changeset was used in btrfs_qgroup_reserve_data() to record which range it reserved in current reserve, so it can free it at error path. The reason we need to export it to callers is,

[RFC PATCH v3 0/6] Qgroup fixes, Non-stack version

2017-05-09 Thread Qu Wenruo
The remaining qgroup fixes patches, based on the Chris' for-linus-4.12 branch with commit 9bcaaea7418d09691f1ffab5c49aacafe3eef9d0 as base. Can be fetched from github: https://github.com/adam900710/linux/tree/qgroup_fixes_non_stack Despite the 5th patch, patches are mostly unchanged. Only minor

[RFC PATCH v3 6/6] btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges

2017-05-09 Thread Qu Wenruo
[BUG] For the following case, btrfs can underflow qgroup reserved space at error path: (Page size 4K, function name without "btrfs_" prefix) Task A | Task B -- Buffered_write [0, 2K)

[RFC PATCH v3 4/6] btrfs: qgroup: Fix qgroup reserved space underflow caused by buffered write and quota enable

2017-05-09 Thread Qu Wenruo
[BUG] Under the following case, we can underflow qgroup reserved space. Task A|Task B --- Quota disabled | Buffered write | |- btrfs_check_data_free_space() |

[RFC PATCH v3 3/6] btrfs: qgroup: Return actually freed bytes for qgroup release or free data

2017-05-09 Thread Qu Wenruo
btrfs_qgroup_release/free_data() only returns 0 or minus error number(ENOMEM is the only possible error). This is normally good enough, but sometimes we need the accurate byte number it freed/released. Change it to return actually released/freed bytenr number instead of 0 for success. And

[RFC PATCH v3 2/6] btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function

2017-05-09 Thread Qu Wenruo
Quite a lot of qgroup corruption happens due to wrong timing of calling btrfs_qgroup_prepare_account_extents(). Since the safest timing is calling it just before btrfs_qgroup_account_extents(), there is no need to separate these 2 function. Merging them will make code cleaner and less bug prone.

[RFC PATCH v3 1/6] btrfs: qgroup: Add quick exit for non-fs extents

2017-05-09 Thread Qu Wenruo
For btrfs_qgroup_account_extent(), modify make it exit quicker for non-fs extents. This will also reduce the noise in trace_btrfs_qgroup_account_extent event. Signed-off-by: Qu Wenruo --- fs/btrfs/qgroup.c | 41 +++-- 1 file changed,

Re: [PATCH] btrfs-progs: btrfs-convert: Add larger device support

2017-05-09 Thread David Sterba
On Tue, May 09, 2017 at 07:46:08PM +0530, Lakshmipathi.G wrote: > created a test script but it fail to detect mk2fs. Test script 009 > also produces: > > # ./convert-tests/009-common-inode-flags/test.sh > [TEST/conv] common inode flags test, btrfs defaults > failed: mke2fs -t ext4 -b 4096

[PATCH v4 15/27] fs: retrofit old error reporting API onto new infrastructure

2017-05-09 Thread Jeff Layton
Now that we have a better way to store and report errors that occur during writeback, we need to convert the existing codebase to use it. We could just adapt all of the filesystem code and related infrastructure to the new API, but that's a lot of churn. When it comes to setting errors in the

[PATCH v4 16/27] fs: adapt sync_file_range to new reporting infrastructure

2017-05-09 Thread Jeff Layton
Since it returns errors in a way similar to fsync, have it use the same method for returning previously-reported writeback errors. Signed-off-by: Jeff Layton --- fs/sync.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/sync.c b/fs/sync.c

[PATCH v4 21/27] mm: clean up error handling in write_one_page

2017-05-09 Thread Jeff Layton
Don't try to check PageError since that's potentially racy and not necessarily going to be set after writepage errors out. Instead, sample the mapping error early on, and use that value to tell us whether we got a writeback error since then. Signed-off-by: Jeff Layton ---

Re: [GIT PULL] Btrfs

2017-05-09 Thread Chris Mason
On 05/09/2017 01:56 PM, Chris Mason wrote: > Hi Linus, > > My for-linus-4.12 branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git > for-linus-4.12 I hit send too soon, sorry. There's a trivial conflict with our WARN_ON fix that went into 4.11. I pushed the

Re: help converting btrfs to new writeback error tracking?

2017-05-09 Thread Jeff Layton
On Mon, 2017-05-08 at 11:39 -0700, Liu Bo wrote: > Hi Jeff, > > On Fri, May 05, 2017 at 04:11:18PM -0400, Jeff Layton wrote: > > On Fri, 2017-05-05 at 12:21 -0700, Liu Bo wrote: > > > Hi Jeff, > > > > > > On Thu, May 04, 2017 at 07:26:17AM -0400, Jeff Layton wrote: > > > > I've been working on

Re: [PATCH 0/6 RFC] utilize bio_clone_fast to clean up

2017-05-09 Thread Liu Bo
On Tue, May 09, 2017 at 03:49:13PM -0700, Liu Bo wrote: > On Fri, May 05, 2017 at 04:24:47PM +0200, David Sterba wrote: > > On Mon, Apr 17, 2017 at 06:16:21PM -0700, Liu Bo wrote: > > > This attempts to use bio_clone_fast() in the places where we clone bio, > > > such as when bio got cloned for

Re: [PATCH] btrfs-progs: btrfs-convert: Add larger device support

2017-05-09 Thread Lakshmipathi.G
created a test script but it fail to detect mk2fs. Test script 009 also produces: # ./convert-tests/009-common-inode-flags/test.sh [TEST/conv] common inode flags test, btrfs defaults failed: mke2fs -t ext4 -b 4096 -F /home/laks/centos/laks/BTRFS/btrfs-progs/tests/test.img Seems like

[PATCH] Btrfs: clear EXTENT_DEFRAG bits in finish_ordered_io

2017-05-09 Thread Liu Bo
Before this, we use 'filled' mode here, ie. if all range has been filled with EXTENT_DEFRAG bits, get to clear it, but if the defrag range joins the adjacent delalloc range, then we'll leave EXTENT_DEFRAG bits until evicting inode. This clears the bit if any was found within the ordered extent.

Re: [PATCH 0/6 RFC] utilize bio_clone_fast to clean up

2017-05-09 Thread Liu Bo
On Fri, May 05, 2017 at 04:24:47PM +0200, David Sterba wrote: > On Mon, Apr 17, 2017 at 06:16:21PM -0700, Liu Bo wrote: > > This attempts to use bio_clone_fast() in the places where we clone bio, > > such as when bio got cloned for multiple disks and when bio got split > > during dio submit. > >

[PATCH 0/8 v7] No wait AIO

2017-05-09 Thread Goldwyn Rodrigues
Formerly known as non-blocking AIO. This series adds nonblocking feature to asynchronous I/O writes. io_submit() can be delayed because of a number of reason: - Block allocation for files - Data writebacks for direct I/O - Sleeping because of waiting to acquire i_rwsem - Congested block

[PATCH 2/8] nowait aio: Introduce RWF_NOWAIT

2017-05-09 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues This flag informs kernel to bail out if an AIO request will block for reasons such as file allocations, or a writeback triggered, or would block while allocating requests while performing direct I/O. Unfortunately, aio_flags is not checked for

[PATCH 3/8] nowait aio: return if direct write will trigger writeback

2017-05-09 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Find out if the write will trigger a wait due to writeback. If yes, return -EAGAIN. This introduces a new function filemap_range_has_page() which returns true if the file's mapping has a page within the range mentioned. Return -EINVAL for buffered

[PATCH 5/8] nowait aio: return on congested block device

2017-05-09 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues A new bio operation flag REQ_NOWAIT is introduced to identify bio's orignating from iocb with IOCB_NOWAIT. This flag indicates to return immediately if a request cannot be made instead of retrying. Stacked devices such as md (the ones with

[PATCH 6/8] nowait aio: ext4

2017-05-09 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Return EAGAIN if any of the following checks fail for direct I/O: + i_rwsem is lockable + Writing beyond end of file (will trigger allocation) + Blocks are not allocated at the write location Signed-off-by: Goldwyn Rodrigues

[PATCH 8/8] nowait aio: btrfs

2017-05-09 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues Return EAGAIN if any of the following checks fail + i_rwsem is not lockable + NODATACOW or PREALLOC is not set + Cannot nocow at the desired location + Writing beyond end of file which is not allocated Signed-off-by: Goldwyn Rodrigues

[PATCH 7/8] nowait aio: xfs

2017-05-09 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues If IOCB_NOWAIT is set, bail if the i_rwsem is not lockable immediately. IF IOMAP_NOWAIT is set, return EAGAIN in xfs_file_iomap_begin if it needs allocation either due to file extension, writing to a hole, or COW or waiting for other DIOs to finish.

[PATCH 4/8] nowait-aio: Introduce IOMAP_NOWAIT

2017-05-09 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues IOCB_NOWAIT translates to IOMAP_NOWAIT for iomaps. This is used by XFS in the XFS patch. Signed-off-by: Goldwyn Rodrigues Reviewed-by: Christoph Hellwig --- fs/iomap.c| 2 ++ include/linux/iomap.h | 1 + 2

[PATCH 1/8] Use RWF_* flags for AIO operations

2017-05-09 Thread Goldwyn Rodrigues
From: Goldwyn Rodrigues RWF_* flags is used for preadv2/pwritev2 calls. Port to use it for aio operations as well. For this, aio_rw_flags is introduced in struct iocb (using aio_reserved1) which will carry these flags. This is a precursor to the nowait AIO calls. Note, the

Re: [PATCH 6/8] nowait aio: ext4

2017-05-09 Thread Jan Kara
On Tue 09-05-17 07:22:17, Goldwyn Rodrigues wrote: > From: Goldwyn Rodrigues > > Return EAGAIN if any of the following checks fail for direct I/O: > + i_rwsem is lockable > + Writing beyond end of file (will trigger allocation) > + Blocks are not allocated at the write

Re: [PATCH 1/3] Btrfs: replace tree->mapping with tree->private_data

2017-05-09 Thread David Sterba
One comment to the callbacks in extent_io_ops: there are 2 instances of the callback structure and 2 groups of callbacks. The callbacks that exist for both instances are in the "mandatory" group and the rest is in the "optional. As you add set_range_writeback and tree_fs_info to both, please move

Re: [PATCH 2/3] btrfs: remove inode argument from repair_io_failure

2017-05-09 Thread David Sterba
On Fri, May 05, 2017 at 11:57:14AM -0400, Josef Bacik wrote: > Once we remove the btree_inode we won't have an inode to pass anymore, just > pass > the fs_info directly and the inum since we use that to print out the repair > message. > > Signed-off-by: Josef Bacik Reviewed-by:

Re: [PATCH 3/3] Btrfs: don't pass the inode through clean_io_failure

2017-05-09 Thread David Sterba
On Fri, May 05, 2017 at 11:57:15AM -0400, Josef Bacik wrote: > Instead pass around the failure tree and the io tree. > > Signed-off-by: Josef Bacik Reviewed-by: David Sterba -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body

[PATCH v4 27/27] mm: clean up comments in me_pagecache_dirty

2017-05-09 Thread Jeff Layton
This no longer applies with the new writeback error tracking and reporting infrastructure. Signed-off-by: Jeff Layton --- mm/memory-failure.c | 35 +-- 1 file changed, 5 insertions(+), 30 deletions(-) diff --git a/mm/memory-failure.c

[PATCH v4 25/27] Documentation: flesh out the section in vfs.txt on storing and reporting writeback errors

2017-05-09 Thread Jeff Layton
I waxed a little loquacious here, but I figured that more detail was better, and writeback error handling is so hard to get right. Cc: Jan Kara Signed-off-by: Jeff Layton --- Documentation/filesystems/vfs.txt | 54 --- 1

[PATCH v4 24/27][RFC] nfs: convert to new errseq_t based error tracking for writeback errors

2017-05-09 Thread Jeff Layton
Drop the ERROR_WRITE flag and convert the error field in the context to a errseq_t. Add a new wb_err_cursor to track the reporting of the errseq_t. In principle, we could use the f_wb_err field in struct file for that, but that's problematic with the stock reporting in call_fsync. Signed-off-by:

[PATCH v4 26/27] mm: flesh out comments over mapping_set_error

2017-05-09 Thread Jeff Layton
Signed-off-by: Jeff Layton --- include/linux/pagemap.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 9593eac41499..9b453eae0aa1 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@

[PATCH v4 00/27] fs: introduce new writeback error reporting and convert existing API as a wrapper around it

2017-05-09 Thread Jeff Layton
v4: several more cleanup patches documentation and kerneldoc comment updates fix bugs in gfs2 patches make sync_file_range use same error reporting semantics bugfixes in buffer.c convert nfs to new scheme (maybe bogus, can be dropped) v3: wb_err_t -> errseq_t conversion

[PATCH v4 02/27] mm: drop "wait" parameter from write_one_page

2017-05-09 Thread Jeff Layton
The callers all set it to 1. Also, make it clear that this function will not set any sort of AS_* error, and that the caller must do so if necessary. No existing caller uses this on normal files, so none of them need it. Also, add __must_check here since, in general, the callers need to handle

[xfstests PATCH v2 2/3] ext4: allow ext4 to use $SCRATCH_LOGDEV

2017-05-09 Thread Jeff Layton
The writeback error handling test requires that you put the journal on a separate device. This allows us to use dmerror to simulate data writeback failure, without affecting the journal. xfs already has infrastructure for this (a'la $SCRATCH_LOGDEV), so wire up the ext4 code so that it can do the

[xfstests PATCH v2 3/3] btrfs: allow it to use $SCRATCH_LOGDEV

2017-05-09 Thread Jeff Layton
With btrfs, we can't really put the log on a separate device. What we can do however is mirror the metadata across two devices and put the data on a single device. When we turn on dmerror then the metadata can fall back to using the other mirror while the data errors out. Signed-off-by: Jeff

Re: [PATCH v4 25/27] Documentation: flesh out the section in vfs.txt on storing and reporting writeback errors

2017-05-09 Thread Jeff Layton
On Tue, 2017-05-09 at 11:49 -0400, Jeff Layton wrote: > I waxed a little loquacious here, but I figured that more detail was > better, and writeback error handling is so hard to get right. > > Cc: Jan Kara > Signed-off-by: Jeff Layton > --- >

Re: [PATCH] Btrfs:Unused comments:"FIXME: check if the IDs really exist" in ioctl.c

2017-05-09 Thread David Sterba
On Mon, May 08, 2017 at 10:10:02AM +0800, Daichou wrote: > These comments were already fixed in 2013. > btrfs_add_qgroup_relation, btrfs_ioctl_qgroup_create, btrfs_limit_qgroup > btrfs_add_qgroup_relation, btrfs_del_qgroup_relation have checked qgroupid > existed. Patch is ok, thanks. (I'll

Re: [PATCH] btrfs: remove redundant assignment and check on variable ret

2017-05-09 Thread David Sterba
On Sat, May 06, 2017 at 11:01:05PM +0100, Colin King wrote: > From: Colin Ian King > > Variable ret is assigned to zero and is always zero throughout the > function. Thus the check for ret being less than zero is always > false and so mapping_set_error always has an

Re: [RFC xfstests PATCH] xfstests: add a writeback error handling test

2017-05-09 Thread Jeff Layton
On Mon, 2017-04-24 at 08:00 -0700, Christoph Hellwig wrote: > On Mon, Apr 24, 2017 at 09:45:51AM -0400, Jeff Layton wrote: > > With the patch series above, ext4 now passes. xfs and btrfs end up in > > r/o mode after the test. xfs returns -EIO at that point though, and > > btrfs returns -EROFS.