Re: [GIT PULL] ext4 bug fixes for 4.6
Linus, Thanks for reverting the broken commit. I'll work on a proper fix for the readdir() issue for the next merge cycle, and make sure it gets proper testing. I'll also take care of packaging up Greg's test program and getting it submitted to xfstests so we have a proper regression test case. Cheers, - Ted
Re: [GIT PULL] ext4 bug fixes for 4.6
Linus, Thanks for reverting the broken commit. I'll work on a proper fix for the readdir() issue for the next merge cycle, and make sure it gets proper testing. I'll also take care of packaging up Greg's test program and getting it submitted to xfstests so we have a proper regression test case. Cheers, - Ted
Re: [GIT PULL] ext4 bug fixes for 4.6
On Sun, Apr 10, 2016 at 1:48 PM, Linus Torvaldswrote: > > The attached patch is actually tested and seems to fix the issue. Christ. Now really attached. I'll just go back to bed, because today is just not working out. Maybe things will be better tomorrow. Or maybe I should start drinking, and at least have an excuse. Linus fs/ext4/namei.c | 2 +- fs/readdir.c| 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index db98f89f737f..c07422d254b6 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1107,7 +1107,7 @@ int ext4_htree_fill_tree(struct file *dir_file, __u32 start_hash, } while (1) { - if (signal_pending(current)) { + if (fatal_signal_pending(current)) { err = -ERESTARTSYS; goto errout; } diff --git a/fs/readdir.c b/fs/readdir.c index e69ef3b79787..2b0bb4fb8990 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -248,6 +248,8 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen, return -EINVAL; dirent = buf->previous; if (dirent) { + if (signal_pending(current)) + return -EINTR; if (__put_user(offset, >d_off)) goto efault; }
Re: [GIT PULL] ext4 bug fixes for 4.6
On Sun, Apr 10, 2016 at 1:29 PM, Linus Torvaldswrote: > > I think the right choice might be to > > (a) revert that patch (or just change the signal_pending() into > fatal_signal_pending()) > > (b) to get the latency advantage, do something like this: The attached patch is actually tested and seems to fix the issue. I do not have a good way to check the latency of signal delivery and I didn't check if the signal_pending() actually ever triggers, but your test-case that showed the problem before seems to be fine with it. Linus
Re: [GIT PULL] ext4 bug fixes for 4.6
On Sun, Apr 10, 2016 at 1:48 PM, Linus Torvalds wrote: > > The attached patch is actually tested and seems to fix the issue. Christ. Now really attached. I'll just go back to bed, because today is just not working out. Maybe things will be better tomorrow. Or maybe I should start drinking, and at least have an excuse. Linus fs/ext4/namei.c | 2 +- fs/readdir.c| 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c index db98f89f737f..c07422d254b6 100644 --- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -1107,7 +1107,7 @@ int ext4_htree_fill_tree(struct file *dir_file, __u32 start_hash, } while (1) { - if (signal_pending(current)) { + if (fatal_signal_pending(current)) { err = -ERESTARTSYS; goto errout; } diff --git a/fs/readdir.c b/fs/readdir.c index e69ef3b79787..2b0bb4fb8990 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -248,6 +248,8 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen, return -EINVAL; dirent = buf->previous; if (dirent) { + if (signal_pending(current)) + return -EINTR; if (__put_user(offset, >d_off)) goto efault; }
Re: [GIT PULL] ext4 bug fixes for 4.6
On Sun, Apr 10, 2016 at 1:29 PM, Linus Torvalds wrote: > > I think the right choice might be to > > (a) revert that patch (or just change the signal_pending() into > fatal_signal_pending()) > > (b) to get the latency advantage, do something like this: The attached patch is actually tested and seems to fix the issue. I do not have a good way to check the latency of signal delivery and I didn't check if the signal_pending() actually ever triggers, but your test-case that showed the problem before seems to be fine with it. Linus
Re: [GIT PULL] ext4 bug fixes for 4.6
On Sun, Apr 10, 2016 at 1:32 PM, Linus Torvaldswrote: > > if (signal_pending()) > return -EINTR; "signal_pending(current)", dammit. I give up. I'm sure you understood what I meant. I need to go make myself more coffee. Or something. Linus
Re: [GIT PULL] ext4 bug fixes for 4.6
On Sun, Apr 10, 2016 at 1:32 PM, Linus Torvalds wrote: > > if (signal_pending()) > return -EINTR; "signal_pending(current)", dammit. I give up. I'm sure you understood what I meant. I need to go make myself more coffee. Or something. Linus
Re: [GIT PULL] ext4 bug fixes for 4.6
On Sun, Apr 10, 2016 at 1:29 PM, Linus Torvaldswrote: > > (b) to get the latency advantage, do something like this: > > + if (signal_pending) > + return -EINTR; That should be if (signal_pending()) return -EINTR; of course. Missing function call parenthesis.. I'm just testing that people are awake. Linus
Re: [GIT PULL] ext4 bug fixes for 4.6
On Sun, Apr 10, 2016 at 1:29 PM, Linus Torvalds wrote: > > (b) to get the latency advantage, do something like this: > > + if (signal_pending) > + return -EINTR; That should be if (signal_pending()) return -EINTR; of course. Missing function call parenthesis.. I'm just testing that people are awake. Linus
Re: [GIT PULL] ext4 bug fixes for 4.6
On Sat, Apr 9, 2016 at 7:23 PM, Greg Thelenwrote: > > > I've been testing 5b5b7fd185e9 (linus/master) and seeing that > interrupted readdir() now returns duplicate dirents. Yes, your test-program recreates this beautifully here. Sadly not while stracing. Worth adding as a filesystem test to fsstress? > Reverting commit 1028b55bafb7 ("ext4: allow readdir()'s of large empty > directories to be interrupted") avoids duplicates. Ok, so the the "fatal_signal_pending()" case in ext4_readdir() should be ok, since we're killing the thread at that point. So I assume it's the ext4_htree_fill_tree() part of the patch. What I *think* happens is: - We are perfectly happy to take a break at "call_filldir()" time (if that returns an error because the buffer was full or whatever), and at that point, ctx->pos will point to the last entry that we did *not* successfully fill in. - but if we take an exception at ext4_htree_fill_tree() time, we end the loop, and now "ctx->pos" contains the offset of the previous entry - the one we *successfully* already copied - so now, when we exit the getdents, we will save the re-start value at the entry that we already successfully handled. So I think that commit is entirely bogus. I think the right choice might be to (a) revert that patch (or just change the signal_pending() into fatal_signal_pending()) (b) to get the latency advantage, do something like this: diff --git a/fs/readdir.c b/fs/readdir.c index e69ef3b79787..a2244fe9c003 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -248,6 +248,8 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen, return -EINVAL; dirent = buf->previous; if (dirent) { + if (signal_pending) + return -EINTR; if (__put_user(offset, >d_off)) goto efault; } which returns that error at a point where we can actually take it (note that we only do this if we have already filled in at least one entry: "buf->previous" was non-NULL, so we can return EINTR, because the caller will actually return the length of the result, never the EINTR error we're returning). The above patch is *entirely* untested. It might be pure garbage. But that commit 1028b55bafb7 is _definitely_ broken, and the above _might_ work. Caveat patchor. Linus
Re: [GIT PULL] ext4 bug fixes for 4.6
On Sat, Apr 9, 2016 at 7:23 PM, Greg Thelen wrote: > > > I've been testing 5b5b7fd185e9 (linus/master) and seeing that > interrupted readdir() now returns duplicate dirents. Yes, your test-program recreates this beautifully here. Sadly not while stracing. Worth adding as a filesystem test to fsstress? > Reverting commit 1028b55bafb7 ("ext4: allow readdir()'s of large empty > directories to be interrupted") avoids duplicates. Ok, so the the "fatal_signal_pending()" case in ext4_readdir() should be ok, since we're killing the thread at that point. So I assume it's the ext4_htree_fill_tree() part of the patch. What I *think* happens is: - We are perfectly happy to take a break at "call_filldir()" time (if that returns an error because the buffer was full or whatever), and at that point, ctx->pos will point to the last entry that we did *not* successfully fill in. - but if we take an exception at ext4_htree_fill_tree() time, we end the loop, and now "ctx->pos" contains the offset of the previous entry - the one we *successfully* already copied - so now, when we exit the getdents, we will save the re-start value at the entry that we already successfully handled. So I think that commit is entirely bogus. I think the right choice might be to (a) revert that patch (or just change the signal_pending() into fatal_signal_pending()) (b) to get the latency advantage, do something like this: diff --git a/fs/readdir.c b/fs/readdir.c index e69ef3b79787..a2244fe9c003 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -248,6 +248,8 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen, return -EINVAL; dirent = buf->previous; if (dirent) { + if (signal_pending) + return -EINTR; if (__put_user(offset, >d_off)) goto efault; } which returns that error at a point where we can actually take it (note that we only do this if we have already filled in at least one entry: "buf->previous" was non-NULL, so we can return EINTR, because the caller will actually return the length of the result, never the EINTR error we're returning). The above patch is *entirely* untested. It might be pure garbage. But that commit 1028b55bafb7 is _definitely_ broken, and the above _might_ work. Caveat patchor. Linus
Re: [GIT PULL] ext4 bug fixes for 4.6
Theodore Ts'o wrote: > The following changes since commit 243d50678583100855862bc084b8b307eea67f68: > > Merge branch 'overlayfs-linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs (2016-03-22 > 13:11:15 -0700) > n> are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git > tags/ext4_for_linus_stable > > for you to fetch changes up to c325a67c72903e1cc30e990a15ce745bda0dbfde: > > ext4: ignore quota mount options if the quota feature is enabled > (2016-04-03 17:03:37 -0400) > > > These changes contains a fix for overlayfs interacting with some > (badly behaved) dentry code in various file systems. These have been > reviewed by Al and the respective file system mtinainers and are going > through the ext4 tree for convenience. > > This also has a few ext4 encryption bug fixes that were discovered in > Android testing (yes, we will need to get these sync'ed up with the > fs/crypto code; I'll take care of that). It also has some bug fixes > and a change to ignore the legacy quota options to allow for xfstests > regression testing of ext4's internal quota feature and to be more > consistent with how xfs handles this case. > > > Dan Carpenter (1): > ext4 crypto: fix some error handling > > Filipe Manana (1): > btrfs: fix crash/invalid memory access on fsync when using overlayfs > > Jan Kara (1): > ext4: retry block allocation for failed DIO and DAX writes > > Miklos Szeredi (4): > fs: add file_dentry() > nfs: use file_dentry() > ext4: use dget_parent() in ext4_file_open() > ext4: use file_dentry() > > Theodore Ts'o (7): > ext4: check if in-inode xattr is corrupted in > ext4_expand_extra_isize_ea() > ext4 crypto: don't let data integrity writebacks fail with ENOMEM > ext4 crypto: use dget_parent() in ext4_d_revalidate() > ext4: allow readdir()'s of large empty directories to be interrupted Ted, I've been testing 5b5b7fd185e9 (linus/master) and seeing that interrupted readdir() now returns duplicate dirents. Reverting commit 1028b55bafb7 ("ext4: allow readdir()'s of large empty directories to be interrupted") avoids duplicates. On 5b5b7fd185e9 a SIGPROF to the test program below occasionally causes "already seen name" error. On older kernels, it runs indefinitely without error. /* mkdir /tmp/foo cd /tmp/foo for i in $(seq 0 599); do touch $i; done /tmp/scanner & kill -PROF %1 */ #include #include #include #include #include #include static void handler(int unused) { } int main() { enum { count = 600 }; char seen[count]; struct dirent *ent; DIR *dir; struct sigaction sa = {0}; sa.sa_handler = handler; if (sigemptyset(_mask)) err(1, "sigemptyset"); sa.sa_flags = SA_RESTART; if (sigaction(SIGPROF, , NULL)) err(1, "sigaction"); for (;;) { memset(seen, 0, sizeof(seen)); dir = opendir("."); if (dir == NULL) err(1, "opendir(.)"); while ((ent = readdir(dir)) != NULL) { int idx; if ((strcmp(ent->d_name, ".") == 0) || (strcmp(ent->d_name, "..") == 0)) { continue; } idx = atoi(ent->d_name); if (idx >= count) { errx(1, "bogus name %s index %d", ent->d_name, idx); } if (seen[idx]) { errx(1, "already seen name %s index %d", ent->d_name, idx); } seen[idx] = 1; } if (closedir(dir)) err(1, "closedir(.)"); } } > ext4: add lockdep annotations for i_data_sem > ext4: avoid calling dquot_get_next_id() if quota is not enabled > ext4: ignore quota mount options if the quota feature is enabled > > fs/btrfs/file.c| 2 +- > fs/dcache.c| 5 - > fs/ext4/crypto.c | 49 + > fs/ext4/dir.c | 5 + > fs/ext4/ext4.h | 29 +++-- > fs/ext4/file.c | 12 > fs/ext4/inode.c| 58 > -- > fs/ext4/move_extent.c | 11 +-- > fs/ext4/namei.c| 5 + > fs/ext4/page-io.c | 14 +- > fs/ext4/readpage.c | 2 +- > fs/ext4/super.c| 61 > +++-- > fs/ext4/xattr.c| 32 > fs/nfs/dir.c | 6 +++---
Re: [GIT PULL] ext4 bug fixes for 4.6
Theodore Ts'o wrote: > The following changes since commit 243d50678583100855862bc084b8b307eea67f68: > > Merge branch 'overlayfs-linus' of > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs (2016-03-22 > 13:11:15 -0700) > n> are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git > tags/ext4_for_linus_stable > > for you to fetch changes up to c325a67c72903e1cc30e990a15ce745bda0dbfde: > > ext4: ignore quota mount options if the quota feature is enabled > (2016-04-03 17:03:37 -0400) > > > These changes contains a fix for overlayfs interacting with some > (badly behaved) dentry code in various file systems. These have been > reviewed by Al and the respective file system mtinainers and are going > through the ext4 tree for convenience. > > This also has a few ext4 encryption bug fixes that were discovered in > Android testing (yes, we will need to get these sync'ed up with the > fs/crypto code; I'll take care of that). It also has some bug fixes > and a change to ignore the legacy quota options to allow for xfstests > regression testing of ext4's internal quota feature and to be more > consistent with how xfs handles this case. > > > Dan Carpenter (1): > ext4 crypto: fix some error handling > > Filipe Manana (1): > btrfs: fix crash/invalid memory access on fsync when using overlayfs > > Jan Kara (1): > ext4: retry block allocation for failed DIO and DAX writes > > Miklos Szeredi (4): > fs: add file_dentry() > nfs: use file_dentry() > ext4: use dget_parent() in ext4_file_open() > ext4: use file_dentry() > > Theodore Ts'o (7): > ext4: check if in-inode xattr is corrupted in > ext4_expand_extra_isize_ea() > ext4 crypto: don't let data integrity writebacks fail with ENOMEM > ext4 crypto: use dget_parent() in ext4_d_revalidate() > ext4: allow readdir()'s of large empty directories to be interrupted Ted, I've been testing 5b5b7fd185e9 (linus/master) and seeing that interrupted readdir() now returns duplicate dirents. Reverting commit 1028b55bafb7 ("ext4: allow readdir()'s of large empty directories to be interrupted") avoids duplicates. On 5b5b7fd185e9 a SIGPROF to the test program below occasionally causes "already seen name" error. On older kernels, it runs indefinitely without error. /* mkdir /tmp/foo cd /tmp/foo for i in $(seq 0 599); do touch $i; done /tmp/scanner & kill -PROF %1 */ #include #include #include #include #include #include static void handler(int unused) { } int main() { enum { count = 600 }; char seen[count]; struct dirent *ent; DIR *dir; struct sigaction sa = {0}; sa.sa_handler = handler; if (sigemptyset(_mask)) err(1, "sigemptyset"); sa.sa_flags = SA_RESTART; if (sigaction(SIGPROF, , NULL)) err(1, "sigaction"); for (;;) { memset(seen, 0, sizeof(seen)); dir = opendir("."); if (dir == NULL) err(1, "opendir(.)"); while ((ent = readdir(dir)) != NULL) { int idx; if ((strcmp(ent->d_name, ".") == 0) || (strcmp(ent->d_name, "..") == 0)) { continue; } idx = atoi(ent->d_name); if (idx >= count) { errx(1, "bogus name %s index %d", ent->d_name, idx); } if (seen[idx]) { errx(1, "already seen name %s index %d", ent->d_name, idx); } seen[idx] = 1; } if (closedir(dir)) err(1, "closedir(.)"); } } > ext4: add lockdep annotations for i_data_sem > ext4: avoid calling dquot_get_next_id() if quota is not enabled > ext4: ignore quota mount options if the quota feature is enabled > > fs/btrfs/file.c| 2 +- > fs/dcache.c| 5 - > fs/ext4/crypto.c | 49 + > fs/ext4/dir.c | 5 + > fs/ext4/ext4.h | 29 +++-- > fs/ext4/file.c | 12 > fs/ext4/inode.c| 58 > -- > fs/ext4/move_extent.c | 11 +-- > fs/ext4/namei.c| 5 + > fs/ext4/page-io.c | 14 +- > fs/ext4/readpage.c | 2 +- > fs/ext4/super.c| 61 > +++-- > fs/ext4/xattr.c| 32 > fs/nfs/dir.c | 6 +++---
[GIT PULL] ext4 bug fixes for 4.6
The following changes since commit 243d50678583100855862bc084b8b307eea67f68: Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs (2016-03-22 13:11:15 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git tags/ext4_for_linus_stable for you to fetch changes up to c325a67c72903e1cc30e990a15ce745bda0dbfde: ext4: ignore quota mount options if the quota feature is enabled (2016-04-03 17:03:37 -0400) These changes contains a fix for overlayfs interacting with some (badly behaved) dentry code in various file systems. These have been reviewed by Al and the respective file system mtinainers and are going through the ext4 tree for convenience. This also has a few ext4 encryption bug fixes that were discovered in Android testing (yes, we will need to get these sync'ed up with the fs/crypto code; I'll take care of that). It also has some bug fixes and a change to ignore the legacy quota options to allow for xfstests regression testing of ext4's internal quota feature and to be more consistent with how xfs handles this case. Dan Carpenter (1): ext4 crypto: fix some error handling Filipe Manana (1): btrfs: fix crash/invalid memory access on fsync when using overlayfs Jan Kara (1): ext4: retry block allocation for failed DIO and DAX writes Miklos Szeredi (4): fs: add file_dentry() nfs: use file_dentry() ext4: use dget_parent() in ext4_file_open() ext4: use file_dentry() Theodore Ts'o (7): ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea() ext4 crypto: don't let data integrity writebacks fail with ENOMEM ext4 crypto: use dget_parent() in ext4_d_revalidate() ext4: allow readdir()'s of large empty directories to be interrupted ext4: add lockdep annotations for i_data_sem ext4: avoid calling dquot_get_next_id() if quota is not enabled ext4: ignore quota mount options if the quota feature is enabled fs/btrfs/file.c| 2 +- fs/dcache.c| 5 - fs/ext4/crypto.c | 49 + fs/ext4/dir.c | 5 + fs/ext4/ext4.h | 29 +++-- fs/ext4/file.c | 12 fs/ext4/inode.c| 58 -- fs/ext4/move_extent.c | 11 +-- fs/ext4/namei.c| 5 + fs/ext4/page-io.c | 14 +- fs/ext4/readpage.c | 2 +- fs/ext4/super.c| 61 +++-- fs/ext4/xattr.c| 32 fs/nfs/dir.c | 6 +++--- fs/nfs/inode.c | 2 +- fs/nfs/nfs4file.c | 4 ++-- fs/overlayfs/super.c | 33 + include/linux/dcache.h | 10 ++ include/linux/fs.h | 10 ++ 19 files changed, 264 insertions(+), 86 deletions(-)
[GIT PULL] ext4 bug fixes for 4.6
The following changes since commit 243d50678583100855862bc084b8b307eea67f68: Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs (2016-03-22 13:11:15 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git tags/ext4_for_linus_stable for you to fetch changes up to c325a67c72903e1cc30e990a15ce745bda0dbfde: ext4: ignore quota mount options if the quota feature is enabled (2016-04-03 17:03:37 -0400) These changes contains a fix for overlayfs interacting with some (badly behaved) dentry code in various file systems. These have been reviewed by Al and the respective file system mtinainers and are going through the ext4 tree for convenience. This also has a few ext4 encryption bug fixes that were discovered in Android testing (yes, we will need to get these sync'ed up with the fs/crypto code; I'll take care of that). It also has some bug fixes and a change to ignore the legacy quota options to allow for xfstests regression testing of ext4's internal quota feature and to be more consistent with how xfs handles this case. Dan Carpenter (1): ext4 crypto: fix some error handling Filipe Manana (1): btrfs: fix crash/invalid memory access on fsync when using overlayfs Jan Kara (1): ext4: retry block allocation for failed DIO and DAX writes Miklos Szeredi (4): fs: add file_dentry() nfs: use file_dentry() ext4: use dget_parent() in ext4_file_open() ext4: use file_dentry() Theodore Ts'o (7): ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea() ext4 crypto: don't let data integrity writebacks fail with ENOMEM ext4 crypto: use dget_parent() in ext4_d_revalidate() ext4: allow readdir()'s of large empty directories to be interrupted ext4: add lockdep annotations for i_data_sem ext4: avoid calling dquot_get_next_id() if quota is not enabled ext4: ignore quota mount options if the quota feature is enabled fs/btrfs/file.c| 2 +- fs/dcache.c| 5 - fs/ext4/crypto.c | 49 + fs/ext4/dir.c | 5 + fs/ext4/ext4.h | 29 +++-- fs/ext4/file.c | 12 fs/ext4/inode.c| 58 -- fs/ext4/move_extent.c | 11 +-- fs/ext4/namei.c| 5 + fs/ext4/page-io.c | 14 +- fs/ext4/readpage.c | 2 +- fs/ext4/super.c| 61 +++-- fs/ext4/xattr.c| 32 fs/nfs/dir.c | 6 +++--- fs/nfs/inode.c | 2 +- fs/nfs/nfs4file.c | 4 ++-- fs/overlayfs/super.c | 33 + include/linux/dcache.h | 10 ++ include/linux/fs.h | 10 ++ 19 files changed, 264 insertions(+), 86 deletions(-)