Re: [git pull] vfs fixes
The pull request you sent on Tue, 22 Sep 2020 22:29:08 +0100: > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git fixes has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/805c6d3c19210c90c109107d189744e960eae025 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
[git pull] vfs fixes
No common topic, just several assorted fixes. The following changes since commit 9123e3a74ec7b934a4a099e98af6a61c2f80bbf5: Linux 5.9-rc1 (2020-08-16 13:04:57 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git fixes for you to fetch changes up to 933a3752babcf6513117d5773d2b70782d6ad149: fuse: fix the ->direct_IO() treatment of iov_iter (2020-09-17 17:26:56 -0400) Al Viro (1): fuse: fix the ->direct_IO() treatment of iov_iter Alexey Dobriyan (1): fs: fix cast in fsparam_u32hex() macro Hans de Goede (1): vboxsf: Fix the check for the old binary mount-arguments struct fs/fuse/file.c| 25 - fs/vboxsf/super.c | 2 +- include/linux/fs_parser.h | 2 +- 3 files changed, 14 insertions(+), 15 deletions(-)
Re: [git pull] vfs fixes
On Fri, 20 Apr 2018 20:09:56 +0100 Al Viro wrote: > On Fri, Apr 20, 2018 at 11:29:45AM -0700, Andrew Morton wrote: > > On Fri, 20 Apr 2018 16:58:46 +0100 Al Viro wrote: > > > > > Assorted fixes. Some of that is only a matter with fault injection > > > (broken handling of small allocation failure in various mount-related > > > places), > > > but the last one is a root-triggerable stack overflow, and combined with > > > userns it gets really nasty ;-/ > > > > > > The following changes since commit > > > 60cc43fc888428bb2f18f08997432d426a243338: > > > > > > Linux 4.17-rc1 (2018-04-15 18:24:20 -0700) > > > > > > are available in the git repository at: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus > > > > > > for you to fetch changes up to 16a34adb9392b2fe4195267475ab5b472e55292c: > > > > > > Don't leak MNT_INTERNAL away from internal mounts (2018-04-19 23:52:15 > > > -0400) > > > ... > > > > > > Tetsuo Handa (1): > > > mm,vmscan: Allow preallocating memory for register_shrinker(). > > > > Confused. You had a bunch of issues with this patch > > (http://lkml.kernel.org/r/20180411005938.gn30...@zeniv.linux.org.uk) > > and Tetsuo sent out a v2 but now we've merged the v1. Deliberate? > > I think by that time I'd applied v1 and fixed those issues myself (same as his > variant, modulo slightly different names). AH. > > Also, it would be nice if you could get the Link: thing working in your > > commits please - this one took a bit of hunting down. > > *blink* > > What Link: thing? You mean lkml.kernel.org references to original postings? yup. It's often fairly useful. > Or is it something else? Never done that, actually; any tips re tools needed > for that? Normally it's save a bunch of postings into a local file in mutt, > scp it over to development box, then git am -s - either all at once, or step > by step with ediiting and git commit --amend in between... I'd have expected git-am to have a way of doing that by now. > I realize that message-id can be picked and massaged into Link: ... form, > of course, but I'd rather not reinvent the wheel if it's already done by > somebody... I just do this: message_url() { pname="$1" idline=$(grep -i "^Message-Id:" "$pname" | head -1) if [ x"$idline" != "x" ] then msgid=$(echo "$idline" | sed -e 's/[^<]*<\([^>]*\).*/\1/') if [ x"$msgid" != "x" ] then echo "http://lkml.kernel.org/r/$msgid"; fi fi }
Re: [git pull] vfs fixes
On Fri, Apr 20, 2018 at 11:29:45AM -0700, Andrew Morton wrote: > On Fri, 20 Apr 2018 16:58:46 +0100 Al Viro wrote: > > > Assorted fixes. Some of that is only a matter with fault injection > > (broken handling of small allocation failure in various mount-related > > places), > > but the last one is a root-triggerable stack overflow, and combined with > > userns it gets really nasty ;-/ > > > > The following changes since commit 60cc43fc888428bb2f18f08997432d426a243338: > > > > Linux 4.17-rc1 (2018-04-15 18:24:20 -0700) > > > > are available in the git repository at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus > > > > for you to fetch changes up to 16a34adb9392b2fe4195267475ab5b472e55292c: > > > > Don't leak MNT_INTERNAL away from internal mounts (2018-04-19 23:52:15 > > -0400) > > ... > > > > Tetsuo Handa (1): > > mm,vmscan: Allow preallocating memory for register_shrinker(). > > Confused. You had a bunch of issues with this patch > (http://lkml.kernel.org/r/20180411005938.gn30...@zeniv.linux.org.uk) > and Tetsuo sent out a v2 but now we've merged the v1. Deliberate? I think by that time I'd applied v1 and fixed those issues myself (same as his variant, modulo slightly different names). > Also, it would be nice if you could get the Link: thing working in your > commits please - this one took a bit of hunting down. *blink* What Link: thing? You mean lkml.kernel.org references to original postings? Or is it something else? Never done that, actually; any tips re tools needed for that? Normally it's save a bunch of postings into a local file in mutt, scp it over to development box, then git am -s - either all at once, or step by step with ediiting and git commit --amend in between... I realize that message-id can be picked and massaged into Link: ... form, of course, but I'd rather not reinvent the wheel if it's already done by somebody...
Re: [git pull] vfs fixes
On Fri, 20 Apr 2018 16:58:46 +0100 Al Viro wrote: > Assorted fixes. Some of that is only a matter with fault injection > (broken handling of small allocation failure in various mount-related places), > but the last one is a root-triggerable stack overflow, and combined with > userns it gets really nasty ;-/ > > The following changes since commit 60cc43fc888428bb2f18f08997432d426a243338: > > Linux 4.17-rc1 (2018-04-15 18:24:20 -0700) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus > > for you to fetch changes up to 16a34adb9392b2fe4195267475ab5b472e55292c: > > Don't leak MNT_INTERNAL away from internal mounts (2018-04-19 23:52:15 > -0400) > ... > > Tetsuo Handa (1): > mm,vmscan: Allow preallocating memory for register_shrinker(). Confused. You had a bunch of issues with this patch (http://lkml.kernel.org/r/20180411005938.gn30...@zeniv.linux.org.uk) and Tetsuo sent out a v2 but now we've merged the v1. Deliberate? Also, it would be nice if you could get the Link: thing working in your commits please - this one took a bit of hunting down.
[git pull] vfs fixes
Assorted fixes. Some of that is only a matter with fault injection (broken handling of small allocation failure in various mount-related places), but the last one is a root-triggerable stack overflow, and combined with userns it gets really nasty ;-/ The following changes since commit 60cc43fc888428bb2f18f08997432d426a243338: Linux 4.17-rc1 (2018-04-15 18:24:20 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus for you to fetch changes up to 16a34adb9392b2fe4195267475ab5b472e55292c: Don't leak MNT_INTERNAL away from internal mounts (2018-04-19 23:52:15 -0400) Al Viro (5): hypfs_kill_super(): deal with failed allocations jffs2_kill_sb(): deal with failed allocations orangefs_kill_sb(): deal with allocation failures rpc_pipefs: fix double-dput() Don't leak MNT_INTERNAL away from internal mounts Tetsuo Handa (1): mm,vmscan: Allow preallocating memory for register_shrinker(). arch/s390/hypfs/inode.c | 2 +- fs/jffs2/super.c | 2 +- fs/namespace.c | 3 ++- fs/orangefs/super.c | 5 + fs/super.c | 9 - include/linux/shrinker.h | 7 +-- mm/vmscan.c | 21 - net/sunrpc/rpc_pipe.c| 1 + 8 files changed, 39 insertions(+), 11 deletions(-)
[git pull] vfs fixes for -rc7
untangle sys_close() abuses in xt_bpf, deal with register_shrinker() failures in sget() The following changes since commit d7ee946942bdd12394809305e3df05aa4c8b7b8f: VFS: Handle lazytime in do_mount() (2017-12-09 20:16:33 -0500) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus for you to fetch changes up to 040ee69226f8a96b7943645d68f41d5d44b5ff7d: fix "netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'" (2018-01-05 11:43:39 -0500) Al Viro (2): sget(): handle failures of register_shrinker() fix "netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1'" Tetsuo Handa (1): mm,vmscan: Make unregister_shrinker() no-op if register_shrinker() failed. fs/super.c | 6 +- include/linux/bpf.h| 10 ++ kernel/bpf/inode.c | 40 +++- kernel/bpf/syscall.c | 2 +- mm/vmscan.c| 3 +++ net/netfilter/xt_bpf.c | 14 ++ 6 files changed, 60 insertions(+), 15 deletions(-)
[git pull] vfs fixes
A couple of fixes; a leak in mntns_install() caught by Andrei (this cycle regression) + d_invalidate() softlockup fix - that had been reported by a bunch of people lately, but the problem is pretty old. The following changes since commit 32c1431eea4881a6b17bd7c639315010aeefa452: Linux 4.12-rc5 (2017-06-11 16:48:20 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus for you to fetch changes up to 4068367c9ca7b515a209f9c0c8741309a1e90495: fs: don't forget to put old mntns in mntns_install (2017-06-15 06:53:05 -0400) Al Viro (1): Hang/soft lockup in d_invalidate with simultaneous calls Andrei Vagin (1): fs: don't forget to put old mntns in mntns_install fs/dcache.c| 10 -- fs/namespace.c | 2 ++ 2 files changed, 6 insertions(+), 6 deletions(-)
Re: [git pull] vfs fixes
On Sat, Apr 15, 2017 at 09:51:40AM -0700, Linus Torvalds wrote: > On Fri, Apr 14, 2017 at 11:41 PM, Vegard Nossum > wrote: > > > > I'm seeing the same memfd_create/name_to_handle_at/path_lookupat > > use-after-free that Dmitry was seeing here: > > Ok, see if that is gone in current git with commit c0eb027e5aef ("vfs: > don't do RCU lookup of empty pathnames") FWIW, I'm finishing testing of fixes for crap found during the discussion of that stuff last week (making sure that mntns_install() can't be abused into setting ->fs->root/->fs->pwd to dentry of NFS referral and its ilk and doing that in a sane way).
Re: [git pull] vfs fixes
On Fri, Apr 14, 2017 at 11:41 PM, Vegard Nossum wrote: > > I'm seeing the same memfd_create/name_to_handle_at/path_lookupat > use-after-free that Dmitry was seeing here: Ok, see if that is gone in current git with commit c0eb027e5aef ("vfs: don't do RCU lookup of empty pathnames") Linus
Re: [git pull] vfs fixes
On 9 April 2017 at 07:40, Al Viro wrote: > > The following changes since commit a71c9a1c779f2499fb2afc0553e543f18aff6edf: > > Linux 4.11-rc5 (2017-04-02 17:23:54 -0700) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus > > for you to fetch changes up to a8e28440016bfb23bec266c4c66eacca6ea2d48b: > > Merge branch 'work.statx' into for-next (2017-04-03 01:06:59 -0400) > > > Al Viro (2): > alpha: fix stack smashing in old_adjtimex(2) > Merge branch 'work.statx' into for-next I'm seeing the same memfd_create/name_to_handle_at/path_lookupat use-after-free that Dmitry was seeing here: https://lkml.org/lkml/2017/3/4/118 I haven't tried the patch from that thread yet, but was there any reason for it not to get merged so far? Vegard
Re: [git pull] vfs fixes
On Tue, Apr 11, 2017 at 2:02 PM, Andreas Dilger wrote: > On Apr 11, 2017, at 12:48 AM, Al Viro wrote: >> >> It's more obscure than I would like, and can grow into a bug one day, but... >> nd_jump_root() can only return non-zero if you have LOOKUP_RCU. > > So possibly a comment like the following would be helpful: > > rcu_read_unlock(); /* nd_jump_root() returns if !LOOKUP_RCU */ > > so that us mere mortals have a chance to understand this in the future? That might be good, but the reason I noticed this at all was that I looked at all those "if (LOOKUP_RCU)" in that function, and was thinking that the whole function would be better being split up into the RCU case and the non-RCU case. Because the two cases do have shared code, but the sharing is almost less than the non-shared stuff. And when I started doing that split to see what it looked like, that rcu_read_unlock() really stood out like a sore thumb. Linus
Re: [git pull] vfs fixes
On Apr 11, 2017, at 12:48 AM, Al Viro wrote: > On Mon, Apr 10, 2017 at 11:10:19PM -0700, Linus Torvalds wrote: > >> It looks odd because the lock part is >> >>if (flags & LOOKUP_RCU) >>rcu_read_lock(); >> >> ie it's locked conditionally, and the code in between does not seem to >> return every time LOOKUP_RCU is clear. >> >> So mind giving this a look? Is it as obviously buggy as I think it is, >> or is there something I'm missing? > > It's more obscure than I would like, and can grow into a bug one day, but... > nd_jump_root() can only return non-zero if you have LOOKUP_RCU. So without > LOOKUP_RCU in flags, this >if (flags & LOOKUP_RCU) >rcu_read_lock(); >set_root(nd); >if (likely(!nd_jump_root(nd))) >return s; >nd->root.mnt = NULL; >rcu_read_unlock(); > won't get to that rcu_read_unlock() at all - it'll get zero from > nd_jump_root() > and proceed to return s; So possibly a comment like the following would be helpful: rcu_read_unlock(); /* nd_jump_root() returns if !LOOKUP_RCU */ so that us mere mortals have a chance to understand this in the future? Cheers, Andreas signature.asc Description: Message signed with OpenPGP
Re: [git pull] vfs fixes
On Mon, Apr 10, 2017 at 11:10:19PM -0700, Linus Torvalds wrote: > It looks odd because the lock part is > > if (flags & LOOKUP_RCU) > rcu_read_lock(); > > ie it's locked conditionally, and the code in between does not seem to > return every time LOOKUP_RCU is clear. > > So mind giving this a look? Is it as obviously buggy as I think it is, > or is there something I'm missing? It's more obscure than I would like, and can grow into a bug one day, but... nd_jump_root() can only return non-zero if you have LOOKUP_RCU. So without LOOKUP_RCU in flags, this if (flags & LOOKUP_RCU) rcu_read_lock(); set_root(nd); if (likely(!nd_jump_root(nd))) return s; nd->root.mnt = NULL; rcu_read_unlock(); won't get to that rcu_read_unlock() at all - it'll get zero from nd_jump_root() and proceed to return s;
Re: [git pull] vfs fixes
Hey Al, mind looking at fs/namei,c line 2186: if (likely(!nd_jump_root(nd))) return s; nd->root.mnt = NULL; --> rcu_read_unlock(); <-- return ERR_PTR(-ECHILD); because that rcu_read_unlock() looks odd. It looks odd because the lock part is if (flags & LOOKUP_RCU) rcu_read_lock(); ie it's locked conditionally, and the code in between does not seem to return every time LOOKUP_RCU is clear. So mind giving this a look? Is it as obviously buggy as I think it is, or is there something I'm missing? Linus
[git pull] vfs fixes
The following changes since commit a71c9a1c779f2499fb2afc0553e543f18aff6edf: Linux 4.11-rc5 (2017-04-02 17:23:54 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus for you to fetch changes up to a8e28440016bfb23bec266c4c66eacca6ea2d48b: Merge branch 'work.statx' into for-next (2017-04-03 01:06:59 -0400) Al Viro (2): alpha: fix stack smashing in old_adjtimex(2) Merge branch 'work.statx' into for-next Darrick J. Wong (1): xfs: report crtime and attribute flags to statx David Howells (3): ext4: Add statx support statx: Reserve the top bit of the mask for future struct expansion statx: Include a mask for stx_attributes in struct statx Eric Biggers (4): Documentation/filesystems: fix documentation for ->getattr() statx: reject unknown flags when using NULL path statx: remove incorrect part of vfs_statx() comment statx: optimize copy of struct statx to userspace Documentation/filesystems/Locking | 3 +- Documentation/filesystems/porting | 6 +++ Documentation/filesystems/vfs.txt | 3 +- arch/alpha/kernel/osf_sys.c | 2 +- fs/ext4/ext4.h| 1 + fs/ext4/file.c| 2 +- fs/ext4/inode.c | 41 +-- fs/ext4/namei.c | 2 + fs/ext4/symlink.c | 3 ++ fs/stat.c | 86 ++- fs/xfs/xfs_iops.c | 14 +++ include/linux/stat.h | 1 + include/uapi/linux/stat.h | 5 ++- samples/statx/test-statx.c| 12 -- 14 files changed, 120 insertions(+), 61 deletions(-)
Re: [git pull] vfs fixes
On Mon, 2017-04-03 at 01:00 -0500, Eric W. Biederman wrote: > Al Viro writes: > > > On Sun, Apr 02, 2017 at 05:58:41PM -0700, Linus Torvalds wrote: > > > > > I had to go and double-check that "DCACHE_DIRECTORY_TYPE" is what > > > d_can_lookup() actually checks, so _that_ part is perhaps a bit > > > subtle, and might be worth noting in that comment that you edited. > > > > > > So the real "rule" ends up being that we only ever look up things from > > > dentries of type DCACHE_DIRECTORY_TYPE set, and those had better have > > > DCACHE_RCUACCESS bit set. > > > > > > And the only reason path_init() only checks it for that case is that > > > nd->root and nd->pwd both have to be of type d_can_lookup(). > > > > > > Do we check that when we set it? I hope/assume we do. > > > > For chdir()/chroot()/pivot_root() it's done by LOOKUP_DIRECTORY in lookup > > flags; fchdir() is slightly different - there we check S_ISDIR of inode > > of opened file. Which is almost the same thing, except for > > kinda-sorta directories that have no ->lookup() - we have them for > > NFS referral points. It should be impossible to end up with > > one of those opened - not even with O_PATH; follow_managed() will be called > > and we'll either fail or cross into whatever ends up overmounting them. > > Still, it might be cleaner to turn that check into > > d_can_lookup(f.file->f_path.dentry) > > simply for consistency sake. > > > > The thing I really don't like is mntns_install(). With sufficiently > > nasty nfsroot setup it might be possible to end up with referral point > > as one's root/pwd; getting out of such state would be interesting... > > Smells like that place should be a solitary follow_down(), not a loop > > of follow_down_one(), but I want Eric's opinion on that one; userns stuff > > is weird. > > If I read the conversation correctly the concern is that we might > initialize a pwd or root with something that is almost but not quite a > directory in mntns_install. > > Refereshing my memory. d_automount mounts things and is what > is used for nfs referrals. d_manage blocks waiting for > an automounts to complete or expire. follow_down just calls d_manage, > follow_manage calls both d_manage and d_automount as appropriate. > > If the concern is nfs referral points calling follow_down is wrong and > what is wanted is follow_managed. The case Al was concerned about (sounds like) where the root (or pwd) being followed is an NFS referral (a similar case could be NFS file system migration if (when?) being used, and that's probably more likely to be triggered from a file system root than a referral). I can't see how that could happen for a referral, but if it did the follow would need to call d_automount(). It's unlikely ->d_manage() would factor into it but it is available for use so should be part of it. So follow_down() rather than follow_down_one() sounds like the right thing to do. > > The only thing that follow_down prevents is changing onto directories > that are only half mounted, and not really directories yet. Which > is certainly part of the invarient we want to preserve. > > > > The intent of the logic in mntns_install is to just pick a reasonable > looking place somewhere in that mount namespace to use as a root > directory. I arbitrarily picked the top of the mount stack on "/". Which > is typically used as the root directory. If people really care where > their root is they save a directory file descriptor off somewhere and > call chroot. So there is a little wiggle room exactly what the code > does. > > There is a secondary use of mntns_install which is to give you a way to > access what is under "/" if you are so foolish as to umount "/". I keep > thinking setns to your own mount namespace would be a handy way to get > back to the rootfs and to use it for something during system shutdown. > I don't know if anyone has actually used setns to your own mount > namespace for that. > > The worst case I can see from the proposed change is we would > not be able to umount all of the way down to rootfs. That > would be a self inflicted wound so I don't care. > > I can't imagine anyone mounting an automount point deliberately on / > except as way to confuse the vfs. Though I can almost imagine getting > there by accident if an automount expires. > > So yes please let's change the follow_down_one loop to follow_managed > to preserve the invariant that we always have a directory that > supports d_can_lookup to pass to set_fs_pwd and set_fs_root. > > Eric > > > diff --git a/fs/dcache.c b/fs/dcache.c > > index 95d71eda8142..05550139a8a6 100644 > > --- a/fs/dcache.c > > +++ b/fs/dcache.c > > @@ -1757,7 +1757,13 @@ static unsigned d_flags_for_inode(struct inode > > *inode) > > return DCACHE_MISS_TYPE; > > > > if (S_ISDIR(inode->i_mode)) { > > - add_flags = DCACHE_DIRECTORY_TYPE; > > + /* > > + * Any potential starting point of lookup should have
Re: [git pull] vfs fixes
On Mon, 2017-04-03 at 01:00 -0500, Eric W. Biederman wrote: > Al Viro writes: > > > On Sun, Apr 02, 2017 at 05:58:41PM -0700, Linus Torvalds wrote: > > > > > I had to go and double-check that "DCACHE_DIRECTORY_TYPE" is what > > > d_can_lookup() actually checks, so _that_ part is perhaps a bit > > > subtle, and might be worth noting in that comment that you edited. > > > > > > So the real "rule" ends up being that we only ever look up things from > > > dentries of type DCACHE_DIRECTORY_TYPE set, and those had better have > > > DCACHE_RCUACCESS bit set. > > > > > > And the only reason path_init() only checks it for that case is that > > > nd->root and nd->pwd both have to be of type d_can_lookup(). > > > > > > Do we check that when we set it? I hope/assume we do. > > > > For chdir()/chroot()/pivot_root() it's done by LOOKUP_DIRECTORY in lookup > > flags; fchdir() is slightly different - there we check S_ISDIR of inode > > of opened file. Which is almost the same thing, except for > > kinda-sorta directories that have no ->lookup() - we have them for > > NFS referral points. It should be impossible to end up with > > one of those opened - not even with O_PATH; follow_managed() will be called > > and we'll either fail or cross into whatever ends up overmounting them. > > Still, it might be cleaner to turn that check into > > d_can_lookup(f.file->f_path.dentry) > > simply for consistency sake. > > > > The thing I really don't like is mntns_install(). With sufficiently > > nasty nfsroot setup it might be possible to end up with referral point > > as one's root/pwd; getting out of such state would be interesting... > > Smells like that place should be a solitary follow_down(), not a loop > > of follow_down_one(), but I want Eric's opinion on that one; userns stuff > > is weird. > > If I read the conversation correctly the concern is that we might > initialize a pwd or root with something that is almost but not quite a > directory in mntns_install. > > Refereshing my memory. d_automount mounts things and is what > is used for nfs referrals. d_manage blocks waiting for > an automounts to complete or expire. follow_down just calls d_manage, > follow_manage calls both d_manage and d_automount as appropriate. AFAIK d_manage() is only defined by autofs. It was needed by autofs because the the mount creation and addition is done by another (user space) thread whereas "normal" file systems like NFS do all the work in-kernel. > > If the concern is nfs referral points calling follow_down is wrong and > what is wanted is follow_managed. > > The only thing that follow_down prevents is changing onto directories > that are only half mounted, and not really directories yet. Which > is certainly part of the invarient we want to preserve. > > > > The intent of the logic in mntns_install is to just pick a reasonable > looking place somewhere in that mount namespace to use as a root > directory. I arbitrarily picked the top of the mount stack on "/". Which > is typically used as the root directory. If people really care where > their root is they save a directory file descriptor off somewhere and > call chroot. So there is a little wiggle room exactly what the code > does. > > There is a secondary use of mntns_install which is to give you a way to > access what is under "/" if you are so foolish as to umount "/". I keep > thinking setns to your own mount namespace would be a handy way to get > back to the rootfs and to use it for something during system shutdown. > I don't know if anyone has actually used setns to your own mount > namespace for that. > > The worst case I can see from the proposed change is we would > not be able to umount all of the way down to rootfs. That > would be a self inflicted wound so I don't care. > > I can't imagine anyone mounting an automount point deliberately on / > except as way to confuse the vfs. Though I can almost imagine getting > there by accident if an automount expires. > > So yes please let's change the follow_down_one loop to follow_managed > to preserve the invariant that we always have a directory that > supports d_can_lookup to pass to set_fs_pwd and set_fs_root. > > Eric > > > diff --git a/fs/dcache.c b/fs/dcache.c > > index 95d71eda8142..05550139a8a6 100644 > > --- a/fs/dcache.c > > +++ b/fs/dcache.c > > @@ -1757,7 +1757,13 @@ static unsigned d_flags_for_inode(struct inode > > *inode) > > return DCACHE_MISS_TYPE; > > > > if (S_ISDIR(inode->i_mode)) { > > - add_flags = DCACHE_DIRECTORY_TYPE; > > + /* > > + * Any potential starting point of lookup should have > > + * DCACHE_RCUACCESS; currently directory dentries > > + * come from d_alloc() anyway, but it costs us nothing > > + * to enforce it here. > > + */ > > + add_flags = DCACHE_DIRECTORY_TYPE | DCACHE_RCUACCESS; > > if (unlikely(!(inode->i_opflags & IOP_LOOKUP))) { > >
Re: [git pull] vfs fixes
On Mon, Apr 03, 2017 at 01:00:56AM -0500, Eric W. Biederman wrote: > Refereshing my memory. d_automount mounts things and is what > is used for nfs referrals. d_manage blocks waiting for > an automounts to complete or expire. follow_down just calls d_manage, > follow_manage calls both d_manage and d_automount as appropriate. D'oh... Right. What's more, by that point getting back to original state on error is needed. > If the concern is nfs referral points calling follow_down is wrong and > what is wanted is follow_managed. ... except that follow_managed() takes nameidata and there is no way in hell we are letting that animal out of fs/namei.c ever again. Too low-level. > The intent of the logic in mntns_install is to just pick a reasonable > looking place somewhere in that mount namespace to use as a root > directory. I arbitrarily picked the top of the mount stack on "/". Which > is typically used as the root directory. If people really care where > their root is they save a directory file descriptor off somewhere and > call chroot. So there is a little wiggle room exactly what the code > does. Hmm... If anything, I'm tempted to add LOOKUP_DOWN that would have path_lookupat() do if (unlikely(flags & LOOKUP_DOWN)) { struct path path = nd->path; dget(nd->path.dentry); err = follow_managed(&path, nd); if (unlikely(err < 0)) terminate_walk(nd); return err; } path_to_nameidate(&path, nd); } right after path_init(). Then your stuff would've turned into get_mnt_ns(mnt_ns); old_mnt_ns = nsproxy->mnt_ns; nsproxy->mnt_ns = mnt_ns; /* Find the root */ err = vfs_path_lookup(mnt_ns->root->mnt.mnt_root, &mnt_ns->root->mnt, "/", LOOKUP_DOWN, &root); if (err) { /* revert to old namespace */ nsproxy->mnt_ns = old_mnt_ns; put_mnt_ns(mnt_ns); return err; } /* Update the pwd and root */ set_fs_pwd(fs, &root); set_fs_root(fs, &root); path_put(&root); put_mnt_ns(old_mnt_ns); return 0; This is absolutely untested, and I won't get around to testing it until tomorrow, but something along those lines would IMO be better than exposing a trimmed-down follow_managed(), not to mention struct nameidata itself...
Re: [git pull] vfs fixes
Al Viro writes: > On Sun, Apr 02, 2017 at 05:58:41PM -0700, Linus Torvalds wrote: > >> I had to go and double-check that "DCACHE_DIRECTORY_TYPE" is what >> d_can_lookup() actually checks, so _that_ part is perhaps a bit >> subtle, and might be worth noting in that comment that you edited. >> >> So the real "rule" ends up being that we only ever look up things from >> dentries of type DCACHE_DIRECTORY_TYPE set, and those had better have >> DCACHE_RCUACCESS bit set. >> >> And the only reason path_init() only checks it for that case is that >> nd->root and nd->pwd both have to be of type d_can_lookup(). >> >> Do we check that when we set it? I hope/assume we do. > > For chdir()/chroot()/pivot_root() it's done by LOOKUP_DIRECTORY in lookup > flags; fchdir() is slightly different - there we check S_ISDIR of inode > of opened file. Which is almost the same thing, except for > kinda-sorta directories that have no ->lookup() - we have them for > NFS referral points. It should be impossible to end up with > one of those opened - not even with O_PATH; follow_managed() will be called > and we'll either fail or cross into whatever ends up overmounting them. > Still, it might be cleaner to turn that check into > d_can_lookup(f.file->f_path.dentry) > simply for consistency sake. > > The thing I really don't like is mntns_install(). With sufficiently > nasty nfsroot setup it might be possible to end up with referral point > as one's root/pwd; getting out of such state would be interesting... > Smells like that place should be a solitary follow_down(), not a loop > of follow_down_one(), but I want Eric's opinion on that one; userns stuff > is weird. If I read the conversation correctly the concern is that we might initialize a pwd or root with something that is almost but not quite a directory in mntns_install. Refereshing my memory. d_automount mounts things and is what is used for nfs referrals. d_manage blocks waiting for an automounts to complete or expire. follow_down just calls d_manage, follow_manage calls both d_manage and d_automount as appropriate. If the concern is nfs referral points calling follow_down is wrong and what is wanted is follow_managed. The only thing that follow_down prevents is changing onto directories that are only half mounted, and not really directories yet. Which is certainly part of the invarient we want to preserve. The intent of the logic in mntns_install is to just pick a reasonable looking place somewhere in that mount namespace to use as a root directory. I arbitrarily picked the top of the mount stack on "/". Which is typically used as the root directory. If people really care where their root is they save a directory file descriptor off somewhere and call chroot. So there is a little wiggle room exactly what the code does. There is a secondary use of mntns_install which is to give you a way to access what is under "/" if you are so foolish as to umount "/". I keep thinking setns to your own mount namespace would be a handy way to get back to the rootfs and to use it for something during system shutdown. I don't know if anyone has actually used setns to your own mount namespace for that. The worst case I can see from the proposed change is we would not be able to umount all of the way down to rootfs. That would be a self inflicted wound so I don't care. I can't imagine anyone mounting an automount point deliberately on / except as way to confuse the vfs. Though I can almost imagine getting there by accident if an automount expires. So yes please let's change the follow_down_one loop to follow_managed to preserve the invariant that we always have a directory that supports d_can_lookup to pass to set_fs_pwd and set_fs_root. Eric > diff --git a/fs/dcache.c b/fs/dcache.c > index 95d71eda8142..05550139a8a6 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -1757,7 +1757,13 @@ static unsigned d_flags_for_inode(struct inode *inode) > return DCACHE_MISS_TYPE; > > if (S_ISDIR(inode->i_mode)) { > - add_flags = DCACHE_DIRECTORY_TYPE; > + /* > + * Any potential starting point of lookup should have > + * DCACHE_RCUACCESS; currently directory dentries > + * come from d_alloc() anyway, but it costs us nothing > + * to enforce it here. > + */ > + add_flags = DCACHE_DIRECTORY_TYPE | DCACHE_RCUACCESS; > if (unlikely(!(inode->i_opflags & IOP_LOOKUP))) { > if (unlikely(!inode->i_op->lookup)) > add_flags = DCACHE_AUTODIR_TYPE; > diff --git a/fs/namei.c b/fs/namei.c > index d41fab78798b..19dcf62133cc 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -2145,6 +2145,9 @@ static const char *path_init(struct nameidata *nd, > unsigned flags) > int retval = 0; > const char *s = nd->name->name; > > + if (!*s) > + flags &= ~LOOKUP_RCU; > + > nd->last_typ
Re: [git pull] vfs fixes
On Sun, Apr 02, 2017 at 05:58:41PM -0700, Linus Torvalds wrote: > I had to go and double-check that "DCACHE_DIRECTORY_TYPE" is what > d_can_lookup() actually checks, so _that_ part is perhaps a bit > subtle, and might be worth noting in that comment that you edited. > > So the real "rule" ends up being that we only ever look up things from > dentries of type DCACHE_DIRECTORY_TYPE set, and those had better have > DCACHE_RCUACCESS bit set. > > And the only reason path_init() only checks it for that case is that > nd->root and nd->pwd both have to be of type d_can_lookup(). > > Do we check that when we set it? I hope/assume we do. For chdir()/chroot()/pivot_root() it's done by LOOKUP_DIRECTORY in lookup flags; fchdir() is slightly different - there we check S_ISDIR of inode of opened file. Which is almost the same thing, except for kinda-sorta directories that have no ->lookup() - we have them for NFS referral points. It should be impossible to end up with one of those opened - not even with O_PATH; follow_managed() will be called and we'll either fail or cross into whatever ends up overmounting them. Still, it might be cleaner to turn that check into d_can_lookup(f.file->f_path.dentry) simply for consistency sake. The thing I really don't like is mntns_install(). With sufficiently nasty nfsroot setup it might be possible to end up with referral point as one's root/pwd; getting out of such state would be interesting... Smells like that place should be a solitary follow_down(), not a loop of follow_down_one(), but I want Eric's opinion on that one; userns stuff is weird. diff --git a/fs/dcache.c b/fs/dcache.c index 95d71eda8142..05550139a8a6 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1757,7 +1757,13 @@ static unsigned d_flags_for_inode(struct inode *inode) return DCACHE_MISS_TYPE; if (S_ISDIR(inode->i_mode)) { - add_flags = DCACHE_DIRECTORY_TYPE; + /* +* Any potential starting point of lookup should have +* DCACHE_RCUACCESS; currently directory dentries +* come from d_alloc() anyway, but it costs us nothing +* to enforce it here. +*/ + add_flags = DCACHE_DIRECTORY_TYPE | DCACHE_RCUACCESS; if (unlikely(!(inode->i_opflags & IOP_LOOKUP))) { if (unlikely(!inode->i_op->lookup)) add_flags = DCACHE_AUTODIR_TYPE; diff --git a/fs/namei.c b/fs/namei.c index d41fab78798b..19dcf62133cc 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2145,6 +2145,9 @@ static const char *path_init(struct nameidata *nd, unsigned flags) int retval = 0; const char *s = nd->name->name; + if (!*s) + flags &= ~LOOKUP_RCU; + nd->last_type = LAST_ROOT; /* if there are only slashes... */ nd->flags = flags | LOOKUP_JUMPED | LOOKUP_PARENT; nd->depth = 0; diff --git a/fs/namespace.c b/fs/namespace.c index cc1375eff88c..31ec9a79d2d4 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3467,6 +3467,7 @@ static int mntns_install(struct nsproxy *nsproxy, struct ns_common *ns) struct fs_struct *fs = current->fs; struct mnt_namespace *mnt_ns = to_mnt_ns(ns); struct path root; + int err; if (!ns_capable(mnt_ns->user_ns, CAP_SYS_ADMIN) || !ns_capable(current_user_ns(), CAP_SYS_CHROOT) || @@ -3484,15 +3485,14 @@ static int mntns_install(struct nsproxy *nsproxy, struct ns_common *ns) root.mnt= &mnt_ns->root->mnt; root.dentry = mnt_ns->root->mnt.mnt_root; path_get(&root); - while(d_mountpoint(root.dentry) && follow_down_one(&root)) - ; - - /* Update the pwd and root */ - set_fs_pwd(fs, &root); - set_fs_root(fs, &root); - + err = follow_down(&root); + if (likely(!err)) { + /* Update the pwd and root */ + set_fs_pwd(fs, &root); + set_fs_root(fs, &root); + } path_put(&root); - return 0; + return err; } static struct user_namespace *mntns_owner(struct ns_common *ns) diff --git a/fs/open.c b/fs/open.c index 949cef29c3bb..217b5db588c8 100644 --- a/fs/open.c +++ b/fs/open.c @@ -459,20 +459,17 @@ SYSCALL_DEFINE1(chdir, const char __user *, filename) SYSCALL_DEFINE1(fchdir, unsigned int, fd) { struct fd f = fdget_raw(fd); - struct inode *inode; int error = -EBADF; error = -EBADF; if (!f.file) goto out; - inode = file_inode(f.file); - error = -ENOTDIR; - if (!S_ISDIR(inode->i_mode)) + if (!d_can_lookup(f.file->f_path.dentry)) goto out_putf; - error = inode_permission(inode, MAY_EXEC | MAY_CHDIR); + error = inode_permission(file_inode(f.file), MAY_EXEC | MAY_CHDIR); if (!error) set_fs_pwd(current->fs, &f.file->f_path); out_putf:
Re: [git pull] vfs fixes
On Sun, Apr 2, 2017 at 5:43 PM, Al Viro wrote: > > Do you have any objections against the following (still untested) variant? > I don't see any point in checking for flags & LOOKUP_RCU in case of !*s - > flags is in register at that point, so... Looks sane to me. I had to go and double-check that "DCACHE_DIRECTORY_TYPE" is what d_can_lookup() actually checks, so _that_ part is perhaps a bit subtle, and might be worth noting in that comment that you edited. So the real "rule" ends up being that we only ever look up things from dentries of type DCACHE_DIRECTORY_TYPE set, and those had better have DCACHE_RCUACCESS bit set. And the only reason path_init() only checks it for that case is that nd->root and nd->pwd both have to be of type d_can_lookup(). Do we check that when we set it? I hope/assume we do. Linus
Re: [git pull] vfs fixes
On Mon, Apr 03, 2017 at 01:30:45AM +0100, Al Viro wrote: > Currently true and almost certainly will remain so. Point taken, what you > are suggesting is better. Actually, the invariant to watch for is > "no d_can_lookup() withtout DCACHE_RCUACCESS" and that we can trivially > enforce by one-liner change in d_flags_for_inode() - > s/DCACHE_DIRECTORY_TYPE/& | DCACHE_RCUACCESS/ > > OK... Do you have any objections against the following (still untested) variant? I don't see any point in checking for flags & LOOKUP_RCU in case of !*s - flags is in register at that point, so... diff --git a/fs/dcache.c b/fs/dcache.c index 95d71eda8142..05550139a8a6 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1757,7 +1757,13 @@ static unsigned d_flags_for_inode(struct inode *inode) return DCACHE_MISS_TYPE; if (S_ISDIR(inode->i_mode)) { - add_flags = DCACHE_DIRECTORY_TYPE; + /* +* Any potential starting point of lookup should have +* DCACHE_RCUACCESS; currently directory dentries +* come from d_alloc() anyway, but it costs us nothing +* to enforce it here. +*/ + add_flags = DCACHE_DIRECTORY_TYPE | DCACHE_RCUACCESS; if (unlikely(!(inode->i_opflags & IOP_LOOKUP))) { if (unlikely(!inode->i_op->lookup)) add_flags = DCACHE_AUTODIR_TYPE; diff --git a/fs/namei.c b/fs/namei.c index d41fab78798b..19dcf62133cc 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2145,6 +2145,9 @@ static const char *path_init(struct nameidata *nd, unsigned flags) int retval = 0; const char *s = nd->name->name; + if (!*s) + flags &= ~LOOKUP_RCU; + nd->last_type = LAST_ROOT; /* if there are only slashes... */ nd->flags = flags | LOOKUP_JUMPED | LOOKUP_PARENT; nd->depth = 0;
Re: [git pull] vfs fixes
On Sun, Apr 02, 2017 at 05:10:08PM -0700, Linus Torvalds wrote: > On Sun, Apr 2, 2017 at 4:59 PM, Linus Torvalds > wrote: > > On Sun, Apr 2, 2017 at 10:01 AM, Al Viro wrote: > >> statx followup fixes, fix for a nasty corner case in path_init() > >> leaving path.dentry in RCU mode pointing to a dentry without > >> DCACHE_RCUACCESS > >> and a fix for stack-smashing on alpha. > > > > These were apparently committed minutes before sending me the pull request. > > > > Why? What kind of testing did this all get? > > Also, that RCU fix really stinks. It makes no sense. Any valid base > for actual pathname lookup will already have the bit set, so the only > issue is when people play games and use a non-path file descriptor > without a pathname. Right? > > And that case shouldn't actually use RCU lookup AT ALL! > > By definition such a case will just be immediately unlazied anyway in > complete_walk(), so doing an RCU lookup on an empty path only adds > overhead, and there is no actual point in doing an RCU walk on an > empty pathname. > > So I get the feeling that the proper fix would be just something like > > /* Don't RCU-lookup empty pathnames */ > if ((flags & LOOKUP_RCU) && !*s) > flags &= ~LOOKUP_RCU; > > at the very top of path_init(), which > > (a) makes the code more efficiant, since we don't do those > unnecessary games with sequence numbers and RCU locking only to un-RCU > it immedately > > and > > (b) obviates the need for those DCACHE_RCUACCESS games entirely, > since anything that can actually be used as a base for pathname lookup > will already have that bit set, afaik. Currently true and almost certainly will remain so. Point taken, what you are suggesting is better. Actually, the invariant to watch for is "no d_can_lookup() withtout DCACHE_RCUACCESS" and that we can trivially enforce by one-liner change in d_flags_for_inode() - s/DCACHE_DIRECTORY_TYPE/& | DCACHE_RCUACCESS/ OK...
Re: [git pull] vfs fixes
On Sun, Apr 02, 2017 at 04:59:10PM -0700, Linus Torvalds wrote: > On Sun, Apr 2, 2017 at 10:01 AM, Al Viro wrote: > > statx followup fixes, fix for a nasty corner case in path_init() > > leaving path.dentry in RCU mode pointing to a dentry without > > DCACHE_RCUACCESS > > and a fix for stack-smashing on alpha. > > These were apparently committed minutes before sending me the pull request. > > Why? The first two used to sit on a branch that predated statx merge into mainline, the rest obviously depends on statx. > What kind of testing did this all get? xfstests/ltp for the last part (-rc4 based branch with statx), same + Dmitry's reproducer for path_init() crap, trivial test for alpha old_adjtimex(). My apologies for rebase; if anything, it would've been cleaner to cherry-pick the old pair of fixes on top of statx series.
Re: [git pull] vfs fixes
On Sun, Apr 2, 2017 at 4:59 PM, Linus Torvalds wrote: > On Sun, Apr 2, 2017 at 10:01 AM, Al Viro wrote: >> statx followup fixes, fix for a nasty corner case in path_init() >> leaving path.dentry in RCU mode pointing to a dentry without DCACHE_RCUACCESS >> and a fix for stack-smashing on alpha. > > These were apparently committed minutes before sending me the pull request. > > Why? What kind of testing did this all get? Also, that RCU fix really stinks. It makes no sense. Any valid base for actual pathname lookup will already have the bit set, so the only issue is when people play games and use a non-path file descriptor without a pathname. Right? And that case shouldn't actually use RCU lookup AT ALL! By definition such a case will just be immediately unlazied anyway in complete_walk(), so doing an RCU lookup on an empty path only adds overhead, and there is no actual point in doing an RCU walk on an empty pathname. So I get the feeling that the proper fix would be just something like /* Don't RCU-lookup empty pathnames */ if ((flags & LOOKUP_RCU) && !*s) flags &= ~LOOKUP_RCU; at the very top of path_init(), which (a) makes the code more efficiant, since we don't do those unnecessary games with sequence numbers and RCU locking only to un-RCU it immedately and (b) obviates the need for those DCACHE_RCUACCESS games entirely, since anything that can actually be used as a base for pathname lookup will already have that bit set, afaik. So honestly, that fix to add DCACHE_RCUACCESS just looks like the wrong thing to me. Linus
Re: [git pull] vfs fixes
On Sun, Apr 2, 2017 at 10:01 AM, Al Viro wrote: > statx followup fixes, fix for a nasty corner case in path_init() > leaving path.dentry in RCU mode pointing to a dentry without DCACHE_RCUACCESS > and a fix for stack-smashing on alpha. These were apparently committed minutes before sending me the pull request. Why? What kind of testing did this all get? Linus
[git pull] vfs fixes
statx followup fixes, fix for a nasty corner case in path_init() leaving path.dentry in RCU mode pointing to a dentry without DCACHE_RCUACCESS and a fix for stack-smashing on alpha. The following changes since commit c02ed2e75ef4c74e41e421acb4ef1494671585e8: Linux 4.11-rc4 (2017-03-26 14:15:16 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus for you to fetch changes up to 0b1367305acc8e8706e13aa4c6766a747cf489d4: statx: Include a mask for stx_attributes in struct statx (2017-04-02 12:27:59 -0400) Al Viro (2): path_init(): make sure that nd->path.dentry freeing will be RCU-delayed alpha: fix stack smashing in old_adjtimex(2) Darrick J. Wong (1): xfs: report crtime and attribute flags to statx David Howells (3): ext4: Add statx support statx: Reserve the top bit of the mask for future struct expansion statx: Include a mask for stx_attributes in struct statx Eric Biggers (4): Documentation/filesystems: fix documentation for ->getattr() statx: reject unknown flags when using NULL path statx: remove incorrect part of vfs_statx() comment statx: optimize copy of struct statx to userspace Documentation/filesystems/Locking | 3 +- Documentation/filesystems/porting | 6 +++ Documentation/filesystems/vfs.txt | 3 +- arch/alpha/kernel/osf_sys.c | 2 +- fs/ext4/ext4.h| 1 + fs/ext4/file.c| 2 +- fs/ext4/inode.c | 41 +-- fs/ext4/namei.c | 2 + fs/ext4/symlink.c | 3 ++ fs/namei.c| 11 +++-- fs/stat.c | 86 ++- fs/xfs/xfs_iops.c | 14 +++ include/linux/stat.h | 1 + include/uapi/linux/stat.h | 5 ++- samples/statx/test-statx.c| 12 -- 15 files changed, 128 insertions(+), 64 deletions(-)
[git pull] vfs fixes
A couple more of d_walk()/d_subdirs reordering fixes (stable fodder; ought to solve that crap for good) and a fix for a brown paperbag bug in d_alloc_parallel() (this cycle). The following changes since commit 1607f09c226d1378439c411b020042750338: coredump: fix dumping through pipes (2016-06-07 22:07:09 -0400) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus for you to fetch changes up to e7d6ef9790bc281f5c29d0132b68031248523fe8: fix idiotic braino in d_alloc_parallel() (2016-06-20 10:07:42 -0400) Al Viro (3): much milder d_walk() race autofs races fix idiotic braino in d_alloc_parallel() fs/autofs4/autofs_i.h | 8 -- fs/autofs4/expire.c| 27 ++ fs/autofs4/root.c | 2 +- fs/dcache.c| 75 ++ fs/internal.h | 1 + fs/libfs.c | 4 +-- include/linux/dcache.h | 1 + 7 files changed, 82 insertions(+), 36 deletions(-)
[git pull] vfs fixes
Fixes for crap of assorted ages: EOPENSTALE one is 4.2+, autofs one is 4.6, d_walk - 3.2+, atomic_open() and coredump ones are this window regressions. The following changes since commit 1a695a905c18548062509178b98bc91e67510864: Linux 4.7-rc1 (2016-05-29 09:29:24 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus for you to fetch changes up to 1607f09c226d1378439c411b020042750338: coredump: fix dumping through pipes (2016-06-07 22:07:09 -0400) Al Viro (4): fix EOPENSTALE bug in do_last() autofs braino fix for do_last() fix d_walk()/non-delayed __d_free() race fix a regression in atomic_open() Mateusz Guzik (1): coredump: fix dumping through pipes arch/powerpc/platforms/cell/spufs/coredump.c | 2 +- fs/binfmt_elf.c | 2 +- fs/binfmt_elf_fdpic.c| 2 +- fs/coredump.c| 4 +- fs/dcache.c | 4 +- fs/namei.c | 61 +++- include/linux/binfmts.h | 1 + 7 files changed, 24 insertions(+), 52 deletions(-)
[git pull] vfs fixes
work.lookups followups - update docs, restore killability of the places that used to take ->i_mutex killably now that we have down_write_killable() merged. Additionally, it turns out that I missed a prereq for security_d_instantiate() stuff - ->getxattr() wasn't the only thing that could be called before dentry is attached to inode; with smack we needed the same treatment applied to ->setxattr() as well. The following changes since commit 0985b65d3ba2c09f10a594b73df45c1f7f68d317: Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs (2016-05-25 15:59:09 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus for you to fetch changes up to 3767e255b390d72f9a33c08d9e86c5f21f25860f: switch ->setxattr() to passing dentry and inode separately (2016-05-27 20:09:16 -0400) Al Viro (5): update D/f/directory-locking add down_write_killable_nested() restore killability of old mutex_lock_killable(&inode->i_mutex) users switch xattr_handler->set() to passing dentry and inode separately switch ->setxattr() to passing dentry and inode separately Documentation/filesystems/directory-locking| 32 ++ Documentation/filesystems/porting | 7 + .../staging/lustre/lustre/llite/llite_internal.h | 4 +-- drivers/staging/lustre/lustre/llite/xattr.c| 6 ++-- fs/9p/acl.c| 6 ++-- fs/9p/xattr.c | 5 ++-- fs/bad_inode.c | 4 +-- fs/btrfs/ioctl.c | 18 +--- fs/btrfs/xattr.c | 12 fs/ceph/xattr.c| 7 +++-- fs/cifs/xattr.c| 9 +++--- fs/ecryptfs/crypto.c | 9 +++--- fs/ecryptfs/ecryptfs_kernel.h | 4 +-- fs/ecryptfs/inode.c| 7 +++-- fs/ecryptfs/mmap.c | 3 +- fs/ext2/xattr_security.c | 7 +++-- fs/ext2/xattr_trusted.c| 7 +++-- fs/ext2/xattr_user.c | 9 +++--- fs/ext4/xattr_security.c | 7 +++-- fs/ext4/xattr_trusted.c| 7 +++-- fs/ext4/xattr_user.c | 9 +++--- fs/f2fs/xattr.c| 12 fs/fuse/dir.c | 6 ++-- fs/gfs2/xattr.c| 6 ++-- fs/hfs/attr.c | 6 ++-- fs/hfs/hfs_fs.h| 2 +- fs/hfsplus/xattr.c | 12 fs/hfsplus/xattr.h | 2 +- fs/hfsplus/xattr_security.c| 7 +++-- fs/hfsplus/xattr_trusted.c | 7 +++-- fs/hfsplus/xattr_user.c| 7 +++-- fs/jffs2/security.c| 7 +++-- fs/jffs2/xattr_trusted.c | 7 +++-- fs/jffs2/xattr_user.c | 7 +++-- fs/jfs/xattr.c | 14 -- fs/kernfs/inode.c | 11 fs/kernfs/kernfs-internal.h| 3 +- fs/libfs.c | 5 ++-- fs/nfs/nfs4proc.c | 19 ++--- fs/ocfs2/xattr.c | 23 +--- fs/orangefs/xattr.c| 10 --- fs/overlayfs/inode.c | 5 ++-- fs/overlayfs/overlayfs.h | 5 ++-- fs/overlayfs/readdir.c | 4 +-- fs/posix_acl.c | 6 ++-- fs/readdir.c | 12 fs/reiserfs/xattr_security.c | 9 +++--- fs/reiserfs/xattr_trusted.c| 9 +++--- fs/reiserfs/xattr_user.c | 9 +++--- fs/ubifs/xattr.c | 7 ++--- fs/xattr.c | 10 --- fs/xfs/xfs_xattr.c | 9 +++--- include/linux/fs.h | 3 +- include/linux/rwsem.h | 2 ++ include/linux/xattr.h | 7 +++-- kernel/locking/rwsem.c | 16 +++ mm/shmem.c | 7 +++-- security/smack/smack_lsm.c | 2 +- 58 files changed, 265 insertions(+), 209
[git pull] vfs fixes
The following changes since commit 7ae8fd0351f912b075149a1e03a017be8b903b9a: fs/pnode.c: treat zero mnt_group_id-s as unequal (2016-02-20 00:15:52 -0500) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus for you to fetch changes up to 5129fa482b16615fd4464d2f5d23acb1b7056c66: do_last(): ELOOP failure exit should be done after leaving RCU mode (2016-02-27 19:37:37 -0500) Al Viro (4): do_last(): don't let a bogus return value from ->open() et.al. to confuse us namei: ->d_inode of a pinned dentry is stable only for positives should_follow_link(): validate ->d_seq after having decided to follow do_last(): ELOOP failure exit should be done after leaving RCU mode Christoph Hellwig (1): fs: return -EOPNOTSUPP if clone is not supported Mikulas Patocka (1): hpfs: don't truncate the file when delete fails fs/hpfs/namei.c | 31 +++ fs/namei.c | 22 +++--- fs/read_write.c | 6 -- 3 files changed, 22 insertions(+), 37 deletions(-)
[git pull] vfs fixes for -rc3
A couple of fixes for bugs caught while digging in fs/namei.c. The first one is this cycle regression, the second is 3.11 and later. Please, pull from the usual place - git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus Shortlog: Al Viro (2): namei: d_is_negative() should be checked before ->d_seq validation path_openat(): fix double fput() Diffstat: fs/namei.c | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On 26 September 2014 23:28, Joachim Eastwood wrote: > On 26 September 2014 22:58, Al Viro wrote: >> On Fri, Sep 26, 2014 at 10:46:14PM +0200, Joachim Eastwood wrote: >>> On 14 September 2014 21:47, Al Viro wrote: >>> > double iput() on failure exit in lustre, racy removal of spliced dentries >>> > from ->s_anon in __d_materialise_dentry() plus a bunch of assorted RCU >>> > pathwalk >>> > fixes. Please, pull from the usual place - >>> > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus >>> > >>> > Shortlog: >>> > Al Viro (5): >>> > [fix] lustre: d_make_root() does iput() on dentry allocation failure >>> > move the call of __d_drop(anon) into __d_materialise_unique(dentry, >>> > anon) >>> > fix bogus read_seqretry() checks introduced in b37199e >>> > don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu() >>> > be careful with nd->inode in path_init() and follow_dotdot_rcu() >>> >>> Hi, >>> >>> Commit 4023bfc9f351a7994 "be careful with nd->inode in path_init() and >>> follow_dotdot_rcu(), seem to hang my ARM no-MMU platform when mounting >>> the ramdisk. >>> >>> 3.17-rc4 - works >>> 3.17-rc5 - works with 4023bfc9f351a7994 reverted. >>> >>> Boot log with from rc5: >>> [ 5.81] TCP: cubic registered >>> [ 5.82] NET: Registered protocol family 17 >>> [ 5.86] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware >>> clock >>> [ 5.91] mmc_host mmc0: Bus speed (slot 0) = 1200Hz (slot req >>> 2500Hz, actual 1200HZ div = 0) >>> [ 5.93] mmc0: new SDHC card at address 0007 >>> [ 5.95] mmcblk0: mmc0:0007 SD08G 7.42 GiB >>> [ 6.15] clk: Not disabling unused clocks >>> [ 81.24] random: nonblocking pool is initialized >>> >>> And there it just hangs it seems. >>> >>> >>> With patch reverted >>> [ 5.81] TCP: cubic registered >>> [ 5.82] NET: Registered protocol family 17 >>> [ 5.85] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware >>> clock >>> [ 6.10] clk: Not disabling unused clocks >>> [ 6.11] RAMDISK: gzip image found at block 0 >>> [ 9.59] VFS: Mounted root (ext2 filesystem) readonly on device 1:0. >>> [ 9.60] devtmpfs: mounted >>> [ 9.61] Freeing unused kernel memory: 68K (281e5000 - 281f6000) >>> >>> And then user space starts. >> >> *blink* What happens to mmc-related messages on successful boot? And what >> in that commit could've possibly lead to those not being produced? > > Now I am puzzled too. I can not longer reproduce that hang. > > I am guessing it was probably related to the mmc card being flaky or > something random like that. > > Sorry for noise! Just to confirm what was happening: I managed to trigger it again and removing the mmc driver makes the problem go away. So this is an mmc issue and _not_ vfs. But what is strange is that reverting 4023bfc9f351a also makes it boot again... Which is what fooled me. Here are the boot logs just in case you want to have a look. Boot log with hang: http://slexy.org/raw/s2eRhKN3aG Boot ok (4023bfc9f351a reverted): http://slexy.org/raw/s21NgzZkCA mmc is behaving differently. Sorry for prematurely blaming vfs. regards, Joachim Eastwood -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On 26 September 2014 22:58, Al Viro wrote: > On Fri, Sep 26, 2014 at 10:46:14PM +0200, Joachim Eastwood wrote: >> On 14 September 2014 21:47, Al Viro wrote: >> > double iput() on failure exit in lustre, racy removal of spliced dentries >> > from ->s_anon in __d_materialise_dentry() plus a bunch of assorted RCU >> > pathwalk >> > fixes. Please, pull from the usual place - >> > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus >> > >> > Shortlog: >> > Al Viro (5): >> > [fix] lustre: d_make_root() does iput() on dentry allocation failure >> > move the call of __d_drop(anon) into __d_materialise_unique(dentry, >> > anon) >> > fix bogus read_seqretry() checks introduced in b37199e >> > don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu() >> > be careful with nd->inode in path_init() and follow_dotdot_rcu() >> >> Hi, >> >> Commit 4023bfc9f351a7994 "be careful with nd->inode in path_init() and >> follow_dotdot_rcu(), seem to hang my ARM no-MMU platform when mounting >> the ramdisk. >> >> 3.17-rc4 - works >> 3.17-rc5 - works with 4023bfc9f351a7994 reverted. >> >> Boot log with from rc5: >> [ 5.81] TCP: cubic registered >> [ 5.82] NET: Registered protocol family 17 >> [ 5.86] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware >> clock >> [ 5.91] mmc_host mmc0: Bus speed (slot 0) = 1200Hz (slot req >> 2500Hz, actual 1200HZ div = 0) >> [ 5.93] mmc0: new SDHC card at address 0007 >> [ 5.95] mmcblk0: mmc0:0007 SD08G 7.42 GiB >> [ 6.15] clk: Not disabling unused clocks >> [ 81.24] random: nonblocking pool is initialized >> >> And there it just hangs it seems. >> >> >> With patch reverted >> [ 5.81] TCP: cubic registered >> [ 5.82] NET: Registered protocol family 17 >> [ 5.85] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware >> clock >> [ 6.10] clk: Not disabling unused clocks >> [ 6.11] RAMDISK: gzip image found at block 0 >> [ 9.59] VFS: Mounted root (ext2 filesystem) readonly on device 1:0. >> [ 9.60] devtmpfs: mounted >> [ 9.61] Freeing unused kernel memory: 68K (281e5000 - 281f6000) >> >> And then user space starts. > > *blink* What happens to mmc-related messages on successful boot? And what > in that commit could've possibly lead to those not being produced? Now I am puzzled too. I can not longer reproduce that hang. I am guessing it was probably related to the mmc card being flaky or something random like that. Sorry for noise! regards, Joachim Eastwood -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Fri, Sep 26, 2014 at 10:46:14PM +0200, Joachim Eastwood wrote: > On 14 September 2014 21:47, Al Viro wrote: > > double iput() on failure exit in lustre, racy removal of spliced dentries > > from ->s_anon in __d_materialise_dentry() plus a bunch of assorted RCU > > pathwalk > > fixes. Please, pull from the usual place - > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus > > > > Shortlog: > > Al Viro (5): > > [fix] lustre: d_make_root() does iput() on dentry allocation failure > > move the call of __d_drop(anon) into __d_materialise_unique(dentry, > > anon) > > fix bogus read_seqretry() checks introduced in b37199e > > don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu() > > be careful with nd->inode in path_init() and follow_dotdot_rcu() > > Hi, > > Commit 4023bfc9f351a7994 "be careful with nd->inode in path_init() and > follow_dotdot_rcu(), seem to hang my ARM no-MMU platform when mounting > the ramdisk. > > 3.17-rc4 - works > 3.17-rc5 - works with 4023bfc9f351a7994 reverted. > > Boot log with from rc5: > [ 5.81] TCP: cubic registered > [ 5.82] NET: Registered protocol family 17 > [ 5.86] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware clock > [ 5.91] mmc_host mmc0: Bus speed (slot 0) = 1200Hz (slot req > 2500Hz, actual 1200HZ div = 0) > [ 5.93] mmc0: new SDHC card at address 0007 > [ 5.95] mmcblk0: mmc0:0007 SD08G 7.42 GiB > [ 6.15] clk: Not disabling unused clocks > [ 81.24] random: nonblocking pool is initialized > > And there it just hangs it seems. > > > With patch reverted > [ 5.81] TCP: cubic registered > [ 5.82] NET: Registered protocol family 17 > [ 5.85] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware clock > [ 6.10] clk: Not disabling unused clocks > [ 6.11] RAMDISK: gzip image found at block 0 > [ 9.59] VFS: Mounted root (ext2 filesystem) readonly on device 1:0. > [ 9.60] devtmpfs: mounted > [ 9.61] Freeing unused kernel memory: 68K (281e5000 - 281f6000) > > And then user space starts. *blink* What happens to mmc-related messages on successful boot? And what in that commit could've possibly lead to those not being produced? Another question: what happens if you revert a half of that commit? There are two separate parts, easy to isolate - one in follow_dotdot_rcu(), another in path_init(). They deal with similar problems, but they are independent from each other; which one is triggering that crap? Al, really mystified... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On 14 September 2014 21:47, Al Viro wrote: > double iput() on failure exit in lustre, racy removal of spliced dentries > from ->s_anon in __d_materialise_dentry() plus a bunch of assorted RCU > pathwalk > fixes. Please, pull from the usual place - > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus > > Shortlog: > Al Viro (5): > [fix] lustre: d_make_root() does iput() on dentry allocation failure > move the call of __d_drop(anon) into __d_materialise_unique(dentry, > anon) > fix bogus read_seqretry() checks introduced in b37199e > don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu() > be careful with nd->inode in path_init() and follow_dotdot_rcu() Hi, Commit 4023bfc9f351a7994 "be careful with nd->inode in path_init() and follow_dotdot_rcu(), seem to hang my ARM no-MMU platform when mounting the ramdisk. 3.17-rc4 - works 3.17-rc5 - works with 4023bfc9f351a7994 reverted. Boot log with from rc5: [ 5.81] TCP: cubic registered [ 5.82] NET: Registered protocol family 17 [ 5.86] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware clock [ 5.91] mmc_host mmc0: Bus speed (slot 0) = 1200Hz (slot req 2500Hz, actual 1200HZ div = 0) [ 5.93] mmc0: new SDHC card at address 0007 [ 5.95] mmcblk0: mmc0:0007 SD08G 7.42 GiB [ 6.15] clk: Not disabling unused clocks [ 81.24] random: nonblocking pool is initialized And there it just hangs it seems. With patch reverted [ 5.81] TCP: cubic registered [ 5.82] NET: Registered protocol family 17 [ 5.85] lpc2k-rtc 40046000.rtc: hctosys: unable to read the hardware clock [ 6.10] clk: Not disabling unused clocks [ 6.11] RAMDISK: gzip image found at block 0 [ 9.59] VFS: Mounted root (ext2 filesystem) readonly on device 1:0. [ 9.60] devtmpfs: mounted [ 9.61] Freeing unused kernel memory: 68K (281e5000 - 281f6000) And then user space starts. This is an ARM Cortex-M4 no-MMU platform that is not yet upstream. regards, Joachim Eastwood -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On 14 September 2014 21:47, Al Viro wrote: > double iput() on failure exit in lustre, racy removal of spliced dentries > from ->s_anon in __d_materialise_dentry() plus a bunch of assorted RCU > pathwalk > fixes. Please, pull from the usual place - > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus > > Shortlog: > Al Viro (5): > [fix] lustre: d_make_root() does iput() on dentry allocation failure > move the call of __d_drop(anon) into __d_materialise_unique(dentry, > anon) > fix bogus read_seqretry() checks introduced in b37199e > don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu() > be careful with nd->inode in path_init() and follow_dotdot_rcu() > > Diffstat: > drivers/staging/lustre/lustre/llite/llite_lib.c |2 +- > fs/dcache.c |8 +++- > fs/namei.c | 52 > ++- > 3 files changed, 39 insertions(+), 23 deletions(-) > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git pull] vfs fixes
double iput() on failure exit in lustre, racy removal of spliced dentries from ->s_anon in __d_materialise_dentry() plus a bunch of assorted RCU pathwalk fixes. Please, pull from the usual place - git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus Shortlog: Al Viro (5): [fix] lustre: d_make_root() does iput() on dentry allocation failure move the call of __d_drop(anon) into __d_materialise_unique(dentry, anon) fix bogus read_seqretry() checks introduced in b37199e don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu() be careful with nd->inode in path_init() and follow_dotdot_rcu() Diffstat: drivers/staging/lustre/lustre/llite/llite_lib.c |2 +- fs/dcache.c |8 +++- fs/namei.c | 52 ++- 3 files changed, 39 insertions(+), 23 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] VFS fixes for 3.16
Hi Linus, here's a userspace triggered vfsmount leak fix, and a compile warning fix for 3.16: The following changes since commit 82e13c71bc655b6dc7110da4e164079dadb44892: Merge branch 'for-3.16' of git://linux-nfs.org/~bfields/linux (2014-07-23 17:55:11 -0700) are available in the git repository at: git://git.infradead.org/users/hch/vfs.git vfs-for-3.16 for you to fetch changes up to 295dc39d941dc2ae53d5c170365af4c9d5c16212: fs: umount on symlink leaks mnt count (2014-07-24 06:18:12 -0400) Boaz Harrosh (1): direct-io: fix uninitialized warning in do_direct_IO() Vasily Averin (1): fs: umount on symlink leaks mnt count fs/direct-io.c | 14 +++--- fs/namei.c | 3 ++- 2 files changed, 9 insertions(+), 8 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Sun, Mar 30, 2014 at 03:39:33PM -0700, Linus Torvalds wrote: > On Sun, Mar 30, 2014 at 1:55 PM, Al Viro wrote: > > > > Commit ID of that branch should be 7fc5aaa083922420e2dec5d985420cb5f959b1ce; > > diffs are the same as what sat there since last Sunday, commit messages > > got updated a bit. > > Ugh. You have apparently rebased the parts that I already pulled too, > which is annoying and causes duplicate commits. Sorry, I'd only noticed your pull after sending that... Missing commits on top of linux.git#master are in vfs.git#for-linus-2, so if you prefer to grab them from there, it's git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus-2 Shortlog: Al Viro (4): resizable namespace.c hashes keep shadowed vfsmounts together don't bother with propagate_mnt() unless the target is shared switch mnt_hash to hlist Diffstat: fs/mount.h |4 +-- fs/namespace.c | 177 +++-- fs/pnode.c | 26 +-- fs/pnode.h |4 +-- 4 files changed, 134 insertions(+), 77 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Sun, Mar 30, 2014 at 1:55 PM, Al Viro wrote: > > Commit ID of that branch should be 7fc5aaa083922420e2dec5d985420cb5f959b1ce; > diffs are the same as what sat there since last Sunday, commit messages > got updated a bit. Ugh. You have apparently rebased the parts that I already pulled too, which is annoying and causes duplicate commits. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Sun, Mar 30, 2014 at 09:33:07PM +0100, Al Viro wrote: > Al, finally back to life and digging himself from under ~5e3 mails in l-k > mailbox... Several fixes; a couple of fdget_pos()-related ones from Eric Biggers, prepend_name() fix (bug was reported by many people, Imre and Jan being the ones I'd been able to find addresses of), plus a series of fixes for bug found by Max Kellermann - missing checks for false negatives from __lookup_mnt() in fs/namei.c + switching mnt_hash to hlist, turning the races between __lookup_mnt() and hash modifications into false negatives from __lookup_mnt() (instead of hangs). Please, pull from the usual place - git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus Commit ID of that branch should be 7fc5aaa083922420e2dec5d985420cb5f959b1ce; diffs are the same as what sat there since last Sunday, commit messages got updated a bit. Should've propagated by now... Shortlog: Al Viro (6): make prepend_name() work correctly when called with negative *buflen rcuwalk: recheck mount_lock after mountpoint crossing attempts resizable namespace.c hashes keep shadowed vfsmounts together don't bother with propagate_mnt() unless the target is shared switch mnt_hash to hlist Eric Biggers (2): vfs: atomic f_pos access in llseek() vfs: Don't let __fdget_pos() get FMODE_PATH files Diffstat: fs/dcache.c |4 +- fs/file.c | 19 ++ fs/mount.h |4 +- fs/namei.c | 29 - fs/namespace.c | 177 --- fs/pnode.c | 26 fs/pnode.h |4 +- fs/read_write.c |4 +- 8 files changed, 155 insertions(+), 112 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Wed, Mar 26, 2014 at 01:55:51PM -0700, Linus Torvalds wrote: > On Wed, Mar 26, 2014 at 9:36 AM, Sedat Dilek wrote: > > > > Looking at [1] you did not pull-in the new changes. > > Are you waiting for a new pull-request? > > Yeah, with the top commit updated, I'd like to make sure I get the right pull. Will do in a few. Sorry about delay (and even more about the reasons for it - the lack of sleep from redeye flight was to be expected, of course, but catching stomach flu was not ;-/) Al, finally back to life and digging himself from under ~5e3 mails in l-k mailbox... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Wed, Mar 26, 2014 at 9:55 PM, Linus Torvalds wrote: > On Wed, Mar 26, 2014 at 9:36 AM, Sedat Dilek wrote: >> >> Looking at [1] you did not pull-in the new changes. >> Are you waiting for a new pull-request? > > Yeah, with the top commit updated, I'd like to make sure I get the right pull. > AFAICS, it was a typo... s/hlist_del_rcu()/hlist_del_init_rcu() - Sedat - -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Wed, Mar 26, 2014 at 9:36 AM, Sedat Dilek wrote: > > Looking at [1] you did not pull-in the new changes. > Are you waiting for a new pull-request? Yeah, with the top commit updated, I'd like to make sure I get the right pull. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Tue, Mar 25, 2014 at 1:46 AM, Linus Torvalds wrote: > Just to clarify: the current vfs tree from Al works for you, no new issues? > > I was delaying the release first a day, and now I think I'll just do > an rc8 after all (and do the final 3.14 next weekend), but I'd like to > be sure what the status of Al's tree is. > > Al, can you send a new pull request with fixed information (assuming I > understood correctly and everything in vfs-land works for Sedat). > Looking at [1] you did not pull-in the new changes. Are you waiting for a new pull-request? - Sedat - [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=3e79d97828841305e3369ad1e07cfed5bf5989ef >> P.S.: [Off-topic] With vanilla -66 kernel I have some OOPS in the >> oom[1..5] tests of LTP. Whom to adress? mm-folks? > > Yes, please send to -mm (and cc lkml and me). > > Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Mon, Mar 24, 2014 at 11:58 PM, Imre Deak wrote: >> [...] >> Shortlog: >> Al Viro (6): >> make prepend_name() work correctly when called with negative > *buflen > > A proper attribution for the above fix would have been nice. Tracking > down the bug was the main thing after all: > > https://lkml.org/lkml/2014/3/12/620 > I cannot follow that link right now. seems to have some problems. - Sedat - -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
Just to clarify: the current vfs tree from Al works for you, no new issues? I was delaying the release first a day, and now I think I'll just do an rc8 after all (and do the final 3.14 next weekend), but I'd like to be sure what the status of Al's tree is. Al, can you send a new pull request with fixed information (assuming I understood correctly and everything in vfs-land works for Sedat). > P.S.: [Off-topic] With vanilla -66 kernel I have some OOPS in the > oom[1..5] tests of LTP. Whom to adress? mm-folks? Yes, please send to -mm (and cc lkml and me). Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
> [...] > Shortlog: > Al Viro (6): > make prepend_name() work correctly when called with negative *buflen A proper attribution for the above fix would have been nice. Tracking down the bug was the main thing after all: https://lkml.org/lkml/2014/3/12/620 --Imre -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Sun, Mar 23, 2014 at 6:01 PM, Linus Torvalds wrote: > On Sun, Mar 23, 2014 at 9:45 AM, Al Viro wrote: >> >> It's easier to skip checking the overflow on prepend() of "\0" in the >> beginning of the whole thing and just let the next operation to fail. >> That's where the corner case comes from. > > Ok, I'll buy the first four commits. Let's wait for Sedat's report on the > rest. > > Linus > Hi, I have tested w/ Al's patch from [1]. AFAICS this changes are now updated in vfs.git#for-linus branch as commit 9d25fe7e232b ("switch mnt_hash to hlist"). I had a 2nd trouble w/ the old code, installing a Debian package resulted also in a freeze. This now works. The LTP test is still running... and my -5 kernel with Al's fix seems to hit the 700+ KiB barrier in my logs. $ LC_ALL=C ls -l /opt/ltp/runltp-log_3.14.0-rc7-[0-9]*-iniza-small.txt -rw-r--r-- 1 root root 733184 Mar 23 09:51 /opt/ltp/runltp-log_3.14.0-rc7-4-iniza-small.txt -rw-r--r-- 1 root root 822533 Mar 24 09:51 /opt/ltp/runltp-log_3.14.0-rc7-5-iniza-small.txt -rw-r--r-- 1 root root 1142248 Mar 23 11:44 /opt/ltp/runltp-log_3.14.0-rc7-66-iniza-small.txt [ NOTE ] -66: vanilla Linus-tree -4: -66 plus old vfs-fixes -5: -66 plus old vfs-fixes plus Al's fix from [1] If you like give the usual credits. Regards, - Sedat - P.S.: [Off-topic] With vanilla -66 kernel I have some OOPS in the oom[1..5] tests of LTP. Whom to adress? mm-folks? [1] http://marc.info/?l=linux-kernel&m=139559379906251&w=2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Sun, Mar 23, 2014 at 9:45 AM, Al Viro wrote: > > It's easier to skip checking the overflow on prepend() of "\0" in the > beginning of the whole thing and just let the next operation to fail. > That's where the corner case comes from. Ok, I'll buy the first four commits. Let's wait for Sedat's report on the rest. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Sun, Mar 23, 2014 at 03:35:05PM +, Al Viro wrote: > On Sun, Mar 23, 2014 at 11:57:16AM +0100, Sedat Dilek wrote: > > > Your branch on top of Linux v3.14-rc7-66-g774868c7094d is freezing my > > Ubuntu/precise AMD64 (WUBI) system when running LTP. > > Which test? Argh... I see what's going on; could you check if the following fixes all the problems you are seeing? diff --git a/fs/namespace.c b/fs/namespace.c index d6e6daf..2ffc5a2 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -746,7 +746,7 @@ static void detach_mnt(struct mount *mnt, struct path *old_path) mnt->mnt_parent = mnt; mnt->mnt_mountpoint = mnt->mnt.mnt_root; list_del_init(&mnt->mnt_child); - hlist_del_rcu(&mnt->mnt_hash); + hlist_del_init_rcu(&mnt->mnt_hash); put_mountpoint(mnt->mnt_mp); mnt->mnt_mp = NULL; } @@ -1236,7 +1236,7 @@ void umount_tree(struct mount *mnt, int how) struct mount *last = NULL; for (p = mnt; p; p = next_mnt(p, mnt)) { - hlist_del_rcu(&p->mnt_hash); + hlist_del_init_rcu(&p->mnt_hash); hlist_add_head(&p->mnt_hash, &tmp_list); } diff --git a/fs/pnode.c b/fs/pnode.c index 72aa2b7..88396df 100644 --- a/fs/pnode.c +++ b/fs/pnode.c @@ -341,7 +341,7 @@ static void __propagate_umount(struct mount *mnt) * other children */ if (child && list_empty(&child->mnt_mounts)) { - hlist_del_rcu(&child->mnt_hash); + hlist_del_init_rcu(&child->mnt_hash); hlist_add_before_rcu(&child->mnt_hash, &mnt->mnt_hash); } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Sun, Mar 23, 2014 at 09:36:28AM -0700, Linus Torvalds wrote: > On Sun, Mar 23, 2014 at 12:16 AM, Al Viro wrote: > > Several fixes; first 4 commits are obvious fixes (a couple > > of fdget_pos()-related ones from Eric Biggers, prepend_name() fix, missing > > checks for false negatives from __lookup_mnt() in fs/namei.c) > > I'm not seeing the obvious fix in the prepend_name() thing, and I > think it's horrible to *update* the name-len to negative like it now > does. > > Why is anybody calling it with a negative buffer length in the first > place? *That* is the bug. Making the buflen become negative just makes > the bug worse, imnsho. It's easier to skip checking the overflow on prepend() of "\0" in the beginning of the whole thing and just let the next operation to fail. That's where the corner case comes from. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Sun, Mar 23, 2014 at 12:16 AM, Al Viro wrote: > Several fixes; first 4 commits are obvious fixes (a couple > of fdget_pos()-related ones from Eric Biggers, prepend_name() fix, missing > checks for false negatives from __lookup_mnt() in fs/namei.c) I'm not seeing the obvious fix in the prepend_name() thing, and I think it's horrible to *update* the name-len to negative like it now does. Why is anybody calling it with a negative buffer length in the first place? *That* is the bug. Making the buflen become negative just makes the bug worse, imnsho. So I'm not pulling this, since the obvious fixes don't all look obvious to me, and Sedat reports there are problems with it. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs fixes
On Sun, Mar 23, 2014 at 11:57:16AM +0100, Sedat Dilek wrote: > Your branch on top of Linux v3.14-rc7-66-g774868c7094d is freezing my > Ubuntu/precise AMD64 (WUBI) system when running LTP. Which test? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git pull] vfs fixes
Several fixes; first 4 commits are obvious fixes (a couple of fdget_pos()-related ones from Eric Biggers, prepend_name() fix, missing checks for false negatives from __lookup_mnt() in fs/namei.c), followed by 4 commits dealing with the bug found by Max last week - switch of mnt_hash to hlist, to avoid the fun with non-terminating __lookup_mnt(). I'm fairly comfortable with that pile, but whether its second part is OK at this point is up to you; it seems to survive everything I'd thrown at it, and it's quite straightforward. If you really feel that it's too close to -final, well... alternative variant is to replace the last 4 with "if we are spinning too much in __lookup_mnt(), check mount_lock" kludge like the one I've posted early in the "don't clobber mnt_hash.next" thread. I'd rather go for "let's just use hlist", obviously... git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus Shortlog: Al Viro (6): make prepend_name() work correctly when called with negative *buflen rcuwalk: recheck mount_lock after mountpoint crossing attempts resizable namespace.c hashes keep shadowed vfsmounts together don't bother with propagate_mnt() unless the target is shared switch mnt_hash to hlist Eric Biggers (2): vfs: atomic f_pos access in llseek() vfs: Don't let __fdget_pos() get FMODE_PATH files Diffstat: fs/dcache.c |4 +- fs/file.c | 19 ++ fs/mount.h |4 +- fs/namei.c | 29 - fs/namespace.c | 177 --- fs/pnode.c | 26 fs/pnode.h |4 +- fs/read_write.c |4 +- 8 files changed, 155 insertions(+), 112 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git pull] vfs fixes for -rc2
sget() one is a long-standing bug and will need to go into -stable (in fact, it had been originally caught in RHEL6), the other two are 3.11-only. Please, pull from git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus Shortlog: Al Viro (2): allow O_TMPFILE to work with O_WRONLY livelock avoidance in sget() Peng Tao (1): vfs: constify dentry parameter in d_count() Diffstat: fs/open.c|2 ++ fs/super.c | 25 ++--- include/linux/dcache.h |2 +- include/uapi/asm-generic/fcntl.h |4 ++-- 4 files changed, 15 insertions(+), 18 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git pull] vfs fixes
Several fixes for bugs caught while looking through f_pos (ab)users. Please, pull from the usual place - git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus Shortlog: Al Viro (3): mconsole: we'd better initialize pos before passing it to vfs_read()... splice: don't pass the address of ->f_pos to methods aout32 coredump compat fix Diffstat: arch/um/drivers/mconsole_kern.c |2 +- arch/x86/ia32/ia32_aout.c |2 +- fs/internal.h |6 ++ fs/read_write.c | 24 fs/splice.c | 31 ++- include/linux/fs.h |2 -- include/linux/splice.h |1 + 7 files changed, 43 insertions(+), 25 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[git pull] vfs fixes
-stable fodder; assorted deadlock fixes. Please, pull from git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus Shortlog: Al Viro (3): Don't bother with redoing rw_verify_area() from default_file_splice_from() Nest rename_lock inside vfsmount_lock vt: synchronize_rcu() under spinlock is not nice... Diffstat: drivers/tty/vt/vc_screen.c |6 -- fs/dcache.c| 16 +++- fs/internal.h |5 + fs/read_write.c| 25 + fs/splice.c|4 +++- 5 files changed, 48 insertions(+), 8 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/