Re: [PATCH 00/17] clean up readlinks
On Wed, Sep 28, 2016 at 4:17 AM, Al Viro wrote: > Symlink traversal is a much hotter path than readlink() would ever be. > What's more, we do have jumps on normal symlink traversal - after all, > absolute symlinks are exactly that; it's "jump to root, then traverse > the following sequence of components". So having ->get_link() that > includes jumps is not that much of a stretch (note that it could both > jump and return a relative pathname to traverse after that; none of the > procfs-style ones do that, but there's no reason to prohibit that). > > What I'd prefer is > * it's a symlink iff it has ->get_link() > * readlink(2) on a symlink is normally just using generic_readlink() > * that can be overridden by supplying a ->readlink() method. > * the first time readlink() hits a symlink it will check both > ->get_link() and ->readlink() presence. Then, if it's a normal symlink, > the inode will get marked as such and all subsequent calls will just call > generic_readlink(). IOW, I would go for > if (unlikely(!marked)) { > if ->readlink is present > call ->readlink and return > if ->get_link is absent > fail > mark > } > call generic_readlink Redid patchset confirming to the above. Also moved the overlayfs/ecryptfs fixes to the front of the queue, as they are independent of the rest of the cleanups. The last part is new: changing ->readlink signature to match that of ->get_link. This moves the user buffer handling out of filesystems, simplifying the implementation in the process. Force pushed new set to: git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git #readlink It's got a couple of trivial conflicts against #rename2 and #overlayfs-next. I can do the merge and send a big pull request if you want. Thanks, Miklos --- Miklos Szeredi (14): vfs: add vfs_get_link() helper ovl: use vfs_get_link() ecryptfs: use vfs_get_link() ovl: use generic_readlink proc/self: use generic_readlink bad_inode: use generic_readlink vfs: replace calling i_op->readlink with vfs_readlink() vfs: default to generic_readlink() vfs: remove ".readlink = generic_readlink" assignments vfs: make generic_readlink() static vfs: convert ->readlink to same signature as ->get_link vfs: remove page_readlink() vfs: make readlink_copy() static nsfs: clean up ns_get_name() interface --- Documentation/filesystems/Locking | 6 +- Documentation/filesystems/porting | 7 +++ Documentation/filesystems/vfs.txt | 14 +++-- drivers/staging/lustre/lustre/llite/symlink.c | 1 - fs/9p/vfs_inode.c | 1 - fs/9p/vfs_inode_dotl.c| 1 - fs/affs/symlink.c | 1 - fs/afs/mntpt.c| 2 +- fs/autofs4/symlink.c | 1 - fs/bad_inode.c| 7 --- fs/btrfs/inode.c | 1 - fs/ceph/inode.c | 1 - fs/cifs/cifsfs.c | 1 - fs/coda/cnode.c | 1 - fs/configfs/symlink.c | 1 - fs/ecryptfs/inode.c | 30 -- fs/ext2/symlink.c | 2 - fs/ext4/symlink.c | 3 - fs/f2fs/namei.c | 2 - fs/fuse/dir.c | 1 - fs/gfs2/inode.c | 1 - fs/hostfs/hostfs_kern.c | 1 - fs/jffs2/symlink.c| 1 - fs/jfs/symlink.c | 2 - fs/kernfs/symlink.c | 1 - fs/libfs.c| 1 - fs/minix/inode.c | 1 - fs/namei.c| 83 +++ fs/ncpfs/inode.c | 1 - fs/nfs/symlink.c | 1 - fs/nfsd/nfs4xdr.c | 8 +-- fs/nfsd/vfs.c | 6 +- fs/nilfs2/namei.c | 1 - fs/nsfs.c | 17 -- fs/ocfs2/symlink.c| 1 - fs/orangefs/symlink.c | 1 - fs/overlayfs/copy_up.c| 46 ++- fs/overlayfs/inode.c | 30 +- fs/proc/base.c| 57 +++--- fs/proc/inode.c | 1 - fs/proc/namespaces.c | 15 +++-- fs/proc/self.c| 13 - fs/proc/thread_self.c | 14 - fs/reiserfs/namei.c
Re: [PATCH 00/17] clean up readlinks
On Tue, Sep 27, 2016 at 11:38:33AM +0200, Miklos Szeredi wrote: > > I have no problem with "let's get rid of generic_readlink" - not > > that > > it bought us much, but sure, if you want to have decision made based upon > > the combination of flags, let's do it. Just make NULL ->readlink + non-NULL > > ->get_link() mean generic_readlink(), and we are done. > > Indeed. Except it really should be the other way round: > > - .get_link always returning the symlink body > - only proc setting .jump_link to do its thing > - RIP .readlink > But that's an extra branch in the symlink following. I was worried > about that and hence gone for the unification of the two. Symlink traversal is a much hotter path than readlink() would ever be. What's more, we do have jumps on normal symlink traversal - after all, absolute symlinks are exactly that; it's "jump to root, then traverse the following sequence of components". So having ->get_link() that includes jumps is not that much of a stretch (note that it could both jump and return a relative pathname to traverse after that; none of the procfs-style ones do that, but there's no reason to prohibit that). What I'd prefer is * it's a symlink iff it has ->get_link() * readlink(2) on a symlink is normally just using generic_readlink() * that can be overridden by supplying a ->readlink() method. * the first time readlink() hits a symlink it will check both ->get_link() and ->readlink() presence. Then, if it's a normal symlink, the inode will get marked as such and all subsequent calls will just call generic_readlink(). IOW, I would go for if (unlikely(!marked)) { if ->readlink is present call ->readlink and return if ->get_link is absent fail mark } call generic_readlink > Yeah. We can do your above suggestion, it's certainly less brittle. > But I think it's rather confusing, having ->get_link normally do > readlink, except for proc, where readlink is done by ->readlink. ->readlink() is just an override for the cases when readlink(2) wants to fake something (or, as in case of AFS ugliness, is used on non-symlinks). The primary function of symlinks is traversal, not readlink...
Re: [PATCH 00/17] clean up readlinks
On Tue, Sep 27, 2016 at 5:10 AM, Al Viro wrote: > On Mon, Sep 12, 2016 at 09:29:02PM +0200, Miklos Szeredi wrote: >> The first patch is actually a bug fix, but I put it into this bunch for >> simplicity... >> >> The rest are really cleanups as well as minor bugfixes that are byproducts >> of the cleanups. >> >> This series builds on the fact that i_op.readlink is already set to >> generic_readlink() in 43/50 of the cases. And of those 7 only 4 are doing >> something special. So more than 90% of readlinks are/could actually just >> call back into get_link. >> >> The interesting cases are: >> >> - AFS, which has readlink but not get_link >> - proc, that allow jumping while following symlinks >> >> The first is handled by setting IOP_NOFOLLOW on the inode by the fs. >> >> The second one is handled by introducing is_following_link() which returns >> a bool depending on whether current->nameidata is NULL or not. If it >> returns false ->get_link() should behave as ->readlink() did. Otherwise it >> should behave as id did previously. >> >> Builds and boots. Can even read symlinks. > > I have no problem with "let's get rid of generic_readlink" - not that > it bought us much, but sure, if you want to have decision made based upon > the combination of flags, let's do it. Just make NULL ->readlink + non-NULL > ->get_link() mean generic_readlink(), and we are done. Indeed. Except it really should be the other way round: - .get_link always returning the symlink body - only proc setting .jump_link to do its thing - RIP .readlink But that's an extra branch in the symlink following. I was worried about that and hence gone for the unification of the two. > > Overloading ->get_link() for procfs-style ones is just plain wrong, > though. Your current->nameidata != NULL thing is bloody brittle - what > happens if some code triggers those readlinks when called by something > during pathname resolution? Sure, right now existing callers won't. > But it doesn't take much to grow such a place _and_ have the implications > go unnoticed for quite a while. Yeah. We can do your above suggestion, it's certainly less brittle. But I think it's rather confusing, having ->get_link normally do readlink, except for proc, where readlink is done by ->readlink. Thanks, Miklos
Re: [PATCH 00/17] clean up readlinks
On Mon, Sep 12, 2016 at 09:29:02PM +0200, Miklos Szeredi wrote: > The first patch is actually a bug fix, but I put it into this bunch for > simplicity... > > The rest are really cleanups as well as minor bugfixes that are byproducts > of the cleanups. > > This series builds on the fact that i_op.readlink is already set to > generic_readlink() in 43/50 of the cases. And of those 7 only 4 are doing > something special. So more than 90% of readlinks are/could actually just > call back into get_link. > > The interesting cases are: > > - AFS, which has readlink but not get_link > - proc, that allow jumping while following symlinks > > The first is handled by setting IOP_NOFOLLOW on the inode by the fs. > > The second one is handled by introducing is_following_link() which returns > a bool depending on whether current->nameidata is NULL or not. If it > returns false ->get_link() should behave as ->readlink() did. Otherwise it > should behave as id did previously. > > Builds and boots. Can even read symlinks. I have no problem with "let's get rid of generic_readlink" - not that it bought us much, but sure, if you want to have decision made based upon the combination of flags, let's do it. Just make NULL ->readlink + non-NULL ->get_link() mean generic_readlink(), and we are done. Especially if you do the usual "set the flag on inode the first time we need to check". I also have no problem with overlayfs and ecryptfs assuming that we only deal with normal symlinks. Overloading ->get_link() for procfs-style ones is just plain wrong, though. Your current->nameidata != NULL thing is bloody brittle - what happens if some code triggers those readlinks when called by something during pathname resolution? Sure, right now existing callers won't. But it doesn't take much to grow such a place _and_ have the implications go unnoticed for quite a while.
[PATCH 00/17] clean up readlinks
The first patch is actually a bug fix, but I put it into this bunch for simplicity... The rest are really cleanups as well as minor bugfixes that are byproducts of the cleanups. This series builds on the fact that i_op.readlink is already set to generic_readlink() in 43/50 of the cases. And of those 7 only 4 are doing something special. So more than 90% of readlinks are/could actually just call back into get_link. The interesting cases are: - AFS, which has readlink but not get_link - proc, that allow jumping while following symlinks The first is handled by setting IOP_NOFOLLOW on the inode by the fs. The second one is handled by introducing is_following_link() which returns a bool depending on whether current->nameidata is NULL or not. If it returns false ->get_link() should behave as ->readlink() did. Otherwise it should behave as id did previously. Builds and boots. Can even read symlinks. Git tree is here: git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git readlink Thanks, Miklos --- Miklos Szeredi (17): bad_inode: add missing i_op initializers ovl: use generic_readlink proc/self: use generic_readlink afs: use generic_readlink bad_inode: use generic_readlink vfs: remove page_readlink() vfs: add is_following_link() helper proc: merge proc_pid_readlink() into proc_pid_get_link() proc: merge proc_ns_readlink() into proc_ns_get_link() nsfs: clean up ns_get_name() interface vfs: replace calling i_op->readlink with vfs_readlink() vfs: remove ".readlink = generic_readlink" assignments vfs: remove unused i_op->readlink vfs: remove unused generic_readlink() vfs: add vfs_get_link() helper ovl: use vfs_get_link() ecryptfs: use vfs_get_link() Documentation/filesystems/Locking | 2 - Documentation/filesystems/porting | 15 + Documentation/filesystems/vfs.txt | 25 drivers/staging/lustre/lustre/llite/symlink.c | 1 - fs/9p/vfs_inode.c | 1 - fs/9p/vfs_inode_dotl.c| 1 - fs/affs/symlink.c | 1 - fs/afs/inode.c| 2 + fs/afs/mntpt.c| 2 +- fs/autofs4/symlink.c | 1 - fs/bad_inode.c| 62 fs/btrfs/inode.c | 1 - fs/ceph/inode.c | 1 - fs/cifs/cifsfs.c | 1 - fs/coda/cnode.c | 1 - fs/configfs/symlink.c | 1 - fs/ecryptfs/inode.c | 30 -- fs/ext2/symlink.c | 2 - fs/ext4/symlink.c | 3 - fs/f2fs/namei.c | 2 - fs/fuse/dir.c | 1 - fs/gfs2/inode.c | 1 - fs/hostfs/hostfs_kern.c | 1 - fs/jffs2/symlink.c| 1 - fs/jfs/symlink.c | 2 - fs/kernfs/symlink.c | 1 - fs/libfs.c| 1 - fs/minix/inode.c | 1 - fs/namei.c| 82 --- fs/ncpfs/inode.c | 1 - fs/nfs/symlink.c | 1 - fs/nfsd/nfs4xdr.c | 8 +-- fs/nfsd/vfs.c | 4 +- fs/nilfs2/namei.c | 1 - fs/nsfs.c | 17 -- fs/ocfs2/symlink.c| 1 - fs/orangefs/symlink.c | 1 - fs/overlayfs/copy_up.c| 46 ++- fs/overlayfs/inode.c | 25 +--- fs/proc/base.c| 66 +++-- fs/proc/inode.c | 1 - fs/proc/namespaces.c | 41 +- fs/proc/self.c| 13 - fs/proc/thread_self.c | 14 - fs/reiserfs/namei.c | 1 - fs/squashfs/symlink.c | 1 - fs/stat.c | 5 +- fs/sysv/inode.c | 1 - fs/ubifs/file.c | 1 - fs/xfs/xfs_ioctl.c| 4 +- fs/xfs/xfs_iops.c | 2 - include/linux/fs.h| 9 ++- include/linux/namei.h | 1 + include/linux/proc_ns.h | 4 +- mm/shmem.c| 2 - 55 files changed, 226 insertions(+), 291 deletions(-) -- 2.5.5