From: NeilBrown <[email protected]> lookup_one_qstr_excl() is used for lookups prior to directory modifications (other than open(O_CREATE)), whether create, remove, or rename.
A future patch will lift lookup out of the i_rwsem lock that protects the directory during these operations (only taking a shared lock if the target name is not yet in the dcache). To prepare for this change and particularly to allow lookup to always be done outside the parent i_rwsem, change lookup_one_qstr_excl() to use d_alloc_parallel(). For the target of create and rename some filesystems skip the preliminary lookup and combine it with the main operation. This is only safe if the operation has exclusive access to the dentry. Currently this is guaranteed by an exclusive lock on the directory. d_alloc_parallel() provides alternate exclusive access (in the case where the name isn't in the dcache and ->lookup will be called). As a result of this change, ->lookup is now only ever called with a d_in_lookup() dentry. Consequently we can remove the d_in_lookup() check from d_add_ci() which is only used in ->lookup. If LOOKUP_EXCL or LOOKUP_RENAME_TARGET is passed, the caller must ensure d_lookup_done() is called at an appropriate time, and must not assume that it can test for positive or negative dentries without confirming that the dentry is no longer d_in_lookup() - unless it is filesystem code acting on itself and *knows* that ->lookup() always completes the lookup (currently true for all filesystems other than NFS). This is all handled in start_creating() and end_dirop() and friends. Note that as lookup_one_qstr_excl() is called with an exclusive lock on the directory, d_alloc_parallel() cannot race with another thread and cannot return a non-in-lookup dentry. However that is expected to change so that case is handled with this patch. Signed-off-by: NeilBrown <[email protected]> --- Documentation/filesystems/porting.rst | 14 ++++++++++ fs/dcache.c | 16 +++-------- fs/namei.c | 38 ++++++++++++++++++++------- 3 files changed, 46 insertions(+), 22 deletions(-) diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst index 7e83bd3c5a12..5ddc5ecfcc64 100644 --- a/Documentation/filesystems/porting.rst +++ b/Documentation/filesystems/porting.rst @@ -1403,3 +1403,17 @@ you really don't want it. lookup_one() and lookup_noperm() are no longer available. Use start_creating() or similar instead. + +--- + +**mandatory** + +All start_creating and start_renaming functions may return a +d_in_lookup() dentry if passed "O_CREATE|O_EXCL" or "O_RENAME_TARGET". +end_dirop() calls the necessary d_lookup_done(). If the caller +*knows* which filesystem is being used, it may know that this is not +possible. Otherwise it must be careful testing if the dentry is +positive or negative as the lookup may not have been performed yet. + +inode_operations.lookup() is now only ever called with a d_in_lookup() +dentry. diff --git a/fs/dcache.c b/fs/dcache.c index abb96ad8e015..f573716d1a04 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -2261,18 +2261,10 @@ struct dentry *d_add_ci(struct dentry *dentry, struct inode *inode, inode_unlock_shared(d_inode(dentry->d_parent)); else inode_unlock(d_inode(dentry->d_parent)); - if (d_in_lookup(dentry)) { - found = d_alloc_parallel(dentry->d_parent, name); - if (IS_ERR(found) || !d_in_lookup(found)) { - iput(inode); - return found; - } - } else { - found = d_alloc(dentry->d_parent, name); - if (!found) { - iput(inode); - return ERR_PTR(-ENOMEM); - } + found = d_alloc_parallel(dentry->d_parent, name); + if (IS_ERR(found) || !d_in_lookup(found)) { + iput(inode); + return found; } if (shared) inode_lock_shared(d_inode(dentry->d_parent)); diff --git a/fs/namei.c b/fs/namei.c index cb80490a869f..bba419f2fc53 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1774,13 +1774,14 @@ static struct dentry *lookup_dcache(const struct qstr *name, } /* - * Parent directory has inode locked exclusive. This is one - * and only case when ->lookup() gets called on non in-lookup - * dentries - as the matter of fact, this only gets called - * when directory is guaranteed to have no in-lookup children - * at all. - * Will return -ENOENT if name isn't found and LOOKUP_CREATE wasn't passed. - * Will return -EEXIST if name is found and LOOKUP_EXCL was passed. + * Parent directory has inode locked. + * If Lookup_EXCL or LOOKUP_RENAME_TARGET is set + * d_lookup_done() must be called before the dentry is dput() + * If the dentry is not d_in_lookup(): + * Will return -ENOENT if name isn't found and LOOKUP_CREATE wasn't passed. + * Will return -EEXIST if name is found and LOOKUP_EXCL was passed. + * If it is d_in_lookup() then these conditions can only be checked by the + * file system when carrying out the intent (create or rename). */ static struct dentry *lookup_one_qstr_excl(const struct qstr *name, struct dentry *base, unsigned int flags) @@ -1798,18 +1799,27 @@ static struct dentry *lookup_one_qstr_excl(const struct qstr *name, if (unlikely(IS_DEADDIR(dir))) return ERR_PTR(-ENOENT); - dentry = d_alloc(base, name); - if (unlikely(!dentry)) - return ERR_PTR(-ENOMEM); + dentry = d_alloc_parallel(base, name); + if (unlikely(IS_ERR(dentry))) + return dentry; + if (unlikely(!d_in_lookup(dentry))) + /* Raced with another thread which did the lookup */ + goto found; old = dir->i_op->lookup(dir, dentry, flags); if (unlikely(old)) { + d_lookup_done(dentry); dput(dentry); dentry = old; } found: if (IS_ERR(dentry)) return dentry; + if (d_in_lookup(dentry)) + /* We cannot check for errors - the caller will have to + * wait for any create-etc attempt to get relevant errors. + */ + return dentry; if (d_is_negative(dentry) && !(flags & LOOKUP_CREATE)) { dput(dentry); return ERR_PTR(-ENOENT); @@ -2921,6 +2931,8 @@ static struct dentry *__start_dirop(struct dentry *parent, struct qstr *name, * The lookup is performed and necessary locks are taken so that, on success, * the returned dentry can be operated on safely. * The qstr must already have the hash value calculated. + * The dentry may be d_in_lookup() if %LOOKUP_EXCL or %LOOKUP_RENAME_TARGET + * is given, depending on the filesystem. * * Returns: a locked dentry, or an error. * @@ -2942,6 +2954,7 @@ void end_dirop(struct dentry *de) { if (!IS_ERR(de)) { inode_unlock(de->d_parent->d_inode); + d_lookup_done(de); dput(de); } } @@ -3854,8 +3867,10 @@ __start_renaming(struct renamedata *rd, int lookup_flags, return 0; out_dput_d2: + d_lookup_done(d2); dput(d2); out_dput_d1: + d_lookup_done(d1); dput(d1); out_unlock: unlock_rename(rd->old_parent, rd->new_parent); @@ -3950,6 +3965,7 @@ __start_renaming_dentry(struct renamedata *rd, int lookup_flags, return 0; out_dput_d2: + d_lookup_done(d2); dput(d2); out_unlock: unlock_rename(old_dentry->d_parent, rd->new_parent); @@ -4059,6 +4075,8 @@ EXPORT_SYMBOL(start_renaming_two_dentries); void end_renaming(struct renamedata *rd) { + d_lookup_done(rd->old_dentry); + d_lookup_done(rd->new_dentry); unlock_rename(rd->old_parent, rd->new_parent); dput(rd->old_dentry); dput(rd->new_dentry); -- 2.50.0.107.gf914562f5916.dirty
