Re: [PATCH 1/2] Btrfs: be more precise on errors when getting an inode from disk

J. Bruce Fields Fri, 22 Jul 2016 13:09:06 -0700

On Fri, Jul 22, 2016 at 12:40:26PM +1000, NeilBrown wrote:
> On Fri, Jul 22 2016, J. Bruce Fields wrote:
> 
> > On Fri, Jul 22, 2016 at 11:08:17AM +1000, NeilBrown wrote:
> >> On Fri, Jun 10 2016, [email protected] wrote:
> >> 
> >> > From: Filipe Manana <[email protected]>
> >> >
> >> > When we attempt to read an inode from disk, we end up always returning an
> >> > -ESTALE error to the caller regardless of the actual failure reason, 
> >> > which
> >> > can be an out of memory problem (when allocating a path), some error 
> >> > found
> >> > when reading from the fs/subvolume btree (like a genuine IO error) or the
> >> > inode does not exists. So lets start returning the real error code to the
> >> > callers so that they don't treat all -ESTALE errors as meaning that the
> >> > inode does not exists (such as during orphan cleanup). This will also be
> >> > needed for a subsequent patch in the same series dealing with a special
> >> > fsync case.
> >> >
> >> > Signed-off-by: Filipe Manana <[email protected]>
> >> 
> >> SNIP
> >> 
> >> > @@ -5594,7 +5602,8 @@ struct inode *btrfs_iget(struct super_block *s, 
> >> > struct btrfs_key *location,
> >> >                  } else {
> >> >                          unlock_new_inode(inode);
> >> >                          iput(inode);
> >> > -                        inode = ERR_PTR(-ESTALE);
> >> > +                        ASSERT(ret < 0);
> >> > +                        inode = ERR_PTR(ret < 0 ? ret : -ESTALE);
> >> >                  }
> >> 
> >> Just a heads-up.  This change breaks NFS :-(
> >> 
> >> The change in error code percolates up the call chain:
> >> 
> >>  nfs4_pufh->fh_verify->nfsd_set_fh_dentry->exportfs_decode_fh
> >>     ->btrfs_fh_to_dentry->ntrfs_get_dentry->btrfs_iget
> >> 
> >> and nfsd returns NFS4ERR_NOENT to the client instead of NFS4ERR_STALE,
> >> and the client doesn't handle that quite the same way.
> >> 
> >> This doesn't mean that the change is wrong, but it could mean we need to
> >> fix something else in the path to sanitize the error code.
> >> 
> >> nfsd_set_fh_dentry already has
> >> 
> >>    error = nfserr_stale;
> >>    if (PTR_ERR(exp) == -ENOENT)
> >>            return error;
> >> 
> >>    if (IS_ERR(exp))
> >>            return nfserrno(PTR_ERR(exp));
> >> 
> >> for a different error case, so duplicating that would work, but I doubt
> >> it is best.  At the very least we should check for valid errors, not
> >> specific invalid ones.
> >> 
> >> Bruce: do you have an opinion where we should make sure that PUTFH (and
> >> various other requests) returns a valid error code?
> >
> > Uh, I guess not.  Maybe exportfs_decode_fh?
> >
> > Though my kneejerk reaction is to be cranky and wonder why btrfs
> > suddenly needs a different convention for decode_fh().
> >
> 
> I can certainly agree with that perspective, though it would be
> appropriate in that case to make sure we document the requirements for
> fh_to_dentry (the current spelling for 'decode_fh').  So I went looking
> for documentation and found:
> 
>  * fh_to_dentry:
>  *    @fh_to_dentry is given a &struct super_block (@sb) and a file handle
>  *    fragment (@fh, @fh_len). It should return a &struct dentry which refers
>  *    to the same file that the file handle fragment refers to.  If it cannot,
>  *    it should return a %NULL pointer if the file was found but no acceptable
>  *    &dentries were available, or an %ERR_PTR error code indicating why it
>  *    couldn't be found (e.g. %ENOENT or %ENOMEM).  Any suitable dentry can be
>  *    returned including, if necessary, a new dentry created with 
> d_alloc_root.
>  *    The caller can then find any other extant dentries by following the
>  *    d_alias links.
>  *
> 
> So the new btrfs code is actually conformant!!
> That documentation dates back to 2002 when I wrote it....
> And it looks like ENOENT wasn't handled correctly then :-(
> 
> I suspect anything that isn't ENOMEM should be converted to ESTALE.
> ENOMEM causes the client to be asked to retry the request later.
> 
> Does this look reasonable to you?
> (Adding Christof as he as contributed a lot to exportfs)
> 
> If there is agreement I'll test and post a proper patch.


I can live with it.  It bothers me that we're losing potentially useful
information about what went wrong in the filesystem.  Maybe this is a
place a dprintk could be handy?

--b.

> 
> Thanks,
> NeilBrown
> 
> 
> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
> index 207ba8d627ca..3527b58cd5bc 100644
> --- a/fs/exportfs/expfs.c
> +++ b/fs/exportfs/expfs.c
> @@ -428,10 +428,10 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, 
> struct fid *fid,
>       if (!nop || !nop->fh_to_dentry)
>               return ERR_PTR(-ESTALE);
>       result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fileid_type);
> -     if (!result)
> -             result = ERR_PTR(-ESTALE);
> -     if (IS_ERR(result))
> -             return result;
> +     if (PTR_ERR(result) == -ENOMEM)
> +             return ERR_CAST(result)
> +     if (IS_ERR_OR_NULL(result))
> +             return ERR_PTR(-ESTALE);
>  
>       if (d_is_dir(result)) {
>               /*
> @@ -541,6 +541,8 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, 
> struct fid *fid,
>  
>   err_result:
>       dput(result);
> +     if (err != -ENOMEM)
> +             err = -ESTALE;
>       return ERR_PTR(err);
>  }
>  EXPORT_SYMBOL_GPL(exportfs_decode_fh);


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] Btrfs: be more precise on errors when getting an inode from disk

Reply via email to