On Thu, Feb 22, 2018 at 1:17 PM, Raghavendra G <raghaven...@gluster.com> wrote:
> > > On Wed, Oct 11, 2017 at 7:32 PM, J. Bruce Fields <bfie...@fieldses.org> > wrote: > >> On Wed, Oct 11, 2017 at 04:11:51PM +0530, Raghavendra G wrote: >> > On Thu, Mar 31, 2016 at 1:22 AM, J. Bruce Fields <bfie...@fieldses.org> >> > wrote: >> > >> > > On Mon, Mar 28, 2016 at 04:21:00PM -0400, Vijay Bellur wrote: >> > > > I would prefer to: >> > > > >> > > > 1. Return ENOENT for all system calls that operate on a path. >> > > > >> > > > 2. ESTALE might be ok for file descriptor based operations. >> > > >> > > Note that operations which operate on paths can fail with ESTALE when >> > > they attempt to look up a component within a directory that no longer >> > > exists. >> > > >> > >> > But, "man 2 rmdir" or "man 2 unlink" doesn't list ESTALE as a valid >> error. >> >> In fact, almost no man pages list ESTALE as a valid error: >> >> [bfields@patate man-pages]$ git grep ESTALE >> Changes.old: Change description for ESTALE >> man2/open_by_handle_at.2:.B ESTALE >> man2/open_by_handle_at.2:.B ESTALE >> man3/errno.3:.B ESTALE >> >> Cc'ing Michael Kerrisk for advice. Is there some reason for that, or >> can we fix those man pages? >> >> > Also rm doesn't seem to handle ESTALE too [3] >> > >> > [4] https://github.com/coreutils/coreutils/blob/master/src/remov >> e.c#L305 >> >> I *think* that code is just deciding whether a given error should be >> silently ignored in the rm -f case. I don't think -ESTALE (indicating >> the directory is bad) is such an error, so I think this code is correct. >> But my understanding may be wrong. >> > > For a local filesystem, we may not end up in ESTALE errors. But, when > rmdir is executed from multiple clients of a network fs (like NFS, > Glusterfs), unlink or rmdir can easily fail with ESTALE as the other rm > invocation could've deleted it. I think this is what has happened in bugs > like: > https://bugzilla.redhat.com/show_bug.cgi?id=1546717 > https://bugzilla.redhat.com/show_bug.cgi?id=1245065 > > This in fact was the earlier motivation to convert ESTALE into ENOENT, so > that rm would ignore it. Now that I reverted the fix, looks like the bug > has promptly resurfaced :) > > There is one glitch though. Bug 1245065 mentions that some parts of > directory structure remain undeleted. From my understanding, atleast one > instance of rm (which is racing ahead of all others causing others to > fail), should've delted the directory structure completely. Though, I need > to understand the directory traversal done by rm to find whether there are > cyclic dependency between two rms causing both of them to fail. > Also note that VFS retries unlink and rmdir if there is an estale: https://github.com/torvalds/linux/blob/master/fs/namei.c#L4056 https://github.com/torvalds/linux/blob/master/fs/namei.c#L3927 So, underlying fs like Glusterfs cannot mask ESTALE as it'll break some functionality. In this scenario what are the ways to fix a failing rm when run from multiple mount points (Bugs I pointed out above)? I really can't think of a way other than fixing rm to ignore ESTALE error. > >> > > Maybe non-creating open("./foo") returning ENOENT would be reasonable >> in >> > > this case since that's what you'd get in the local filesystem case, >> but >> > > creat("./foo") returning ENOENT, for example, isn't something >> > > applications will be written to handle. >> > > >> > > The Linux VFS will retry ESTALE on path-based systemcalls *one* time, >> to >> > > reduce the chance of ESTALE in those cases. >> > >> > >> > I should've anticipated bug [2] due to this comment. My mistake. Bug >> [2] is >> > indeed due to kernel not retrying open on receiving an ENOENT error. >> > Glusterfs sent ENOENT because file's inode-number/nodeid changed but >> same >> > path exists. The correct error would've been ESTALE, but due to our >> > conversion of ESTALE to ENOENT, the latter was sent back to kernel. >> > >> > Looking through kernel VFS code, only open *seems* to retry >> > (do_filep_open). I couldn't find similar logic to other path based >> syscalls >> > like rmdir, unlink, stat, chmod etc >> >> I believe there is a retry in those cases, but I'm not sure exactly >> where it is. Looking around.... See the retry_estale() checks sprinkled >> around namei.c, which were added by Jeff Layton a few years ago. >> >> --b. >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-devel >> > > > > -- > Raghavendra G > -- Raghavendra G
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel