Re: rm -r sometimes produces errors under NFS

Vincent Lefevre Thu, 08 Mar 2007 17:18:00 -0800

On 2007-03-09 00:44:55 +0100, Jim Meyering wrote:
> Realize that for most people (everyone except you, afaik),
> rm works just fine.


Yes, for most people, rm works fine. But the problem exists (I had
it on 3 different NFS servers in the past few years). And for your
information, other users have reported the same problem, e.g.

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=994291&admit=-682735245+1173400109463+28353475

where some user posted the problem, and another user replied he had
the same problem ("I have a full encapsulated procedure which stops
when rm fails. Very annoying problem [...] While my system Is in
production it is pretty hard to turn off EMC or nfs server cache,
cause it would dracstically impact performances."). See? Other users
find this problem annoying, and though there may be a solution on the
NFS side (turning off the cache), such a solution is not reasonable.

> Please step back a moment and consider whether you have an unusual
> NFS setup, since you are the only one to report such a problem.

Correction: I'm the only one who has reported it at the right place
(well, perhaps not the right place, seeing how this problem is
considered here...). It is well know that most users don't report
bugs, or report them at a different place, more likely searching
for an immediate workaround. This is also my case, sometimes. You
can see here the first time I had this problem (this was with GNU
fileutils 4.0p, in 2001):

http://groups.google.com/group/fr.comp.os.unix/browse_thread/thread/2e526832a2f3947d/

Also note that the problem occurred much more frequently with the
coreutils snapshot (6.8+) than with the current Debian version (5.97).
And I doubt that many people use the snapshot version.

And I'm also one of those who use the machines the most intensively
(I'm often the only one to report bugs, but they are sometimes
eventually identified and fixed).

> You should start by trying to reproduce the failure using stock versions
> of client and server kernels, tools, etc.

The problem is that as the user, I can't choose. But FYI, the client
is a Debian/testing (in fact, because Debian/stable doesn't exist for
x86_64), so, not really old. The server is however quite old, but the
sysadmins don't want to upgrade it as they are not sure that it will
still work... (This is not surprising!) And it will be replaced by a
new server under AFS, but this is not for the short term.

> Better still, write a script that will demonstrate the problem,
> given a small number of inputs (e.g., directory, hostname) and ask
> people to run it and report any problem they see.

The problem is that it is difficult to reproduce under different
conditions, in particular if the number of inputs is small. BTW,
I can no longer reproduce the problem with my testcase that was
100% reproducible a few days ago (though under the same conditions
on my side, and the machines haven't rebooted). It probably depends
on the load of the machine or the network (as very often, when the
bug depends on race conditions).

> I admit that the "rm skips rmdir" may be technically contrary to POSIX,
> but unless there's a more realistic way to trigger misbehavior, then I
> won't try to change it.  However, if you develop a clean, non-invasive
> patch to make rm conform to the letter of POSIX, and add a test script,
> I'll consider it.

A suggestion concerning the "rm skips rmdir": Consider that ENOENT
errors should not block rmdir (and other errors do). Indeed such an
"error" doesn't mean that an existing file couldn't be unlinked,
just that the file didn't exist. And to implement that, only an
additional flag is necessary, isn't it? (But I haven't looked at
the coreutils source very deeply).

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


_______________________________________________
Bug-coreutils mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-coreutils

Re: rm -r sometimes produces errors under NFS

Reply via email to