>Steve Mcclure wrote:
>>
>> But, in some cases, NFS mounts get hung up (i.e. a "df" will hang, trying
>> to access the directory will hang, etc.) From what I've seen, it's always
>> the directories that had problems mounting. In looking at the log to see
>> why they hang, I see these types of errors:
>>
>> Sep 16 18:17:54 luc0242 kernel: autofs warning: lookup failure on positive dentry,
>status = -4, name = user2
>> Sep 16 18:18:29 luc0242 last message repeated 2 times
>> Sep 16 18:18:35 luc0242 kernel: autofs warning: lookup failure on positive dentry,
>status = -4, name = user2
>> Sep 16 18:19:06 luc0242 last message repeated 2 times
>> Sep 16 18:30:51 luc0242 kernel: autofs warning: lookup failure on positive dentry,
>status = -4, name = user2
>>
>> while trying to access the directories.
>>
>
>Okay, this is bad. -4 is EINTR; this implies a race condition involving
>signal handling. However, the relevant parts of autofs are protected by
>the global kernel lock. This may be the VFS bug that Linus and I were
>chasing a while ago, a fix for which is in 2.3.14. You may want to see
>if you can reproduce this using a recent 2.3 kernel.
>
> -hpa
Ok, I've been trying to reproduce my problem. It turns out that the
above data was generated from having the patch I had asked you about
previously being applied (I'll include it below again for reference -
it's the one you mentioned I would get silent data corruption on) and
I included the wrong log file entries.
What happens without that patch is that when we get an RPC timeout on
an automount, and then try to automount immediately again, the
automounter completely hangs, and you have to reboot the box to make
it recover.
This is worse than with the patch, in that with the patch, only that
user's automount would get hung, and you could recover everything by
stopping the automounter, unmounting everything by hand, and
restarting the automounter. This time, the entire automounter gets
hung, for every directory, and you have to reboot the box. In fact,
you can't even kill the user level automounter no matter how hard you
try.
Note that this is on a Quad Xeon box, running 2.2.10 SMP. I have been
able to sorta-kinda get a test case that will reproduce this by taking
another SMP box and pulling the network cable at inopportune times,
and sometimes (but not regularly) the automounter will hang (note that
I've been testing with 2.2.10, 2.2.12 and 2.3.16 and have the same
results on the test box - i.e. I can't completely reproduce it, so I'm
not exactly sure where it is or why.) Sometimes after it hangs, it
will come back, but only after you kill some processes. Other times,
you have to reboot. Very unpredictable.
Now, 2.2.10 (or possibly even 2.2.12 or 2.2.13 though that is pushing
it :) )) is a requirement for what I am doing, so I need to try to
trace down a fix for this for the 2.2.x series kernels. I was trying
to see exactly what this VFS layer patch is, but was unable to glean
it out of the diffs between 2.3.13 and 2.3.14. Is there a specific
area of the code I should look at? Anyone else have any other ideas?
-- Steve
[EMAIL PROTECTED]
--- Patch follows ---
--- root.c.orig Thu Sep 2 12:23:41 1999
+++ root.c Thu Sep 2 12:26:29 1999
@@ -165,7 +165,7 @@ static int try_to_fill_dentry(struct den
* yet completely filled in, and revalidate has to delay such
* lookups..
*/
-static int autofs_revalidate(struct dentry * dentry, int flags)
+static int autofs_do_revalidate(struct dentry * dentry, int flags)
{
struct inode * dir = dentry->d_parent->d_inode;
struct autofs_sb_info *sbi = autofs_sbi(dir->i_sb);
@@ -200,6 +200,17 @@ static int autofs_revalidate(struct dent
return 1;
}
+static int autofs_revalidate(struct dentry * dentry, int flags)
+{
+ int status;
+
+ up(&dentry->d_parent->d_inode->i_sem);
+ status = autofs_do_revalidate(dentry, flags);
+ down(&dentry->d_parent->d_inode->i_sem);
+
+ return (status);
+}
+
static struct dentry_operations autofs_dentry_operations = {
autofs_revalidate, /* d_revalidate */
NULL, /* d_hash */
@@ -237,9 +248,7 @@ static struct dentry *autofs_root_lookup
dentry->d_flags |= DCACHE_AUTOFS_PENDING;
d_add(dentry, NULL);
- up(&dir->i_sem);
autofs_revalidate(dentry, 0);
- down(&dir->i_sem);
/*
* If we are still pending, check if we had to handle