Hello. In wandering through the kernel code trying to find possible
places where my RPC timeout bug may be occuring, I was wondering if
there is a possible race condition in the autofs wait queues?
If you look in linux/fs/autofs/waitq.c, you can see the creation and
manipulation of the autofs wait queues. But, I notice they are not
protected by any locking mechanisms of any kind. The box that I am
having problems with is a 4-way SMP box, and I did have two of them
actually start producing kernel Oops's in the autofs code after the
RPC timeouts caused the automounter to go nuts...
So, is it possible that these queues could be getting messed up by
multiple automount requests coming in at the same time, and since they
are waiting for a long time (due to network problems) the wait queues
are getting smashed? I don't think this would fix my specific problem,
but it does look like it could be a problem nonetheless.
(I'm still trying to get the boxes to have a manually reproducable
problem BTW, instead of waiting for random network errors to cause it.
Once I get the manual test case to be repeatable, I'll post it here.)
Thanks!
-- Steve McClure
[EMAIL PROTECTED]