On Wed, 2003-11-05 at 21:53, Ian Kent wrote:
> On Wed, 5 Nov 2003, Matthew Mitchell wrote:
> 
> > Looking at the 4.1.0-beta2 code, I don't see any mutex-looking code in
> > handle_packet_missing.  Isn't that where it should be?  Or are you
> > taking care of it elsewhere?  (Or are you allowing the bind mounts to go
> > through in the odd case where you get past the lstat() test while still
> > waiting on the lookup_mount to finish?)
> 
> Huh?.
> 
> You know of a user space semaphore implementation other than SysV IPC.
> Tell me and I'll use it.
> 
> There is no MUTEX type code and the code that is there is not around the
> lookup.

Sorry, what I wrote was a bit unclear.  You don't need a real mutex
because there is only one automount process per mount point.  The
problem (it seems to me) is that the automount process runs mount
asychronously (fork && exec without wait()) which can lead to
unnecessary bind() mounts.   I would think that you either need to
wait() on the mount or put in a manual exclusion check around the fork
&& exec.  Of course I am only thinking about a simple NFS mount
situation; there may be other automount uses that preclude this.  If so
please educate me before I take the naive path. :)

> First I believe one of the problems is caused by contention for the mtab
> file within mount. This can cause mount to return a fail even though the
> mount has happened. Similar things happen at umount time when there are
> many master map entries. Hence using a -n switch on the bind mount test
> helps when there are many entries in the master map. This problem can also
> occur when mount requests occur in rapid succession. The other possiblity
> is that the kernel module incorrectly fires multiple mounts requests. If
> this is the case (as it was at one point) then that needs to be fixed in
> the kernel module not the daemon.

Contention for mtab might explain the weird /tmp/mount-foobar garbage
that df complains about.  I don't have any idea what was in mtab when
this happened, though.  Have to check it the next time the problem shows
up.

> I can't remember whether you gave details of your senario.
> 
> All I do is use a lock file around mount calls and a timed delay once
> the lock file is aquired. I'm not giving any guarentee that this will
> always work or that it will even work at all. However it does seem to work
> better than I expected. It is a temporary work around until a better
> solution is implemented.
> 
> Perhaps you can help here with a patch. But first try it and see if it
> works. If it aint broke then I don't have the time to fix it.

Understood.

> You should be using the latest autofs4 kernel module with this. It is
> available form the same place you got the beta autofs-4.

OK.  Looks like a big change to the kernel setup (from a quick browse of
the patch).  In the interest of isolating the change to userspace, can
you point me to a version of autofs4 that works with the stock 2.4.20
kernel module?  I'd be happy to forward-port anything that works and
test it again with the newer code.  (I can test the new stuff pretty
easily on one machine but I don't want to undertake the week-long
process of updating the kernel on the whole cluster just for this
problem.)

-m

_______________________________________________
autofs mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/autofs

Reply via email to