On Sat, 22 Nov 2003, Ian Kent wrote: > On Fri, 21 Nov 2003, Jim Carter wrote: > > B. If the daemon is hit with SIGUSR1, it goes into an infinite loop > > trying unsuccessfully to dismount eligible filesystems, spitting out > > typically 1000 syslog messages over 2 seconds until item C (below) > > supervenes. I put in both a rate throttle (20/second) and a dynamic > > limit on the number of dismounts. > > This sounds like a problem that needs to be identified and fixed. > Rate throttling seems more of a workaround that a solution. > Can you give more information please.
This part of the patch is efinitely a kludge. The daemon's logic goes like this: When it's time to purge mounts, it sends a packet to the driver saying "find an expired mount". The driver sends up a packet saying "dismount /net/tupelo//h1". The daemon tries to do that, but the filesystem is not actually dismounted (lots of possible reasons this could happen). Repeating the loop, the daemon asks "find an expired mount". The driver sends up a packet saying "dismount /net/tupelo//h1"... A possible non-kludge fix might go like this. The daemon walks the tree of (its own sub-) mounts and for each, it may or may not make a judgment that the mount might (or might not) be expired. On likely-looking mounts, it asks the driver "is this expired" or "when was it really last used"? If the mount is really expired, the daemon attempts to dismounts it. But, if the filesystem fails to go away, the daemon will not return to it until the next USR1 or ALRM, avoiding the infinite loop. Here's another possibility: you shouldn't go around updating the atime of the mounted filesystem, but the mount point belongs to the driver, and if you stat the mount point's inode, the driver can provide the last access time (what it uses to decide about expiration) as the atime of that inode. Then the daemon can do the entire logic of picking expired mounts. That would be preferable as design, and it avoids all infinite loop possibilities. Presumably to stat the inode, you would open(2) the mount point directory before mounting on it, and then use lstat. I hope that will actually work. Of course, both of these fixes require protocol changes in the driver. > > C. Upon auto-dismount or SIGUSR1 looping, st_prepare_shutdown is called > > when ap.state != ST_READY and an assertion fails, killing the thread. > > I changed it to die on ST_SHUTDOWN_PENDING, i.e. a recursive call. I'm > > not 100% sure that this is the correct contingency, but automount does > > dismount the unused filesystems and does exit. > > Have seen this. I'm not sure if I fixed this in the 4.0.0 release either. > Will check into it. When the submounted daemon dismounts its last filesystem, it's in ST_EXPIRE (I think that's the spelling), but correctly calls st_prepare_shutdown. I don't know if there are any other non-obvious but correct transitions into SHUTDOWN state. > > The patches follow. They are against autofs-4.0.0pre10, which is the > > version distributed with SuSE 8.2, the distro we are using. > > The SuSE maintainer contacted me a while ago, sent me a copy of his > autofs which was much appreciated. I merged some of the SuSE patches into > the current 4.1.0 beta. > > I hope to encourage him to adopt 4.1.0 when a final version is released. We're definitely looking forward to it. We're pretty aggressive about patching machines and auditing software, and when we make private patches a big problem is making sure they stay installed. James F. Carter Voice 310 825 2897 FAX 310 206 6673 UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555 Email: [EMAIL PROTECTED] http://www.math.ucla.edu/~jimc (q.v. for PGP key) _______________________________________________ autofs mailing list [EMAIL PROTECTED] http://linux.kernel.org/mailman/listinfo/autofs
