On Wed, 2008-04-23 at 16:04 -0400, Jeff Moyer wrote: > [EMAIL PROTECTED] (Jim Carter) writes: > > > Our two webservers serve UserDirs that are automounted (NFS) from other > > hosts. Every few days we discover a catatonic webserver (Apache2) with > > $ServerLimit child processes (150 of them), and many but not all home > > directories cannot be accessed manually (ls -d ~$user, which hangs). > > This started immediately after we upgraded the server host from SuSE > > 10.1 to SuSE 10.3; autofs version changed from 4.1.4 to 5.0.2. > > That's a big jump! > > > I was hoping to include debug output from autofs, but when I set > > DEFAULT_LOGGING=debug and started the test program it totally locked up > > the machine and I haven't been able to get on it since (because I'm > > working from home). Update: a co-worker rebooted it for me and I was > > able to clear the debug switch and recover the syslog output (attached). > > But evidently the test program also seized up; I don't see a lot of > > actual mounting going on. Anyway I've included it, for what it's worth. > > That's strange. Given the number of mounts you're talking about, > though, it may just be that you overcommitted the boxes memory. It will > be hard to say without further digging. > > > I was hoping to include useful strace output, and I have 80 Mbytes of > > turgid information (on a different machine), but I have a feeling that > > it's going to be more useful to include the test program and let > > someone overload their own testbed system. Here's my impression of the > > traces: > > Or, you could just give us a backtrace of the automount process when > things go pear-shaped. See below. > > > /bin/mount used to have notorious problems locking /etc/mtab. But I > > compare /etc/mtab with /proc/mounts before forking the directory access > > process, and it was the same on several thousand comparisons with only > > two unequal comparisons; in both cases the filesystem about to be > > accessed (remounted) was in mtab and not /proc/mounts, and at most 8 > > seconds later it was in both and the content had been read. 2 minutes > > after the second such event, and 38 minutes into the test run, client > > processes started to hang. > > This is less of a problem these days, due to the fact that we've fixed > the bugs we've found in util-linux and the fact that we don't use mtab > anymore. ;)
v5 still does, but much less so than previously. > > > > > Here are the particulars of our autofs setup. > > > > Distro: OpenSuSE 10.3 > > Kernel: 2.6.22.17 (kernel-default-2.6.22.17-0.1) > > Autofs: 5.0.2-30.2 (recompiled with the DNS timeout mitigation > > patch that Ian Kent made for us) (and identical behavior > > without the patch) > > Mount program: util-linux-2.12r+2.13rc2+git20070725-24.2 (/bin/mount) > > NFS: nfs-client-1.1.0-8 (/sbin/mount.nfs) > > > > =-- auto.master --- (comments omitted in all conf files) > > /net /etc/auto.net <== giving trouble > > /home yp:auto.home > > > > =-- auto.net --- > > * -rsize=8192,wsize=8192,retry=1,soft,fstype=autofs,-DSERVER=& > > file:/etc/auto.net.generic > > A ha! Submounts! We're currently chasing a couple of issues in this > area. > > > =------------- Output from DEFAULT_LOGGING=debug ------- > [snip] > > Jim, I'm not sure I see anything out of the ordinary in this snippet of > the debug log. Can you search your logs for a message that contains, > "ask umount returned busy"? If you see that, then we're looking at the > same problem. If you don't, well, we'll have to get more information > from you. Also, we don't know what patches have been included in the SuSE release. Any chance of providing a source rpm? > > For starters, can you install the autofs debuginfo package and attach to > the running automounter (when in a bad state) and get the output from: > > gdb> thr a a bt > > ? That would be a great help. I don't know if SuSE provide debuginfo packages but the thread trace is useless without debug info. The backtrace is the most effective way to identify a few known problems. It's really important. Ian _______________________________________________ autofs mailing list [email protected] http://linux.kernel.org/mailman/listinfo/autofs
