Joe Pruett wrote:
> i had one of my servers get into the mode where automount is hung up doing 
> something.  i started attaching to each one and doing the gdb stack trace 
> you asked for.  here are the results.  after looking at the third one, 
> things cleared up.  hopefully we can figure something out on this.
>
> Script started on Fri 22 Aug 2008 01:11:35 PM PDT
> [EMAIL PROTECTED] ~]# ps axf | grep auto
>   1741 ?        Ssl  116:45 automount
> 16772 ?        S      0:00  \_ automount
> 16777 ?        S      0:00  \_ automount
> 20963 ?        S      0:00  \_ automount
> 21865 ?        S      0:00  \_ automount
> 23136 ?        S      0:00  \_ automount
> 25322 pts/0    S+     0:00                  \_ grep auto
> [EMAIL PROTECTED] ~]# gdb -p 1741 /usr/sbin/automount
>   

snip ...

> Attaching to program: /usr/sbin/automount, process 1741
> Loaded symbols for /usr/sbin/automount
> Reading symbols from /lib/libpthread.so.0...done.
> [Thread debugging using libthread_db enabled]
> [New Thread -1208218944 (LWP 1741)]
> [New Thread -1228944496 (LWP 23135)]
> [New Thread -1212564592 (LWP 21864)]
> [New Thread -1222640752 (LWP 20962)]
> [New Thread -1218868336 (LWP 18488)]
> [New Thread -1216767088 (LWP 16774)]
> [New Thread -1214665840 (LWP 16773)]
> [New Thread -1224742000 (LWP 16771)]
> [New Thread -1210463344 (LWP 1749)]
> [New Thread -1208362096 (LWP 1746)]
> [New Thread -1208292464 (LWP 1743)]
> [New Thread -1208222832 (LWP 1742)]
>   

snip ...

> 0x002ba402 in __kernel_vsyscall ()
> (gdb) thr a a bt
>   

snip ...

>
> Thread 7 (Thread -1214665840 (LWP 16773)):
> #0  0x002ba402 in __kernel_vsyscall ()
> #1  0x00711e2b in read () from /lib/libpthread.so.0
> #2  0x0024ef62 in do_spawn (logopt=0, options=0, prog=0xb7996bdd "/bin/mount",
>      argv=0xb7996b60) at /usr/include/bits/unistd.h:35
> #3  0x0024f8f5 in spawn_mount (logopt=0) at spawn.c:301
> #4  0x0068bb4d in mount_mount (ap=0x8c651b8, root=0x8c65298 "/disks",
>      name=0xb7996dd0 "hyperion.0", name_len=10, 
> ---Type <return> to continue, or q <return> to quit---
>      what=0xb7996da0 "hyperion.spiretech.com:/disk/0", fstype=0x1379e4 "nfs",
>      options=0xb7996df0 "udp,rsize=32768,wsize=32768", context=0x215280)
>      at mount_nfs.c:259
> #5  0x00128b85 in sun_mount (ap=0x8c651b8, root=0x8c65298 "/disks",
>      name=0xb7999108 "hyperion.0", namelen=10,
>      loc=0x8cd0420 "hyperion.spiretech.com:/disk/0", loclen=30,
>      options=0x8cd00e0 "udp,rsize=32768,wsize=32768", ctxt=0x8c5db38)
>      at parse_sun.c:638
> #6  0x00129ebc in parse_mount (ap=0x8c651b8, name=0xb7999108 "hyperion.0",
>      name_len=10,
>      mapent=0xb7999080 "-udp,rsize=32768,wsize=32768 
> hyperion.spiretech.com:/disk/0", context=0x8c5db38) at parse_sun.c:1452
> #7  0x00ce8c6c in lookup_mount (ap=0x8c651b8, name=0x8ccfb50 "hyperion.0",
>      name_len=10, context=0x8c5db08) at lookup_yp.c:646
> #8  0x00250d99 in do_lookup_mount (ap=0x8c651b8, map=0x8c5dac0,
>      name=0x8ccfb50 "hyperion.0", name_len=10) at lookup.c:669
> #9  0x00251f13 in lookup_nss_mount (ap=0x8c651b8, source=0x0,
>      name=0x8ccfb50 "hyperion.0", name_len=10) at lookup.c:731
> #10 0x00249e9a in do_mount_indirect (arg=0x8ccfaf0) at indirect.c:835
> #11 0x0070b46b in start_thread () from /lib/libpthread.so.0
> #12 0x00424dbe in clone () from /lib/libc.so.6
>   

It looks like this is what's blocking the rest and it looks OK.
AFAICT there's no evidence in the backtrace that autofs itself is 
deadlocked or waiting on a completion message that has been missed.
If mount(8) is waiting for a mount that's higher up in the tree then 
everything else should also wait.
Without more information I'd have to say there's not much autofs can do 
here.

Ian

_______________________________________________
autofs mailing list
[email protected]
http://linux.kernel.org/mailman/listinfo/autofs

Reply via email to