On Thu, 4 May 2006, Mike Marion wrote:

> Seeing some of our hosts in only one site having problems with hangs
> occurring.  Seems to be to same filer and even same paths, but what I
> see is odd.  The kernel rpciod thread is even stuck in state D,
> seemingly because the umount call is.  

I think that might be the other way around.

> 
> i.e.
> root     20302  1.2  0.0  2468  584 ?        D    12:01   2:39
> /bin/umount //usr/local/projects/dsp/qdsp6
> 
> root      6270  0.0  0.0     0    0 ?        D    Apr28   3:17 [rpciod]
> 
> unfortunately, once this happens, any new mounts will fail.  Can't even
> stat the path above via df.  Basically the whole NFS layer is stuck.

Tell us what the maps look like.

> 
> Using autofs-4.1.4 with 
> autofs-4.1.4-misc-fixes.patch 
> autofs-4.1.4-multi-parse-fix.patch
> autofs-4.1.4-non-replicated-ping.patch
> patches (slight possibility one of the above is missing, but I'm pretty
> damn sure they're in there).
> 
> Mounts are TCP based so I can't even use a spoofed interface to force a
> umount.  
> 
> Wondering why the extra / in the path on the umount call as well.  Also
> wondering if there's something in the filer (netapp) wrong that's giving
> some kind of response to the umount that's tickling a bug there.  Not
> much I've found online yet though.

The extra "/" will be a bug but it gets ignored by the kernel in this 
case.

> 
> Oh, and umount call shows socks in fd list that don't appear to exist
> anymore:
> :~# ls -l /proc/20302/fd
> total 3
> dr-x------  2 root root  0 May  4 15:26 .
> dr-xr-xr-x  3 root root  0 May  4 12:01 ..
> lrwx------  1 root root 64 May  4 15:26 0 -> /dev/null
> l-wx------  1 root root 64 May  4 15:26 1 -> pipe:[4528730]
> l-wx------  1 root root 64 May  4 15:26 2 -> pipe:[4528730]
> :~ # socklist | grep 4528730
> :~ #
> 
> Problem happens on hosts using same autofs daemons with or without
> direct maps enabled.  Not really sure if it's technically an autofs
> issue (unless there's a glitch in how it's calling umount and it's
> timing there) or an NFS layer issue.
> 
> SLES9-SP1, kernel 2.6.5-7.147-smp (from suse-9.2 updates) on
> x86_64 hosts.
> 
> -- 
> Mike Marion-Unix SysAdmin/Staff Engineer-http://www.qualcomm.com
> Drew Carey: "Look, this is an odd question, but you're kind of cute and you're
> pretty nice to me. Are you drunk? It's OK if you are." => Drew Cary Show.
> 
> _______________________________________________
> autofs mailing list
> [email protected]
> http://linux.kernel.org/mailman/listinfo/autofs
> 

_______________________________________________
autofs mailing list
[email protected]
http://linux.kernel.org/mailman/listinfo/autofs

Reply via email to