Dave,
Now when I tried to do something similar, I found that if you weren't on node1 or node2, the filesystem was read-only, so I had to do this:
/vol/vol1 -rw=node1:node2,root=node1,node2 /vol/vol1/foo1 -root=node1:node2 /vol/vol1/foo2 -root=node1:node2
On this one here, the top line is correct but the other two lines should be:
/vol/vol1/foo1 -rw,root=node1:node2 /vol/vol1/foo2 -rw,root=node1:node2
This way, the vol/vol1 dir does not mount when you cd to /net/machine/vol/vol1 but the other two directories do mount and are accessible by all workstations that need to read and write to it. This should work under both RedHat 8 and Enterprise 3. Now, I don't know why autofs4 seems to require the exports to be this way on a netapp box when Solaris didn't seem to care but this is what is working for us.
Dwight Marzolf
David Meleedy wrote:
Hi Ian & Jeff, I am trying to track down an autofs issue that has been plaguing us. It seems to be caused by the interaction of autofs version 4 with a Network Appliance server, and cd'ing to /net directories on the Netapp server.
A similar issue was seen in Analog Devices in Redhat 8, and apparently the problem was worked around by Dwight Marzolf working with Ian Kent's help. So following what Dwight did I have been trying to recreate the fix for Redhat Enterprise 3 update 3, and so far have not met with success.
THE PROBLEM DESCRIPTION:
Autofs hangs and refuses to mount any directories for a period of time after cd'ing to /net/<Netapp>/vol/vol[0-3] and waiting a while. The only way to clear this is to reboot the client.
Initially we started using the following software (Redhat Enterprise 3 update 3)
autofs 4.1.3-12
kernel 2.4.21-20
nfs-utils 1.0.6-31EL
WHAT HAS BEEN TRIED SO FAR:
Mike Waychison, after seeing the messages from our log file said,
"These messages are due to starvation for reserved ports (< 1024). Specifically, the kernel will only use ports < 800. Currently, the kernel uses one port per nfs filesystem. If you mount filesystems very fast, then you can also run out of reserved ports as the local (mountd iirc?) will close tcp sessions and each must wait 2 minutes before being released.
One solution is to try out the patch I posted last week that allows nfs mounts to share tcp/udp connections:
http://marc.theaimsgroup.com/?l=linux-nfs&m=110261671705396&w=2 "
The problem is we are using a different version of the kernel 2.4, and his patch was for the 2.6 kernel. Also, although his patch might make the number of ports available increase, I think it does not really solve the problem, it just gives more breathing room.
After talking with Jeff Moyer about the issue, I updated autofs to autofs-4.1.3-67. This was supposed to incorporate a patch that fixes
the port leak problem.
This did not solve the problem, but it did seem to improve things a bit.
After looking at Dwight Marzolf's document on his workaround I found the following information (this is exactly the same sort of thing we are seeing too):
" we quickly found that if you did a cd via /net to one of our Network Appliance filers (all our other netapp filers worked correctly when unmounting /net mounts), the port release issue still existed. In fact, the mountpoints actively took more ports. This meant that if you mounted this filer with /net, your workstation could be rendered useless in less than 24 hours. It also became evident that this active taking of ports by this filer was not limited to just autofs-4.1.3-28 but also earlier versions of autofs ... Further research revealed the ports were being taken at the point of automount timeout. When the automounter had declared these mountpoints to be timed out and ready to be unmounted and attempted to umount them, in fact, it ended up remounting them, using new ports for the remount ... "
HOW TO REPRODUCE THE PROBLEM:
Actually in our case we can render a machine useless in just about an hour or two, and this happens for all of our Netapp filers. The procedure to do this is reproducible.
1) You cd to a /net directory on the filer. 2) Leave the shell in that /net directory for about 15 minutes-> 1/2 an hour. and watch the "BUG" messages in the /var/log/messages file.
3) Log out. (so the automounter tries to unmount everything that was mounted).
4) Log in again, after 30 minutes and by then you won't be about to mount anything anymore
You can replace steps 3 and 4 with "init 6". When the automounter process is stopped by init, you will see the port messages scroll up the console screen.
EXAMPLE OF REPRODUCING THE PROBLEM:
codered-51: cd /net/aflac/vol/vol2 ( I can't help but wonder if this BUG message that shows up once a minute is indicative of a problem )
codered-52: tail -f /var/log/messages
Jan 11 15:32:37 codered automount[6214]: attempting to mount entry /net/aflac
Jan 11 15:33:41 codered automount[7915]: BUG: /net/aflac/vol/vol2 already mounted
Jan 11 15:34:42 codered automount[8049]: BUG: /net/aflac/vol/vol2 already mounted
Jan 11 15:36:42 codered automount[8311]: BUG: /net/aflac/vol/vol2 already mounted
Jan 11 15:37:43 codered automount[8441]: BUG: /net/aflac/vol/vol2 already mounted
... (continues once a minute to print out this bug) ...
codered-53: sudo init 6
(after reboot log in to see error messages)
THE REALLY WEIRD PART: Now the interesting thing here is that the machine is rebooting, so there is no program requesting additional mounts, yet here in the log files you can see that almost every subdirectory of /vol/vol2, /vol/vol3 and /vol/vol3 are attempted to be mounted, even though the only thing that should be happening is an unmount of the directory aflac:/vol/vol2
jetcar-189: cd /net/aflac/vol/vol3
jetcar-190: ls
ad1983/ cad_archive/ emerald/ layout_old/ ta/ archive/ design/ is_013std/ lx3/ jetcar-191: cd ../vol2
jetcar-192: ls
9xcores/ danube/ nwd_layout/ ulc3/
DSPS_Finance/ gpdsp_PLD/ nwd_testmgr/ win2k/
WWM/ gpdsp_marketing/ pc_backups/ bitpower/ india_mirror/ sh/ bluetooth/ nile/ spitfire/ jetcar-194: cd ../vol1
etcar-195: ls
IssueManager/ diablo/ is_013std/ ras/ tigersharc/
admin/ ed/ jordan/ soft/ archive/ fsp/ nwd_fsp@ teton_lite/ cpd/ herc_eval/ pe_workspace/ thor/
codered-54: less /var/log/messages
Jan 11 15:51:14 codered automount[6214]: can't shutdown: filesystem /net still busy
Jan 11 15:51:17 codered autofs: automount -USR2 succeeded
Jan 11 15:51:19 codered automount[6214]: can't shutdown: filesystem /net still busy
Jan 11 15:51:20 codered autofs: automount -USR2 succeeded
Jan 11 15:51:23 codered autofs: automount -USR2 succeeded
Jan 11 15:51:26 codered autofs: automount -USR2 succeeded
Jan 11 15:51:26 codered automount[6214]: can't shutdown: filesystem /net still busy
Jan 11 15:51:28 codered automount[14708]: >> mount: wrong fs type, bad option, bad superblock on aflac:/vol/vol2/spitfire,
Jan 11 15:51:28 codered automount[14708]: >> or too many mounted file sys
tems
Jan 11 15:51:28 codered automount[14708]: mount(nfs): nfs: mount failure aflac:/
vol/vol2/spitfire on /net/aflac/vol/vol2/spitfire
Jan 11 15:51:28 codered kernel: RPC: Can't bind to reserved port (98).
Jan 11 15:51:28 codered kernel: nfs_get_root: getattr error = 5
Jan 11 15:51:28 codered kernel: RPC: Can't bind to reserved port (98).
Jan 11 15:51:28 codered kernel: nfs_get_root: getattr error = 5
Jan 11 15:51:28 codered kernel: nfs_read_super: get root inode failed
Jan 11 15:51:28 codered kernel: nfs warning: mount version older than kernel
Jan 11 15:51:28 codered kernel: RPC: Can't bind to reserved port (98).
Jan 11 15:51:28 codered kernel: nfs_get_root: getattr error = 5
Jan 11 15:51:28 codered kernel: nfs_read_super: get root inode failed
Jan 11 15:51:28 codered automount[14708]: >> mount: wrong fs type, bad option, bad superblock on aflac:/vol/vol2/ulc3,
Jan 11 15:51:28 codered automount[14708]: >> or too many mounted file systems
Jan 11 15:51:28 codered automount[14708]: mount(nfs): nfs: mount failure aflac:/vol/vol2/ulc3 on /net/aflac/vol/vol2/ulc3
...
This same pattern of error messages repeats for (in this order)
aflac:/vol/vol2/win2k
aflac:/vol/vol3/ad1983
aflac:/vol/vol3/archive
aflac:/vol/vol3/cad_archive
aflac:/vol/vol3/design
aflac:/vol/vol3/emerald
aflac:/vol/vol3
aflac:/vol/vol3/is_013std
aflac:/vol/vol3/layout_old
aflac:/vol/vol3/lx3
aflac:/vol/vol3/ta
aflac:/vol/vol2/DSPS_Finance
aflac:/vol/vol2
aflac:/vol/vol2/gpdsp_marketing
aflac:/vol/vol2/gpdsp_PLD
aflac:/vol/vol2/india_mirror
aflac:/vol/vol2/nile
aflac:/vol/vol2/nwd_layout
aflac:/vol/vol2/nwd_testmgr
aflac:/vol/vol2/pc_backups
aflac:/vol/vol2/sh
aflac:/vol/vol2/spitfire (repeats the whole thing again) eventually gets to vol1: ... aflac:/vol/vol3/ta aflac:/vol/vol1/pe_workspace aflac:/vol/vol1/ras aflac:/vol/vol1/soft aflac:/vol/vol1/teton_lite aflac:/vol/vol1/thor aflac:/vol/vol1/tigersharc aflac:/vol/vol2/9xcores aflac:/vol/vol2/bitpower aflac:/vol/vol2/bluetooth aflac:/vol/vol2/danube aflac:/vol/vol2/DSPS_Finance ... (repeats the whole thing again)...
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol3/ta Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol3/lx3 Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol3/layout_old
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol3/is_013std
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol3 Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/win2k
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/ulc3
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/spitfire
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/sh Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/pc_backups
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/nwd_testmgr
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/nwd_layout
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/nile
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/india_mirror
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/gpdsp_marketing
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2/gpdsp_PLD
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol2 Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/tigersharc
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/thor
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/teton_lite
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/soft
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/ras Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/pe_workspace
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/jordan
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/is_013std
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/herc_eval
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/fsp Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1/IssueManager
Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol1 Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol/vol0 Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac/vol Jan 11 15:51:37 codered automount[15971]: rm_unwanted: /net/aflac Jan 11 15:51:37 codered automount[15971]: expired /net/aflac
Jan 11 15:51:37 codered automount[15974]: rm_unwanted: /net/aflac/vol/vol3 Jan 11 15:51:37 codered automount[15974]: rm_unwanted: /net/aflac/vol/vol2 Jan 11 15:51:37 codered automount[15974]: rm_unwanted: /net/aflac/vol/vol1 Jan 11 15:51:37 codered automount[15974]: rm_unwanted: /net/aflac/vol Jan 11 15:51:37 codered automount[15974]: rm_unwanted: /net/aflac Jan 11 15:51:37 codered automount[15974]: expired /net/aflac
Jan 11 15:51:37 codered automount[15975]: rm_unwanted: /net/aflac/vol/vol3 Jan 11 15:51:37 codered automount[15975]: rm_unwanted: /net/aflac/vol/vol2 Jan 11 15:51:37 codered automount[15975]: rm_unwanted: /net/aflac/vol/vol1 Jan 11 15:51:37 codered automount[15975]: rm_unwanted: /net/aflac/vol Jan 11 15:51:37 codered automount[15975]: rm_unwanted: /net/aflac Jan 11 15:51:37 codered automount[15975]: expired /net/aflac
Jan 11 15:51:37 codered automount[15976]: rm_unwanted: /net/aflac/vol/vol3 Jan 11 15:51:37 codered automount[15976]: rm_unwanted: /net/aflac/vol/vol2 Jan 11 15:51:37 codered automount[15976]: rm_unwanted: /net/aflac/vol/vol1 Jan 11 15:51:37 codered automount[15976]: rm_unwanted: /net/aflac/vol Jan 11 15:51:37 codered automount[15976]: rm_unwanted: /net/aflac Jan 11 15:51:37 codered automount[15976]: expired /net/aflac
Jan 11 15:51:37 codered automount[15977]: rm_unwanted: /net/aflac/vol/vol3 Jan 11 15:51:37 codered automount[15977]: rm_unwanted: /net/aflac/vol/vol2 Jan 11 15:51:37 codered automount[15977]: rm_unwanted: /net/aflac/vol/vol1 Jan 11 15:51:37 codered automount[15977]: rm_unwanted: /net/aflac/vol Jan 11 15:51:37 codered automount[15977]: rm_unwanted: /net/aflac Jan 11 15:51:37 codered automount[15977]: expired /net/aflac
Jan 11 15:51:38 codered automount[15978]: rm_unwanted: /net/aflac/vol/vol3 Jan 11 15:51:38 codered automount[15978]: rm_unwanted: /net/aflac/vol/vol2 Jan 11 15:51:38 codered automount[15978]: rm_unwanted: /net/aflac/vol/vol1 Jan 11 15:51:38 codered automount[15978]: rm_unwanted: /net/aflac/vol Jan 11 15:51:38 codered automount[15978]: rm_unwanted: /net/aflac Jan 11 15:51:38 codered automount[15978]: expired /net/aflac
Jan 11 15:51:38 codered autofs: automount -USR2 succeeded
Jan 11 15:51:38 codered automount[15986]: rm_unwanted: /net/aflac/vol/vol3 Jan 11 15:51:38 codered automount[15986]: rm_unwanted: /net/aflac/vol/vol2 Jan 11 15:51:38 codered automount[15986]: rm_unwanted: /net/aflac/vol/vol1 Jan 11 15:51:38 codered automount[15986]: rm_unwanted: /net/aflac/vol Jan 11 15:51:38 codered automount[15986]: rm_unwanted: /net/aflac Jan 11 15:51:38 codered automount[15986]: expired /net/aflac
Jan 11 15:51:39 codered automount[6214]: can't shutdown: filesystem /net still busy
.... (keeps repeating) ....
Jan 11 15:51:45 codered automount[6214]: can't shutdown: filesystem /net still busy
Jan 11 15:51:47 codered autofs: automount shutdown failed
HOW IT WAS FIXED IN REDHAT 8:
Dwight had implemented his fix in 3 steps for Redhat 8:
1) He updated his autofs to autofs-4.1.3-28 which had the port leak fix
2) He patched his kernel with the autofs4-2.4.20-20040508.patch
(is some equivalent patch needed for Redhat 3 Enterprise 3 which uses kernel 2.4.21-20 ?
3) He changed the way he exported filesystems from the Netapp:
"The last issue was the matter of how /vol/vol0 is exported from a Network Appliance filer. We found that the following exports broke autofs4:
/vol/vol0 -root=node1:node2:node3:node4 /vol/vol0 -rw,root=node1:node2:node3 /vol/vol0 -anon=0
The export syntax that worked was:
/vol/vol0 -rw=node1:node2,root=node1,node2 "
WHAT HAPPENED WHEN I TRIED THE REDHAT 8 WORKAROUND:
Now when I tried to do something similar, I found that if you weren't on node1 or node2, the filesystem was read-only, so I had to do this:
/vol/vol1 -rw=node1:node2,root=node1,node2 /vol/vol1/foo1 -root=node1:node2 /vol/vol1/foo2 -root=node1:node2
This way if you cd /net/filer/vol/vol1 it was read-only for most machines
but if you cd'd to /net/filer/vol/vol1/foo1 it was read-write.
So using that Netapp export workaround that fixed the Redhat 8 autofs4 problem, plus using autofs-4.1.3-67 has not yet solved the problem yet for our Redhat Enterprise 3 clients.
CONCLUSION:
I hope this is enough info to track down this problem. It appears as though the interaction of using /net with a Netapp is causing spurious mounts, and unmounting is not working. I will assist with any patch tests that you require, so let me know, and I will be able to verify any fixes.
Thanks,
-Dave
________________________________________________________________________ David Meleedy Analog Devices, Inc. [EMAIL PROTECTED] Three Technology Way Phone: 781 461 3494 Norwood, MA 02062-9106 USA
_______________________________________________ autofs mailing list [email protected] http://linux.kernel.org/mailman/listinfo/autofs
