Re: [autofs] Recovering from the loss of a NFS Server

Ian Kent Sun, 13 Mar 2011 01:19:06 -0800

On Sat, 2011-03-12 at 23:27 -0500, Breitman, Jason wrote:
> OS
>       Linux hostname 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 
> x86_64 x86_64 x86_64 GNU/Linux
> 
> autofs package
>       autofs-5.0.1-0.rc2.148.bz579312.1.el5
> 
> Mount options
>       $ cat /etc/auto.master
>       # Master map for automounter
>       #
>       /home             auto_home               -hard,intr,retry=10
> 
>       $ cat /etc/sysconfig/autofs
>       TIMEOUT=86400 - we have a long TIMEOUT to avoid mount storms.
> 
> What am I trying to do?
>       Prior to a disaster recovery test, my home directory will be mounted 
> from my-nfs-server.domainname:/home/jbreitma.
>       At this point my-nfs-server.domainname points to 1.1.1.1.
>       There are active reads and writes to my home directory.
>       Lets say I have a subdirectory called htdocs and am running apache.
> 
>       Now we are cutoff from 1.1.1.1 because the Data Center where 1.1.1.1 
> lives is no longer accessible.
>       We simulate this with an ACL.
>       We now repoint my-nfs-server.domainname to 2.2.2.2.
> 
>       The NFS Clients where /home/jbreitma is mounted are now confused.
> 
>       What is my best coarse of action?
>               umount -l /home/jbreitma
>               /etc/init.d/autofs restart
>               fuser -k /home/jbreitma
>               kill -USR1 `pgrep automount`
>               etc ...


That's about all you can do.

The "umount -l" has it's own set of problems.
In particular any process that has an active mount must do a "cd ." (I
believe that will work) to recover from the changed mount otherwise
getcwd(3) will fail and /proc/<pid>/cwd will point to "/" instead of a
valid working directory.

Also, there is pretty much no way to get the RPC layer to give up on
those outstanding IOs which will cause ongoing problems.

> 
>       How do I recover from this situation?   

There's not much you can do for read/write mounts and even read only
fail over hasn't been implemented within the Linux kernel NFS client.

>       I am open to a new approach if that is required.

The only way I think high availability NFS can work today is when the
backend deals with the change such as in Clustered environments.

> 
> 
> I have had some success with umount -l /home/jbreitma followed by
> a /etc/init.d/autofs restart, but this does not always work.
> I specifically fail when active writes and or reads are occurring
> to /home/jbreitma.
>               
> 
> Jason Breitman
> A&T-Tech-GTI
> jason.breit...@blackrock.com
> BlackRock
> 
> THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE 
> PRIVILEGED.  If this message was misdirected, BlackRock, Inc. and its 
> subsidiaries, ("BlackRock") does not waive any confidentiality or privilege.  
> If you are not the intended recipient, please notify us immediately and 
> destroy the message without disclosing its contents to anyone.  Any 
> distribution, use or copying of this e-mail or the information it contains by 
> other than an intended recipient is unauthorized.  The views and opinions 
> expressed in this e-mail message are the author's own and may not reflect the 
> views and opinions of BlackRock, unless the author is authorized by BlackRock 
> to express such views or opinions on its behalf.  All email sent to or from 
> this address is subject to electronic storage and review by BlackRock.  
> Although BlackRock operates anti-virus programs, it does not accept 
> responsibility for any damage whatsoever caused by viruses being passed.
> 
> 
> _______________________________________________
> autofs mailing list
> autofs@linux.kernel.org
> http://linux.kernel.org/mailman/listinfo/autofs


_______________________________________________
autofs mailing list
autofs@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/autofs

Re: [autofs] Recovering from the loss of a NFS Server

Reply via email to