Re: [Linux-HA] NFS active-active failover OCF RA.

Florian Haas Wed, 24 Mar 2010 11:37:25 -0700

Awesome. I've wanted someone to write this for a while. :)

Can you please subscribe to linux-ha-dev so we can do a proper patch
review there?


Cheers,
Florian

On 03/24/2010 05:28 PM, Ben Timby wrote:
> I would like some opinions on the OCF RA I wrote. I needed to support
> an active-active setup for NFS, and googling found me no working
> solution, so I put one together. I have read these list archives and
> various resources around the 'net when putting this together. My
> testing is favorable so far, but I would like to ask the experts. I
> wrote up a description of my solution on my blog, the RA is linked
> from there. I will copy the text and link in this email. I am using
> Heartbeat 3 and Pacemaker on CentOS 5.4.
> 
> http://ben.timby.com/?p=109
> ------
> I have need for an active-active NFS cluster. For review, and
> active-active cluster is two boxes that export two resources (one
> each). Each box acts as a backup for the other box’s resource. This
> way, both boxes actively serve clients (albeit for different NFS
> exports).
> 
> The first problem I ran into with this setup is the nfsserver OCF
> resource agent that comes with Heartbeat is not suitable. This is
> because it works by stopping/starting the nfs server via it’s init
> script. For my situation, NFS will always be running, I just want to
> add/remove exports on failover.
> 
> Adding and removing exports is fairly easy under Linux, you use the
> exportfs command:
> 
> $ exportfs -o rw,sync,mp 192.168.1.0/24:/mnt/fs/to/export
> 
> The options correspond to those you would place into /etc/exports, and
> the rest is the host:/path portion, also as it would go into
> /etc/exports. To remove an export, you specify the following:
> 
> $ exportfs -u 192.168.1.0/24:/mnt/fs/to/export
> 
> Therefore what I needed was an OCF RA that managed NFS exports using
> exportfs. I wrote one and it is available at the link below.
> 
> http://ben.timby.com/pub/exportfs.txt
> 
> However there are two remaining issues.
> 
> The first is that when you export a file system via NFS, a unique fsid
> is generated for that file system. The client machines that mount the
> exported file system use this id to generate handles to
> directories/files. This fsid is generated using the major/minor of the
> device being exported. This is a problem for me, as the device being
> exported is a DRBD volume with LVM on top of it. This means that when
> the LVM OCF RA fails over the LVM volgroup, the major/minor will
> change. In fact, the first device on my system had a minor of 4. This
> was true of both nodes. If a resource migrates, it receives the minor
> 4, as the existing volgroup already occupies 4. This means that the
> fsid will change for the exported file system and all client file
> handles are stale after failover.
> 
> To fix this, each exported file system needs a unique fsid option
> passed to exportfs:
> 
> $ exportfs -o rw,sync,mp,fsid=1 192.168.1.0/24:/mnt/fs/to/export
> 
> Note that fsid=0 has special meaning in NFSv4, so avoid it unless you
> read the docs and understand it’s special use. I have taken care of
> this in my RA by generating a random fsid in case one is not already
> assigned. This random fsid is then written to the DRBD device, and
> used on the other node when the file system is exported. This way the
> fsid is both unique and persistent (remains same on other node after
> failover).
> 
> The other problem is that the /var/lib/nfs/rmtab file needs to be
> synchronized. This file contains the clients whom have mounted the
> exported file system. Again, I handle this in my RA by saving the
> relevant rmtab entries onto the DRBD device, and restoring them to the
> other node’s rmtab file. I also remove these entries from the node on
> which the resource is stopped.
> 
> This gives me a smooth failover of NFS from one node to the other and
> back again. To use my RA, simply install it onto your cluster nodes
> at:
> 
> /usr/lib/ocf/resources.d/custom/exportfs
> 
> Then you can create a resource using that RA, it requires three parameters.
> 
>    1. exportfs_dir - the directory to export.
>    2. exportfs_clientspec - the client specification to export to
> (i.e. 192.168.1.0/24).
>    3. exportfs_options - the options as you would specify in /etc/exports.
> 
> If you provide an fsid in the exportfs_options param, that value will
> be honored, the random fsid is only generated when fsid is absent.
> 
> This seems to work perfectly on my cluster running CentOS 5.4, I
> tested using an Ubuntu 9.10 client.
> 
> ** Update **
> 
> I posted a new version of the OCF RA. The problem being that it was
> only backing up rmtab when the resource is being stopped. Needless to
> say, this only covers the graceful failover scenario, if the service
> dies, the backup is never made. I have remedied this by spawning a
> process that continually backs up rmtab. This process is then killed
> when the resource is stopped. This should cover resource failures as
> well as graceful failovers.

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] NFS active-active failover OCF RA.

Reply via email to