Re: [Linux-ha-dev] NFS resource agent for active-active clusters.

Dejan Muhamedagic Thu, 25 Mar 2010 04:03:55 -0700

Hi Tim,

On Wed, Mar 24, 2010 at 09:55:39PM -0600, Tim Serong wrote:
> On 3/25/2010 at 05:59 AM, Ben Timby <[email protected]> wrote: 
> > Attached is a resource agent that I call exportfs. 
> >  
> > Rather than starting/stopping NFS, it uses exportfs to add/remove 
> > individual exports. 
> 
> Awesome, as Florian said :)
> 
> A couple of random comments...
> 
> > It also takes care to use cluster-wide unique fsid parameters for each 
> > export. It ensures that this fsid is migrated with the resource. 
> 
> An alternative to automating this would be to just push the burden of
> fsid assignment to the sysadmin (have the RA return $OCF_ERR_CONFIGURED
> if no fsid was explicitly specified).  Makes the code slightly simpler
> at the expense of some small administrative effort :)


I'd also rather have this done by the user than mess with random
numbers and syncing and avoiding duplicates.

> Now for a little potential nastiness...  I did some work in this area
> a year or two ago, and at the time, we ran into some curious edge cases.
> Hopefully things have moved on a little since then in NFS-land (I was
> using SLES 10 SP2, from memory), but for reference, have a look at:
> 
>   http://marc.info/?l=linux-nfs&m=123175640421702&w=2
> 
> This describes an edge case where (depending on what the clients are
> doing), it's possible that running "exportfs -i" to export one directory
> will result in an interruption of service to an unrelated exported
> directory on the same node.

This sounds unexpected.

BTW, I guess that this is a Linux specific exportfs, right?

Cheers,

Dejan

> There's also a problem whereby you almost certainly can't rely on the
> return code from exportfs actually telling you the directory was exported
> successfully.  The only reason exportfs will fail is if you pass invalid
> options, and it's possible that exportfs will return before the export
> has actually appeared in /var/lib/nfs/etab (exportfs says "please kernel,
> export this when you get a chance, kthxbye").
> 
> We ran into these issues because we were doing failover testing while
> the system was under heavy load (continuous write of several GB, followed
> by reading the same data back for verification), while failing over
> multiple NFS exports from node to node.  You probably won't ever hit them
> unless the system is being severely hammered...  But I'd still recommend
> further testing along these lines, out of sheer paranoia.
> 
> Regards,
> 
> Tim
> 
> 
> -- 
> Tim Serong <[email protected]>
> Senior Clustering Engineer, OPS Engineering, Novell Inc.
> 
> 
> 
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: [Linux-ha-dev] NFS resource agent for active-active clusters.

Reply via email to