Re: [lopsa-tech] Options for a fault tolerant shared file system?

Doug Hughes Tue, 27 Apr 2010 11:02:45 -0700

Matt Lawrence wrote:
> It is an intentionally vague description since I want to keep an open 
> mind.
>
> This needs to run on linux, both clients and servers.
>
> Right now I am dealing with an older HA linux setup with NFS.  It seems to 
> be having problems and there have been some ugly and expensive application 
> failures.  I am looking for a solution that will be a lot more reliable 
> and robust.
>
> I am just starting my search for a replacement.  The first things that 
> come to mind are GPFS, GFS2 and GlusterFS.  I am expecting that GFS2 will 
> require mounting remote drives via iSCSI that are mirrored.  I'm not sure 
> exactly how the resync would occur after a failure, I could use some 
> insight.  I have a call scheduled with the Gluster folks to discuss how to 
> do it with GlusterFS, I do know that it supports mirroring on the client 
> end.  I have never done a GPFS setup and haven't even touched a GPFS setup 
> in many years.
>
> AFS isn't an option, the Kerberos and ticketing infrastructure isn't 
> feasible to implement here.
>
> I have already suggested buying a NAS setup, but was shot down.
>
> What other solutions should I be looking at?
>
> -- Matt
> It's not what I know that counts.
> It's what I can remember in time to use.
>   
GlusterFS is really academically cool and high performance with some 
gee-whiz features, but I'm not sure they are there if you want 
rock-solid enterprise reliability. That's not really it's niche. I'd 
feel fine using it as a large scratch or research base where some 
downtime and other twiddling could be tolerated. It might even right 
absolutely fine, but I'd ask for references for people who have it 
deployed in production at scale for at least 6 months.


GPFS is kind of the opposite. We have some of GPFS and some of Gluster. 
GPFS is an enterprise filesystem with the kind of locking, knobs for 
tweaking, and gazillion ancillary programs for doing everything that you 
expect from IBM.  They have many options, published papers, and a 
multi-volume set of documentation to go with it. We've had some 
interesting minor issues with it recently, but have only had it deployed 
in production for a month now after extensive testing. I'd like the 
failover to work a little bit quicker and more seamlessly, but the good 
news is that it does failover except for one pathological case where the 
10G interface on a box went down in such a way that it wasn't really 
down. Since GPFS uses link-state vs heartbeat, it didn't catch the 
failure so the clients didn't automatically failover to another GPFS/NFS 
head. When we rebooted the server, it recovered automatically.

GPFS doesn't collapse under heavy load and seems to scale extremely 
well, plus the storage tiering capabilities are really amazingly cool.


_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Re: [lopsa-tech] Options for a fault tolerant shared file system?

Reply via email to