Re: [Pvfs2-users] Question about redundancy

Rob Ross Mon, 30 Apr 2007 12:01:26 -0700

Hi Erich,

I'd say that the GPFS failover groups are a good example of exactly whatI'm talking about.


From [1]:

---

GPFS failover support allows you to organize your hardware into a numberof failure groups. A failure group is a set of disks that share a commonpoint of failure that could cause them all to become simultaneouslyunavailable. When used in conjunction with the replication feature ofGPFS, the creation of multiple failure groups provides for increasedfile availability should a group of disks fail. GPFS maintains eachinstance of replicated data and metadata on disks in different failuregroups. Should a set of disks become unavailable, GPFS fails over to thereplicated copies in another failure group.

During configuration, you assign a replication factor to indicate thetotal number of copies of data and metadata you wish to store.Replication allows you to set different levels of protection for eachfile or one level for an entire file system. Since replication usesadditional disk space and requires extra write time, you might want toconsider replicating only file systems that are frequently read from butseldom written to. To reduce the overhead involved with the replicationof data, you may also choose to replicate only metadata as a means ofproviding additional file system protection. For further information onGPFS replication, see File system recoverability parameters.

---

You can see here that this is *not* something that they intend forgeneral use, especially not for write-heavy workloads (likecomputational science). Further, this is the mechanism that they suggestavoiding, instead using shared hardware and failover.

*Conceptually* lots of things are possible, and in fact there are a lotof really interesting ideas that have been pursued in research andproduction domains. Panasas has an interesting way of driving redundantstorage from clients, as another production example.

So far these approaches aren't widely used in production HECdeployments, to my knowledge, because they simply slow things down toomuch. They might make good sense in a bioinformatics application, etc.,where datasets are often read-only.

The Ceph group at UCSC is another group that is looking at options inthis area, close to home for you.

[1]http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp?topic=/com.ibm.cluster.gpfs.doc/gpfs23/bl1ins10/bl1ins1015.html


Erich Weiler wrote:

IBM's GPFS has done this quite nicely with primary and redundant serverdisks, also they use a concept called 'failover' groups that providebackups for nodes with common failure points. It's a sort ofreplication technique, not exactly a RAID 5 type of redundancy but itworks. I understand this kind of thing is not trivial to code; butconceptually it seems doable.
-erich

Rob Ross wrote:
Hi Steve,

We get this question a lot.
Software redundancy in a parallel file system is a very challengingproblem, particularly to provide efficient access at the same time.
The group at Clemson has been looking into this as a research project,and I believe that others have as well. If a group creates a solutionthat performs well, reliably operates, and fits into the rest of thePVFS system, then we would certainly consider integrating it into theproduction releases. This hasn't happened so far...
Regards,

Rob

Steve wrote:
Is built in redundancy planned ? Or not in the scope of the project ?

Steve

Trusting my 1.1Tb to the reliability of my drives, and touch wood in 20
years of computing had never had a drive fail. Now ive just put acurse on
them!
-------Original Message------- From: Robert Latham Date: 24/04/200714:14:13 To: Erich Weiler Cc: [email protected]Subject: Re: [Pvfs2-users] Question about redundancy On Mon, Apr 23,2007 at 05:03:39PM -0700, Erich Weiler wrote:
I need to be clear on this before putting a lot of time into it, butit sounds like this might be a good solution for our firm, as wehave a 200 node cluster each with one 500GB disk, 400GB of which canbe leveraged to a massive parallel file system (400GB x 200 nodes =one big ~80TB distributed file system). But that assumes that thereis no redundancy, other wise that 80TB would be more like 50-60TBmax or something because there would be some redundancy in there... ?
Murali's explanation is spot-on: no software-based reduncancyscheme. For users concerned with redundancy, we suggest hardwarefailover to Shared storage, which works quite well. ==rob

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Re: [Pvfs2-users] Question about redundancy

Reply via email to