Re: [Lustre-discuss] OST failure, data recovery

Andreas Dilger Fri, 08 Jun 2007 10:35:28 -0700

On Jun 08, 2007  13:41 +0200, Thomas Roth wrote:
> If the user data gets spread out in chunks to a number of OSTs, and one 
> of the OSTs fails completely - say, all the disks on the fileserver 
> behind that OST are gone for good - how does the cluster recover from that?


This is of course possible.  The risk of losing each Lustre file is a
function of how reliable the back-end storage is, and how many OSTs a
file is striped over.  We already recommend striping over as few OSTs
as is needed to achieve the bandwidth needed for its usage, so if you
have a relatively slow network compared to the storage, and don't need
access to a file by many clients at one time you can just stripe over
a single OST by default.

> It can't be a backup replay from tape or keeping all fileservers as HA 
> pairs, right?

Well, RAID is never really a substitute for a backup in any case, because
RAID doesn't protect you from "rm -rf *" and other user mistakes.  That is
true for Lustre as much as other filesystems.

> Is lustre doing some kind of RAID accross the OSTs?

This is something that we are working toward.  However, it is far more
difficult to implement reliably than it first appears.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] OST failure, data recovery

Reply via email to