On Apr 27, 2007  14:50 +0200, Andrei Maslennikov wrote:
> I am just fresh from a workshop where one whole day was dedicated to
> file systems and Lustre was one of the key solutions. There was one
> person attending this workshop who was asking every speaker to
> comment on the fsck downtime for each of the file systems discussed,
> and we looked at Lustre from this point of view, as well.
> 
> In case of Lustre, this downtime is estimated as 1-3 hours for 1 TB of ext3
> in use on OST (depending on the underlying hardware), and may last of up
> to several days per 1 PB for the metadata part.

While the 1-3h per TB is reasonable, what is important to note is that this
checking happens IN PARALLEL for lustre.  If you have 500 2TB OSTs = 1PB,
then you can still check all of them in 2-4 hours.

CFS has also recently developed patches to improve the e2fsck speed for 
ext3 filesystems by 2-20x (depends on filesystem usage).  What used to
take 1h to check has been shown for production filesystems to take only
10 minutes...

> Could someone from CFS suggest a sort of formula to calculate the fsck
> downtime in a more accurate manner? This is often important when
> planning for service levels. If a file system is spread over multiple OSTs,
> which fsck operations run in parallel? May metadata checking be
> parallelized?

Yes, the OST and MDS e2fsck checking can be done in parallel.
The distributed checking phase (lfsck) is not needed before returning
the filesystem to service, and can also be run while the filesystem is
in use.  We are planning to eliminate the need for running a separate
lfsck entirely, and the filesystem will just do "scrubbing" internally
all the time during idle times or as a low-priority task.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to