On Apr 27, 2007 14:50 +0200, Andrei Maslennikov wrote: > I am just fresh from a workshop where one whole day was dedicated to > file systems and Lustre was one of the key solutions. There was one > person attending this workshop who was asking every speaker to > comment on the fsck downtime for each of the file systems discussed, > and we looked at Lustre from this point of view, as well. > > In case of Lustre, this downtime is estimated as 1-3 hours for 1 TB of ext3 > in use on OST (depending on the underlying hardware), and may last of up > to several days per 1 PB for the metadata part.
While the 1-3h per TB is reasonable, what is important to note is that this checking happens IN PARALLEL for lustre. If you have 500 2TB OSTs = 1PB, then you can still check all of them in 2-4 hours. CFS has also recently developed patches to improve the e2fsck speed for ext3 filesystems by 2-20x (depends on filesystem usage). What used to take 1h to check has been shown for production filesystems to take only 10 minutes... > Could someone from CFS suggest a sort of formula to calculate the fsck > downtime in a more accurate manner? This is often important when > planning for service levels. If a file system is spread over multiple OSTs, > which fsck operations run in parallel? May metadata checking be > parallelized? Yes, the OST and MDS e2fsck checking can be done in parallel. The distributed checking phase (lfsck) is not needed before returning the filesystem to service, and can also be run while the filesystem is in use. We are planning to eliminate the need for running a separate lfsck entirely, and the filesystem will just do "scrubbing" internally all the time during idle times or as a low-priority task. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
