>>> On Fri, 27 Apr 2007 14:13:56 -0600, Andreas Dilger >>> <[EMAIL PROTECTED]> said:
[ ... 'fsck' times ... ] adilger> While the 1-3h per TB is reasonable, what is important to adilger> note is that this checking happens IN PARALLEL for lustre. adilger> If you have 500 2TB OSTs = 1PB, then you can still check adilger> all of them in 2-4 hours. Ahhh interesting. But yes, if they are on separate hosts, but for example I have only one with 12TB on a RAID10. My main reason to look at Lustre is not to take advantage of the cluster based parallelism, but to have 6x2TB OSTs on the same machine and hope that if there are active updates to only one then only one needs 'fsck'ing. Basically my main reason is to reduce post-crash service unavailability due to 'fsck'. My particular application would have 12TB of 20-80MB files, let's say around 200,000-700,000 inodes in total. adilger> CFS has also recently developed patches to improve the adilger> e2fsck speed for ext3 filesystems by 2-20x (depends on adilger> filesystem usage). What used to take 1h to check has been adilger> shown for production filesystems to take only 10 adilger> minutes... Well, that would be nice, but also sounds a bit implausible. Production filesystems tend to be full, with metadata scattered all over the place, and 'ext3' has quite a bit of quite scattered metadata, and become very fragmented quite rapidly. >> Could someone from CFS suggest a sort of formula to calculate >> the fsck downtime in a more accurate manner? This is often >> important when planning for service levels. If a file system is >> spread over multiple OSTs, which fsck operations run in >> parallel? May metadata checking be parallelized? adilger> Yes, the OST and MDS e2fsck checking can be done in adilger> parallel. I wonder if one had those 6x2TB OSTs on the same RAID10 then parallel checking would be faster thanks to all those arms. adilger> The distributed checking phase (lfsck) is not needed adilger> before returning the filesystem to service, and can also adilger> be run while the filesystem is in use. We are planning to adilger> eliminate the need for running a separate lfsck entirely, adilger> and the filesystem will just do "scrubbing" internally adilger> all the time during idle times or as a low-priority task. Ahh interesting too, but this may not always be feasible: the application I am thinking of has 24x7 simultaneous read and write rates of around 100MB/s each (and yes using just a single system is unfortunately non-negotiable right now). _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
