On Thu, 27 Mar 2008, Bly, MJ (Martin) wrote:

<snip>
On our hardware RAID arrays (3ware, Areca, Infortrend) with many (12/14)
SATA disks, 500/750GB each, we fsck 2TB+ ext3 filesystems (as
infrequently as possible!) and it takes ~2 hours each.  We have some
5.5TB arrays that take less than three hours.  Note that these are
created with '-T largefile4 -O dir_index' among other options.

At least one pass of the ext3 fsck involves checking every inode table entry so '-T largefile4' would help you since you will get smaller numbers of inodes. As the inode tables are spread through the disk it will read the various chunks then seek off somewhere and read more...

[ one of the planned features for ext4 is a way to safely mark than an entire inode-table lump is unused to save things like fsck from having to scan all those unusd blocks. Of course doing so safely isn't quite trivial and it causes problems with the current model of how to choose the locations for an inode for a new file... ]

I'd be very suspicious of a HW RAID controller that took 'days' to fsck
a file system unless the filesystem was already in serious trouble, and
from bitter experience, fsck on a filesystem with holes it it caused by
a bad raid controller interconnect (SCSI!) can do more damage than good.

To give you one example at least one fsck pass needs to check that every inode in use has the right (link-count) number of directory entries pointing at it. The current ext3 fsck seems to do a good impersonation of a linear search though it's in-memory inode-state-table for each directory entry - at least for files with non-trivial link-counts.

A trivial analysis shows that such a set of checks would be O(n^2) in the number of files needing to be checked, not counting the performance problems when the 'in-memory' tables get too big for ram...

[ In case I'm slandering the ext3 fsck people - I've not actually checked the ext3 fsck code really does anything as simple as a linear search but anything more complex will need to use more memory and ... ]

Last year we were trying to fsck a ~6.8TB ext3 fs which was about 70% filled with hundereds of hard-link trees of home directories. So huge numbers of inode entries (many/most files are small), and having a link-counts of say 150. Our poor server had only 8G ram and the ext3 fsck wanted rather a lot more. Obviously in such a case it will be *slow*.

Of course that was after we built a version which didn't simply go into an infinite loop somewhere between 3 and 4TB into scanning through the inode-table.

Now as you can guess any dump needs to do much the same kind of work wrt scanning the inodes, looking for hard-links, so you may not be shocked to discover that attempting a backup was rather slow too...

--
Jon Peatfield,  Computer Officer,  DAMTP,  University of Cambridge
Mail:  [EMAIL PROTECTED]     Web:  http://www.damtp.cam.ac.uk/

Reply via email to