On Wed, 2008-04-30 at 17:32 +0200, Per Jessen wrote: > Peter Grandi wrote: > > > This and "very busy storing/processing new files (24h/day)" later > > seem to describe a fairly critical system with somewhat high > > availability requirements. > > Fairly high, yes. It has now been down for almost 48hours, which is > probably just about as far as I can let it go. We've already promised > our customers it will be back up Friday morning. Tomorrow is a holiday > here, very fortunate. > > >>> but someone will if it goes on for another 12-15 hours. I > >>> really do not want to have a 2nd day of this tomorrow ... > > > >> Well, looks like that was wishful thinking. > > > > Indeed, and if one has availability constraints, relying on > > 'fsck' being quick is equally unrealistic. > > That's an interesting comment - I guess I _have_ been relying on 1) the > system only rarely needing a reboot and 2) a fast fsck when it happens.
JFS is, of course, designed so that under normal circumstances, fsck only replays the journal, which is very fast. When something bad happens, and it has to do the full processing, it isn't necessarily going to be fast. > > Do you have any insights to share wrt availability, large filesystems > (up to 1Tb in our case) and millions of files? (apart from "don't do > it" :-) JFS's fsck time is basically tied to the number of inodes. I don't have numbers to give you, but a huge, nearly-empty file system won't take too much time to check, but one with millions of inodes may take a long time. Worst case (other than a fatal error that fsck can't recover from) is when cross-linked blocks are detected, and it has to do the pass that is causing you so much delay. It used to be MUCH worse before jfsutils-1.1.5, if you can believe it. > >> Now 35 hours and counting. > > > > The time taken to do a deep check of entangled filesystems can > > be long. For an 'ext3' filesystem it was 75 days, and there are > > other interesting reports of long 'fsck' times: I doubt it gets as bad as that, but again, I have no idea how much longer it will take. > Uh oh. I guess I'd better move ahead with my new system, and hope to > migrate whatever I can later on. > > > but I haven't found (even on this mailing list) many reports of > > 'fsck' durations for JFS, and my own filesystems are rather small > > like yours (a few hundred thousand files, a few hundred GB of > > data), and 'fsck' takes a few minutes on undamaged or mostly OK > > filesystems. > > That has been my experience too - right up until 28 April at around > 20:00. :-( > > > /Per Jessen, Zürich -- David Kleikamp IBM Linux Technology Center ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ Jfs-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/jfs-discussion
