[ ... ] >> Fortunately, no customer has complained yet,
This and "very busy storing/processing new files (24h/day)" later seem to describe a fairly critical system with somewhat high availability requirements. >> but someone will if it goes on for another 12-15 hours. I >> really do not want to have a 2nd day of this tomorrow ... > Well, looks like that was wishful thinking. Indeed, and if one has availability constraints, relying on 'fsck' being quick is equally unrealistic. > Now 35 hours and counting. The time taken to do a deep check of entangled filesystems can be long. For an 'ext3' filesystem it was 75 days, and there are other interesting reports of long 'fsck' times: http://www.sabi.co.uk/blog/anno05-4th.html#051009 http://www.sabi.co.uk/blog/anno05-4th.html#051108 http://www.sabi.co.uk/blog/0802feb.html#080210 My impression is that JFS has a much better 'fsck' than 'ext3', but I haven't found (even on this mailing list) many reports of 'fsck' durations for JFS, and my own filesystems are rather small like yours (a few hundred thousand files, a few hundred GB of data), and 'fsck' takes a few minutes on undamaged or mostly OK filesystems. Anyhow the high bounds on 'fsck' times and space are a well known problems, especially for multi-TB filesystems, and this are some of the most recent news for a couple of other filesystems: http://kerneltrap.org/Linux/Improving_fsck_Speeds_in_ext4 http://oss.sgi.com/archives/xfs/2008-01/msg00187.html > Recent output is stuff like this: [ ... shared blocks ... ] > [ ... ] what could possibly have caused such a mess?? A very optimistic sysadm? :-) > This is a old(ish) SMP system, running 2.4.33, [ ... ] My impression is that both SMP and JFS in 2.4.33 are not as well tested as in 2.6, as there have been some important bug fixes in the 2.6 series that probably apply very much to high load systems, especially for SMP. Using a kernel that old means accepting whatever issues it has and hoping that they don't affect your load. Anyhow in my experience most events like the above are caused by hardware issues, more than old old bugs remaining unfixed in the SMP or JFS code of old kernels. Even a single bit error in RAM or a single block error during IO can have devastating effects. Never mind firmware or other errors. Consider for example this interesting report on IO "silent corruption" from a largish installation with a lot of experience: https://indico.desy.de/contributionDisplay.py?contribId=65&sessionId=42&confId=257 and their subsquent update: http://indico.fnal.gov/contributionDisplay.py?contribId=44&sessionId=15&confId=805 System integration and qualification is a very difficult and expensive activity... > [ ... ] The filesystem is about 140Gb in total of which 90Gb > is in use. It's backed by a software RAID5. As you now have duscovered it would have been much quicker to restore it from backups ('-o nointegrity' would have made it even faster). That's a way of doing 'fsck' that is often faster than 'fsck', because it relies largely on straightforward sequential accesses, while 'fsck' relies a lot on random accesses and somewhat hairy algorithms. > I'm guessing the filesystem probably had some 500.000 files > with up to maybe 40,000 in some directories, That's generally unwise, but the real problems are the overlapping allocations because then 'fsck' must check everything against everything. > but generally less. The system was generally very busy > storing/processing new files (24h/day). ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ Jfs-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/jfs-discussion
