Hello,
I've got a problem that I'm hoping someone on this list can help me with...
Read-only fsck.jfs checks on my oldest volumes are reporting an alarming number
of corrupted root nodes despite the fact that these volumes appear to be
healthy when mounted read-only. Here's the error that I'm getting...
fsck.jfs -n -v /dev/md/10
fsck.jfs version 1.1.14, 06-Apr-2009
processing started: 8/13/2010 10.9.6
The current device is: /dev/md/10
Open(...READONLY...) returned rc = 0
Primary superblock is valid.
The type of file system for the device is JFS.
Block size in bytes: 4096
Filesystem size in blocks: 4756914448
**Phase 1 - Check Blocks, Files/Directories, and Directory Entries
Invalid data format detected in root directory.
CANNOT CONTINUE.
ERRORS HAVE BEEN DETECTED. Run fsck with the -f parameter to repair.
processing terminated: 8/13/2010 10:10:05 with return code: 10062 exit code:
4.
Despite the catastrophic sounding error above, mounting the file system
read-only and listing the directory from the command-line works fine....
ls
20090110 20090303 20090418 20090605 20090721 20090914 20091030 20091215
20100130 20100317 20100502 20100617
20090111 20090304 20090419 20090606 20090722 20090915 20091031 20091216
20100131 20100318 20100503 20100618
20090113 20090305 20090420 20090607 20090723 20090916 20091101 20091217
20100201 20100319 20100504 20100619
20090114 20090306 20090421 20090608 20090724 20090917 20091102 20091218
20100202 20100320 20100505 20100620
20090115 20090307 20090422 20090609 20090725 20090918 20091103 20091219
20100203 20100321 20100506 20100622
20090116 20090308 20090423 20090610 20090727 20090919 20091104 20091220
20100204 20100322 20100507 20100623
20090117 20090309 20090424 20090611 20090728 20090920 20091105 20091221
20100205 20100323 20100508 20100624
20090118 20090310 20090425 20090612 20090729 20090921 20091106 20091222
20100206 20100324 20100509 20100625
20090119 20090311 20090426 20090613 20090730 20090922 20091107 20091223
20100207 20100325 20100510 20100626
20090120 20090312 20090427 20090614 20090731 20090923 20091108 20091224
20100208 20100326 20100511 20100627
20090121 20090313 20090428 20090615 20090801 20090924 20091109 20091225
20100209 20100327 20100512 20100628
20090122 20090314 20090429 20090616 20090802 20090925 20091110 20091226
20100210 20100328 20100513 20100629
20090123 20090315 20090430 20090617 20090803 20090926 20091111 20091227
20100211 20100329 20100514 20100630
20090126 20090316 20090501 20090618 20090804 20090927 20091112 20091228
20100212 20100330 20100515 20100701
20090127 20090317 20090502 20090619 20090805 20090928 20091113 20091229
20100213 20100331 20100516 20100702
20090128 20090318 20090503 20090620 20090809 20090929 20091114 20091230
20100214 20100401 20100517 20100703
20090129 20090319 20090504 20090621 20090810 20090930 20091115 20091231
20100215 20100402 20100518 20100704
20090130 20090320 20090505 20090622 20090811 20091001 20091116 20100101
20100216 20100403 20100519 20100705
20090202 20090321 20090506 20090623 20090812 20091002 20091117 20100102
20100217 20100404 20100520 20100706
20090204 20090322 20090507 20090624 20090813 20091003 20091118 20100103
20100218 20100405 20100521 20100707
20090205 20090323 20090508 20090625 20090814 20091004 20091119 20100104
20100219 20100406 20100522 20100708
20090206 20090324 20090509 20090626 20090815 20091005 20091120 20100105
20100220 20100407 20100523 20100709
20090207 20090325 20090510 20090627 20090816 20091006 20091121 20100106
20100221 20100408 20100524 20100710
20090208 20090326 20090511 20090628 20090817 20091007 20091122 20100107
20100222 20100409 20100525 20100711
20090209 20090327 20090512 20090629 20090818 20091008 20091123 20100108
20100223 20100410 20100526 20100712
20090210 20090328 20090513 20090630 20090819 20091009 20091124 20100109
20100224 20100411 20100527 20100713
20090211 20090329 20090514 20090701 20090820 20091010 20091125 20100110
20100225 20100412 20100528 20100714
20090212 20090330 20090515 20090702 20090821 20091011 20091126 20100111
20100226 20100413 20100529 20100715
20090213 20090331 20090516 20090703 20090822 20091012 20091127 20100112
20100227 20100414 20100530 20100716
20090214 20090401 20090517 20090704 20090823 20091013 20091128 20100113
20100228 20100415 20100531 20100717
20090215 20090402 20090518 20090705 20090824 20091014 20091129 20100114
20100301 20100416 20100601 20100718
20090216 20090403 20090519 20090706 20090825 20091015 20091130 20100115
20100302 20100417 20100602 20100719
20090217 20090404 20090520 20090707 20090826 20091016 20091201 20100116
20100303 20100418 20100603 20100720
20090218 20090405 20090521 20090708 20090827 20091017 20091202 20100117
20100304 20100419 20100604 20100721
20090219 20090406 20090522 20090709 20090828 20091018 20091203 20100118
20100305 20100420 20100605 20100722
20090220 20090407 20090523 20090710 20090901 20091019 20091204 20100119
20100306 20100421 20100606 20100723
20090221 20090408 20090524 20090711 20090902 20091020 20091205 20100120
20100307 20100422 20100607 20100724
20090222 20090409 20090527 20090712 20090903 20091021 20091206 20100121
20100308 20100423 20100608 20100725
20090223 20090410 20090528 20090713 20090904 20091022 20091207 20100122
20100309 20100424 20100609 20100726
20090224 20090411 20090529 20090714 20090905 20091023 20091208 20100123
20100310 20100425 20100610 20100727
20090225 20090412 20090530 20090715 20090906 20091024 20091209 20100124
20100311 20100426 20100611 20100728
20090226 20090413 20090531 20090716 20090907 20091025 20091210 20100125
20100312 20100427 20100612 20100729
20090227 20090414 20090601 20090717 20090908 20091026 20091211 20100126
20100313 20100428 20100613 mount_check
20090228 20090415 20090602 20090718 20090909 20091027 20091212 20100127
20100314 20100429 20100614
20090301 20090416 20090603 20090719 20090912 20091028 20091213 20100128
20100315 20100430 20100615
20090302 20090417 20090604 20090720 20090913 20091029 20091214 20100129
20100316 20100501 20100616
Running fsck.jfs read-wrirte re-initiallizes the root node and moves all of its
former contents into lost+found. I can recover the data from lost+found so this
is not fatal but still something I would like to fix/avoid.
I have not repaired the above volume yet but have repaired others... Here's the
fsck.jfs output for a read-write repair on a volume that had the same errors as
those described above.
fsck.jfs -v /dev/md10
fsck.jfs version 1.1.14, 06-Apr-2009
processing started: 4/23/2010 4.32.24
Using default parameter: -p
The current device is: /dev/md10
Open(...READ/WRITE EXCLUSIVE...) returned rc = 0
Primary superblock is valid.
The type of file system for the device is JFS.
Block size in bytes: 4096
Filesystem size in blocks: 4756914448
**Phase 0 - Replay Journal Log
LOGREDO: Log record for Sync Point at: 0x05774f34
LOGREDO: Beginning to update the Inode Allocation Map.
LOGREDO: Done updating the Inode Allocation Map.
LOGREDO: Beginning to update the Block Map.
LOGREDO: Incorrect leaf index detected (k=(d) 0, j=(d) 0, idx=(d) 0) while
writing Block Map.
LOGREDO: Write Block Map control page failed in UpdateMaps().
LOGREDO: Unable to update map(s).
logredo failed (rc=-231). fsck continuing.
**Phase 1 - Check Blocks, Files/Directories, and Directory Entries
Root directory has a corrupt tree.
Initialized tree created for root directory.
The root directory has an invalid data format. Will correct.
**Phase 2 - Count links
**Phase 3 - Duplicate Block Rescan and Directory Connectedness
**Phase 4 - Report Problems
**Phase 5 - Check Connectivity
**Phase 6 - Perform Approved Corrections
Superblock marked dirty because repairs are about to be written.
No \lost+found directory found in the filesystem.
Directory inode 18661404 has been reconnected to /lost+found/.
Directory inode 18637982 has been reconnected to /lost+found/.
Directory inode 18614880 has been reconnected to /lost+found/.
Directory inode 18595359 has been reconnected to /lost+found/.
Directory inode 18581312 has been reconnected to /lost+found/.
Directory inode 18556038 has been reconnected to /lost+found/.
.
.
.
Directory inode 448971 has been reconnected to /lost+found/.
File inode 443531 has been reconnected to /lost+found/.
Directory inode 442414 has been reconnected to /lost+found/.
.
.
.
Directory inode 2320 has been reconnected to /lost+found/.
Directory inode 101 has been reconnected to /lost+found/.
Directory inode 32 has been reconnected to /lost+found/.
622 directories reconnected to /lost+found/.
1 file reconnected to /lost+found/.
**Phase 7 - Rebuild File/Directory Allocation Maps
**Phase 8 - Rebuild Disk Allocation Maps
**Phase 9 - Reformat File System Log
logformat returned rc = 0
Filesystem Summary:
Blocks in use for inodes: 2276956
Inode count: 18215648
File count: 16453081
Directory count: 1529882
Block count: 4756914448
Free block count: 655162544
19027657792 kilobytes total disk space.
6342069 kilobytes in 1529882 directories.
16397493672 kilobytes in 16453081 user files.
0 kilobytes in extended attributes
0 kilobytes in access control lists
15856013 kilobytes reserved for system use.
2620650176 kilobytes are available for use.
Filesystem is clean.
All observed inconsistencies have been repaired.
Filesystem has been marked clean.
**** Filesystem was modified. ****
processing terminated: 4/23/2010 9:08:55 with return code: 0 exit code: 1.
This problem appears to be related to age and/or the number of directories in
the root node. It's hard to distinguish between these two attributes in our
environment because the root node of our data volumes contain one directory for
each day the volume has been in use. The tipping point appears to be around 500
days/directories.
Is this a known issue? Is there really a problem with the root node or does
fsck.jfs have an analysis bug? In any event, since the OS can list the contents
of the root node, fsck.jfs should be able to do better than just dumping all
the contents into lost+found.
I've also seen corruption in my allocation maps which could be related... How
can I help debug this further?
Thanks!
Tim
------------------------------------------------------------------------------
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users
worldwide. Take advantage of special opportunities to increase revenue and
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
_______________________________________________
Jfs-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jfs-discussion