Thank you for the response, that was very helpful! It appears most of the errors were easily fixed by the fsck, but one thing worries me -- I'm still getting all of the 'Internal logic failure !! duplicate cluster' messages. Is this a problem?
Bob Sunil Mushran wrote: > A directory is corrupted. To get the name of the dir, do: > > $ debugfs.ocfs2 -R "findpath <12862125>" /dev/sdX > > It may take time as it will have to traverse the dirs. > > To fix, you will have to run fsck in rw mode. fsck.ocfs2 -fy /dev/sdX. > > That the nodes do not crash is as expected. Dir corruption is always > localized > and only rears up as an error. > > 800K files in one dir is not efficient since the current version does not > support indexed dirs. We hope to add support for the same in the near term. > > Sunil > > > Bob Ziuchkovski wrote: >> Hi All, >> >> I'm trying to move my company from our current frenzy of rsyncing >> towards ocfs2. I've deployed ocfs2 on a few test servers and in >> general things seem to be working. However, I've run into a couple of >> problems and wanted to run them by this mailing list. >> >> Earlier today I encountered errors that at first appeared to be >> permission errors. However, when I checked dmesg output, I saw the >> following entries repeated over and over: >> >> (20830,0):ocfs2_mknod:351 ERROR: status = -2 >> (20830,0):ocfs2_check_dir_entry:1727 ERROR: bad entry in directory >> #12862125: rec_len is smaller than minimal - offset=258867 >> 2, inode=1099511657728, rec_len=0, name_len=0 >> >> I'm not exactly sure what this means. Is there a way for me to >> determine the path to the directory and/or file referenced above? >> >> Since this seemed like it might be fs corruption, I ran the fsck.ocfs2 >> utility, but in read-only mode. I ended up with output that looks >> like the following: >> >> Pass 0a: Checking cluster allocation chains >> Pass 0b: Checking inode allocation chains >> Pass 0c: Checking extent block allocation chains >> Pass 1: Checking inodes and blocks. >> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate >> cluster 4387633^M >> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate >> cluster 4387634^M >> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate >> cluster 4387635^M >> o2fsck_mark_cluster_allocated: Internal logic faliure !! duplicate >> cluster 4387636^M >> <-----------------SNIP Similar--------------------> >> Pass 2: Checking directory entries. >> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block >> 632 physical block 35102672 offset 0. Attempt to repair this block's >> directory entries? n >> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block >> 633 physical block 35102673 offset 0. Attempt to repair this block's >> directory entries? n >> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block >> 634 physical block 35102674 offset 0. Attempt to repair this block's >> directory entries? n >> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block >> 635 physical block 35102675 offset 0. Attempt to repair this block's >> directory entries? n >> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block >> 636 physical block 35102676 offset 0. Attempt to repair this block's >> directory entries? n >> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block >> 637 physical block 35102677 offset 0. Attempt to repair this block's >> directory entries? n >> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block >> 638 physical block 35102678 offset 0. Attempt to repair this block's >> directory entries? n >> [DIRENT_LENGTH] Directory inode 12862125 corrupted in logical block >> 639 physical block 35102679 offset 0. Attempt to repair this block's >> directory entries? n >> Pass 3: Checking directory connectivity. >> Pass 4a: checking for orphaned inodes >> Pass 4b: Checking inodes link counts. >> [INODE_NOT_CONNECTED] Inode 0 isn't referenced by any directory >> entries. Move it to lost+found? n >> [INODE_COUNT] Inode 31231457 has a link count of 1 on disk but >> directory entry references come to 0. Update the count on disk >> to match? n >> [INODE_NOT_CONNECTED] Inode 31231457 isn't referenced by any directory >> entries. Move it to lost+found? n >> [INODE_COUNT] Inode 31231458 has a link count of 1 on disk but >> directory entry references come to 0. Update the count on disk >> to match? n >> [INODE_NOT_CONNECTED] Inode 31231458 isn't referenced by any directory >> entries. Move it to lost+found? n >> [INODE_COUNT] Inode 31231459 has a link count of 1 on disk but >> directory entry references come to 0. Update the count on disk >> to match? n >> [INODE_NOT_CONNECTED] Inode 31231459 isn't referenced by any directory >> entries. Move it to lost+found? n >> [INODE_COUNT] Inode 31231460 has a link count of 1 on disk but >> directory entry references come to 0. Update the count on disk >> to match? n >> [INODE_NOT_CONNECTED] Inode 31231460 isn't referenced by any directory >> entries. Move it to lost+found? n >> [INODE_COUNT] Inode 31231461 has a link count of 1 on disk but >> directory entry references come to 0. Update the count on disk >> to match? n >> <-----------------SNIP Similar-----------------------> >> >> As far as I know, none of the nodes that are running ocfs2 have >> actually crashed and I created the filesystems just last week. One >> thing I should mention, though, is that the filesystem in question has >> about 3.3 million small files, 800k of which are contained within a >> single flat directory -- I know, it's terrible...I've inherited this >> mess from previous admins. Additionally, I rsync'ed the files to the >> ocfs2 volume from one of our existing servers. I have never been able >> to fsck the filesystem of the existing server without errors, but >> fixing the errors generally leads to a bunch of the small files being >> unlinked and moved to lost+found. My impression is that rsync reads >> things at a high-enough level that this shouldn't duplicate filesystem >> errors on a target volume, but maybe I'm wrong. >> >> Anyway, any help that could be offered would be greatly appreciated. >> I'm really trying to fix the filesystem mess I've inherited, but I get >> the impression it will be an arduous task. :) In terms of package >> information, we're running RHEL 4u7 on x86_64 with the following >> packges installed: ocfs2-2.6.9-55.0.9.ELsmp-1.2.9-1.el4 and >> ocfs2-tools-1.2.7-1.el4. Thanks! >> >> Bob Ziuchkovski >> >> _______________________________________________ >> Ocfs2-users mailing list >> [email protected] >> http://oss.oracle.com/mailman/listinfo/ocfs2-users >> _______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
