On 2009-12-03, at 20:19, 恩强周 wrote: > hi, all > I also hit ldiskfs problems.I have two osts report messages like this. > LDISKFS-fs: group 22879: 30128 blocks in bitmap, 29885 in gd > LDISKFS-fs: group 22810: 29150 blocks in bitmap, 29242 in gd > LDISKFS-fs: group 22846: 28278 blocks in bitmap, 28324 in gd
I believe this is a bug that was already fixed in newer Lustre releases. You should run the Lustre "e2fsck -f" on the device, when it is unmounted. > Does it mean LDISKFS will corrupted at some time later? > > Also one ost reported messages like "Remounting ... read-only", so > some files cann't be write at that time.We have run e2fsck to fix > it. But it reported again now. > We have found that ldiskfs seems unstable since 1.6.(1.4 better > than 1.6) > We have worryed about problem like filessystem corruption.Anyone can > give some suggestion? You should update to a newer version of Lustre. > 2009/12/4 Craig Prescott <[email protected]> > Craig Prescott wrote: > > Andreas Dilger wrote: > >> Hmm, the code shouldn't be checking the checksums if the uninit_bg > >> feature is not enabled. I believe this was fixed in ext4 already: > >> > >> in ldiskfs_group_desc_csum_verify() change it to be: > >> > >> int ldiskfs_group_desc_csum_verify(struct ext4_sb_info *sbi, > >> __u32 block_group, > >> struct ext4_group_desc *gdp) > >> { > >> if ((sbi->s_es->s_feature_ro_compat & > >> cpu_to_le32(LDISKFS_FEATURE_RO_COMPAT_GDT_CSUM)) && > >> (gdp->bg_checksum != ldiskfs_group_desc_csum(sbi, > >> block_group, gdp))) > >> return 0; > >> return 1; > >> } > > > > Ok, thanks. I'll try that. > > > <snip> > > Again, I really appreciate the help, and will let the list know > how it > > goes. > > Sadly, we didn't have any luck with this. We had written off the > OST in > our minds anyway, so to get any of the data back would have been a > windfall. > > Wouldn't mount as ldiskfs with the group descriptor checksum disabled: > > Dec 3 10:58:05 tebow2 kernel: LDISKFS-fs error (device dm-7): > ldiskfs_check_descriptors: Block bitmap for group 10112 not in group > (block > 484237063)! > Dec 3 10:58:05 tebow2 kernel: LDISKFS-fs: group descriptors > corrupted! > > Disabling that check and trying to mount yielded this one: > > Dec 3 11:01:13 tebow2 kernel: LDISKFS-fs error (device dm-7): > ldiskfs_check_descriptors: Inode bitmap for group 10112 not in group > (block > 14342712)! > Dec 3 11:01:13 tebow2 kernel: LDISKFS-fs: group descriptors > corrupted! > > Disabling that check yielded this one: > > Dec 3 11:01:59 tebow2 kernel: LDISKFS-fs error (device dm-7): > ldiskfs_check_descriptors: Inode table for group 10112 not in group > (block > 3538357782)! > Dec 3 11:01:59 tebow2 kernel: LDISKFS-fs: group descriptors > corrupted! > > All these messages were seen repeatedly in our fsck attempts. If we > had > been able to get past this group, several thousand more would have > followed. > > Disabling the inode table present in group check: > > Dec 3 11:02:35 tebow2 kernel: ldiskfs: No journal on filesystem on > dm-7 > > At that point we tried to rewrite superblocks with mkfs.lustre and > --mkfsoptions="-S", which panic'd the OSS. At that point, we gave up. > > Though it didn't work out this time, we'll be in a better position > to be > successful if this happens ever again. > > Thanks, > Craig Prescott > UF HPC Center > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
