On Mar 20, 2008 13:48 +0100, Papp Tam�s wrote: > What could cause this error? > Kernel: 2.6.9-42.0.10.EL_lustre-1.6.0.1custom-drbd and > 2.6.9-55.0.9.EL_lustre.1.6.4.1smp (CentOS 4.4) > > After the node freezed up, his failover pair took over the resource, but > it did it too. > > I've just looked back in logs and I see, this header corrupted messages > some more times in the last few days. > After I turned it on again, it freezed up in 10 minutes. > > > Mar 20 10:57:19 node2 kernel: LDISKFS-fs: header is corrupted! > Mar 20 10:57:19 node2 kernel: LDISKFS-fs: invalid magic = 0x281e > Mar 20 10:57:19 node2 kernel: LDISKFS-fs: header is corrupted!
This means you have on-disk corruption and an "e2fsck -f" is needed (while filesystem is unmounted of course). > Mar 20 11:03:25 node2 kernel: ------------[ cut here ]------------ > Mar 20 11:03:25 node2 kernel: kernel BUG at > /usr/src/redhat/BUILD/lustre-1.6.0.1/lustre/ldiskfs/extents.c:1751! You have quite an old version of lustre, and several ldiskfs bugs have been fixed since then. I don't think it will BUG() on finding disk errors anymore. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
