On Thu, 2008-03-13 at 12:34 +0100, Frank Mietke wrote: > okay I've found the following in /var/log/messages before the bulk of above > messages come. It seems that something with the RAID went wrong.
I don't see anything RAID specific however... > Mar 13 06:17:31 chic2e24 kernel: [3068633.701448] attempt to access beyond > end of device > Mar 13 06:17:31 chic2e24 kernel: [3068633.701454] sda: rw=1, > want=11287722456, limit=7796867072 This is pretty self-explanatory. Something tried to read beyond the end of the disk. Something has a misunderstanding of how big the disk is. Is it possible that the disk format process was misled about the disk size during initialization? Andreas, does mkfs do any bounds checking to verify the sanity of the mkfs request? I.e. does it make sure that if/when you specify a number of blocks for a filesystem that that many block are available? Frank, is it at all possible that the size of the device had somehow gotten smaller since you first initialized it? > Mar 13 06:17:31 chic2e24 kernel: [3068633.701555] attempt to access beyond > end of device > Mar 13 06:17:31 chic2e24 kernel: [3068633.701558] sda: rw=1, > want=25366292592, limit=7796867072 > Mar 13 06:17:31 chic2e24 kernel: [3068633.701562] Buffer I/O error on device > sda, logical block 3170786573 > Mar 13 06:17:31 chic2e24 kernel: [3068633.701785] lost page write due to I/O > error on sda > Mar 13 06:17:31 chic2e24 kernel: [3068633.702004] Aborting journal on device > sda. This is all just fallout error messages from the attempted read beyond EOF. > Mar 13 06:17:31 chic2e24 kernel: [3068633.702226] LustreError: > 4493:0:(obd.h:1038:obd_transno_commit_cb()) chicfs-OST0010: transno > 6510615555435490347 commit error: 2 > Mar 13 06:17:31 chic2e24 kernel: [3068633.702933] LDISKFS-fs error (device > sda) in ldiskfs_reserve_inode_write: Journal has aborted > Mar 13 06:17:31 chic2e24 kernel: [3068633.703587] Remounting filesystem > read-only > Mar 13 06:17:31 chic2e24 kernel: [3068633.704001] journal commit I/O error > Mar 13 06:17:31 chic2e24 kernel: [3068633.704981] LDISKFS-fs error (device > sda) in ldiskfs_dirty_inode: Journal has aborted And this is the ldiskfs fallout. b. _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
