Package: e2fsprogs
Version: 1.37-2sarge1

Greetings,

I'm checking an ext3 file system on a ~240 GiB RAID-1 array (two "250
GB" drives, df reports 240362560 1K-blocks), and e2fsck segfaults.

Today it segfaulted while I wasn't there, the last message it printed
was: "i_fsize for inode xxxxx (...) is 234848, should be zero. Clear?"
(Pass 4?) Last night it segfaulted after going back for at least a Pass
1b to resolve a bunch of duplicate blocks; I'm not sure whether it
entered further passes.

The command line is:

e2fsck -y -C 0 /dev/md0

Today I ran it a third time under gdb, it got to about 93% and then in
Pass 4, after a handful of non-zero i_file_acls, bad modes and an
illegal FIFO, it segfaulted, the last messages were:

Inode 2919156 ref count is 90, should be 1.  Fix? yes

i_file_acl for inode 5350463 (...) is 3961695421, should be zero.
Clear? yes

i_faddr for inode 5350463 (...) is 2987423426, should be zero.
Clear? yes

i_frag for inode 5350463 (...) is 51, should be zero.
Clear? yes

i_fsize for inode 5350463 (...) is 214, should be zero.
Clear? yes


Program received signal SIGSEGV, Segmentation fault.
0x400338fa in ext2fs_unmark_generic_bitmap () from /lib/libext2fs.so.2
(gdb) backtrace
#0  0x400338fa in ext2fs_unmark_generic_bitmap ()
from /lib/libext2fs.so.2
#1  0x0805712a in ?? ()
#2  0x00000000 in ?? ()
#3  0x0051a43f in ?? ()
#4  0xbffff830 in ?? ()
#5  0x080631d4 in _IO_stdin_used ()
#6  0x00000000 in ?? ()
#7  0xbffff8a5 in ?? ()
#8  0xbffff8a4 in ?? ()
#9  0x00000004 in ?? ()
#10 0x00000000 in ?? ()
#11 0x0051a43f in ?? ()
#12 0x00000000 in ?? ()
#13 0x00000000 in ?? ()
#14 0xbffff830 in ?? ()
#15 0x00000000 in ?? ()
#16 0x00000000 in ?? ()
#17 0x00000000 in ?? ()
#18 0xffffffff in ?? ()
#19 0xffffffff in ?? ()
#20 0xffffffff in ?? ()
#17 0x00000000 in ?? ()
#17 0x00000000 in ?? ()
#17 0x00000000 in ?? ()
... (not sure more of this is useful to copy from the screen)
#92 0x400bde6c in malloc_set_state () from /lib/libc.so.6
Previous frame inner to this frame (corrupt stack?)

This is /home on a production server (but with a backup), so I'm not too
interested in taking it down for another hour and a half for testing --
though if/when it crashes again, I'll give it another try.  Right now
it's up and running with the errors. :-(

Also, we just migrated from an 80 GB RAID-5 array to this array; this is
on an ABIT BP6 HPT 366 controller (I know, bad, bad, bad! but we don't
have the money for a new server).  The old array on the same controller
had a MTBF around 40 days before it crashed the machine, the new array
MTBF was a few hours, so we upgraded to the RU BIOS and it hasn't
crashed since...

Thanks,

-Adam
-- 
GPG fingerprint: D54D 1AEE B11C CE9B A02B  C5DD 526F 01E8 564E E4B6

Welcome to the best software in the world today cafe!
http://www.take6.com/albums/greatesthits.html


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to