-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hans -
Most of the error messages make it pretty clear that the problems are hardware related. The BUG() at the end *is* a software error; This is a valid bug report. I'll take a look at it on Monday. That said, I don't think that kernel error messages are the best place for data recovery howtos. Even so, ReiserFS is not the only filesystem affected by I/O errors; every disk filesystem is. ext3's failure messages are very similar, and I haven't heard much confusion over them. - -Jeff Hans Reiser wrote: > Chris/Jeff, can you modify your code to whenever it sees an I/O error, > to say "I/O errors usually indicate bad hardware not bad software, > probably you need to get a new disk and use dd_rescue to copy everything > to it."? > > Thanks, > > Hans > > Linas Vepstas wrote: > > >>Hi, >> >>I've been experimenting with automatic bus error recovery in the >>2.6.11 kernel. During one of my failed experiments, I tripped over >>a Reiserfs bug, below. Basically, my error recovery failed, which >>means a SCSI disk went permanently offline, which, admitedly, >>is pretty catastrophic, but shouldn't be a kernel panic. It seems >>that reiser hits a 'BUG_ON' in this case. >> >>FWIW, in my limited experience with ext3 in the same exact situation, >>it seems that ext3 handles this gracefully, returning -EIO to all >>affected apps accessing the disk. >> >>Unfortunately, I don't know how to tell you how to reproduce this :) >> >>--linas >> >> >>Here's dmesg leading up to the failure, and the stack traces are shown below. >> >><4>sym0:8:0: HOST RESET operation timed-out. >><6>scsi: Device offlined - not ready after error recovery: host 0 channel 0 >>id 8 lun 0 >><3>scsi0 (8:0): rejecting I/O to offline device >><3>scsi0 (8:0): rejecting I/O to offline device >><3>Buffer I/O error on device sda3, logical block 8210 >><4>lost page write due to I/O error on sda3 >><4>ReiserFS: sda3: warning: journal-837: IO error during journal replay >><2>REISERFS: abort (device sda3): Write error while updating journal header >>in flush_journal_list >><2>REISERFS: Aborting journal for filesystem on sda3 >><3>scsi0 (8:0): rejecting I/O to offline device >><3>Buffer I/O error on device sda3, logical block 741 >><4>lost page write due to I/O error on sda3 >><3>Buffer I/O error on device sda3, logical block 742 >><4>lost page write due to I/O error on sda3 >><3>Buffer I/O error on device sda3, logical block 743 >><4>lost page write due to I/O error on sda3 >><3>Buffer I/O error on device sda3, logical block 744 >><4>lost page write due to I/O error on sda3 >><3>Buffer I/O error on device sda3, logical block 745 >><4>lost page write due to I/O error on sda3 >><3>Buffer I/O error on device sda3, logical block 746 >><4>lost page write due to I/O error on sda3 >><3>Buffer I/O error on device sda3, logical block 747 >><4>lost page write due to I/O error on sda3 >><3>Buffer I/O error on device sda3, logical block 748 >><4>lost page write due to I/O error on sda3 >><3>Buffer I/O error on device sda3, logical block 749 >><4>lost page write due to I/O error on sda3 >><3>scsi0 (8:0): rejecting I/O to offline device >><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS >><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS >><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS >><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS >><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS >><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS >><2>kernel BUG in submit_ordered_buffer at fs/reiserfs/journal.c:616! >><3>scsi0 (8:0): rejecting I/O to offline device >> >> >>cpu 0x1: Vector: 700 (Program Check) at [c00000000fcef740] >> pc: c000000000132ac8: .write_ordered_chunk+0xa4/0x100 >> lr: c000000000133274: .write_ordered_buffers+0x348/0x364 >> sp: c00000000fcef9c0 >> msr: 9000000000029032 >> current = 0xc00000000fea87b0 >> paca = 0xc00000000053b400 >> pid = 953, comm = reiserfs/1 >>kernel BUG in submit_ordered_buffer at fs/reiserfs/journal.c:616! >>enter ? for help >>1:mon> t >>[c00000000fcefa60] c000000000133274 .write_ordered_buffers+0x348/0x364 >>[c00000000fcefc30] c000000000133af0 .flush_commit_list+0x80c/0x8cc >>[c00000000fcefd10] c000000000138ac0 .flush_async_commits+0xf0/0xf4 >>[c00000000fcefdb0] c00000000006d2fc .worker_thread+0x258/0x32c >>[c00000000fcefee0] c000000000073d80 .kthread+0x174/0x1c8 >>[c00000000fceff90] c000000000014240 .kernel_thread+0x4c/0x6c >>1:mon> >>1:mon> c >>cpus stopped: 0-3 >>1:mon> c 0 >>0:mon> t >>[c0000000004efdd0] c00000000000f948 .cpu_idle+0x3c/0x54 >>[c0000000004efe50] c00000000000c188 .rest_init+0x3c/0x58 >>[c0000000004efed0] c00000000049b7dc .start_kernel+0x27c/0x2fc >>[c0000000004eff90] c00000000000c000 .__setup_cpu_power3+0x0/0x4 >>0:mon> c 2 >>2:mon> t >>[c00000000424fe80] c00000000000f948 .cpu_idle+0x3c/0x54 >>[c00000000424ff00] c00000000003a878 .start_secondary+0x108/0x148 >>[c00000000424ff90] c00000000000bd28 .enable_64b_mode+0x0/0x28 >>2:mon> c 3 >>3:mon> t >>[c000000004253e80] c00000000000f948 .cpu_idle+0x3c/0x54 >>[c000000004253f00] c00000000003a878 .start_secondary+0x108/0x148 >>[c000000004253f90] c00000000000bd28 .enable_64b_mode+0x0/0x28 >> >> >> >> >> > > > - -- Jeff Mahoney SuSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFCMjW/LPWxlyuTD7IRAtQSAJ4yBQxFRPZcMmU/vo4mUcki6aZ/KgCfZrXP qF9JJ+nVRQtT4vE0OIAtGlM= =EtXb -----END PGP SIGNATURE-----
