-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hans -

Most of the error messages make it pretty clear that the problems are
hardware related. The BUG() at the end *is* a software error; This is a
valid bug report. I'll take a look at it on Monday.

That said, I don't think that kernel error messages are the best place
for data recovery howtos. Even so, ReiserFS is not the only filesystem
affected by I/O errors; every disk filesystem is. ext3's failure
messages are very similar, and I haven't heard much confusion over them.

- -Jeff

Hans Reiser wrote:
> Chris/Jeff, can you modify your code to whenever it sees an I/O error,
> to say "I/O errors usually indicate bad hardware not bad software,
> probably you need to get a new disk and use dd_rescue to copy everything
> to it."?
> 
> Thanks,
> 
> Hans
> 
> Linas Vepstas wrote:
> 
> 
>>Hi,
>>
>>I've been experimenting with automatic bus error recovery in the
>>2.6.11 kernel.  During one of my failed experiments, I tripped over
>>a Reiserfs bug, below.  Basically, my error recovery failed, which
>>means a SCSI disk went permanently offline, which, admitedly,
>>is pretty catastrophic, but shouldn't be a kernel panic.  It seems
>>that reiser hits a 'BUG_ON' in this case.
>>
>>FWIW, in my limited experience with ext3 in the same exact situation, 
>>it seems that ext3 handles this gracefully, returning -EIO to all 
>>affected apps accessing the disk.
>>
>>Unfortunately, I don't know how to tell you how to reproduce this :)
>>
>>--linas
>>
>>
>>Here's dmesg leading up to the failure, and the stack traces are shown below.
>>
>><4>sym0:8:0: HOST RESET operation timed-out.
>><6>scsi: Device offlined - not ready after error recovery: host 0 channel 0 
>>id 8 lun 0
>><3>scsi0 (8:0): rejecting I/O to offline device
>><3>scsi0 (8:0): rejecting I/O to offline device
>><3>Buffer I/O error on device sda3, logical block 8210
>><4>lost page write due to I/O error on sda3
>><4>ReiserFS: sda3: warning: journal-837: IO error during journal replay 
>><2>REISERFS: abort (device sda3): Write error while updating journal header 
>>in flush_journal_list
>><2>REISERFS: Aborting journal for filesystem on sda3
>><3>scsi0 (8:0): rejecting I/O to offline device
>><3>Buffer I/O error on device sda3, logical block 741
>><4>lost page write due to I/O error on sda3
>><3>Buffer I/O error on device sda3, logical block 742
>><4>lost page write due to I/O error on sda3
>><3>Buffer I/O error on device sda3, logical block 743
>><4>lost page write due to I/O error on sda3
>><3>Buffer I/O error on device sda3, logical block 744
>><4>lost page write due to I/O error on sda3
>><3>Buffer I/O error on device sda3, logical block 745
>><4>lost page write due to I/O error on sda3
>><3>Buffer I/O error on device sda3, logical block 746
>><4>lost page write due to I/O error on sda3
>><3>Buffer I/O error on device sda3, logical block 747
>><4>lost page write due to I/O error on sda3
>><3>Buffer I/O error on device sda3, logical block 748
>><4>lost page write due to I/O error on sda3
>><3>Buffer I/O error on device sda3, logical block 749
>><4>lost page write due to I/O error on sda3
>><3>scsi0 (8:0): rejecting I/O to offline device
>><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS
>><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS
>><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS
>><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS
>><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS
>><4>ReiserFS: sda3: warning: clm-6006: writing inode 346759 on readonly FS
>><2>kernel BUG in submit_ordered_buffer at fs/reiserfs/journal.c:616!
>><3>scsi0 (8:0): rejecting I/O to offline device
>>
>>
>>cpu 0x1: Vector: 700 (Program Check) at [c00000000fcef740]
>>   pc: c000000000132ac8: .write_ordered_chunk+0xa4/0x100
>>   lr: c000000000133274: .write_ordered_buffers+0x348/0x364
>>   sp: c00000000fcef9c0
>>  msr: 9000000000029032
>> current = 0xc00000000fea87b0
>> paca    = 0xc00000000053b400
>>   pid   = 953, comm = reiserfs/1
>>kernel BUG in submit_ordered_buffer at fs/reiserfs/journal.c:616!
>>enter ? for help
>>1:mon> t
>>[c00000000fcefa60] c000000000133274 .write_ordered_buffers+0x348/0x364
>>[c00000000fcefc30] c000000000133af0 .flush_commit_list+0x80c/0x8cc
>>[c00000000fcefd10] c000000000138ac0 .flush_async_commits+0xf0/0xf4
>>[c00000000fcefdb0] c00000000006d2fc .worker_thread+0x258/0x32c
>>[c00000000fcefee0] c000000000073d80 .kthread+0x174/0x1c8
>>[c00000000fceff90] c000000000014240 .kernel_thread+0x4c/0x6c
>>1:mon>
>>1:mon> c
>>cpus stopped: 0-3
>>1:mon> c 0
>>0:mon> t
>>[c0000000004efdd0] c00000000000f948 .cpu_idle+0x3c/0x54
>>[c0000000004efe50] c00000000000c188 .rest_init+0x3c/0x58
>>[c0000000004efed0] c00000000049b7dc .start_kernel+0x27c/0x2fc
>>[c0000000004eff90] c00000000000c000 .__setup_cpu_power3+0x0/0x4
>>0:mon> c 2
>>2:mon> t
>>[c00000000424fe80] c00000000000f948 .cpu_idle+0x3c/0x54
>>[c00000000424ff00] c00000000003a878 .start_secondary+0x108/0x148
>>[c00000000424ff90] c00000000000bd28 .enable_64b_mode+0x0/0x28
>>2:mon> c 3
>>3:mon> t
>>[c000000004253e80] c00000000000f948 .cpu_idle+0x3c/0x54
>>[c000000004253f00] c00000000003a878 .start_secondary+0x108/0x148
>>[c000000004253f90] c00000000000bd28 .enable_64b_mode+0x0/0x28
>>
>>
>>
>> 
>>
> 
> 
> 


- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFCMjW/LPWxlyuTD7IRAtQSAJ4yBQxFRPZcMmU/vo4mUcki6aZ/KgCfZrXP
qF9JJ+nVRQtT4vE0OIAtGlM=
=EtXb
-----END PGP SIGNATURE-----

Reply via email to