Re: Checksums wrong on one disk of mirror

2006-11-13 Thread Henrik Holst
David wrote: snip mdadm is version 1.12. Looking at the most recently available version this seems incredibly out of date, but seems to be the default installed in Ubuntu. Even Debian stable seems to have 1.9. I can bug this with them for an update if necessary. It's already on it's way.

Re: Re[2]: RAID1 submirror failure causes reboot?

2006-11-13 Thread Jens Axboe
On Mon, Nov 13 2006, Neil Brown wrote: On Friday November 10, [EMAIL PROTECTED] wrote: Hello Neil, [87398.531579] blk: request botched NB NB That looks bad. Possible some bug in the IDE controller or elsewhere NB in the block layer. Jens:

Re: Re[2]: RAID1 submirror failure causes reboot?

2006-11-13 Thread Neil Brown
On Monday November 13, [EMAIL PROTECTED] wrote: It doesn't sound at all unreasonable. It's most likely either a bug in the ide driver, or a bad bio being passed to the block layer (and later on to the request and driver). By bad I mean one that isn't entirely consistent, which could be a bug

[PATCH 001 of 4] md: Fix innocuous bug in raid6 stripe_to_pdidx

2006-11-13 Thread NeilBrown
stripe_to_pdidx finds the index of the parity disk for a given stripe. It assumes raid5 in that it uses disks-1 to determine the number of data disks. This is incorrect for raid6 but fortunately the two usages cancel each other out. The only way that 'data_disks' affects the calculation of

[PATCH 003 of 4] md: Misc fixes for aligned-read handling.

2006-11-13 Thread NeilBrown
1/ When aligned requests fail (read error) they need to be retried via the normal method (stripe cache). As we cannot be sure that we can process a single read in one go (we may not be able to allocate all the stripes needed) we store a bio-being-retried and a list of

[PATCH 004 of 4] md: Fix a couple more bugs in raid5/6 aligned reads

2006-11-13 Thread NeilBrown
1/ We don't de-reference the rdev when the read completes. This means we need to record the rdev to so it is still available in the end_io routine. Fortunately bi_next in the original bio is unused at this point so we can stuff it in there. 2/ We leak a cloned by if the target rdev

Re: bio too big device dm-XX (256 255) on 2.6.17

2006-11-13 Thread Jure Pečar
Hello, this is getting more and more annoying. Somewhere in the stack reiserfs-dm-md-hd[bd] lies the problem that's causing bio too big device dm-10 (256 255) errors, which cause i/o failures. It works as expected on reiserfs-dm-sda and on ext3-dm-md-hd[bd]. Debian Etch, 2.6.17-2. On

Re: bio too big device dm-XX (256 255) on 2.6.17

2006-11-13 Thread Neil Brown
On Tuesday November 14, [EMAIL PROTECTED] wrote: Hello, this is getting more and more annoying. Somewhere in the stack reiserfs-dm-md-hd[bd] lies the problem that's causing bio too big device dm-10 (256 255) errors, which cause i/o failures. It works as expected on reiserfs-dm-sda

Re: Re[2]: RAID1 submirror failure causes reboot?

2006-11-13 Thread Jens Axboe
On Tue, Nov 14 2006, Neil Brown wrote: On Monday November 13, [EMAIL PROTECTED] wrote: It doesn't sound at all unreasonable. It's most likely either a bug in the ide driver, or a bad bio being passed to the block layer (and later on to the request and driver). By bad I mean one that