Re: RAID 10 resync leading to attempt to access beyond end of device

2007-02-19 Thread John Stilson
Hey Neil, I tested this new patch and it seems to work! I'm going to do some more vigorous testing, and I'll let you know if any more issues bubble out. Thanks! -John On 2/15/07, Neil Brown <[EMAIL PROTECTED]> wrote: On Thursday February 15, [EMAIL PROTECTED] wrote: > Ok tried the patch and go

Re: RAID 10 resync leading to attempt to access beyond end of device

2007-02-15 Thread Neil Brown
On Thursday February 15, [EMAIL PROTECTED] wrote: > Ok tried the patch and got a kernel BUG this time (BUG_ON(k == conf->copies)?) Thanks obviously I missed some subtlety. I think I have it right now. I've tested this against a setup which I think is sufficiently identical to yours this time

Re: RAID 10 resync leading to attempt to access beyond end of device

2007-02-15 Thread John Stilson
Oh, an additional piece of information I just realized I had not put in my original email is that this failure only happens intermittenly -- 50%-75% of the time a rebuild occurs -John On 2/15/07, John Stilson <[EMAIL PROTECTED]> wrote: Ok tried the patch and got a kernel BUG this time (BUG_ON(k

Re: RAID 10 resync leading to attempt to access beyond end of device

2007-02-15 Thread John Stilson
Ok tried the patch and got a kernel BUG this time (BUG_ON(k == conf->copies)?) -John Feb 15 12:52:35 testsvr kernel: md: recovery of RAID array md0 Feb 15 12:52:35 testsvr kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk. Feb 15 12:52:35 testsvr kernel: md: using maximum available idle

Re: RAID 10 resync leading to attempt to access beyond end of device

2007-02-14 Thread Neil Brown
On Wednesday February 14, [EMAIL PROTECTED] wrote: > Feb 14 16:23:45 testsvr kernel: attempt to access beyond end of device > Feb 14 16:23:45 testsvr kernel: sdc1: rw=1, want=901904331651136, > limit=16081002 That 'want=' value is an enormous number! 52 bits. Looks a lot like an uninitialised var

RAID 10 resync leading to attempt to access beyond end of device

2007-02-14 Thread John Stilson
Hi, I'm experiencing what appears to be a kernel bug in the raid10 driver, where immediately after a resync completes an access beyond the end of the rebuilt disk is attempted which causes the disk to be failed. The system is a single-processor dual-core Xeon 3000 at 1.86GHz. It has four 2