Re: drive failure during rebuild causes page fault

2005-05-22 Thread Joe Rhett
You need to overwrite the metadata (se above) which are located in different places again depending on metadata format. So where is it located with the sil3114 controler? (same as 3112, but with 4 ports...) On Sun, May 22, 2005 at 12:45:05AM +0200, Søren Schmidt wrote: Depends on what

Re: drive failure during rebuild causes page fault

2005-05-22 Thread Søren Schmidt
On 22/05/2005, at 18:11, Joe Rhett wrote: You need to overwrite the metadata (se above) which are located in different places again depending on metadata format. So where is it located with the sil3114 controler? (same as 3112, but with 4 ports...) On Sun, May 22, 2005 at 12:45:05AM

Re: drive failure during rebuild causes page fault

2005-05-21 Thread Søren Schmidt
On 21/05/2005, at 1:10, Joe Rhett wrote: On Thu, May 19, 2005 at 08:21:13AM +0200, Søren Schmidt wrote: On 19/05/2005, at 2.20, Joe Rhett wrote: Soren, I've just retested all of this with 5.4-REL and most of the problems listed here are solved. The only problems appear to be related to

Re: drive failure during rebuild causes page fault

2005-05-20 Thread Joe Rhett
On Thu, May 19, 2005 at 08:21:13AM +0200, Søren Schmidt wrote: On 19/05/2005, at 2.20, Joe Rhett wrote: Soren, I've just retested all of this with 5.4-REL and most of the problems listed here are solved. The only problems appear to be related to these ghost arrays that appear when it

Re: drive failure during rebuild causes page fault

2005-05-19 Thread Søren Schmidt
On 19/05/2005, at 2.20, Joe Rhett wrote: Soren, I've just retested all of this with 5.4-REL and most of the problems listed here are solved. The only problems appear to be related to these ghost arrays that appear when it finds a drive that was taken offline earlier. For example, pull a

Re: drive failure during rebuild causes page fault

2005-05-18 Thread Joe Rhett
Soren, I've just retested all of this with 5.4-REL and most of the problems listed here are solved. The only problems appear to be related to these ghost arrays that appear when it finds a drive that was taken offline earlier. For example, pull a drive and then reboot the system. 1. If you

Re: drive failure during rebuild causes page fault

2004-12-16 Thread Peter Jeremy
On Wed, 2004-Dec-15 19:16:59 -0500, asym wrote: [audio jukebox] what would be your recommendations for this particular (and very limited) application? Honestly I'd probably go for a RAID1+0 setup. It wastes half the space in total for mirroring, but it has none of the performance penalties of

Re: drive failure during rebuild causes page fault

2004-12-15 Thread asym
At 18:16 12/15/2004, Gianluca wrote: barracudas and at this point I wonder if it's best to go w/ a small hw raid controller like the 3ware 7506-4LP or use sw raid. I don't really care about speed (I know RAID5 is not the best for that) nor hot swapping, my main concern is data integrity. I tried

Re: drive failure during rebuild causes page fault

2004-12-15 Thread asym
At 18:57 12/15/2004, Gianluca wrote: actually all the data I plan to keep on that server is gonna be backed up, either to cdr/dvdr or in the original audio cds that I still have. what I meant by integrity is trying to avoid having to go back to the backups to restore 120G (or more in this case)

Re: drive failure during rebuild causes page fault

2004-12-15 Thread Gianluca
Hello, I've been following this thread w/ apprehension since I'm in the process of putting together my first RAID server. maybe this problem has nothing to do w/ what I have in mind but I figure I'd ask the experts first. I want to make a fileserver for home use, mostly as a music jukebox and

Re: drive failure during rebuild causes page fault

2004-12-15 Thread Gianluca
If you're thinking of using RAID instead of good timely backups, you need to go back to the drawing board, because that is not what RAID is intended to replace -- and is something it cannot replace. actually all the data I plan to keep on that server is gonna be backed up, either to cdr/dvdr

Re: drive failure during rebuild causes page fault

2004-12-14 Thread Joe Rhett
Soren, do you have any thoughts on what I could do to alleviate or better debug this page fault? I've found three ways to cause this: in all cases pull is either physical pull or atacontrol detach channel 1. Pull a drive and rebuild onto hot spare. Pull hot spare *boom* 2. Pull a drive and

Re: drive failure during rebuild causes page fault

2004-12-14 Thread Joe Rhett
On Tue, Dec 14, 2004 at 07:58:53AM +0100, Søren Schmidt wrote: Anyhow. I can only test with the HW I have here in the lab, which by far covers all possible permutations, so testing etc by the community is very much needed here to get things sorted out... So this system is just my sandbox in

Re: drive failure during rebuild causes page fault

2004-12-13 Thread Doug White
On Sun, 12 Dec 2004, Joe Rhett wrote: On Sun, Dec 12, 2004 at 09:59:16PM -0800, Doug White wrote: Thats a nice shotgun you have there. Yessir. And that's what testing is designed to uncover. The question is why this works, and how do we prevent it? I'm sure Soren appreciates you donating

Re: drive failure during rebuild causes page fault

2004-12-13 Thread Joe Rhett
On Sun, Dec 12, 2004 at 09:59:16PM -0800, Doug White wrote: Thats a nice shotgun you have there. On Sun, 12 Dec 2004, Joe Rhett wrote: Yessir. And that's what testing is designed to uncover. The question is why this works, and how do we prevent it? On Mon, Dec 13, 2004 at 10:28:53AM

Re: drive failure during rebuild causes page fault

2004-12-13 Thread Paul Mather
On Mon, 2004-12-13 at 10:28 -0800, Doug White wrote: On Sun, 12 Dec 2004, Joe Rhett wrote: On Sun, Dec 12, 2004 at 09:59:16PM -0800, Doug White wrote: Thats a nice shotgun you have there. Yessir. And that's what testing is designed to uncover. The question is why this works, and

Re: drive failure during rebuild causes page fault

2004-12-13 Thread Joe Rhett
On Mon, Dec 13, 2004 at 04:03:06PM -0500, Paul Mather wrote: That's not quite fair. He was obviously testing to see how resilient ATA RAID is to drive failures during rebuilding, as part of a series of tests. (Obviously, it is not.) If you look at his original message, he did not even yank

Re: drive failure during rebuild causes page fault

2004-12-13 Thread Doug White
On Mon, 13 Dec 2004, Joe Rhett wrote: This is why I don't trust ATA RAID for fault tolerance -- it'll save your data, but the system will tank. Since the disk state is maintained by the OS and not abstracted by a separate processor, if a disk dies in a particularly bad way the system may

Re: drive failure during rebuild causes page fault

2004-12-13 Thread Søren Schmidt
Doug White wrote: On Mon, 13 Dec 2004, Joe Rhett wrote: This is why I don't trust ATA RAID for fault tolerance -- it'll save your data, but the system will tank. Since the disk state is maintained by the OS and not abstracted by a separate processor, if a disk dies in a particularly bad way the

drive failure during rebuild causes page fault

2004-12-12 Thread Joe Rhett
And another, I can now confirm that it is fairly easy to kill 5.3-release during the rebuilding process. The following steps will cause a kernel page fault consistently: atacontrol create RAID0 ad6 ad10 atacontrol detach 5 log: ad10 deleted from ar0 disk1 log: ad10 WARNING -

Re: drive failure during rebuild causes page fault

2004-12-12 Thread Doug White
On Sun, 12 Dec 2004, Joe Rhett wrote: And another, I can now confirm that it is fairly easy to kill 5.3-release during the rebuilding process. The following steps will cause a kernel page fault consistently: atacontrol create RAID0 ad6 ad10 atacontrol detach 5 log: ad10 deleted from

Re: drive failure during rebuild causes page fault

2004-12-12 Thread Joe Rhett
And here's where I found even more interesting stuff. (again with the sil3114 controller) If you detach a channel and then attach the channel, a new raid device gets created. And the removed drive shows up in the new array... # atacontrol create RAID0 ad6 ad8 # atacontrol

Re: drive failure during rebuild causes page fault

2004-12-12 Thread Joe Rhett
On Sun, 12 Dec 2004, Joe Rhett wrote: And another, I can now confirm that it is fairly easy to kill 5.3-release during the rebuilding process. The following steps will cause a kernel page fault consistently: atacontrol create RAID0 ad6 ad10 atacontrol detach 5 log: ad10 deleted