Re: Adaptec 2400A RAID controller corrupting data (4.8)

2003-07-16 Thread Matt Staroscik
I am going to break this saga into 2 posts, one with the ugly details for 
those who are interested, and one short post with the essential questions 
and observations.

I have that card with 6 60 gig drives and set the box up (freebsd 4.7?)
and it would run for a day or so and just crash.  I also recall having
similar panics when moving large amounts of data.  I've given up on
using the box for any real work so it's just sitting doing nothing
waiting... hoping for a solution... a glimmer of hope.  ;-)
If you get it working please post.
Here is an update. While I have made progress I am not 100% hopeful for a 
solution that is stable in the long term.

To make a long story short, I seem to have made the system much stable by 
turning off soft updates. I was able to do a make buildworld, and then 
delete the contents of /usr/obj. Previously, one of those actions was sure 
to trigger a panic. Before I tried disabling soft updates I also did all 
this, some of which I readily admit is voodoo:

- cable replacement
- jumped drives to Master instead of Cable Select
- Changed RAID card PCI slot
- Wiggled everything
I continued my test by cvsupping my source and doing another make 
buildworld. However, this time it bombed out while working on groff. I 
checked the file in an editor and it didn't look munged, so I am not sure 
if there is an error in the cvs tree, an innocent file transfer error, or a 
sign of deeper issues with my disk subsystem. I am going to thrash the 
machine with more builds but avoid CVS for now.

Unfortunately, turning off soft updates isn't a great solution, if indeed 
it IS a solution, which I am still testing. It definitely makes things 
slower. My buildworld went from about 23 minutes to 34 minutes this way. 
Removing the contents of /usr/obj took about 1 minute, whereas with soft 
updates it took only a few seconds (though it panicked afterwards).

Update: I created a custom kernel config (adding only device pcm and 
removing nothing) and successfully built it. I then installed it, rebooted, 
and tried to make installworld. Bomb city! getty dumped core before I even 
logged in and it got worse from there.

Then I tried deleting /usr/obj and I got the kernel panic again. :)

Observation: My last 2 panics (ffs_blkfree) reported these block numbers: 
54608, 54592. Those are awfully close. Could my trouble stem from a defect 
on a disk?

Things I have yet to try:

- Removing the Maxtor 160s from the RAID and trying them individually on 
the motherboard controller.
- Applying a hammer to the system

Cheers,
Matt
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Adaptec 2400A RAID controller corrupting data (4.8)

2003-07-15 Thread Matt Bettinger
On Tue, 15 Jul 2003 19:59:32 -0700
Matt Staroscik <[EMAIL PROTECTED]> wrote:

> 
> I have started building myself a new file server, and at the heart of
> it are 2 Maxtor 160GB drives in a RAID-1, using the Adaptec 2400a. 
> Unfortunately I am having some kind of issue with data on the array
> getting corrupted.
> 
> During disk activity (like makeworld, cvsup, rm -rf /usr/obj/*) I get 
> kernel panics like this:
> 
> dev=#da/0x20006, block=54608, fs=/usr
> panic: ffs_blkfree: freeing free block
> 
> 
> Weird thing is, this box was working GREAT just a day or so ago. I had
> 
> multiple successful builds on the RAID array. But something has gone
> south.
>Many thanks in advance. When/if I solve this I will post a followup...
> 
> Cheers,
> Matt

I have that card with 6 60 gig drives and set the box up (freebsd 4.7?)
and it would run for a day or so and just crash.  I also recall having
similar panics when moving large amounts of data.  I've given up on
using the box for any real work so it's just sitting doing nothing
waiting... hoping for a solution... a glimmer of hope.  ;-)

If you get it working please post.

Regards, 

-mb
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"