jeff- i am using 2.2.14 with mingo patch, and it is great. i have a dozen or
so boxes, 512meg, SMP pIII 450, ncr scsi, etc in this config. all are fine.
it would be interesting to see if raid is the issue, or your adaptec (i am
inclined to think the latter).
1. swap scsi cards. i like symbios/ncr 53c875 and 876 controllers. they are
cheaper than adaptecs, and are much better with borderline cabling, and
recover from disk hangs. i have found the adaptec does not.
2. turn off raid temporarily. to do this, try something like this- but BE
CAREFUL, i have not tried this with the hacked raid1 lilo...
a. use the raidhotremove command to get the sdbX partitions out of their
respective raids. this will ensure that when you return the system to its raid
state later, that /dev/sda will be considered the most recent copy of your
data.
b. mark the partition types of ALL the raid constituents to 83 instead of fd.
c. change your /etc/fstab to mount /dev/sdaX instead of /dev/md0 (where X is
the coorespondent partition to the one in your raid set, check your raidtab
for that info)
d. change your lilo config to use /dev/sdaX as root instead of /dev/md0
(again- where X is the partition on sda that is part of your root raid device)
e. run lilo.
now, when you reboot, make sure you choose the kernel that you changed the
root= for, and you will be running a simple, non-raid system.
see if you can get the lockups.
reverse the procedure to get the system back into raid:
a. change lilo.conf, setting root=/dev/md0
b. run lilo
c. edit fstab, moving back to md's
d. change partition types to fd instead of 83
e. reboot. system will bitch about /dev/sdbX being old, and all the arrays wil
be degraded, but syncing.
f. monitor /proc/mdstat to make sure the raid is reconstructing. if not, you
may need some help from the raidhotadd command.
allan
Jeff Hill <[EMAIL PROTECTED]> said:
> Jakob �stergaard wrote:
> >
> > On Sat, 25 Mar 2000, Jeff Hill wrote:
> >>--snip--<<
> > > My system hangs for 30 seconds to 5 minutes several times a day using a
> > > vanilla kernel 2.2.14 from ftp.kernel.org with a 2.2.14 RAID patch from
> > > Redhat on my Debian (Potato version) server. When the system hangs, it
> >>--snip--<<
> > Is it a SMP system ?
>
> Nope. ASUS P3B-F motherboard w/Intel 440BX AGPset, 512MB PC100SDRAM,
> Pentium III 450Mhz
> I removed SMP support when I compiled the kernel.
>
> > > My mdstat reads:
> > > Personalities : [linear] [raid0] [raid1] [translucent]
> > > read_ahead 1024 sectors
> > > md0 : active raid1 sdb2[1] sda2[0] 8739264 blocks [2/2] [UU]
> > > unused devices: <none>
> > Disable translucent mode ! It's not intended to be used yet.
>
> I assume the only way to disable it is to recompile the kernel?
>
> >>--snip--<<
> > It sounds pretty strange what you're seeing. It would be very interesting
> > to see if you could reproduce your problems without RAID. You're running
> > RAID-1, so you should be able to just don't start the RAID devices, and
> > then mount one of the mirrors disregarding the /dev/mdX devices.
>
> Sorry for my ignorance of RAID, but I'm not certain I'm following. How
> do you not start RAID when it is compiled into the kernel to automount
> /dev/md0 at root (I use the RedHat lilo version that allows this). Doing
> a "raidstop /dev/md0"? Or reboot to a 2.2.14 kernel without RAID
> compiled in?
>
> I would just shoot in the dark at this, but I'm a little paranoid as it
> is my main webserver (yes, I should have done more testing before making
> it a production machine).
>
> Thanks for the assistance.
>
> Jeff Hill
>
> --
> ------------------------------------------------------------
> ------ HR On-Line: The Network for Workplace Issues ------
> http://www.hronline.com - Ph:416-604-7251 - Fax:416-604-4708
> ------------------------------------------------------------
>
--