Re: Root RAID and unmounting /boot

1999-10-27 Thread Jakob Østergaard

On Tue, Oct 26, 1999 at 09:07:34AM -0600, Marcos Lopez wrote:
 Egon Eckert wrote:
  
   syslogd   378   root5w   REG9,18548   192396
   /var/log/boot.log
   klogd 389   root2r   REG8,1  191102   12
   /boot/System.map-2.2.12-20
  
   Is it safe to kill these?
  
  These are loggers, so I guess nothing terrible would happen.  But I
  wouldn't kill them anyway..  I would fix the problem, not make it worse.
  
   Also i would be quite grateful if someone could explain to me why I must
   unmount /boot inorder for the lilo -r /mnt/newroot to work?
  
  I don't unmount anything before running lilo.
 
 Okay tired performing that function without doing the umount. I tried
 doing it without coping the files in /boot over to /mnt/newroot/boot but
 it couldn't find /boot/boot.b.  So i copied over all the files in /boot
 to /mnt/newroot/boot.  Then after running "lilo -r /mnt/newroot" I get
 the following error:
 "Sorry, don't know how to handle device 0x0900" 
 
 What does the above mean and how to i fix it?
 

It's because /mnt/newroot/boot is not mounted on your boot partition, but
is a regular subdirectory on your /mnt/newroot RAID.

umount /boot, mount /mnt/newroot/boot.  Then run LILO.

-- 

: [EMAIL PROTECTED]  : And I see the elder races, :
:.: putrid forms of man:
:   Jakob Østergaard  : See him rise and claim the earth,  :
:OZ9ABN   : his downfall is at hand.   :
:.:{Konkhra}...:



RE: Build in degraded mode?

1999-10-27 Thread Tom Livingston

Mark Spencer wrote:
  as "failed-disk" as opposed to "raid-disk" in your
  /etc/raidtab file.  The device you give it is not even
  actually accessed or written to;  you could put /foo/bar
  and it will still build degraded without and error, if I'm
  not mistaken.

 mkraid says "unrecognized option "failed-disk" detected error line 13...

You need a newer version of the raidtools package and the kernel patch.  You
can get the latest at ftp://ftp.us.kernel.org/pub/linux/daemons/raid/alpha

this is the only way to build in degraded mode, so you will need to upgrade
if you want to accomplish this.

tom



AW: raid not auto detecting

1999-10-27 Thread Gordon Booth

Which kernel version are you using??  Have you patched it with the latest
RAID patches?

I'm running a stock rh 6.0 install.  Where can i look to autodetect
and
start my
raid arrays?




RE: Root RAID and unmounting /boot

1999-10-27 Thread Bruno Prior

 my lilo.conf:
 disk=/dev/hda
  bios = 0x80
  sectors = 63
  heads = 128
  cylinders = 779
  partition = /dev/hda1
  start = 63

 boot=/dev/hda
 map=/boot/map
 install=/boot/boot.b
 prompt
 timeout=50
 default=linux
 image=/boot/bzImage
  label=linux
  read-only
  root=/dev/md0
  append="ether=0,0,0,0,eth1"

 Note the disk paragraph, My setup is hda1 as boot-disk (32mb) and the rest
 of my 2 hd's as raid0. Why
 exactly this is needed, i don't know. But without it, i also got the 0x0900
 from lilo.

In the context of booting from RAID, the disk geometry is only needed if the
boot-loader is trying to read from or write to a file on a RAID-1 array that
contains that partition. But in that case, the disk= and partition= lines would
point to RAID arrays, not normal disks and partitions (see Harald
Nordgård-Hansen's message to the list of 22 August on the "Booting Root RAID 1
Directly _Is_ Possible" thread). You may need this section in your lilo.conf to
enable booting, but it has nothing to do with RAID, nor should it be a cure for
the "Sorry, don't know how to handle device 0x0900" error, particularly as I
believe your /dev/md0 is RAID-0, which cannot be read by lilo whatever
parameters you pass to it.

The lines in lilo.conf that could produce this error by pointing lilo at
/dev/md0 are the "boot=", "map=", "install=", "loader=" or "image=" lines. In
this case, it is clearly not the "boot=/dev/hda" line, as that is a
straightforward installation of the boot-loader to the MBR on the first IDE
disk. The "map=", "install=" and "image=" lines all point to files in /boot, and
the absent "loader=" line defaults to /boot/chain.b as well, so if this is
really the error message you are receiving, your problem is clearly something to
do with /boot. You say that /dev/hda1 is your "boot-disk". Do you mean by this
that /dev/hda1 is mounted on /boot? If so and you have all the normal files in
/boot, I don't understand why you get the error message. If not, that is why you
get the error. But even then, I still don't understand how adding disk geometry
for /dev/hda has solved the problem.

Something is not right with your description. Are you really saying that
removing the disk geometry lines from your lilo.conf results in a "Sorry, don't
know how to handle device 0x0900" error when you run lilo? The device number is
particularly crucial. Could you try it again and check? Be careful that the only
change you make it is to remove the geometry lines. And if you get an error, see
if it is _exactly_ as stated above. Because as it stands, your solution doesn't
make sense, as you yourself recognize.

Cheers,


Bruno Prior [EMAIL PROTECTED]

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of Kelina
 Sent: 26 October 1999 18:04
 To: Marcos Lopez; Egon Eckert
 Cc: [EMAIL PROTECTED]
 Subject: Re: Root RAID and unmounting /boot


 At 09:07 26/10/99 -0600, Marcos Lopez wrote:
 Egon Eckert wrote:
  
syslogd   378   root5w   REG9,18548   192396
/var/log/boot.log
klogd 389   root2r   REG8,1  191102   12
/boot/System.map-2.2.12-20
   
Is it safe to kill these?
  
   These are loggers, so I guess nothing terrible would happen.  But I
   wouldn't kill them anyway..  I would fix the problem, not make it worse.
  
Also i would be quite grateful if someone could explain to me why I must
unmount /boot inorder for the lilo -r /mnt/newroot to work?
  
   I don't unmount anything before running lilo.
 
 Okay tired performing that function without doing the umount. I tried
 doing it without coping the files in /boot over to /mnt/newroot/boot but
 it couldn't find /boot/boot.b.  So i copied over all the files in /boot
 to /mnt/newroot/boot.  Then after running "lilo -r /mnt/newroot" I get
 the following error:
 "Sorry, don't know how to handle device 0x0900"
 
 What does the above mean and how to i fix it?

 Okay bear with me here... i'm doing this from memory and some of my own
 configs (if u have the time to search the
 archives, u'll find something abt this in august i think)

 my lilo.conf:
 disk=/dev/hda
  bios = 0x80
  sectors = 63
  heads = 128
  cylinders = 779
  partition = /dev/hda1
  start = 63

 boot=/dev/hda
 map=/boot/map
 install=/boot/boot.b
 prompt
 timeout=50
 default=linux
 image=/boot/bzImage
  label=linux
  read-only
  root=/dev/md0
  append="ether=0,0,0,0,eth1"

 Note the disk paragraph, My setup is hda1 as boot-disk (32mb) and the rest
 of my 2 hd's as raid0. Why
 exactly this is needed, i don't know. But without it, i also got the 0x0900
 from lilo.  The sectors, heads
 and cylinder's line are taken directly from fdisk -l /dev/hda. The start as
 far as i remember, is the same
 as the sector, since it's on the beginning of the disk. (correct me if i'm
 

RE: Root RAID and unmounting /boot

1999-10-27 Thread Bruno Prior

Marcos,

 My boot partition is on sda1 and that is on that partition.

What do you mean by this? What is on that partition? Is /dev/sda1 mounted on
/boot with all the normal /boot files on it? If so, your setup sounds fine.

Assuming the setup is OK, you have 2 alternatives. Kill klogd, as you suggest.
But you presumably have this running for a reason. So if you don't want to do
this, the other alternative would be to leave /dev/sda1 on /boot for now. Edit
/mnt/newroot/etc/lilo.conf so that you have an image section which selects your
root-RAID array as root (e.g. root=/dev/md?). Delete the files you copied to
/mnt/newroot/boot (you want it to be an empty mount-point). Make sure
/mnt/newroot/etc/fstab shows / on your root-RAID device and /boot still on
/dev/sda1. Now run lilo -C /mnt/newroot/etc/lilo.conf. If you now reboot and
select the RAID image section at the boot prompt, everything should be working.

Note to Jakob: The advice to move the boot device to /mnt/newroot/boot and
running lilo with the -r option is more confusing than it needs to be. Why not
simply make sure that lilo.conf includes a RAID section as above (where
root=/dev/md?)? You don't need the boot device to be on /mnt/newroot/boot. All
the files which lilo references are in the same physical place, whether it is
mounted on /boot or /mnt/newroot/boot. You could replace the section beginning
"Now, unmount the current /boot filesystem" and ending with "complete with no
errors" with a simple instruction to run lilo without parameters, as long as you
insert an item before the filesystem is copied across to the effect of:

* Edit /etc/lilo.conf to add an image section which points to the root-RAID
array as the root device (i.e. use "root=/dev/md?" where ? is the device number
of your root RAID array).

(We do this before the filesystem is copied so that /etc/lilo.conf matches
/mnt/newroot/etc/lilo.conf, so we don't even have to use -C with lilo to pass it
the new lilo.conf address.) I think this would be a good change to make as the
unnecessary complication of moving the boot device seems to be tripping a lot of
people up.

Cheers,


Bruno Prior [EMAIL PROTECTED]

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of Marcos Lopez
 Sent: 25 October 1999 19:09
 To: [EMAIL PROTECTED]
 Subject: Root RAID and unmounting /boot


 I am attempting to get my system to run RAID 1 for /, /usr, /var, /home

 I have moved over all except for / and it works fine.

 After reading the howto "Root file system on RAID. by - Jakob
 OEstergaard" I have decided to take the first approach, unfortunately I
 can not umount \boot.  even after performing a umount -f \boot.  I also
 performed a lsof | grep /boot to see what was being used and it returned
 the following.
 syslogd   378   root5w   REG9,18548   192396
 /var/log/boot.log
 klogd 389   root2r   REG8,1  191102   12
 /boot/System.map-2.2.12-20

 Is it safe to kill these?

 Also i would be quite grateful if someone could explain to me why I must
 unmount /boot inorder for the lilo -r /mnt/newroot to work?

 My boot partition is on sda1 and that is on that partition. I actually
 had to copy my /boot to my /mnt/newroot/boot else there would be no
 files there. An explanation here would really be grand.

 What happens when i run lilo -r /mnt/newroot? How will this affect the
 current /boot as it is never touched as far as I can see.. Hehe, i am
 kinda confused :-)

 thanks
 -marcos




RE: Root RAID and unmounting /boot

1999-10-27 Thread Bruno Prior

There seems to be quite a lot of confusion about root-RAID out there.

  to /mnt/newroot/boot.  Then after running "lilo -r /mnt/newroot" I get
  the following error:
  "Sorry, don't know how to handle device 0x0900"
 
  What does the above mean and how to i fix it?

 It does mean that lilo doesn't know anything about md device.  Change the
 line containing "root=" in lilo conf to "root=0x900".  This way lilo tells
 the kernel properly where from mount the root fs.

This shouldn't be necessary. The root= line in /etc/lilo.conf is simply a
pointer to a filesystem that is passed to the init scripts. It doesn't require
reading of that filesystem, so lilo shouldn't have a problem with it pointing to
a RAID array. I have root=/dev/md? in my machines with no problems.

If lilo is reporting "Sorry, don't know how to handle device 0x0900", that means
it is being asked to read from or write to /dev/md0. The lines in your lilo.conf
that might require this include the "boot=", "install=", "map=", "loader=" and
"image=" lines. If any of these lines points to something on /dev/md0, or if any
of them is missing and the default (mostly in /boot) is on /dev/md0, and lilo
cannot handle /dev/md0, then you will get this message. You can get round it by:

(a) Getting the latest patched lilo from RedHat or applying the lilo.raid1 patch
and rebuilding it yourself (if /dev/md0 is RAID-1)
(b) Providing lilo with the geometry of one of the devices in the array (again
if /dev/md0 is RAID-1)
(c) Using a slightly-adapted grub instead of lilo (again if /dev/md0 is RAID-1)
(d) Making sure the files to which these lines point are not on a software-RAID
array.

Check the mailing list archives for instructions on these methods.

 Yes.  Put everything needed to run lilo to a non-raided partition.

Almost. But not exactly. Put everything needed to *boot from* lilo on a
non-raided partition. lilo.conf doesn't have to be on a non-raided partition
itself. It isn't read during the boot process. It is read when installing lilo.
At this time, your system is fully operational, so lilo can easily read it from
a RAID array. The best advice would be to leave it in /etc, where it should be
by default. Nor do you need /dev or any of its devices on a non-RAIDed
partition. In fact, all you need on a non-RAIDed partition is the kernel image,
the system map, boot.b and associated files. In other words, the files that live
in /boot by default. So the simplest strategy is to stick with the default
filesystem structure, and have /boot on its own tiny partition. Then you can use
a bog standard /etc/lilo.conf, and run lilo straight without the need to pass it
parameters.

Cheers,


Bruno Prior [EMAIL PROTECTED]

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of Egon Eckert
 Sent: 26 October 1999 18:18
 To: Marcos Lopez
 Cc: [EMAIL PROTECTED]
 Subject: Re: Root RAID and unmounting /boot


  to /mnt/newroot/boot.  Then after running "lilo -r /mnt/newroot" I get
  the following error:
  "Sorry, don't know how to handle device 0x0900"
 
  What does the above mean and how to i fix it?

 It does mean that lilo doesn't know anything about md device.  Change the
 line containing "root=" in lilo conf to "root=0x900".  This way lilo tells
 the kernel properly where from mount the root fs.

  it get hammered.  I think i am close just have to figure out this lilo
  thing.

 Yes.  Put everything needed to run lilo to a non-raided partition.

 For example, this is my 'ls -lR /mnt/hda7' :

 boot:
 total 23
 -rw-r--r--   1 root root  512 Jul 28 16:09 boot.0300
 -rw-r--r--   1 root root 4540 Feb  2  1999 boot.b
 -rw-r--r--   1 root root  612 Feb  2  1999 chain.b
 -rw---   1 root root14336 Jul 28 16:09 map
 -rw-r--r--   1 root root  620 Feb  2  1999 os2_d.b

 dev:
 total 0
 brw-rw   1 root disk   3,   0 Jul 21  1998 hda
 brw-rw   1 root disk   3,   7 Jul 21  1998 hda7

 lilo:
 total 933
 -rw-r--r--   1 root root  176 Jul 28 14:43 lilo.conf
 -rw-r--r--   1 root root   451591 Jul 28 15:49 zImage
 -rw-r--r--   1 root root   494994 Feb  4  1999 zImage-bak

 Then run lilo like this:

 archiv:/home/egon# lilo -r /mnt/hda7 -C /lilo/lilo.conf
 Added Linux-smp *
 Added Linux-bak
 archiv:/home/egon#

 Here's my lilo.conf:

 boot=/dev/hda
 root=0x900
 install=/boot/boot.b
 map=/boot/mapvga=ext
 delay=30
 image=/lilo/zImage
   label=Linux-smp
 image=/lilo/zImage-bak
   label=Linux-bak
 read-only


 Simple.

 Egon





Re: raid not auto detecting

1999-10-27 Thread Luca Berra

On Wed, Oct 27, 1999 at 01:38:05PM +0200, Gordon Booth wrote:
stock rh 6.0 already includes raid patches
fyi it uses kernel 2.2.5

the correct questions are:
have you changed the partition type to 0xfd?
have you created an initrd with mkinitrd --with raid1 ...

 Which kernel version are you using??  Have you patched it with the latest
 RAID patches?
 
   I'm running a stock rh 6.0 install.  Where can i look to autodetect
 and
   start my
 raid arrays?
 
 

-- 
Luca Berra -- [EMAIL PROTECTED]
Communications Media  Services S.r.l.



Re: autorun

1999-10-27 Thread klkrause

I forgot to set the partition type to 0xfd.

Philipp Krause, SPTH-Projekt
[EMAIL PROTECTED]
http://www.spth.de



RE: raid not auto detecting

1999-10-27 Thread Bruno Prior

 After setting up raid, the box is unable to autodetect the raid 1 array.
 doing a cat /proc/mdstat shows that the array is not running however
 when I type raidstart /dev/md0, then do a cat /proc/mdstat, everything is
 fine
 I'm running a stock rh 6.0 install.  Where can i look to autodetect and
 start my
 raid arrays?

Firstly, I take it this explains your previous problem with fsck failing on
/dev/md0? fsck won't be able to do much if the array hasn't been initialized.

This sounds like the classic RedHat 6.* auto-recognition problem. If you check
dmesg ("dmesg | less"), is there a line that reads something like:

kmod: failed to exec /sbin/modprobe -s -k md-personality-4, errno = 2

If so, this means that neither your initrd nor your kernel include RAID support.
By default, RedHat 6.* comes with modular RAID support, which means that RAID
arrays cannot be started until the filesystem which contains the modules (in
/lib/modules) has been mounted. This occurs after the kernel tries to
auto-recognize the arrays, so auto-recognition fails for lack of RAID support at
that time. This explains why you can start it later with raidstart -
/lib/modules is now available.

To solve the problem, you can either rebuild the kernel to include RAID support
built-in (as opposed to as a module), or you can add raid support to your
initrd. To do this, run "mkinitrd --with=raid1 /boot/initrd 2.2.5-22",
substituting your desired initrd name for /boot/initrd and the correct kernel
version for 2.2.5-22. Make sure /etc/lilo.conf points to the initrd you just
created in the "initrd=" line(s). Run lilo. RAID support should now be available
during auto-recognition.

The other thing to check is that you have marked the partitions which make up
this array as type "fd" using fdisk. This tells the kernel that the partitions
are part of an array. Without it, the partitions will be ignored during the
auto-recognition phase. Make sure that your RAID is umounted and raidstopped
before you run fdisk on the partitions.

Cheers,


Bruno Prior [EMAIL PROTECTED]

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of Jason Wong
 Sent: 26 October 1999 22:05
 To: [EMAIL PROTECTED]
 Subject: raid not auto detecting


 After setting up raid, the box is unable to autodetect the raid 1 array.
 doing a cat /proc/mdstat shows that the array is not running however
 when I type raidstart /dev/md0, then do a cat /proc/mdstat, everything is
 fine
 I'm running a stock rh 6.0 install.  Where can i look to autodetect and
 start my
 raid arrays?  TIA for any help.








Probable software RAID vs hardware RAID conflict -out of memory kernel crash

1999-10-27 Thread Enyue . Jia



Dear Oh Great Raid Masters!!

We're running RAID on two linux boxes.   Both use software RAID, and one also
has a hardware RAID (Mylex DAC960/DAC1100).  Our kernel is 2.2.11 patched with
raid0145-19990824-2.2.11.

The box that only runs software RAID has no problems at all, whereas the box
that also has hardware RAID crashes every day with "out of memory" and we have
to reboot it.  It's our department server, so it's making everyone unhappy. :(

It turns out the used memory slowly increases until all the processes are
swapped out and (we're guessing) the kernel itself runs out of memory.  Could it
be that the software RAID boot patch we applied doesn't work well with the
hardware RAID code, that it introduces a memory leak in the hardware RAID code
in the kernel?  Looking at the code this seems probable since the patch replaces
statically allocated arrays with dynamically allocated arrays (for some buffers)
in code that could well be common between the two implementations.  The software
RAID patch probably adds the correct memory management functionality to the
software RAID code, but maybe not the hardware RAID code.

I'm not a kernel hackers, so this is the best bug report I can provide.  Can you
help me;  do you have any pointers?

Best regards,
Jey



"WorldSecure Server ogilvy.com" made the following
 annotations on 10/27/99 10:09:11
--
Privileged/Confidential Information may be contained in this message.  If you are not 
the addressee indicated in this message (or responsible for delivery of the message to 
such person), you may not copy or deliver this message to anyone. In such case, you 
should destroy this message and kindly notify the sender by reply email. Please advise 
immediately if you or your employer does not consent to email for messages of this 
kind.  Opinions, conclusions and other information in this message that do not relate 
to the official business of the Ogilvy Group shall be understood as neither given nor 
endorsed by it.

==



RE: autorun

1999-10-27 Thread Fernandez, Richard

Hello,
I've been able to set up raid 1 for my /usr /var and /usr/local partitions
on RH6.0

I've been lurking in the list for a while and I keep seeing references to
setting the partition type to 0xfd for root-RAID.

I feel dumb asking the question 'cuz the answer must be staring me in the
face:

How do you set the partition type to 0xfd? 
I don't see any reference to "fd" when I list the Hex codes (fdisk v2.9n)

TIA

Rich Fernandez

 --
 From: [EMAIL PROTECTED][SMTP:[EMAIL PROTECTED]]
 Sent: Wednesday, October 27, 1999 8:30 AM
 To:   LINUX-RAID-Mailingliste
 Subject:  Re: autorun
 
 I forgot to set the partition type to 0xfd.
 
 Philipp Krause, SPTH-Projekt
 [EMAIL PROTECTED]
 http://www.spth.de
 



RE: Root RAID and unmounting /boot

1999-10-27 Thread Theo Van Dinter

On Wed, 27 Oct 1999, Bruno Prior wrote:

BP (a) Getting the latest patched lilo from RedHat or applying the lilo.raid1 patch
BP and rebuilding it yourself (if /dev/md0 is RAID-1)
BP (b) Providing lilo with the geometry of one of the devices in the array (again
BP if /dev/md0 is RAID-1)
BP (c) Using a slightly-adapted grub instead of lilo (again if /dev/md0 is RAID-1)
BP (d) Making sure the files to which these lines point are not on a software-RAID
BP array.

Just a note:  I setup root RAID1 over the weekend on my RH61 box. The
configuration file is really simple, you run lilo as normal, it writes the
boot block to all disks in the array and you're done:

boot=/dev/md0
map=/boot/map
install=/boot/boot.b
timeout=50
default=linux

image=/boot/vmlinuz
label=linux
read-only
root=/dev/md0


-- 
Randomly Generated Tagline:
"How should I know if it works?  That's what beta testers are for.  I only
 coded it."
 (Attributed to Linus Torvalds, somewhere in a posting)



RE: Root RAID and unmounting /boot

1999-10-27 Thread Kelina


In the context of booting from RAID, the disk geometry is only needed if the
boot-loader is trying to read from or write to a file on a RAID-1 array that
contains that partition. But in that case, the disk= and partition= lines 
would
point to RAID arrays, not normal disks and partitions (see Harald
Nordgård-Hansen's message to the list of 22 August on the "Booting Root RAID 1
Directly _Is_ Possible" thread). You may need this section in your 
lilo.conf to
enable booting, but it has nothing to do with RAID, nor should it be a 
cure for
the "Sorry, don't know how to handle device 0x0900" error, particularly as I
believe your /dev/md0 is RAID-0, which cannot be read by lilo whatever
parameters you pass to it.

The lines in lilo.conf that could produce this error by pointing lilo at
/dev/md0 are the "boot=", "map=", "install=", "loader=" or "image=" lines. In
this case, it is clearly not the "boot=/dev/hda" line, as that is a
straightforward installation of the boot-loader to the MBR on the first IDE
disk. The "map=", "install=" and "image=" lines all point to files in 
/boot, and
the absent "loader=" line defaults to /boot/chain.b as well, so if this is
really the error message you are receiving, your problem is clearly 
something to
do with /boot. You say that /dev/hda1 is your "boot-disk". Do you mean by this
that /dev/hda1 is mounted on /boot? If so and you have all the normal files in
/boot, I don't understand why you get the error message. If not, that is 
why you
get the error. But even then, I still don't understand how adding disk 
geometry
for /dev/hda has solved the problem.

Something is not right with your description. Are you really saying that
removing the disk geometry lines from your lilo.conf results in a "Sorry, 
don't
know how to handle device 0x0900" error when you run lilo? The device 
number is
particularly crucial. Could you try it again and check? Be careful that 
the only
change you make it is to remove the geometry lines. And if you get an 
error, see
if it is _exactly_ as stated above. Because as it stands, your solution 
doesn't
make sense, as you yourself recognize.


Hrm, it seems you are right... (figures:-) i just commented those lines out 
and it worked
Must've been another mistake i made which i fixed when i added these lines 
to lilo.conf. Cuz
i redid my whole conf file back then...

Sorry for the misinformation :)

Surge




RE: Root RAID and unmounting /boot

1999-10-27 Thread Admin Mailing Lists


I've been following this thread as I research using software RAID1 on a
production system
I haven't read the new HOWTO yet (i plan to sometime today if i get time)
but I dont think i understand the concept of bootable, root RAID1 so far.
What I got so far is you have a small bootable partition which is
not-raided, and then every other partition on that drive raided to like
partitions on the mirror drive.
The thing I TOTALLY don't get is if the first drive dies, how can
you boot the 2nd drive (mirror) when you're still losing the small
bootable partition (as it's still part of drive 1). Of course you can put
this bootable partition on a seperate drive, but you still dont have the
redundancy because THAT drive can die.
Wouldn't it be easier to stick the kernel, lilo config, relevant boot info
on a floppy and boot raid1 systems from that?
perhaps i'm missing something..more likely that not :-)

-Tony 
.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-.
Anthony J. Biacco   Network Administrator/Engineer
[EMAIL PROTECTED]Intergrafix Internet Services

"Dream as if you'll live forever, live as if you'll die today"
http://cygnus.ncohafmuta.comhttp://www.intergrafix.net
.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-._.-.

On Wed, 27 Oct 1999, Bruno Prior wrote:

 There seems to be quite a lot of confusion about root-RAID out there.
 
   to /mnt/newroot/boot.  Then after running "lilo -r /mnt/newroot" I get
   the following error:
   "Sorry, don't know how to handle device 0x0900"
  
   What does the above mean and how to i fix it?
 
  It does mean that lilo doesn't know anything about md device.  Change the
  line containing "root=" in lilo conf to "root=0x900".  This way lilo tells
  the kernel properly where from mount the root fs.
 
 This shouldn't be necessary. The root= line in /etc/lilo.conf is simply a
 pointer to a filesystem that is passed to the init scripts. It doesn't require
 reading of that filesystem, so lilo shouldn't have a problem with it pointing to
 a RAID array. I have root=/dev/md? in my machines with no problems.
 
 If lilo is reporting "Sorry, don't know how to handle device 0x0900", that means
 it is being asked to read from or write to /dev/md0. The lines in your lilo.conf
 that might require this include the "boot=", "install=", "map=", "loader=" and
 "image=" lines. If any of these lines points to something on /dev/md0, or if any
 of them is missing and the default (mostly in /boot) is on /dev/md0, and lilo
 cannot handle /dev/md0, then you will get this message. You can get round it by:
 
 (a) Getting the latest patched lilo from RedHat or applying the lilo.raid1 patch
 and rebuilding it yourself (if /dev/md0 is RAID-1)
 (b) Providing lilo with the geometry of one of the devices in the array (again
 if /dev/md0 is RAID-1)
 (c) Using a slightly-adapted grub instead of lilo (again if /dev/md0 is RAID-1)
 (d) Making sure the files to which these lines point are not on a software-RAID
 array.
 
 Check the mailing list archives for instructions on these methods.
 
  Yes.  Put everything needed to run lilo to a non-raided partition.
 
 Almost. But not exactly. Put everything needed to *boot from* lilo on a
 non-raided partition. lilo.conf doesn't have to be on a non-raided partition
 itself. It isn't read during the boot process. It is read when installing lilo.
 At this time, your system is fully operational, so lilo can easily read it from
 a RAID array. The best advice would be to leave it in /etc, where it should be
 by default. Nor do you need /dev or any of its devices on a non-RAIDed
 partition. In fact, all you need on a non-RAIDed partition is the kernel image,
 the system map, boot.b and associated files. In other words, the files that live
 in /boot by default. So the simplest strategy is to stick with the default
 filesystem structure, and have /boot on its own tiny partition. Then you can use
 a bog standard /etc/lilo.conf, and run lilo straight without the need to pass it
 parameters.
 
 Cheers,
 
 
 Bruno Prior [EMAIL PROTECTED]
 
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED]]On Behalf Of Egon Eckert
  Sent: 26 October 1999 18:18
  To: Marcos Lopez
  Cc: [EMAIL PROTECTED]
  Subject: Re: Root RAID and unmounting /boot
 
 
   to /mnt/newroot/boot.  Then after running "lilo -r /mnt/newroot" I get
   the following error:
   "Sorry, don't know how to handle device 0x0900"
  
   What does the above mean and how to i fix it?
 
  It does mean that lilo doesn't know anything about md device.  Change the
  line containing "root=" in lilo conf to "root=0x900".  This way lilo tells
  the kernel properly where from mount the root fs.
 
   it get hammered.  I think i am close just have to figure out this lilo
   thing.
 
  Yes.  Put everything needed to run lilo to a non-raided partition.
 
  For example, this is my 'ls -lR /mnt/hda7' :
 
  

RE: DPT Linux RAID.

1999-10-27 Thread G.W. Wettstein

On Oct 26, 12:27pm, [EMAIL PROTECTED] wrote:
} Subject: RE: DPT Linux RAID.

 On Tue, 26 Oct 1999, G.W. Wettstein wrote:

  CONFIG_SCSI_EATA_DMA
  
  The instability is especially profound in an SMP environment.  Under
  any kind of load there will be crashes and hangs.

 Red Hat defaults to using CONFIG_SCSI_EATA_DMA if you have a PM2144UW.  I
 have a client with 2 servers using PM2144UW's with CONFIG_SCSI_EATA_DMA
 that have been rock stable.  Neither is SMP.  One is a mail/web/dns
 server.  The other is backup mail/dns and squid.

That is an interesting datapoint.

The PM3334 in a dual-PII simply would not hold up under load on our
main IMAP server with the EATA_DMA driver.  Our news server (dual
PII-300) was also running with one of these and we saw problems.

The problem seem to be compounded when we had SMC Etherpower-II (EPIC)
cards in the machines.  I had a lot of respect for SMC but these cards
have no place in a production environment from our experience.

The issue is probably pretty much moot for us at this point.  The DPT
cards, at least the 3334 we have, simply don't have the I/O
performance that we need as our load scales.  We have been pretty
happy with the DAC960 cards although we had to turn off tagged queing
in order to keep the drives on-line.

We are moving toward fibre-channel and outboard RAID controllers to
implement the SAN that we are deploying for our Linux server farms.
Given the excellent luck that we have had with the software RAID code
for Linux I probably see a diminishing future for hardware RAID
controller cards in most of our servers.  We are using software RAID1
to mirror root, var and swap and than deploying the service
filesystems on the RAID5 composite volumes provided by the
fibre-channel controllers.

  Jon Lewis *[EMAIL PROTECTED]*|  Spammers will be winnuked or 

Thanks again for the note, have a pleasant remainder of the week.

Greg

}-- End of excerpt from [EMAIL PROTECTED]

As always,
Dr. G.W. Wettstein   Enjellic Systems Development - Specializing
4206 N. 19th Ave.in information infra-structure solutions.
Fargo, ND  58102 WWW: http://www.enjellic.com
Phone: 701-281-1686  EMAIL: [EMAIL PROTECTED]
--
"MS can classify NT however they like.  Calling a pig a bird still
doesn't get you flying ham, however."
-- Steven N. Hirsch



Re: Probable software RAID vs hardware RAID conflict - out ofmemory kernel crash

1999-10-27 Thread David Holl

2.2.11 has a memory leak in its tcp code.  2.2.12 is friendly, and I'm
waiting to see how people take to 2.2.13 before test driving.  The 2.2.11
patch should apply to 2.2.12 and you can ignore the 2 rejected hunks in
fs.h

On Wed, 27 Oct 1999 [EMAIL PROTECTED] wrote:

-
-
-Dear Oh Great Raid Masters!!
-
-We're running RAID on two linux boxes.   Both use software RAID, and one also
-has a hardware RAID (Mylex DAC960/DAC1100).  Our kernel is 2.2.11 patched with
-raid0145-19990824-2.2.11.
-
-The box that only runs software RAID has no problems at all, whereas the box
-that also has hardware RAID crashes every day with "out of memory" and we have
-to reboot it.  It's our department server, so it's making everyone unhappy. :(
-
-It turns out the used memory slowly increases until all the processes are
-swapped out and (we're guessing) the kernel itself runs out of memory.  Could it
-be that the software RAID boot patch we applied doesn't work well with the
-hardware RAID code, that it introduces a memory leak in the hardware RAID code
-in the kernel?  Looking at the code this seems probable since the patch replaces
-statically allocated arrays with dynamically allocated arrays (for some buffers)
-in code that could well be common between the two implementations.  The software
-RAID patch probably adds the correct memory management functionality to the
-software RAID code, but maybe not the hardware RAID code.
-
-I'm not a kernel hackers, so this is the best bug report I can provide.  Can you
-help me;  do you have any pointers?
-
-Best regards,
-Jey
-
-
-
-"WorldSecure Server ogilvy.com" made the following
- annotations on 10/27/99 10:09:11
---
-Privileged/Confidential Information may be contained in this message.  If you are not 
the addressee indicated in this message (or responsible for delivery of the message to 
such person), you may not copy or deliver this message to anyone. In such case, you 
should destroy this message and kindly notify the sender by reply email. Please advise 
immediately if you or your employer does not consent to email for messages of this 
kind.  Opinions, conclusions and other information in this message that do not relate 
to the official business of the Ogilvy Group shall be understood as neither given nor 
endorsed by it.
-
-==
-



Re: autorun

1999-10-27 Thread Tim Walberg

On 10/27/1999 10:07 -0400, Fernandez, Richard wrote:
  Hello,
  I've been able to set up raid 1 for my /usr /var and /usr/local partitions
  on RH6.0
  
  I've been lurking in the list for a while and I keep seeing references to
  setting the partition type to 0xfd for root-RAID.
  
  I feel dumb asking the question 'cuz the answer must be staring me in the
  face:
  
  How do you set the partition type to 0xfd? 
  I don't see any reference to "fd" when I list the Hex codes (fdisk v2.9n)
  

To quote a popular shoe maker... "Just do it". Never mind that fd isn't
in the list; when it asks for the new partition type, type in '0xfd'.



tw


-- 
+--+--+
| Tim Walberg  | Phone: 847-782-2472  |
| TERAbridge Technologies Corp | FAX:   847-623-1717  |
| 1375 Tri-State Parkway   | [EMAIL PROTECTED]  |
| Gurnee, IL 60031 | 800-SKY-TEL2 PIN 9353299 |
+--+--+

 PGP signature


RE: mkraid aborts, no info?

1999-10-27 Thread Fernandez, Richard

Mandrake doesn't have RAID support built into the kernel AFAIK.
I was trying to do the same thing you're doing using Mandrake 6.0.

Below is an e-mail I received from Mandrake...

dear Richard Fernandez,

you should recompile the kernel with raid support or use the RedHat
compiled kernel which already has that.

sincerely,

-- 
Florin Grad
(Technical Support Team)
[EMAIL PROTECTED]


 --
 From: Bill Carlson[SMTP:[EMAIL PROTECTED]]
 Sent: Wednesday, October 27, 1999 4:17 PM
 To:   [EMAIL PROTECTED]
 Subject:  mkraid aborts, no info?
 
 Hello,
 
 I'm trying to setup your basic RAID0 Software raid and running into some
 problems.
 
 Details:
 
 Mandrake 6.1,(not RedHat!)
 Kernel 2.2.13 and 2.2.12
 2 SCSI drives, completely seperate from root, 4 GB each
 
 My test raidtab (/etc/raidtab.test)
 
 raiddev /dev/md0
 raid-level  0
 nr-raid-disks   2
 persistent-superblock   1
 chunk-size  8
 
 device  /dev/sdb1
 raid-disk   0
 
 device  /dev/sdc1
 raid-disk   1
 
 
 
 
 When I run mkraid I get the following:
 
 [root@washu /root]# mkraid -c /etc/raidtab.test /dev/md0
 handling MD device /dev/md0
 analyzing super-block
 disk 0: /dev/sdb1, 4192933kB, raid superblock at 4192832kB
 disk 1: /dev/sdc1, 4192933kB, raid superblock at 4192832kB
 mkraid: aborted, see the syslog and /proc/mdstat for potential clues.
 
 Output from /proc/mdstat:
 
 [root@washu /root]# chkraid
 Personalities : [2 raid0]
 read_ahead not set
 md0 : inactive
 md1 : inactive
 md2 : inactive
 md3 : inactive
 [root@washu /root]#
 
 Nothing of note in syslogd (*.*  /dev/tty12)
 
 Any ideas on what I am doing wrong?
 
 I just joined the list, are there any searchable archives?
 Is a kernel patch still required for the 2.2.1x series?
 
 Thanks in advance,
 
 
 Bill Carlson
 
 Systems Programmer[EMAIL PROTECTED]  |  Opinions are mine,
 Virtual Hospital  http://www.vh.org/|  not my employer's.
 University of Iowa Hospitals and Clinics  |
 
 
 



RE: mkraid aborts, no info?

1999-10-27 Thread Bill Carlson

On Wed, 27 Oct 1999, Fernandez, Richard wrote:

 Mandrake doesn't have RAID support built into the kernel AFAIK.
 I was trying to do the same thing you're doing using Mandrake 6.0.
 
 Below is an e-mail I received from Mandrake...
 
 dear Richard Fernandez,
 
 you should recompile the kernel with raid support or use the RedHat
 compiled kernel which already has that.
 
 sincerely,
 
 -- 
 Florin Grad
 (Technical Support Team)
 [EMAIL PROTECTED]
 


I'm thinking Florin means the kernel is not compiled with support by
default.

From the info on raidtools:

This package includes the tools you need to set up and maintain a software
RAID
device under Linux. It only works with Linux 2.2 kernels and later, or 2.0
kernel specifically patched with newer raid support.

To me that implies a 2.2.x kernel does not need a patch. On Mandrake 6.1,
the required RAID modules were already in place after installation. 

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




RE: mkraid aborts, no info?

1999-10-27 Thread Fernandez, Richard



 --
 From: Bill Carlson[SMTP:[EMAIL PROTECTED]]
 Sent: Wednesday, October 27, 1999 4:30 PM
 To:   Fernandez, Richard
 Cc:   [EMAIL PROTECTED]
 Subject:  RE: mkraid aborts, no info?
 
 On Wed, 27 Oct 1999, Fernandez, Richard wrote:
 
  Mandrake doesn't have RAID support built into the kernel AFAIK.
  I was trying to do the same thing you're doing using Mandrake 6.0.
  
  Below is an e-mail I received from Mandrake...
  
  dear Richard Fernandez,
  
  you should recompile the kernel with raid support or use the RedHat
  compiled kernel which already has that.
  
  sincerely,
  
  -- 
  Florin Grad
  (Technical Support Team)
  [EMAIL PROTECTED]
  
 
 
 I'm thinking Florin means the kernel is not compiled with support by
 default.
 
That's how I read it too.
I re-compiled the kernel with support for RAID. Then I was able to
"mkraid /dev/md0", but recieved "mkraid: aborted..." same as you. My
/proc/mdstat
showed the same as yours.

Honestly, I never figured out how to fix it. We happened to have a
copy of RH6 on hand and rather that "waste" any more time I installed it.
Worked like a charm.





Re: mkraid aborts, no info?

1999-10-27 Thread Luca Berra

On Wed, Oct 27, 1999 at 03:30:31PM -0500, Bill Carlson wrote:
 I'm thinking Florin means the kernel is not compiled with support by
 default.

no.
 This package includes the tools you need to set up and maintain a software
 RAID
 device under Linux. It only works with Linux 2.2 kernels and later, or 2.0
 kernel specifically patched with newer raid support.
 
 To me that implies a 2.2.x kernel does not need a patch. On Mandrake 6.1,
 the required RAID modules were already in place after installation. 

this is FALSE
kernel 2.2.x does require a patch.
get either a redhat kernel or a linux-2.2.13ac1

mandrake was unfortunate in that they included the raidtools package
but did not patch the kernel to support raid.

{to see if you have a patched kernel look for the "autodetecting raid"
message during bootup}


-- 
Luca Berra -- [EMAIL PROTECTED]
Communications Media  Services S.r.l.



Re: mkraid aborts, no info?

1999-10-27 Thread David A. Cooley

Hi Bill,
You need to get the latest raid kernel patch (ignore the errors it gives... 
one hunk is included in the 2.2.12/2.2.13 kernel) and the latest raidtools 
(0.90).


At 03:17 PM 10/27/99 -0500, Bill Carlson wrote:
Hello,

I'm trying to setup your basic RAID0 Software raid and running into some
problems.

Details:

Mandrake 6.1,(not RedHat!)
Kernel 2.2.13 and 2.2.12
2 SCSI drives, completely seperate from root, 4 GB each

My test raidtab (/etc/raidtab.test)

raiddev /dev/md0
 raid-level  0
 nr-raid-disks   2
 persistent-superblock   1
 chunk-size  8

 device  /dev/sdb1
 raid-disk   0

 device  /dev/sdc1
 raid-disk   1




When I run mkraid I get the following:

[root@washu /root]# mkraid -c /etc/raidtab.test /dev/md0
handling MD device /dev/md0
analyzing super-block
disk 0: /dev/sdb1, 4192933kB, raid superblock at 4192832kB
disk 1: /dev/sdc1, 4192933kB, raid superblock at 4192832kB
mkraid: aborted, see the syslog and /proc/mdstat for potential clues.

Output from /proc/mdstat:

[root@washu /root]# chkraid
Personalities : [2 raid0]
read_ahead not set
md0 : inactive
md1 : inactive
md2 : inactive
md3 : inactive
[root@washu /root]#

Nothing of note in syslogd (*.*  /dev/tty12)

Any ideas on what I am doing wrong?

I just joined the list, are there any searchable archives?
Is a kernel patch still required for the 2.2.1x series?

Thanks in advance,


Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|



===
David Cooley N5XMT   Internet: [EMAIL PROTECTED]
  Packet: N5XMT@KQ4LO.#INT.NC.USA.NA   T.A.P.R. Member #7068
Sponges grow in the ocean... Wonder how deep it would be if they didn't?!
===



Re: mkraid aborts, no info?

1999-10-27 Thread Bill Carlson

On Wed, 27 Oct 1999, David A. Cooley wrote:

 Hi Bill,
 You need to get the latest raid kernel patch (ignore the errors it gives... 
 one hunk is included in the 2.2.12/2.2.13 kernel) and the latest raidtools 
 (0.90).
 

Ah, I see now. I'll try applying the patch to the 2.2.13 now.

Thanks David, Luca.

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|





Re: raid problem

1999-10-27 Thread Luca Berra

On Wed, Oct 27, 1999 at 11:09:46PM +0200, Mathieu ARNOLD wrote:
  I'm using a redhat 6.1 with a 2.3.21 kernel.

 well, i've gone back to the 2.2 series, i'm now using a 2.2.13 kernel, and
 well, it does exactly the same thing.
 does someone have a clue ?
yup.
neither 2.3.x nor 2.2.x do support new raid code, but you can find a patch
for 2.2.x (while there is no patch for 2.3.x)

i'd get
ftp://ftp.fr.kernel.org/pub/linux/kernel/people/alan/2.2.13ac/patch-2.2.13ac1.gz

other option is:
ftp://ftp.fr.kernel.org/pub/linux/daemons/raid/alpha/raid0145-19990824-2.2.11.gz
and fix the rejects by hand

regards,
Luca

-- 
Luca Berra -- [EMAIL PROTECTED]
Communications Media  Services S.r.l.



raid0145 patch

1999-10-27 Thread Bill Carlson

Hey,

The patch did the trick, just like it was supposed to.

cd /usr/src/linux
patch -p1  raid0145-19990824-2.2.11

There was the one error, which I ignored, as I was patching against
2.2.12. Does the same patch apply vs 2.2.13? I'm guessing that Mandrake's
sources are what caused the errors that lead me to go with a fresh 2.2.12
source tree.

Recompile, reboot and the magic messages started. :)

2 minutes later I had me an 8 GB array.

Thanks a lot everyone!

Bill Carlson

Systems Programmer[EMAIL PROTECTED]|  Opinions are mine,
Virtual Hospital  http://www.vh.org/|  not my employer's.
University of Iowa Hospitals and Clinics|




Re: raid problem

1999-10-27 Thread Mathieu ARNOLD



Luca Berra wrote:
 
 On Wed, Oct 27, 1999 at 11:09:46PM +0200, Mathieu ARNOLD wrote:
   I'm using a redhat 6.1 with a 2.3.21 kernel.
 ...
  well, i've gone back to the 2.2 series, i'm now using a 2.2.13 kernel, and
  well, it does exactly the same thing.
  does someone have a clue ?
 yup.
 neither 2.3.x nor 2.2.x do support new raid code, but you can find a patch
 for 2.2.x (while there is no patch for 2.3.x)
 
 i'd get
 ftp://ftp.fr.kernel.org/pub/linux/kernel/people/alan/2.2.13ac/patch-2.2.13ac1.gz
 
 other option is:
 ftp://ftp.fr.kernel.org/pub/linux/daemons/raid/alpha/raid0145-19990824-2.2.11.gz
 and fix the rejects by hand

yep, i've read about it just after posting my msg :/

i now have :
md0 : active raid5 hdh1[4] hdg1[3] hdf1[2] hde1[1] hdb1[0] 60066304 blocks
level 5, 32k chunk, algorithm 2 [5/5] [U] resync=15% finish=163.4min

and 
/dev/md0   56G  3.7G   53G   7% /opt1

i'm just happy now :)

-- 
Cordialement
Mathieu Arnold   PGP key id : 0x2D92519F
IRC : _mat  ICQ 1827742  http://www.mat.cc/



raid problems solved - Thanks to all who helped

1999-10-27 Thread Jason Wong

Well, after receiving numerous emails with suggestions and help, I've
finally got this raid thing setup
I just wanted to post and say thanks to all who helped.
allan
bruno
mike
surge
luca
gordon
and anyone else that helped that I might have forgotten.  This has been a
really good experience and I must say that this is one of the best lists
that I've ever subscribed to.  Anyhow, here's a rundown of what I finally
did to get a running system.  Hopefully this will help in case another
newbie decides to give this a whirl.

1.  Installed stock RH 6.0 system using kernel 2.2.5-15 (This kernel has the
necessary raid patch)
2.  Setup up following partitions using fdisk during install process

disk 0:

/dev/sda116MB/boot  83  linux
/dev/sda2128MB 82  linux swap
/dev/sda3400MB /   83  linux
/dev/sda45  extended
/dev/sda5500MB  fd  unknown
/dev/sda6500MB  fd  unknown
/dev/sda7175MB  fd  unknown
/dev/sda8175MB  fd  unknown

disk 1:

/dev/sdb116MB   83  linux
/dev/sdb2128MB 82  linux swap
/dev/sdb3400MB 83  linux
/dev/sdb45  extended
/dev/sdb5500MB  fd  unknown
/dev/sdb6500MB  fd  unknown
/dev/sdb7175MB  fd  unknown
/dev/sdb8175MB  fd  unknown

(to set the partition type to fd, type the command 't', select the partition
you want to change and type 'fd' when prompted for the Hex code)

3. Finish installation process.
4. Reboot system.
5. Login as root
6. Install raidtools-0.90-3.i386.rpm from the distro CD
7. Setup /etc/raidtab per instructions from
http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/Software-RAID.HOWTO.txt for
raid 1
8. run 'mkraid /dev/md0'
9. run 'mke2fs -b 4096 /dev/md0
10. run 'mkinitrd --with=raid1 /boot/initrd 2.2.5-15
11. edit /etc/lilo.conf  -  change line 'initrd=/boot/initrd-2.2.5-15.img'
to 'initrd=/boot/initrd'
12. run 'lilo'
13. Reboot system
14. Login as root
15. run 'cat /proc/mdstat' and it should show an active raid setup

Thanks again for all the help.  Now to setup a bootable raid setup.  :)




simple question

1999-10-27 Thread Michael Cunningham


I am sure this is probably obvious but I wanna make sure:) 
I have a raid 1 setup using two ide disks. I am not booting from these
disks. The RAID seems to be working GREAT so far:) 

I assume I should just put a normal entry in the fstab right?

/dev/md0   /data  ext2defaults1 2

fsck wont screw the raid array up right if its run against it?

Thanks.. Mike

--
Friends help you move. Real friends help you move bodies.



raid performance? good?

1999-10-27 Thread Michael Cunningham

Is this good software raid performance for a 
Dual celeron 500 mhz
128 mb ram
abit bp6 mb
using the htp66 controller in dma 33 mode
2 x 18 gig ibm dma 33 drives
Running kernel 2.2.13ac1, big ide patch (latest as of today), 2.2.14pre1
as well..

Its a raid 1 mirror with each disk alone on its own ide channel (one 3,
one 4). OS is on a different drive and controller.

  ---Sequential Output ---Sequential InputRandom--
  -Per Char- --Block--- -Rewrite-- -Per Char- --BlockSeeks---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
  400  7644 98.3 15585 23.1  7000 21.1  7861 92.5 17368 12.4 145.4 3.2

 hdparm -Tt /dev/md0

/dev/md0:
 Timing buffer-cache reads:   64 MB in  0.85 seconds =75.29 MB/sec
 Timing buffered disk reads:  32 MB in  1.87 seconds =17.11 MB/sec

I can understand the write performance but I would think the read
performance would be better given that it should be reading from both 
halves of the mirror? I am using a 4096 block size on these drives
if that helps. Basically I will be storing a lot of big files,
database..etc. Any tuning suggestions? Basically looking for the
MAXimum performance out of this thing. I am sure more ram will
help which is my next purchase.

Thanks, Mike

--
Friends help you move. Real friends help you move bodies.



fsck and raid 1

1999-10-27 Thread Michael Cunningham

I just rebooted my system for the first time automatically
mounting up md0 through fstab. and I got a fsck failure on the
md0 device. I made sure i let it totally sync before rebooting
so I am not sure whats wrong. It was great until fsck ran against it.

It seems that the parition table and the superblock dont agree
on what the raid size is. Are there any known problems with 
fsck and raid v90 patch? fsck is from redhat 6.0. 
I am running kernel 2.2.13 

After i created the raid device I mounted it up right away and i got
these messages to the console right after the mount. 

md0: blocksize changed during read
nr_blocks changed to 128 (blocksize 1024, j 264192, max_blocks 17619712)
md0: blocksize changed during read
nr_blocks changed to 32 (blocksize 4096, j 66048, max_blocks 4404928)

the 4404928 is on of the numbers fsck complained about. It said the
partition is actually smaller but the superblock says its larger
(4404928).

Any ideas?
Mike

--
Friends help you move. Real friends help you move bodies.



fsck and raid 1 more info

1999-10-27 Thread Michael Cunningham


Here are the actual messages that fsck spits out when the system 
boots.. well basically the same. I am just running it manually here.

[root@server1 /]# fsck /dev/md0
Parallelizing fsck version 1.14 (9-Jan-1999)
e2fsck 1.14, 9-Jan-1999 for EXT2 FS 0.5b, 95/08/09
The filesystem size (according to the superblock) is 4404952 blocks
The physical size of the device is 4404928 blocks
Either the superblock or the partition table is likely to be corrupt!
Aborty? 

And a copy of my /proc/mdstat

[root@server1 /]# more /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] 
read_ahead 1024 sectors
md0 : active raid1 hdg1[1] hde1[0] 17619712 blocks [2/2] [UU]
unused devices: none

And a copy of the raid tab if this is useful..

[root@server1 /etc]# more raidtab
raiddev /dev/md0
raid-level1
nr-raid-disks 2
nr-spare-disks0
chunk-size4
persistent-superblock 1  
device/dev/hde1
raid-disk 0
device/dev/hdg1
raid-disk 1

Thanks for ANY info you can offer..

Mike

-
Previous email:

I just rebooted my system for the first time automatically
mounting up md0 through fstab. and I got a fsck failure on the
md0 device. I made sure i let it totally sync before rebooting
so I am not sure whats wrong. It was great until fsck ran against it.

It seems that the parition table and the superblock dont agree
on what the raid size is. Are there any known problems with 
fsck and raid v90 patch? fsck is from redhat 6.0. 
I am running kernel 2.2.13 

After i created the raid device I mounted it up right away and i got
these messages to the console right after the mount. 

md0: blocksize changed during read
nr_blocks changed to 128 (blocksize 1024, j 264192, max_blocks 17619712)
md0: blocksize changed during read
nr_blocks changed to 32 (blocksize 4096, j 66048, max_blocks 4404928)

the 4404928 is on of the numbers fsck complained about. It said the
partition is actually smaller but the superblock says its larger
(4404928).

Any ideas?
Mike

--
Friends help you move. Real friends help you move bodies.




RE: Please Help - SCSI RAID device very slow

1999-10-27 Thread David Cooley

At 05:39 PM 10/27/1999 +0100, Matthew Clark wrote:
Does anyone know how I stop the MegaRAID using the onboard INTEL SCSI
Controller? (i960) It should be using my SYMBIOS card...???

It will only use the card it is physically connected to.  Move the cable to 
the SYMBIOS card and it will be seen there.
===
David Cooley N5XMT Internet: [EMAIL PROTECTED]
Packet: N5XMT@KQ4LO.#INT.NC.USA.NA T.A.P.R. Member #7068
We are Borg... Prepare to be assimilated!
===



RE: Please Help - SCSI RAID device very slow

1999-10-27 Thread Matthew Clark

Apparently the i960 is used to control the Symbios somehow...still - I doubt
this explains the horrendus performance.. I am beginning to suspect a
hardware problem elsewhere..

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of David Cooley
 Sent: 27 October 1999 21:08
 To: [EMAIL PROTECTED]; linux-scsi; linux-raid
 Subject: RE: Please Help - SCSI RAID device very slow


 At 05:39 PM 10/27/1999 +0100, Matthew Clark wrote:
 Does anyone know how I stop the MegaRAID using the onboard INTEL SCSI
 Controller? (i960) It should be using my SYMBIOS card...???

 It will only use the card it is physically connected to.  Move
 the cable to
 the SYMBIOS card and it will be seen there.
 ===
 David Cooley N5XMT Internet: [EMAIL PROTECTED]
 Packet: N5XMT@KQ4LO.#INT.NC.USA.NA T.A.P.R. Member #7068
 We are Borg... Prepare to be assimilated!
 ===


 -
 To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
 the body of a message to [EMAIL PROTECTED]




RE: Please Help - SCSI RAID device very slow

1999-10-27 Thread Scott Marlowe

On Wed, 27 Oct 1999, Matthew Clark wrote:

 I do want to use the hardware RAID.. I think I have been looking at this for
 too long.. I now think the INTEL i960 IS the MegaRAID controller.. I didn't
 realise it was registered as a PCI device... How come it connects to the
 Symbios controller?
 
 I am very confused now!!

Some days I think I'm perpetually confused, but hopefully this particular
time, I'm the cluefull one.

 Does anyone know how I stop the MegaRAID using the onboard INTEL SCSI
 Controller? (i960) It should be using my SYMBIOS card...???

The AMI MegaRAIDs I've worked with (the 428) had an Intel i960 RISC
processor integrated onto the megaraid card, along with several SymBIOS 8xx
series SCSI controller chips.  The i960 generates parity data and controls
the RAID arrays, while the Symbios chips interface to the HDs under control
of the i960.

The newer MegaRAID cards can now "hijack" your other Symbios based
controllers and make them part of the "collective" that is the megaraid.

How slow is too slow?  What numbers do you get from hdparm -tT?

 Looking at /proc/scsi/scsi, it appears that linux is talking
 directly to my
 MegaRAID controller..

 Shouldn't it be talking to it via the SymBIOS controller?

No, the MegaRAID is what it should be talking to.  The Symbios controllers
are now subsystems to the MegaRAID.




RE: Please Help - SCSI RAID device very slow

1999-10-27 Thread Matthew Clark

I do want to use the hardware RAID.. I think I have been looking at this for
too long.. I now think the INTEL i960 IS the MegaRAID controller.. I didn't
realise it was registered as a PCI device... How come it connects to the
Symbios controller?

I am very confused now!!



 -Original Message-
 From: casler, heather [mailto:[EMAIL PROTECTED]]
 Sent: 27 October 1999 18:54
 To: '[EMAIL PROTECTED]'
 Subject: RE: Please Help - SCSI RAID device very slow


 If you don't want to use the MegaRaid at all, you might want to
 go into the
 system's BIOS and turn the card off in the BIOS.  That was the only way I
 got the HP NetServer LH3 that I was working with to use the Symbios
 controller.
 Hope that helps.
 Heather
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
 Sent: Wednesday, October 27, 1999 12:39 PM
 To: linux-scsi; linux-raid
 Subject: RE: Please Help - SCSI RAID device very slow


 Does anyone know how I stop the MegaRAID using the onboard INTEL SCSI
 Controller? (i960) It should be using my SYMBIOS card...???

  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED]]On Behalf Of Matthew Clark
  Sent: 27 October 1999 15:51
  To: linux-scsi; linux-raid
  Subject: Please Help - SCSI RAID device very slow
 
 
  Looking at /proc/scsi/scsi, it appears that linux is talking
  directly to my
  MegaRAID controller..
 
  Shouldn't it be talking to it via the SymBIOS controller?
 
  Host: scsi2 Channel: 01 Id: 00 Lun: 00
Vendor: MegaRAID Model: LD0 RAID5 26031R Rev: D
Type: Direct-Access  ANSI SCSI Revision: 02
 
  Any help would be appreciated.. the disk array is running unacceptably
  slowly when writing - could this be the cause?
 
  Regards,
 
  Matthew Clark.
  --
  NetDespatch Ltd - The Internet Call Centre.
  http://www.netdespatch.com
 
 
 


 -
 To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
 the body of a message to [EMAIL PROTECTED]




RE: Please Help - SCSI RAID device very slow

1999-10-27 Thread Matthew Clark

Thanks for replying...

 The AMI MegaRAIDs I've worked with (the 428) had an Intel i960 RISC
 processor integrated onto the megaraid card, along with several
 SymBIOS 8xx
 series SCSI controller chips.  The i960 generates parity data and controls
 the RAID arrays, while the Symbios chips interface to the HDs
 under control
 of the i960.

This is what someone else has mentioned .. so the i960 is actually
controlling the onboard Symbios controllers...


 The newer MegaRAID cards can now "hijack" your other Symbios based
 controllers and make them part of the "collective" that is the megaraid.


Yikes - That sounds unplesant! Fortunately my other controller is an adaptec
U2W (aic7xxx)


 How slow is too slow?  What numbers do you get from hdparm -tT?

Well .. hdparm -tT isn't "so" bad - here's the output

#hdparm -tT /dev/sda
/dev/sda:
 Timing buffer-cache reads:   64 MB in  0.61 seconds =104.92 MB/sec
 Timing buffered disk reads:  32 MB in  1.50 seconds =21.33 MB/sec

(are 32  64 meg enough for these tests??)

it's writing that's the BIG problem - check out this output from bonnie..

  ---Sequential Output ---Sequential
Input-- --Random--
  -Per Char- --Block--- -Rewrite-- -Per
Char- --Block--- --Seeks---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec
%CPU
   1* 300   446  7.0   447  1.7  2440  4.9  4926 61.4 26823 24.6 1047.2
14.1

:-(

 No, the MegaRAID is what it should be talking to.  The Symbios controllers
 are now subsystems to the MegaRAID.

Right - well I'm glad you said that.. that was confusing the hell out of me!
Both you and Gerard Roudier have confirmed that now..

I never heard of the i960 controlling controllers like that.. mind you I
never claimed to be a hardware guru :-)

Regards,

Matthew Clark.



Re: Please Help - SCSI RAID device very slow

1999-10-27 Thread Alan Cox

 Does anyone know how I stop the MegaRAID using the onboard INTEL SCSI
 Controller? (i960) It should be using my SYMBIOS card...???

An i960 isnt a scsi controller. If you want to use the symbios directly load
the ncr53c8xx driver and not the symbios driver. I'not totally sure if that
will work but it is at least the right driver. That will basically cut out
use of the megaraid totally




Re: Please Help - SCSI RAID device very slow

1999-10-27 Thread Alan Cox

 I do want to use the hardware RAID.. I think I have been looking at this for
 too long.. I now think the INTEL i960 IS the MegaRAID controller.. I didn't
 realise it was registered as a PCI device... How come it connects to the
 Symbios controller?

Linux
   |
[MegaRAID]  ( Does raid stuff, drives scsi card)
   |
[Symbios]   ( SCSI card, does scsi not raid)
   |
 DISK



Please Help - SCSI RAID device very slow

1999-10-27 Thread Matthew Clark

Looking at /proc/scsi/scsi, it appears that linux is talking directly to my
MegaRAID controller..

Shouldn't it be talking to it via the SymBIOS controller?

Host: scsi2 Channel: 01 Id: 00 Lun: 00
  Vendor: MegaRAID Model: LD0 RAID5 26031R Rev: D
  Type: Direct-Access  ANSI SCSI Revision: 02

Any help would be appreciated.. the disk array is running unacceptably
slowly when writing - could this be the cause?

Regards,

Matthew Clark.
--
NetDespatch Ltd - The Internet Call Centre.
http://www.netdespatch.com




RE: Please Help - SCSI RAID device very slow

1999-10-27 Thread Matthew Clark

Does anyone know how I stop the MegaRAID using the onboard INTEL SCSI
Controller? (i960) It should be using my SYMBIOS card...???

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of Matthew Clark
 Sent: 27 October 1999 15:51
 To: linux-scsi; linux-raid
 Subject: Please Help - SCSI RAID device very slow


 Looking at /proc/scsi/scsi, it appears that linux is talking
 directly to my
 MegaRAID controller..

 Shouldn't it be talking to it via the SymBIOS controller?

 Host: scsi2 Channel: 01 Id: 00 Lun: 00
   Vendor: MegaRAID Model: LD0 RAID5 26031R Rev: D
   Type: Direct-Access  ANSI SCSI Revision: 02

 Any help would be appreciated.. the disk array is running unacceptably
 slowly when writing - could this be the cause?

 Regards,

 Matthew Clark.
 --
 NetDespatch Ltd - The Internet Call Centre.
 http://www.netdespatch.com