Re: [SLUG] Two grub/RAID questions?

Grant Parnell Thu, 10 Aug 2006 17:31:08 -0700

On Thu, August 10, 2006 11:56 pm, Luke Kendall wrote:
> On 10 Aug, Grant Parnell wrote:
>>  I'm not going to attempt to answer all these questions as I'm out of
>> time...
>
> Thanks for taking the time at all!  You're the only reply I've had so
> far.
>
>>  Software Raid devices do not have partitions on them.
>
> Yes, it eventually occurred to me to check on my old system, and fdisk
> made the same sort of complaints.  So the real problem I have to solve
> there is why the boot sequence is deciding that the drive needs checking
> (and why it's using fdisk to look at a raid volume).
>
>> You would typically
>>  mke2fs -j /dev/md0 to format it, alternately put LVM on it but that's
>>  another topic.
>
> Actually, I've already created the mirror from a pre-existing ext2
> partition by adding the missing drive to the raid set and letting it
> rebuild.  /dev/md0 mounts fine manually and seems fine normally (except
> for the boot complaints.
>
> E.g. dropping out of maintenance mode and continuing to boot up from the
> old /dev/hda6 goes on to mount /home on /dev/md2 and everything works
> fine.
>
>>  Software RAID mirroring has a nice side effect, you can mount half of
>> the
>>  mirror in it's own right in case of emergency. Similarly, the boot
>> files,
>>  namely the kernel and initrd will appear in consistent locations within
>>  each  component device.
>>
>>  Thus if /dev/hda1 and /dev/hdc1 are components of /dev/md0 you can do
>> this:-
>>  $ touch /boot/hereplease
>>  $ grub
>>  grub> find /boot/hereplease
>>  (hd0,0)
>>  (hd1,0)
>>  {your actual results may vary}
>>  grub> root (hd0,0)
>>  grub> setup (hd0)
>>  grub> root (hd1,0)
>>  grub> setup (hd1)
>>  grub> quit
>>  $
>>
>>  I typically do a find to make sure GURB's idea of the partitions is in
>>  line with what I am thinking. The setup command does all the hard work
>> for
>>  you of finding the stage1 and stage2 files and writing the boot
>> sectors.
>>  Another thing to help is to write yourself a device.map file somewhere
>> in
>>  /tmp or something just for doing the recovery work.
>
> As a grub novice, I have to admit I don't know what you mean by a
> device.map file.  It sounds like a helper file you might use with grub
> interactively when recovering from some major boot problem.
>
> But leaving that aside, in a nutshell, are you saying that grub can't be
> And that if a drive fails, I have to run grub interactively and point it
> at the drive within the mirror?!


In my experience I have found that installing grub on each component
device is the most reliable. Have a look at an existing
/boot/grub/device.map file. It's a very simple file that explains to grub
the relationship between (hdX) and /dev/hdX so theoretically you could use
a file like this:-

(fd0)  /dev/fd0
(hd0)  /dev/md0
(hd1)  /dev/hda6

I can't remember why I don't do it that way, possibly grub does not
understand software RAID as you suppose. I don't think lilo does either.
> In contrast, lilo uses the raid device (via the kernel), so if a drive
> fails everything continues to work seamlessly.  (I know, it happened to
> me several times on my old system.)  In fact you have to do extra work
> to *discover* that there's a problem that needs attention.

I think you're confused. If grub or lilo's purpose is to load the kernel
then how would it be able to use the kernel raid driver to load the
kernel?
Oh! I know what you mean, when you run "lilo" at the command prompt the
INSTALLER part knows about the raid device and knows to put the master
boot block on both root devices. The boot-time executable therefore does
NOT know about software raid. Therefore you can't use anything other than
RAID-1 for /boot when using GRUB or LILO.

As a footnote, if I'm wrong about it installing the MBR on both root
devices the effect on LILO only would be that the secondary component
drive would not have an up-to-date record of where the kernel and initrd
blocks are on the component partition. GRUB on the other hand knows a
little more about finding blocks on partitions/filesystems and will cope
better. It's why you don't need to run 'grub' every time you change
grub.conf/menu.lst like you have to run 'lilo' every time you change
/etc/lilo.conf.

The above is why when using lilo I carefully partition the drives to
ensure the same geometry is used and the same number of cylinders are used
for the boot partition and the same partition number within the device is
used. This way you can literally swap the IDE cables around if your
component drives where /dev/hda and /dev/hdc.

Still, you can get caught if you have say 3 drives with /boot on a RAID-1
mirror with hot spare drive. The hot spare never gets it's lilo/grub
updated! You have to soft fail one of the component drives to force use of
the hot-spare to force an update - OR the dodgy way is to mount the spare
device directly and rsync it then tell grub or lilo to rewrite boot
sectors on that device. It's a hot spare so the raid system won't care.

Hopefully that explains a lot of the black art of booting software raid.

-- 
---<GRiP>---
Electronic Hobbyist, Former Arcadia BBS nut, Occasional nudist, Linux
Guru, SLUG President, AUUG and Linux Australia member, Sydney
Flashmobber, Tenpin Bowler, BMX rider, Walker, Raver & rave music lover,
Big kid that refuses to grow up. I'd make a good family pet, take me home
today!

Some people actually read these things it seems.


-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] Two grub/RAID questions?

Reply via email to