Re: Time to deprecate old RAID formats?

Doug Ledford Sat, 27 Oct 2007 17:19:07 -0700

On Sat, 2007-10-27 at 11:20 -0400, Bill Davidsen wrote:
> > * When using lilo to boot from a raid device, it automatically installs
> > itself to the mbr, not to the partition.  This can not be changed.  Only
> > 0.90 and 1.0 superblock types are supported because lilo doesn't
> > understand the offset to the beginning of the fs otherwise.
> >   
> 
> I'm reasonably sure that's wrong, I used to set up dual boot machines by 
> putting LILO in the partition and making that the boot partition, by 
> changing the active partition flag I could just have the machine boot 
> Windows, to keep people from getting confused.


Yeah, someone else pointed this out too.  The original patch to lilo
*did* do as I suggest, so they must have improved on the patch later.

> > * When using grub to boot from a raid device, only 0.90 and 1.0
> > superblocks are supported[1] (because grub is ignorant of the raid and
> > it requires the fs to start at the start of the partition).  You can use
> > either MBR or partition based installs of grub.  However, partition
> > based installs require that all bootable partitions be in exactly the
> > same logical block address across all devices.  This limitation can be
> > an extremely hazardous limitation in the event a drive dies and you have
> > to replace it with a new drive as newer drives may not share the older
> > drive's geometry and will require starting your boot partition in an odd
> > location to make the logical block addresses match.
> >
> > * When using grub2, there is supposedly already support for raid/lvm
> > devices.  However, I do not know if this includes version 1.0, 1.1, or
> > 1.2 superblocks.  I intend to find that out today.  If you tell grub2 to
> > install to an md device, it searches out all constituent devices and
> > installs to the MBR on each device[2].  This can't be changed (at least
> > right now, probably not ever though).
> >   
> 
> That sounds like a good reason to avoid grub2, frankly. Software which 
> decides that it knows what to do better than the user isn't my 
> preference. If I wanted software which fores me to do things "their way" 
> I'd be running Windows.

It's not really all that unreasonable of a restriction.  Most people
aren't aware than when you put a boot sector at the beginning of a
partition, you only have 512 bytes of space, so the boot loader that you
put there is basically nothing more than code to read the remainder of
the boot loader from the file system space.  Now, traditionally, most
boot loaders have had to hard code the block addresses of certain key
components into these second stage boot loaders.  If a user isn't aware
of the fact that the boot loader does this at install time (or at kernel
selection update time in the case of lilo), then they aren't aware that
the files must reside at exactly the same logical block address on all
devices.  Without that knowledge, they can easily create an unbootable
setup by having the various boot partitions in slightly different
locations on the disks.  And intelligent partition editors like parted
can compound the problem because as they insulate the user from having
to pick which partition number is used for what partition, etc., they
can end up placing the various boot partitions in different areas of
different drives.  The requirement above is a means of making sure that
users aren't surprise by a non-working setup.  The whole element of
least surprise thing.  Of course, if they keep that requirement, then I
would expect it to be well documented so that people know this going
into putting the boot loader in place, but I would argue that this is at
least better than finding out when a drive dies that your system isn't
bootable.

> > So, given the above situations, really, superblock format 1.2 is likely
> > to never be needed.  None of the shipping boot loaders work with 1.2
> > regardless, and the boot loader under development won't install to the
> > partition in the event of an md device and therefore doesn't need that
> > 4k buffer that 1.2 provides.
> >   
> 
> Sounds right, although it may have other uses for clever people.
> > [1] Grub won't work with either 1.1 or 1.2 superblocks at the moment.  A
> > person could probably hack it to work, but since grub development has
> > stopped in preference to the still under development grub2, they won't
> > take the patches upstream unless they are bug fixes, not new features.
> >   
> 
> If the patches were available, "doesn't work with existing raid formats" 
> would probably qualify as a bug.

Possibly.  I'm a bit overbooked on other work at the moment, but I may
try to squeeze in some work on grub/grub2 to support version 1.1 or 1.2
superblocks.

> > [2] There are two ways to install to a master boot record.  The first is
> > to use the first 512 bytes *only* and hardcode the location of the
> > remainder of the boot loader into those 512 bytes.  The second way is to
> > use the free space between the MBR and the start of the first partition
> > to embed the remainder of the boot loader.  When you point grub2 at an
> > md device, they automatically only use the second method of boot loader
> > installation.  This gives them the freedom to be able to modify the
> > second stage boot loader on a boot disk by boot disk basis.  The
> > downside to this is that they need lots of room after the MBR and before
> > the first partition in order to put their core.img file in place.  I
> > *think*, and I'll know for sure later today, that the core.img file is
> > generated during grub install from the list of optional modules you
> > specify during setup.  Eg., the pc module gives partition table support,
> > the lvm module lvm support, etc.  You list the modules you need, and
> > grub then builds a core.img out of all those modules.  The normal amount
> > of space between the MBR and the first partition is (sectors_per_track -
> > 1).  For standard disk geometries, that basically leaves 254 sectors, or
> > 127k of space.  This might not be enough for your particular needs if
> > you have a complex boot environment.  In that case, you would need to
> > bump at least the starting track of your first partition to make room
> > for your boot loader.  Unfortunately, how is a person to know how much
> > room their setup needs until after they've installed and it's too late
> > to bump the partition table start?  They can't.  So, that's another
> > thing I think I will check out today, what the maximum size of grub2
> > might be with all modules included, and what a common size might be.
> >
> >   
> Based on your description, it sounds as if grub2 may not have given 
> adequate thought to what users other than the authors might need (that 
> may be a premature conclusion). I have multiple installs on several of 
> my machines, and I assume that the grub2 for 32 and 64 bit will be 
> different. Thanks for the research.

No, not really.  The grub command on the two is different, but they
actually build the boot sector out of 16 bit non-protected mode code,
just like DOS.  So either one would build the same boot sector given the
same config.  And you can always use the same trick I've used in the
past of creating a large /boot partition (say 250MB) and using that same
partition as /boot in all of your installs.  Then they share a single
grub config (while the grub binaries are in the individual / partitions)
and from the single grub instance you can boot to any of the installs,
as well as a kernel update in any install updates that global grub
config.  The other option is to use separate /boot partitions and chain
load the grub instances, but I find that clunky in comparison.  Of
course, in my case I also made /lib/modules its own partition and also
shared it between all the installs so that I could manually edit the
various kernel boot params to specify different root partitions and in
so doing I could boot a RHEL5 kernel using a RHEL4 install and vice
versa.  But if you do that, you have to manually
patch /etc/rc.d/rc.sysinit to mount the /lib/modules partition before
ever trying to do anything with modules (and you have to mount it rw so
they can do a depmod if needed), then remount it ro for the fsck, then
it gets remounted rw again after the fs check.  It was a pain in the ass
to maintain because every update to initscripts would wipe out the patch
and if you forgot to repatch the file, the system wouldn't boot and
you'd have to boot into another install, mount the / partition of the
broken install, patch the file, then it would work again in that
install.


-- 
Doug Ledford <[EMAIL PROTECTED]>
              GPG KeyID: CFBFF194
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband

signature.asc
Description: This is a digitally signed message part

Re: Time to deprecate old RAID formats?

Reply via email to