Three questions:

1) Do you install Ubuntu so that you can choose raid, by installing from
the Alternate CD?

2) Has anyone here got a Grub system booting from a raid1 / (slash) root
filesystem?  (Google for grub and raid and you find lots of problems and
a few people saying it can be done.)

3) Can you tell Ubuntu to use Lilo instead of Grub as the boot loader?

Gory details below, mainly for posterity in case some other poor sap
makes the same mistakes I do.

Last minute thought - I should be able to destroy the /dev/md0 on "/"
and recreate it properly, and then copy the files back on.  Yippee!

I do think though that for grub and raid (unlike lilo and raid), the
best you can do is double up the stanza for each kernel you want to
boot and manually choose the working drive when a drive in the raid
mirror fails?

--- gory details -----

I've been trying to retrofit raid mirroring on a system after installing
Ubuntu 6.  This seemed to work out okay until I got to the point where
the fsck failed during boot because /dev/md0's superblock claimed a
different size to the partition table entry:

    fsck 1.38 (30-Jun-2005)
    /dev/md0 is mounted.  fsck 1.38 (30-Jun-2005)
    fsck 1.38 (30-Jun-2005)
    e2fsck 1.38 (30-Jun-2005)
    The filesystem size (according to the superblock) is 3146724 blocks
    The physical size of the device is 3146704 blocks
    Either the superblock or the partition table is likely to be corrupt!
    Abort? yes
    

(A google search lead me to
https://launchpad.net/distros/ubuntu/+source/debian-installer/+bug/13076
which included this comment which interested me, because when I created
my raid I also found the devices all had the same UUID that I had to
correct:

    I had the exact same problem as described in comment #2. It actually 
appears to
    be related to the mdadm 1.81 UUID bug, but I can't seem to find the 
reference to
    that bug right now.
    
    Only one of my RAID-1 arrays (/) started up , because mdadm 1.81 created 
all of
    my arrays with the same UUID.
)


Running fsck on /dev/md0 manually was unable to fix the problem.
Running fsck manually on /dev/hda7 and /dev/sda7 did seem to repair all
problems (the same problems), except that it made no difference to the
correctness of /dev/md0 that was made from them.

In short, I can't boot from it.

I just found why fsck reported the superblock error.  Yes, this is
exactly what I did wrong - I mke2fs'd the individual raid components and
added them together, since I was turning an existing ext3 slash into a
mirror.  Damn:

http://www.linuxjournal.com/node/5653/print says:

    8. Create an ext2 filesystem on /dev/md0 using the command mke2fs
    /dev/md0. Do not mke2fs on the RAID-1 component partitions,
    in this case /dev/hda2 and /dev/hdc2. If you do not create an
    ext2 filesystem on /dev/md0, then e2fsck /dev/md0 will return
    an error message, something like this:
    
    The filesystem size (according to the superblock) is
    2104483 blocks. The physical size of the device is 2104384 blocks.
    Either the superblock or the partition table is likely to be corrupt.
    
    This is because mkraid writes the RAID superblock near the
    end of the component partitions. e2fsck does not recognize
    the RAID superblock that has caused the physical size to be
    smaller. You can mount /dev/md0 at this point, and even use
    /usr, but the ext2 filesystem superblock contains incorrect
    information. You may not notice problems but you should not use
    the filesystem in this state. You will not be able to boot and
    mount /dev/md0 unless you turn off the filesystem checking by
    making the appropriate entry in fstab (e.g., /dev/md0 /usr ext2
    defaults 1 0). The 0 at the end of the line causes e2fsck to be
    skipped. Do not do this unless you have to fix your RAID. Make
    /dev/md0 an ext2 filesystem.



Rebooting to an older Ubuntu 6 on a different partition, I was unable to
mount either device in the raid array in question.

If I tried to mount a device in the array, mount reported that it was
already mounted.  If I forced a mount, it didn't know the filesystem
type.  If I told it ext3, the filesystem mounted but appeared to be
empty.

When I unmounted the device (/dev/hda7), "mount" reported that it
wasn't mounted. (?!)

I thought the data might be lost, but by booting up as far as I can
from the semi-hosed system, I can at least see the slash filesystem and
mount other areas, and copy everything off.  (Interesting to see a
handful of errors in the copy, that match what fsck reported as
problems.)

I'm thinking I should install Ubuntu again from scratch, and redo the
10GB of extra package installation and all the configuration for mail
etc. again. :-(

I gather I do this by installing from the Alternate CD, which is less
beautiful but gives you more control over the installation?


Is there a way to note the list of all the packages I installed, so I
can avoid spending another 4 hours selecting the packages again?

A bit of google searching on ubuntu and raid strongly suggests that
grub just doesn't work properly with a mirrored boot and/or root.
(1: Failed drives cause devices to change name. 2: Even after installing
grub to both devices in the mirror, you still have to have double
stanzas in menu.lst for each raw device so you *manually* choose to
boot off the other device in the event of failure.)

To contradict myself, this page indicates someone doing it happily with
Ubuntu 6.06:

http://users.piuha.net/martti/comp/ubuntu/raid.html

I gather that in contrast, lilo stores the actual locations for the
kernel images on both devices and *also* knows to try each device in
event of failure.

Does anyone here have any other tips on installing Ubuntu onto a
software raid mirror?

Can you choose to use Lilo with Ubuntu?

Some config details below.

luke

----------------- fdisk /dev/hda ----------------------------
Disk /dev/hda: 200.0 GB, 200049647616 bytes
255 heads, 63 sectors/track, 24321 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hda1   *           1        5737    46082421    7  HPFS/NTFS
/dev/hda2            5738        5992     2048287+   b  W95 FAT32
/dev/hda3            5993       24321   147227692+   5  Extended
/dev/hda5            5993        6057      522081   82  Linux swap / Solaris
/dev/hda6            6058        7624    12586896   83  Linux
/dev/hda7            7625        9191    12586896   fd  Linux raid autodetect
/dev/hda8            9192       24321   121531693+  fd  Linux raid autodetect

----------------- fdisk /dev/sda ----------------------------
Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1        5737    46082421    7  HPFS/NTFS
/dev/sda2            5738        5992     2048287+   b  W95 FAT32
/dev/sda3            5993       30401   196065292+   5  Extended
/dev/sda5            5993        6057      522081   82  Linux swap / Solaris
/dev/sda6            6058        7624    12586896   83  Linux
/dev/sda7            7625        9191    12586896   fd  Linux raid autodetect
/dev/sda8            9192       24321   121531693+  fd  Linux raid autodetect
/dev/sda9           24322       30401    48837568+  83  Linux
----------------- mount ----------------------------
/dev/hda7 on / type ext3 (rw,errors=remount-ro)
proc on /proc type proc (rw)
/sys on /sys type sysfs (rw)
procbususb on /proc/bus/usb type usbfs (rw)
udev on /dev type tmpfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)

-------------- After booting up in the old /dev/hda6 Ubuntu ----------
[EMAIL PROTECTED]:~# mount
/dev/hda6 on / type ext3 (rw,errors=remount-ro)
proc on /proc type proc (rw)
/sys on /sys type sysfs (rw)
varrun on /var/run type tmpfs (rw)
varlock on /var/lock type tmpfs (rw)
procbususb on /proc/bus/usb type usbfs (rw)
udev on /dev type tmpfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
devshm on /dev/shm type tmpfs (rw)
lrm on /lib/modules/2.6.15-25-386/volatile type tmpfs (rw)
/dev/hda1 on /C type ntfs (rw,nls=utf8,umask=007,gid=46)
/dev/hda2 on /D type vfat (rw,utf8,umask=007,gid=46)
/dev/md2 on /home type ext3 (rw)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
----------------- /etc/mdadm/mdadm.conf ----------------------------
DEVICE /dev/hda* /dev/sda*
ARRAY /dev/md0 devices=/dev/hda7,/dev/sda7
ARRAY /dev/md2 devices=/dev/hda8,/dev/sda8
----------------- mdadm -E /dev/md0 ----------------------------
[EMAIL PROTECTED]:~# mdadm -E /dev/md0
mdadm: No super block found on /dev/md0 (Expected magic a92b4efc, got 5c5b8cb7)
----------------- mdadm -D /dev/md0 ----------------------------
/dev/md0:
        Version : 00.90.03
  Creation Time : Sun Aug  6 17:49:47 2006
     Raid Level : raid1
     Array Size : 12586816 (12.00 GiB 12.89 GB)
    Device Size : 12586816 (12.00 GiB 12.89 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sun Aug 27 22:17:05 2006
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 198d19c1:54e1b9b4:06d2b37b:ad06ae48
         Events : 0.1298

    Number   Major   Minor   RaidDevice State
       0       8        7        0      active sync   /dev/sda7
       1       3        7        1      active sync   /dev/hda7
----------------- mdadm -D /dev/md1 ----------------------------
/dev/md1:
        Version : 00.90.03
  Creation Time : Sun Aug  6 15:03:29 2006
     Raid Level : raid1
     Array Size : 121531584 (115.90 GiB 124.45 GB)
    Device Size : 121531584 (115.90 GiB 124.45 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Sat Aug 12 09:24:07 2006
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : 4da55295:9574e998:08b9b3cb:7a82a70b
         Events : 0.51059

    Number   Major   Minor   RaidDevice State
       0       8        8        0      active sync   /dev/sda8
       1       3        8        1      active sync   /dev/hda8

-------------- Weird stuff --------------------------------------
After mount -f -t ext3 /dev/hda7 /mnt/tmp
-------------------- mount --------------------------------------
/dev/hda6 on / type ext3 (rw,errors=remount-ro)
proc on /proc type proc (rw)
/sys on /sys type sysfs (rw)
varrun on /var/run type tmpfs (rw)
varlock on /var/lock type tmpfs (rw)
procbususb on /proc/bus/usb type usbfs (rw)
udev on /dev type tmpfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
devshm on /dev/shm type tmpfs (rw)
lrm on /lib/modules/2.6.15-25-386/volatile type tmpfs (rw)
/dev/hda1 on /C type ntfs (rw,nls=utf8,umask=007,gid=46)
/dev/hda2 on /D type vfat (rw,utf8,umask=007,gid=46)
/dev/md2 on /home type ext3 (rw)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/dev/hda7 on /mnt/tmp type ext3 (rw)
/dev/sdb1 on /media/PD 12X type vfat 
(rw,nosuid,nodev,quiet,shortname=mixed,uid=1000,gid=1000,umask=077,iocharset=utf8)
-------------------- umount --------------------------------------
[EMAIL PROTECTED]:~# umount /dev/hda7
umount: /dev/hda7: not mounted
umount: /dev/hda7: not mounted


-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Reply via email to