Debian RAID-1 installation (was: Re: [SLUG] Enterprise Platform)

Jamie Wilkinson Thu, 20 May 2004 19:54:18 -0700

This one time, at band camp, Jamie Wilkinson wrote:
>This one time, at band camp, Howard Lowndes wrote:
>>OK, so how do I install Debian with RAID and LVM?
>
>I can elaborate more later


So here's Anchor's RAID migration document, tweaked for Debian unstable
(previously it'd only been done on various flavours of Red Hat), based
on the migration I did last night (and I expect I'll be doing it again
today on a different machine).

I'm going to skip the parts that migrate an existing RAID configuration
to bigger disks or an alternate RAID scheme, and instead assume you've
got two disks in a box and you performed the install onto the first
disk (hda), thus leaving the second disk (hdb) unused.

* Partition hdb how you like it.  My server partitioning scheme is like
  this for IDE machines:

hda1    500M    /
hda2    2G      swap
hda3    Extended
hda5    2G      /var
hda6    4G      /usr
hda7    remaining   /data

/data contains bind mount points for /home, /var/lib/{mysql,postgres}

* Create a degraded RAID array.

  mdadm -C -l 1 -n 2 /dev/md0 /dev/hdb1 missing
  mdadm -C -l 1 -n 2 /dev/md1 /dev/hdb5 missing
  mdadm -C -l 1 -n 2 /dev/md2 /dev/hdb6 missing
  mdadm -C -l 1 -n 2 /dev/md3 /dev/hdb7 missing

  -l specifies the raid level, in this case RAID-1, and -n specifies the
  number of devices that will be in the array, in our case 2.  'missing'
  lets mdadm know that there is another device we haven't specified yet,
  and the array will be built on one half.

  If you're doing this on RAID-5, then you'd use two disks and a
  missing.

* Create the new filesystems:

  mke2fs -j /dev/md0
  ...

* Mount the new partitions.

  mkdir /newroot
  mount /dev/md0 /newroot
  cd /newroot
  mkdir usr var data
  mount ...
  mkdir data/var.lib.postgres
  mkdir -p var/lib/postgres
  mount -o bind data/var.lib.postgres var/lib/postgres

* Copy the data from the existing root to the new root

  cd /
  for each mountpoint on the old root:
    cp -ax $mountpoint/ /newroot/$mountpoint/

* Now comes the fun part.  Shutdown all services that are writing to the
  disk, using ps ax and netstat -lnp to find out who's still alive.

* Rsync the data that just got written

  for each mountpoint:
    rsync -avnx $mountpoint/ /newroot/$mountpoint/
    # double check that did what you thought, then remove the -n option

* Pivot the kernel onto the new root:

  mkdir /newroot/oldroot
  cd /newroot
  pivot_root . oldroot

  If you're on the console, you can
  exec chroot . /bin/sh <dev/console >dev/console 2>&1

  otherwise if you're playing tough-guy-migration via SSH, don't do
  this, yet

  mount -t proc proc /proc
  mount -t devpts devpts /dev/pts
  mount -t tmpfs tmpfs /tmp

  # and for 2.6 kernels
  mount -t sysfs sysfs /sys 

* Restart init and SSH:

  telinit u
  /etc/init.d/ssh restart

  Now if you're in tough guy mode, ssh into the new machine.

  fuser -vm /oldroot should show you a few kernel threads and your first
  ssh session.  If the ssh restart was successful and you're logged in,
  log out of the first ssh session.

* Umount the old root:

  see the processes holding up the umount
  
  fuser -mv /oldroot

  see the mounts holding up the umount
  cat /proc/mounts

  umount the virtual filesystems from oldroot:
  
  umount /oldroot/proc
  umount /oldroot/dev/pts
  
  and so on

  Chances are /proc/mounts says you've got a /dev/root.old and a
  /dev2/root2, and you've got some kernel threads attached to
  /oldroot/initrd:

  mount -o remount,ro /dev/root.old /oldroot/initrd
  mount -o remount,ro /dev2/root2 /oldroot

  umount -l /oldroot/initrd
  umount /oldroot

  The -l to umount is a recent feature that does a lazy umount... it
  removes the mount point from the mounted filesystems namespaces, so
  it's effectively gone, but it'll get properly umounted when all
  processes using it are finished.  It's a good idea to make it
  read-only first just so you don't break anything.

* Fix /etc/fstab and /etc/mtab

  /etc/fstab has the old hda filesystems on it, so fix that up.

  /etc/mtab has the old devices listed because the pivot_root doesn't
  update it, so fix that up too.  Cross check against /proc/mounts.

* Update the boot loader.

  Debian unstable uses grub, so somethign like this will install the
  first stage bootloader into both MBRs:

  grub
  grub> device (hd0) /dev/hda
  grub> device (hd1) /dev/hdb
  grub> root (hd0,0)
  grub> setup (hd0)
  grub> root (hd1,0)
  grub> setup (hd1)

  That tells grub that it's hd0 is Linux's hda, to use hda1 as the
  location for grubs files (/boot/grub, as /boot is on / in my case)
  and to install the MBR on /dev/hda.  The second pass is to do the same
  thing on hdb, using /dev/hdb1 and hdb's MBR.

  double check it worked by looking for the string GRUB in the output
  of

  dd if=/dev/hdX count=1 | strings

* Fix the initrd

  I always forget this part and end up booting off of half of the /
  array which has the effect of destroying the raid superblock,
  requiring /dev/md0 to be reconstructed afterwards.  So the moral is
  DON'T FORGET THIS PART.

  mkinitrd -k -o /boot/initrd.img.tmp

  Look in the temporary directory that mkinitrd left its files in,
  /tmp/mkinitrd.*/initrd and make sure that the file 'script' contains a
  line that builds the /dev/md0 array, like this:
  mdadm -A /devfs/md/0 -R -u ...
  It'll probably be building it using only /dev/hdb1 at this moment,
  that's fine.

* Fix mdadm.conf.

  /etc/init.d/mdadm and /etc/init.d/mdadm-raid will automagically build
  the remaining arrays at boot time if you get this right, otherwise the
  fsck will bomb out because /dev/md1 and friends are corrupted (i.e.
  don't really exist)

  So, in /etc/mdadm/mdadm.conf:

  DEVICE /dev/hda* /dev/hdb*
  ARRAY /dev/md0 devices=/dev/hda1,/dev/hdb1
  ARRAY /dev/md1 devices=/dev/hda5,/dev/hdb5
  ARRAY /dev/md2 devices=/dev/hda6,/dev/hdb6
  ARRAY /dev/md3 devices=/dev/hda7,/dev/hdb7

  Make sure you remember the DEVICE line, otherwise it'll still fail...

* Reconstruct the RAID array from the now free hda

  sfdisk -d /dev/hdb | sfdisk /dev/hda

  That'll copy the partition table from hdb, your good disk, to hda, the
  missing disk.

  Hot add the partitions to the array:

  mdadm -a /dev/md0 /dev/hda1
  mdadm -a /dev/md1 /dev/hda5
  mdadm -a /dev/md2 /dev/hda6
  mdadm -a /dev/md3 /dev/hda7

  ramp up the reconstruction speed:

  echo 1000000000 > /proc/sys/dev/raid/speed_limit_max

  watch the progress:

  watch "cat /proc/mdstat"

* Do the boot loader again, because it's fun, and likely stuff has moved
  around on hda1.

  Don't do anything on hda until the raid reconstruction is finished.


At this point you can continue using the machine, in fact as early as
the umount oldroot step you can restart all your services and the
machine will be back online: that's a downtime of only as long as it
takes to do the final rsync before pivoting.

I'd recommend rebooting soon after though so yuo can make sure you got
the bootloader and initrd part right.  During booting you can get away
with changing the kernel root= option to use /dev/hda1 if the raid array
isn't getting constructed in your initrd.img.tmp, and don't delete or
overwrite any of your initrds whilst you're debugging, only once it
boots without assistance should you overwrite the initrd.img that's
listed in the grub menu.lst.

-- 
[EMAIL PROTECTED]                           http://spacepants.org/jaq.gpg
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Debian RAID-1 installation (was: Re: [SLUG] Enterprise Platform)

Reply via email to