Thanks David for the script and pointers. I'm a week or so away trying this out in my box(s). Will post my results back.
It seems most of the examples use PXE, as anyone tried this using Uboot? I hope it would have similar capability to pick a custom initrd image w/ Lustre modules and mount/pivot_root to a mounted subdir. - Sridhar > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > David Golden > Sent: Monday, December 11, 2006 4:55 AM > To: [email protected] > Subject: Re: [Lustre-discuss] Lustre client as root filesystem > > followup with some scripts. I'm kind of ashamed of > some of them, follow them entirely at your own risk! > This is a "Brute Force and Ignorance" method. I meant > to write it up when I had refined it... just a little more... > but might as well chime in now, John might want to compare notes. > > "unity" is just what I started calling the shared root setup. > > > Assuming RHEL4 and Lustre 1.4.x (1.6.x will make this > easier I'd guess) > > -2. Be set up to PXE boot. > > -1. Have a lustre filesystem set up. > > 0. Rsync (or whatever) > an OS install to a subdir of it (outside scope of > this quick pseudo-howto). Hack its initscripts a bit, > similar to system-config-netboot's hack job. Make > rc.sysinit call rc.bindmounts to do a bunch of bindmounts > to split out hostspecific parts. The bit at the start is > to prevent little accidents, aborting early on if the > hostname isn't around: > > -8<---rc.bindmounts------------------- > HOSTNAME=`hostname` > if [ ! -d "/perhost/$HOSTNAME" ]; then > echo "No /perhost/$HOSTNAME Directory" > sulogin > umount -a > /usr/local/sbin/lustre_flush_cache > mount -n -o remount,ro / > sync > sync > sync > reboot -f > fi > > > mount --bind "/lustre1/home" /home > > mount --bind "/perhost/$HOSTNAME" /perhost/currenthost > > mount --bind /perhost/currenthost/etc/rc.d/rc3.d/ /etc/rc.d/rc3.d/ > mount --bind /perhost/currenthost/etc/rc.d/rc4.d/ /etc/rc.d/rc4.d/ > mount --bind /perhost/currenthost/etc/rc.d/rc5.d/ /etc/rc.d/rc5.d/ > mount --bind /perhost/currenthost/etc/xinetd.d/ /etc/xinetd.d/ > mount --bind /perhost/currenthost/etc/xinetd.conf /etc/xinetd.conf > mount --bind /perhost/currenthost/etc/ntp/step-tickers > /etc/ntp/step-tickers > mount --bind /perhost/currenthost/etc/adjtime /etc/adjtime > mount --bind /perhost/currenthost/etc/motd /etc/motd > mount --bind /perhost/currenthost/etc/crontab /etc/crontab > mount --bind /perhost/currenthost/etc/cups/certs/ /etc/cups/certs/ > mount --bind /perhost/currenthost/etc/sysconfig/hwconf > /etc/sysconfig/hwconf > mount --bind /perhost/currenthost/var/tmp/ /var/tmp/ > mount --bind /perhost/currenthost/var/lib/nfs/ /var/lib/nfs/ > mount --bind /perhost/currenthost/var/lib/random-seed > /var/lib/random-seed > mount --bind /perhost/currenthost/var/lock/ /var/lock/ > mount --bind /perhost/currenthost/var/run/ /var/run/ > mount --bind /perhost/currenthost/var/spool/ /var/spool/ > mount --bind /perhost/currenthost/var/log/ /var/log/ > mount --bind /perhost/currenthost/var/lib/logrotate.status > /var/lib/logrotate.status > mount --bind /perhost/currenthost/etc/sysconfig/network > /etc/sysconfig/network > mount --bind > /perhost/currenthost/etc/sysconfig/network-scripts/ifcfg-eth0 > /etc/sysconfig/network-scripts/ifcfg-eth0 > mount --bind > /perhost/currenthost/etc/sysconfig/network-scripts/ifcfg-eth1 > /etc/sysconfig/network-scripts/ifcfg-eth1 > > ------------------------------- > > > 1. Make an initrd using RH tools. tg3, libata and ata_piix > are hardware-specific, just our nics and hdds. > > -8<---unity_mkinitrd.basic---- > #!/bin/bash > mkinitrd --preload tg3 \ > --preload ksocklnd \ > --preload ptlrpc \ > --preload lov \ > --preload osc \ > --preload llite \ > --preload libata \ > --preload ata_piix \ > initrd.basic.img 2.6.9-42.0.2.EL_lustre.1.4.7.1smp > ------------------------------ > > 2. cpio-decompress the initrd and mutilate it. > > 2.1. Split its bin and sbin. > 2.2. make some extra directories in it- dhcp may fail without > a /tmp, and > lustre itself may need a mountpoint. > 2.3 rpm2cpio decomproess a bunch of redhat rpms, and rsync > 'em to the initrd. > As we're tftp booting, we can make an enormous initrd if we want > (see also: warewulf, and this is the main "brute force" part > of the comment > above...). Might want to strip out manual pages and stuff in > /usr/share > a bit to keep the size below the 16MByte threshhold where the > boot process > may get upset though. > > for i in \ > initrd.basic \ > initrd.extradirs \ > bash \ > coreutils \ > dhclient \ > glibc \ > libacl \ > libattr \ > libselinux \ > libtermcap \ > lustre \ > ncurses \ > net-tools \ > readline \ > util-linux \ > mktemp \ > iproute \ > initscripts \ > procps \ > sed \ > gawk \ > grep \ > pcre \ > ; do > > echo $i > rsync -a $i/ initrd/ > > done > > ---------------------------------- > > 2.4. replace the initrd's init with something else... > Yes, I was actually lazy enough to just stick bash > in the initrd and script in that instead of busybox's reduced > shell! After you've debugged a bit, you might want to > replace the "exit 1s" with reboots, so that clients > cycle and try again if something is transiently wrong, > instead of dumping you at a shell prompt within the initrd. > I think the /dev and /proc umount shenanigans at the end of this > script are wrong somehow, but do mostly work. you should > also watch out for /.oldroot/ being left accessible to users > when boot is complete - make sure you're not opening security holes... > > -8<---unityrc--------------------- > #!/bin/bash > > PATH=/sbin:/usr/sbin:/bin:/usr/bin > export PATH > > mount -t proc /proc /proc > echo Mounted /proc filesystem > echo Mounting sysfs > mount -t sysfs none /sys > echo Creating /dev > mount -o mode=0755 -t tmpfs none /dev > mknod /dev/console c 5 1 > mknod /dev/null c 1 3 > mknod /dev/zero c 1 5 > mkdir /dev/pts > mkdir /dev/shm > echo Starting udev > /sbin/udevstart > echo -n "/sbin/hotplug" > /proc/sys/kernel/hotplug > echo "Loading tg3.ko module" > insmod /lib/tg3.ko > > echo "Bringing up Network" > > echo "loopback" > ifconfig lo 127.0.0.1 netmask 255.0.0.0 up > > hostname '(none)' > > echo "DHCP: querying" > dhclient -pf /tmp/dhclient.pid -lf /tmp/dhclient.leases eth1 > >/tmp/dhclient.out 2>&1 > if [ $? -ne 0 ]; then > echo "ERROR! dhclient failed!" > exec /bin/bash > exit 1 > fi > if [ `hostname` == '(none)' ]; then > echo "ERROR! did not get a hostname - assuming that > there's Trouble." > exec /bin/bash > exit 1 > fi > > echo "DHCP: kill -9 dhclient, (hopefully) leaving interface up" > kill -9 $(</tmp/dhclient.pid) > > > echo "Lustre Init" > > echo "Loading libcfs.ko module" > insmod /lib/libcfs.ko > echo "Loading lnet.ko module" > insmod /lib/lnet.ko > echo "Loading ksocklnd.ko module" > insmod /lib/ksocklnd.ko > echo "Loading lvfs.ko module" > insmod /lib/lvfs.ko > echo "Loading obdclass.ko module" > insmod /lib/obdclass.ko > echo "Loading ptlrpc.ko module" > insmod /lib/ptlrpc.ko > echo "Loading lov.ko module" > insmod /lib/lov.ko > echo "Loading osc.ko module" > insmod /lib/osc.ko > echo "Loading mdc.ko module" > insmod /lib/mdc.ko > echo "Loading llite.ko module" > insmod /lib/llite.ko > > echo "Loading scsi_mod.ko module" > insmod /lib/scsi_mod.ko > echo "Loading sd_mod.ko module" > insmod /lib/sd_mod.ko > echo "Loading libata.ko module" > insmod /lib/libata.ko > echo "Loading ata_piix.ko module" > insmod /lib/ata_piix.ko > /sbin/udevstart > > > echo Mounting Lustre root filesystem > mount -t lustre -o user_xattr,acl > <YOUR_LUSTRE_SERVER>:/mds1/client /lustre1 > if [ $? -ne 0 ]; then > echo "ERROR! Lustre mount failed!" > exec /bin/bash > exit 1 > fi > > grep lustre1 /proc/mounts > if [ $? -ne 0 ]; then > echo "ERROR! Lustre mount not found!" > exec /bin/bash > exit 1 > fi > > > mount --bind /lustre1/centos4a /sysroot > mount --bind /lustre1 /sysroot/lustre1 > umount /lustre1 > > mount -t tmpfs --bind /dev /sysroot/dev > > umount /sys > umount /proc > umount /dev > echo Switching to new root > cd /sysroot > pivot_root /sysroot .oldroot > umount /.oldroot/dev > umount /.oldroot/proc > exec /sbin/init > > ---------------------------------- > > > 2.5 re-compress the initrd. > > 3. attempt to pxe-boot clients with your new setup. > 4. Attempt to debug the mess you've just made and try again :-) > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
