Lenny and MD RAID on an E420R

Ross Halliday Tue, 01 Sep 2009 04:35:55 -0700

Hi list,

I'm wondering if someone with more experience can help me out here. I am
having endless trouble with getting a pair of servers up and running
with a Sun A5200 JBOD array.


At the moment both boxes have Lenny loaded up on them with limited
levels of success. Both make extensive use of software RAID.

Hardware configuration:
 Server A
 - OpenBoot PROM v3.31
 - 4x UltraSPARC II 450 MHz CPUs
 - 4 GB RAM
 - 2x 36 GB SCSI disks in front bays
 - OEM SCSI CD-ROM
 - 2x Sun-branded Q-Logic QLA2100 single port FC-AL HBAs
 - single Quad Fast Ethernet adapter
 - using serial console on ttya/tty0 at 38400-8N1
 Server B
 - OpenBoot PROM v3.33
 - otherwise same as A

 Sun A5200 JBOD array
 - 2x interface boards
 - 4x GBICs total (2 per interface board)
 - split-loop configuration
 - each server having one connection to each 'loop'
 - 22 18 GB FC-AL disks

note: for the below steps I took, the other server was disconnected from
the array to avoid access conflicts

So Server A boots properly and loads firmware for the QLA2100 FC HBAs.
After some experimenting with multipath-tools and single-loop config on
the A5200 (seems impractical due to changing LUNs on the array) I
decided to uninstall the package. After it came out and a reboot later,
I can read all of the disks and create and manage arrays on the
FC-attached devices, but when Server A reboots the partition tables on
all but one of the drives disappear. I have told fdisk to rescan and
they show up every time... until I reboot. I have also used dd to zero
out the first 10 MB of the disks and started from scratch - but again
every reboot of Server A the partition table disappears.

So I built up Server B in a similar fashion with the idea in my head
that multipath-tools screwed up access to the udev names like /dev/sdc.
Booted up fine, got SILO booting off of a software RAID 1 just like the
other system. Added in the qla2xxx firmware, watched the initrd rebuild,
rebooted, and my disks showed up! So I partitioned all of the disks from
Server B, made a new array, rebooted, and init failed to find the root
partition so dumped me into the initramfs emergency shell. mdadm
--assemble --scan finds and puts together the arrays on the internal
disks just fine - I can mount them and chroot into the real root
filesystem. I figured that maybe SILO suddenly didn't like "partition=0"
in silo.conf so I updated all references (silo.conf, fstab, mdadm.conf,
and ran 'mdadm --assemble /dev/md2 -m0 /dev/sda2 /dev/sdb2') so the
array should now appear as /dev/md2 instead of /dev/md0. Upon reboot it
now thinks that /dev/md2 does not exist. Again, an 'mdadm --assemble
--scan' finds and puts together everything no problem.

Any suggestions from the experts? At the moment I'm burning a copy of
Solaris 10 but I'd really like to stick with Debian since we've
standardized on it for all Linux systems organization-wide.

Please let me know what additional information I should provide,
although my ability to perform captures of screen output during boot may
be limited as the boot messages become garbled and strewn randomly
across my screen from the point of "console handover: boot [earlyprom0]
-> real [tty0]". If anybody has suggestions on that either that would be
wonderful! (Using a Livingston/Lucent Portmaster 2e)


Thanks!


---
Ross Halliday
Network Operations
WTC Communications

Office: 613-547-6939 x203
Helpdesk: 866-547-6939 option 2
http://www.wtccommunications.ca 


--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Lenny and MD RAID on an E420R

Reply via email to