For anyone interested, I started the 2-year overhaul of my Poweredge
2900 server, including ZFS. After reading a bit on multiple single
RAID-0 volumes pitfalls (wrt HDD firmwares, ZFS itself), I ordered a
cheap SAS6/iR controller from Dell -it was cheaper than second hand on
ebay!- as a drop-in replacement to the original Perc5 integrated, in
order to expose 10 independent drives and let ZFS do its thing; I
suppose a newer 6GB/s card like HBA200 would have worked as well, given
the proper cables. The old SAS6/iR cannot handle drives over 2TB but
this is ok with me.

I tried hard, really hard, moving to ubuntu 12.04 LTS in recognition of
superior documentation and ZFS support, compared to Debian. Besides, it
seems like Dell now supports Ubuntu on its newer servers. Well, good
luck with that: I was unable to get to a reliable boot sequence that
would properly bring up the bonded-bridged network (with vlans), zfs
volumes, nfs exports, and kvm-based VMs the server runs. Getting the
network to work was quite difficult, and when I thought everything was
fine the pool or zvols started refusing to mount. I put the blame on
race conditions in upstart which for me is the kind of reinvention we
absolutely don't need in linux, to the same level as cdrtools->wodim or
oss->alsa->pulse or grub->grub2 or …
So I came back to Debian Squeeze which still uses system V init scripts
and henceforth, work. Since I use squeeze-backports (for qemu-kvm, linux
kernel 3.x) I had to compile zfs from the source package found in
ubuntu's PPA, this is kind of annoying but it works.

On either ubuntu or debian you may get bitten by the usual dance of hdd
assignments, something you want to avoid in a mirrored pool or simply to
identify the physical location of a failed drive. To get over this
you'll want to use disk-by-path names when creating/importing a zpool.
On Debian, with my SAS6/IR controller I had to upgrade udev to the
version found in testing (that's v. 175, the one that comes with ubuntu
12.04 LTS); otherwise drives from the controller were missing from the
by-path directory.
Since by-path does not use the most legible names, and the zdev.conf
file trick works for a single pool (AFAIK), I crafted udev rules to get
persistent and legible names. My
/etc/udev/rules.d/70-persistent-zpool-sas6ir.rules file looks like this
…
Code:
--------------------
    ENV{DEVTYPE}=="disk", ENV{SUBSYSTEM}=="block", 
ENV{ID_PATH}=="pci-0000:01:00.0-sas-0x1221000000000000-lun-0", 
SYMLINK+="disk/zpool-sas6ir/0-sas_A-main_up"
  ENV{DEVTYPE}=="partition", ENV{SUBSYSTEM}=="block", 
ENV{ID_PATH}=="pci-0000:01:00.0-sas-0x1221000000000000-lun-0", 
SYMLINK+="disk/zpool-sas6ir/0-sas_A-main_up-part%n"
  …
  ENV{DEVTYPE}=="disk", ENV{SUBSYSTEM}=="block", 
ENV{ID_PATH}=="pci-0000:01:00.0-sas-0x500188b335af2109-lun-0", 
SYMLINK+="disk/zpool-sas6ir/9-sas_B-flexbay"
  ENV{DEVTYPE}=="partition", ENV{SUBSYSTEM}=="block", 
ENV{ID_PATH}=="pci-0000:01:00.0-sas-0x500188b335af2109-lun-0", 
SYMLINK+="disk/zpool-sas6ir/9-sas_B-flexbay-part%n"
--------------------
 … to this effect … 
Code:
--------------------
    ls -la /dev/disk/zpool-sas6ir/
  total 0
  drwxr-xr-x 2 root root 640 Aug 14 22:50 .
  drwxr-xr-x 7 root root 140 Aug 14 22:50 ..
  lrwxrwxrwx 1 root root   9 Aug 14 22:50 0-sas_A-main_up -> ../../sdl
  lrwxrwxrwx 1 root root  10 Aug 14 22:50 0-sas_A-main_up-part1 -> ../../sdl1
  lrwxrwxrwx 1 root root  10 Aug 14 22:50 0-sas_A-main_up-part9 -> ../../sdl9
  …
  lrwxrwxrwx 1 root root   9 Aug 14 22:50 9-sas_B-flexbay -> ../../sdj
  lrwxrwxrwx 1 root root  10 Aug 14 22:50 9-sas_B-flexbay-part1 -> ../../sdj1
  lrwxrwxrwx 1 root root  10 Aug 14 22:50 9-sas_B-flexbay-part9 -> ../../sdj9
--------------------
 … allowing commands such as "zpool import -d /dev/disk/zpool-sas6ir".
This zpool looks so: 
Code:
--------------------
    # sudo zpool status
  pool: rz2-120730
  state: ONLINE
  scan: resilvered 851G in 11h30m with 0 errors on Sat Aug  4 11:06:48 2012
  config:
  
        NAME                 STATE     READ WRITE CKSUM
        rz2-120730           ONLINE       0     0     0
          raidz2-0           ONLINE       0     0     0
            1-sas_A-main_up  ONLINE       0     0     0
            2-sas_A-main_up  ONLINE       0     0     0
            4-sas_B-main_lo  ONLINE       0     0     0
            5-sas_B-main_lo  ONLINE       0     0     0
            6-sas_B-main_lo  ONLINE       0     0     0
            7-sas_B-main_lo  ONLINE       0     0     0
            8-sas_B-flexbay  ONLINE       0     0     0
            9-sas_B-flexbay  ONLINE       0     0     0
            3-sas_A-main_up  ONLINE       0     0     0
            0-sas_A-main_up  ONLINE       0     0     0
  
  errors: No known data errors
  # sudo zpool list
  NAME         SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
  rz2-120730  18.1T  8.71T  9.42T    48%  1.00x  ONLINE  -
--------------------
(why the 3 and 0 at the end of the list? because I initially created the
pool with sd* names, and I don't want to destroy 9TB of data just for
cosmetics)

I intend to create other udev file for other cheap SATA port multiplier
controllers I have allowing to move the physical drives to another
controller or another machine. For the moment I haven't setup a separate
journal or log device, a quick test didn't show a difference on this
machine. I may come back to that. (BTW, zdev.conf does not allow giving
fancy names to ZIL or L2ARC devices, while a udev rule will)

I may start using zvols for my VMs instead of image files. I guess that
would give me LVM2 without LVM2… I already replaced the loop-mounted
large file images I used as TimeMachine backup volumes with zvols, works
fine.

df is completely amiss when it reports available space on the zpool, but
ok with the traditionally mounted and formatted zvols.

Leaving the Raid-5 hardware controller means I have lost the patrol
feature which used to check the arrays for defects. So I need to code a
daemon (or a simple cron) to scrub weekly and report. And probably
configure smartd to execute routine checks.


Thanks to upstart my squeezeboxes were out of order for quite a while. I
shall remember that.


------------------------------------------------------------------------
epoch1970's Profile: http://forums.slimdevices.com/member.php?userid=16711
View this thread: http://forums.slimdevices.com/showthread.php?t=92904

_______________________________________________
unix mailing list
[email protected]
http://lists.slimdevices.com/mailman/listinfo/unix

Reply via email to