I did some more ZFS testing in 9.0_BETA, sorry for the length. There are a lot of advantages and use cases in getting ZFS more completely integrated into the NetBSD universe. Raidframe+LVM is great too, but there are things that are harder to do with it.
One of the purposes of these tests was to see how NetBSD ZFS interacted with other ZFS implementations in other OSs. The following was done: "zfs export pool" created from NetBSD and "zfs import pool" into FreeBSD and SmartOS. Both of these worked completely as expected. with full read and write on the pools. There are ZFS features not enabled on pools created with NetBSD, but this does not appear to be a problem. As long as a "zfs upgrade pool" is not performed you can import and export as much as you wish. I have not tried any Linux implementations. Going the other direction did not work quite as expected. On pools created on FreeBSD, the ZFS features that are turned on will not allow the pool to import as writable on NetBSD. This is not really an unexpected thing, given that features on turned on in FreeBSD that are not in NetBSD. The pool will import readonly and appears to work as expected in that case. It should be possible to create pools in FreeBSD with features disabled that will allow a better import. For SmartOS, pools created there will (might?? I only use the filesystems I presented to the ISO) have a GPT header attached to them. When NetBSD sees that, by default, it will create wedges. As a general thing, this wouldn't be a problem. However, "zpool import" does not appear to see dk devices or refuses to deal with them and thus no pools were noticed for import. You can sort of force the issue by creating a directory with symlinks to the /dev/dkN devices in them and then do a "zpool import -d /dev/directory". When this is attempted, the DOMU guest I was using panicked: panic: kernel diagnostic assertion "seg <= BLKIF_MAX_SEGMENTS_PER_REQUEST" failed: file "/usr/src/sys/arch/xen/xen/xbd_xenbus.c", line 1032 I am running an older 8.99.xx with Xen 4.8.x, but as this was from the guest, I suspect that a bug is still present. From the BT, it appears that it tried to do the import, but failed somewhere in the lower level disk routines. I did not try doing this with FreeBSD. Other things of note: Assuming you want to have separate /var and /usr, trying to use ZFS for those is a challenge. One of the biggest issues is that /zbin/zfs and /zbin/zpool have linked against shared libraries in /usr/lib. This is probably not for the best. Further, mostly because of this, you can't reasonably have a /etc/rc.d/zfs start the module and make it usable for critical_filesystems in /etc/rc.conf (you can fake it out by putting the contents of the shared libraries in /usr/lib and then mount over it, but that is a bit ugly). Having /var and /usr use LVM is possible mostly because /etc/rc.d/lvm exists and is started before critical_filesystems. ZFS should probably work the same way. Another item is that mountpoint=legacy really does not work. Right now, there is neither support in the general mount system call for ZFS nor does a /sbin/mount_zfs exist. You can actually get some of this effect by a simple shell script in /sbin called mount_zfs that translates the request into a "/sbin/zfs set mountpoint=XXX" call followed by a "zfs mount XXX". I wrote a quick and dirty one of these, and with a quick and dirty /etc/rc.d/zfs, it will allow the effect (mostly) having /var and /usr be from a ZFS pool. An unexpected problem was with /var in this configuration, however. For reasons that are not entirely obvious postfix will not start if /var is from a ZFS pool. It errors in an unexpected way when trying to start postfix master. I did not have time to do a full ktrace to see what it was trying to use, but something is not quite supported, or not supported in the same way as a FFS. An alternative might be to use a zvol with a FFS in it for /var. This mostly does work as expected from the command line interactively. However, it won't work from boot because of a couple of problems. 1) The names of the devices confuse fsck and it fails on boot. The zvol will be called something like /dev/zvol/dsk/pool/volname in fstab and fsck will want to try and clean /dev/zvol/dsk/pool/rvolname which is wrong for ZFS zvols. You need to use /dev/zvol/rdsk/pool/volname. You can put a symlink in and it will mostly work, but that is highly manual. LVM has a symlink tree it creates that points the devices at the right places in /dev/mapper which allows fsck and mount to work pretty much as needed. We could something like that for ZFS or teach fsck about zvol names. 2) Suppose you wanted to avoid fsck by trying to use wapbl logging. There are missing IOCTLs that will prevent the log directive from working, so "mount -o log /dev/zvol/fsk/pool/volname" will not work. The end result is that while you can define a zvol and put a FFS on it, you can't clean it on boot and you can't use FFS wabpl. Disklabel won't work with zvols, but this is not really fatal. I don't think they work with LVM volumns either. They make little sense in either case. You can put a GPT label on a zvol, but dkctl can't create wedges due to missing IOCTLs. I never tried doing this with a LVM volumn, so I don't really know what happens in that case. You can not use a zvol for swap. Again, probably not fatal, as I don't think you can use a LVM volumn for swap either, but this may stand in the way of making the root filesystem a ZFS filesystem. All in all, however, this all does work pretty nicely for a lot of things one would want to do. I will probably end up moving my Xen disks from raidframe+lvm to zvols. -- Brad Spencer - [email protected] - KC8VKS - http://anduin.eldar.org
