Re: r253070 and disappearing zpool
On Thu, Jul 25, 2013 at 04:33:58PM +0100, Johannes Totz wrote: On 24/07/2013 12:47, Andriy Gapon wrote: on 22/07/2013 23:38 Pawel Jakub Dawidek said the following: On Mon, Jul 22, 2013 at 10:29:40AM +0300, Andriy Gapon wrote: I think that this setup (on ZFS level) is quite untypical, although not impossible on FreeBSD (and perhaps only FreeBSD). It's untypical because you have separate boot pool (where loader, loader.conf and kernel are taken from) and root pool (where / is mounted from). As I said elsewhere, it is pretty typical when full disk encryption is used. I am judging by the number of reports / amount of feedback so far. I'm using a similar configuration too, where I have a USB stick with unencrypted kernel and /boot bits which load a GELI keyfile (from its own pool zboot), and then the rest of the system starts up from the fully encrypted HDD (from another pool zsystem, so boot and rootfs are on different pools). I'm not sure I understand the problem though. What exactly broke after your commit? The pool that contains the bits that would normally go to /boot is not imported automatically, but the rest is working (ie. /boot symlink pointing to nowhere)? Or does booting somehow fail? /boot disappears and becomes a broken symlink. Glen pgpAkYtlM9f09.pgp Description: PGP signature
Re: r253070 and disappearing zpool
On 24/07/2013 12:47, Andriy Gapon wrote: on 22/07/2013 23:38 Pawel Jakub Dawidek said the following: On Mon, Jul 22, 2013 at 10:29:40AM +0300, Andriy Gapon wrote: I think that this setup (on ZFS level) is quite untypical, although not impossible on FreeBSD (and perhaps only FreeBSD). It's untypical because you have separate boot pool (where loader, loader.conf and kernel are taken from) and root pool (where / is mounted from). As I said elsewhere, it is pretty typical when full disk encryption is used. I am judging by the number of reports / amount of feedback so far. I'm using a similar configuration too, where I have a USB stick with unencrypted kernel and /boot bits which load a GELI keyfile (from its own pool zboot), and then the rest of the system starts up from the fully encrypted HDD (from another pool zsystem, so boot and rootfs are on different pools). I'm not sure I understand the problem though. What exactly broke after your commit? The pool that contains the bits that would normally go to /boot is not imported automatically, but the rest is working (ie. /boot symlink pointing to nowhere)? Or does booting somehow fail? The /boot/ has to be unencrypted and can be stored on eg. USB pendrive which is never left unattended, unlike laptop which can be left in eg. a hotel room, but with entire disk encrypted. As we discussed elsewhere, there are many options of configuring full disk encryption. Including decisions whether root filesystem should be separate from boot filesystem, choice of filesystem type for boot fs, ways of tying various pieces together, and many more. I do not believe that my change is incompatible with full disk encryption in general. So, I see three ways of resolving the problem that my changes caused for your configuration. 1. [the easiest] Put zpool.cache loading instructions that used to be in defaults/loader.conf into your loader.conf. This way everything should work as before -- zpool.cache would be loaded from your boot pool. 2. Somehow (I don't want to go into any technical details here) arrange that your root pool has /boot/zfs/zpool.cache that describes your boot pool. This is probably hard given that your /boot is a symlink at the moment. This probably would be easier to achieve if zpool.cache lived in /etc/zfs. 3. [my favorite] Remove an artificial difference between your boot and root pools, so that they are a single root+boot pool (as zfs gods intended). As far as I understand your setup, you use GELI to protect some sensitive data. Apparently your kernel is not sensitive data, so I wonder if your /bin/sh or /sbin/init are really sensitive either. So perhaps you can arrange your unencrypted pool to hold all of the base system (boot + root) and put all your truly sensitive filesystems (like e.g. /home or /var/data or /opt/xyz) onto your encrypted pool. If all you care about is laptop being stolen, then that would work. If you however want to be protected from someone replacing your /sbin/init with something evil then you use encryption or even better integrity verification also supported by GELI. There are different ways to ensure that. Including storing cryptographic checksums in a safe place or keeping init in the same place where kernel is kept. And probably many more. Remember, tools not policies. I am not trying to enforce any policy on end-users here. There is also option number 4 - backing out your commit. That's definitely an option. I'll discuss it a few lines below. When I saw your commit removing those entries from defaults/loader.conf, I thought it is fine, as we now don't require zpool.cache to import the root pool, which was, BTW, very nice and handy improvement. Now that we know it breaks existing installations I'd prefer the commit to be backed out. breaks sounds dramatic, but let's take a step back and see what exactly is broken. The system in question still can boot without a problem, it is fully usable and it is possible to change its configuration without any hassle. The only thing that changed is that its boot pool is not imported automatically. Let's also recall that the system was not created / configured by any of the existing official or semi-official tools and thus it does not represent any recommended way of setting up such systems. Glen configured it this way, but it doesn't mean that that is the way. I think that there are many of ways of changing configuration of that system to make behave as before again. Three I mentioned already. Another is to add rc script to import the boot pool, given that it is a special, designated pool. Yet another is to place zpool.cache onto the root pool and use nullfs (instead of a symlink) to make /boot be from the boot pool but /boot/zfs be from the root pool. This is because apart from breaking some existing installations it doesn't gain us anything. I think I addressed the breaking part, as to the gains - a few lines below. So I
Re: r253070 and disappearing zpool
On Wed, Jul 24, 2013 at 02:47:11PM +0300, Andriy Gapon wrote: on 22/07/2013 23:38 Pawel Jakub Dawidek said the following: The /boot/ has to be unencrypted and can be stored on eg. USB pendrive which is never left unattended, unlike laptop which can be left in eg. a hotel room, but with entire disk encrypted. As we discussed elsewhere, there are many options of configuring full disk encryption. Including decisions whether root filesystem should be separate from boot filesystem, choice of filesystem type for boot fs, ways of tying various pieces together, and many more. I do not believe that my change is incompatible with full disk encryption in general. Maybe you can imagine many ways of configuring it, but definiately the most typical one is to have separate /boot/ from /, where /boot/ is unencrypted and where you use one file system type for both (UFS or ZFS). Let's also recall that the system was not created / configured by any of the existing official or semi-official tools and thus it does not represent any recommended way of setting up such systems. Glen configured it this way, but it doesn't mean that that is the way. Note that there are no official tools to install FreeBSD on ZFS. Is that enough reason to stop supporting it? What Glen did is the recommended way of setting up full disk encryption with ZFS. I'd do it the same way and I'd recommend this configuration to anyone who will (or did) ask me. I think that there are many of ways of changing configuration of that system to make behave as before again. Three I mentioned already. Another is to add rc script to import the boot pool, given that it is a special, designated pool. Yet another is to place zpool.cache onto the root pool and use nullfs (instead of a symlink) to make /boot be from the boot pool but /boot/zfs be from the root pool. Come on... BTW. If moving zpool.cache to /etc/zfs/ will work for both cases that's fine by me, although the migration might be tricky. Yes, that's migration that's scary to me too. Now, about the postponed points. I will reproduce a section from my email that you've snipped. P.S. ZFS/FreeBSD boot process is extremely flexible. For example zfsboot can take zfsloader from pool1/fsA, zfsloader can boot kernel from pool2/fsB and kernel can mount / from pool3/fsC. Of these 3 filesystems from where should zpool.cache be taken? My firm opinion is that it should be taken from / (pool3/fsC in the example above). Because it is the root filesystem that defines what a system is going to do ultimately: what daemons are started, with what configurations, etc. And thus it should also determine what pools to auto-import. We can say that zpool.cache is analogous to /etc/fstab in this respect. So do you or do you not agree with my reasoning about from where zpool.cache should be taken? If you do not, then please explain why. If you do, then please explain how this would be compatible with the old way of loading zpool.cache. I don't have a strong opinion about this. As I said above I'm fine with moving zpool.cache to /etc/zfs/ if we can ensure it won't break existing installations. Still I'm not sure this was your initial goal, because you weren't aware of systems with separate boot pool until recently (if you were aware of this I hope you wouldn't commit the change without prior discussion). Which means in your eyes zpool.cache was always part of the root pool, because /boot/ was. I think that ensuring that zpool.cache is always loaded from a root filesystem is the gain from my change. Were people complaining about zpool.cache being loaded from /boot/zfs/ and not from /etc/zfs/? I don't think so. But people do complain about boot pool not being autoimported. In my opinion for the end user it doesn't really matter if it is /etc/zfs/zpool.cache or /boot/zfs/zpool.cache, as both directories are available once the system is booted. For most people those two directories are placed on the same file system. For some people who actually care if this is /etc/zfs/ or /boot/zfs/, because those are separate file systems the latter works, the former doesn't. In my opinion the gain, if any, is only theoretical. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://mobter.com pgpF_b3WkFXBB.pgp Description: PGP signature
Re: r253070 and disappearing zpool
on 22/07/2013 23:38 Pawel Jakub Dawidek said the following: On Mon, Jul 22, 2013 at 10:29:40AM +0300, Andriy Gapon wrote: I think that this setup (on ZFS level) is quite untypical, although not impossible on FreeBSD (and perhaps only FreeBSD). It's untypical because you have separate boot pool (where loader, loader.conf and kernel are taken from) and root pool (where / is mounted from). As I said elsewhere, it is pretty typical when full disk encryption is used. I am judging by the number of reports / amount of feedback so far. The /boot/ has to be unencrypted and can be stored on eg. USB pendrive which is never left unattended, unlike laptop which can be left in eg. a hotel room, but with entire disk encrypted. As we discussed elsewhere, there are many options of configuring full disk encryption. Including decisions whether root filesystem should be separate from boot filesystem, choice of filesystem type for boot fs, ways of tying various pieces together, and many more. I do not believe that my change is incompatible with full disk encryption in general. So, I see three ways of resolving the problem that my changes caused for your configuration. 1. [the easiest] Put zpool.cache loading instructions that used to be in defaults/loader.conf into your loader.conf. This way everything should work as before -- zpool.cache would be loaded from your boot pool. 2. Somehow (I don't want to go into any technical details here) arrange that your root pool has /boot/zfs/zpool.cache that describes your boot pool. This is probably hard given that your /boot is a symlink at the moment. This probably would be easier to achieve if zpool.cache lived in /etc/zfs. 3. [my favorite] Remove an artificial difference between your boot and root pools, so that they are a single root+boot pool (as zfs gods intended). As far as I understand your setup, you use GELI to protect some sensitive data. Apparently your kernel is not sensitive data, so I wonder if your /bin/sh or /sbin/init are really sensitive either. So perhaps you can arrange your unencrypted pool to hold all of the base system (boot + root) and put all your truly sensitive filesystems (like e.g. /home or /var/data or /opt/xyz) onto your encrypted pool. If all you care about is laptop being stolen, then that would work. If you however want to be protected from someone replacing your /sbin/init with something evil then you use encryption or even better integrity verification also supported by GELI. There are different ways to ensure that. Including storing cryptographic checksums in a safe place or keeping init in the same place where kernel is kept. And probably many more. Remember, tools not policies. I am not trying to enforce any policy on end-users here. There is also option number 4 - backing out your commit. That's definitely an option. I'll discuss it a few lines below. When I saw your commit removing those entries from defaults/loader.conf, I thought it is fine, as we now don't require zpool.cache to import the root pool, which was, BTW, very nice and handy improvement. Now that we know it breaks existing installations I'd prefer the commit to be backed out. breaks sounds dramatic, but let's take a step back and see what exactly is broken. The system in question still can boot without a problem, it is fully usable and it is possible to change its configuration without any hassle. The only thing that changed is that its boot pool is not imported automatically. Let's also recall that the system was not created / configured by any of the existing official or semi-official tools and thus it does not represent any recommended way of setting up such systems. Glen configured it this way, but it doesn't mean that that is the way. I think that there are many of ways of changing configuration of that system to make behave as before again. Three I mentioned already. Another is to add rc script to import the boot pool, given that it is a special, designated pool. Yet another is to place zpool.cache onto the root pool and use nullfs (instead of a symlink) to make /boot be from the boot pool but /boot/zfs be from the root pool. This is because apart from breaking some existing installations it doesn't gain us anything. I think I addressed the breaking part, as to the gains - a few lines below. So I understand that my change causes a problem for a setup like yours, but I believe that the change is correct. The change is clearly incorrect or incomplete as it breaks existing installations and doesn't allow for full disk encryption configuration on ZFS-only systems. I think I addressed the breaking part and also addressed your overly general statement about full disk encryption. So I don't think that my change is clearly incorrect, otherwise that would be clear even to me. BTW. If moving zpool.cache to /etc/zfs/ will work for both cases that's fine by me, although the migration
Re: r253070 and disappearing zpool
[turning this into a public discussion with Glen's permission] on 10/07/2013 21:05 Glen Barber said the following: Hi, My setup is like this: root@nucleus:/usr/src # zpool list NAME SIZE ALLOC FREECAP DEDUP HEALTH ALTROOT zboot0 9.94G 379M 9.57G 3% 1.00x ONLINE - zroot0 159G 113G 46.2G70% 1.00x ONLINE - root@nucleus:/usr/src # zpool get bootfs NAMEPROPERTY VALUE SOURCE zboot0 bootfs- default zroot0 bootfs- default root@nucleus:/usr/src # zfs list zboot0 NAME USED AVAIL REFER MOUNTPOINT zboot0 379M 9.41G 281M /bootdir root@nucleus:/usr/src # zfs list zroot0 NAME USED AVAIL REFER MOUNTPOINT zroot0 113G 43.7G 147M / 'zroot0' is a GELI-backed pool, so I have this to fix the boot process: root@nucleus:/usr/src # ll /boot lrwxr-xr-x 1 root wheel 12 Aug 25 2012 /boot@ - bootdir/boot I upgraded from head/ on July 1 to r253159, and when I rebooted the system, I could correctly boot from the /bootdir/boot. Once I enter the GELI passphrase, / (from zroot0) is mounted. Normally, everything would be okay at that point, but since the upgrade, /bootdir/boot disappears because the zboot0 pool is not imported as it was before. Any thoughts? I think that this setup (on ZFS level) is quite untypical, although not impossible on FreeBSD (and perhaps only FreeBSD). It's untypical because you have separate boot pool (where loader, loader.conf and kernel are taken from) and root pool (where / is mounted from). There is this magic zpool.cache file that essentially tells what pools should be automatically imported. On FreeBSD this file lives in /boot/zfs directory while originally (in Solaris and its descendants) its home is /etc/zfs. Until recently FreeBSD could use only zpool.cache from a boot pool and, in fact, if a root pool was different from a boot pool the root pool had to be in zpool.cache. I changed things a little bit and now a root pool does not have to be in zpool.cache. Also, now zpool.cache is taken from the root pool, or to be more precise from a root filesystem (whatever happens to be /boot/zfs/zpool.cache after / is mounted). I am considering if perhaps now we should move zpool.cache back to /etc/zfs/. So, I see three ways of resolving the problem that my changes caused for your configuration. 1. [the easiest] Put zpool.cache loading instructions that used to be in defaults/loader.conf into your loader.conf. This way everything should work as before -- zpool.cache would be loaded from your boot pool. 2. Somehow (I don't want to go into any technical details here) arrange that your root pool has /boot/zfs/zpool.cache that describes your boot pool. This is probably hard given that your /boot is a symlink at the moment. This probably would be easier to achieve if zpool.cache lived in /etc/zfs. 3. [my favorite] Remove an artificial difference between your boot and root pools, so that they are a single root+boot pool (as zfs gods intended). As far as I understand your setup, you use GELI to protect some sensitive data. Apparently your kernel is not sensitive data, so I wonder if your /bin/sh or /sbin/init are really sensitive either. So perhaps you can arrange your unencrypted pool to hold all of the base system (boot + root) and put all your truly sensitive filesystems (like e.g. /home or /var/data or /opt/xyz) onto your encrypted pool. ZFS is really flexible, you can use mountpoint and canmount properties to place your filesystems from same or different pools into whatever file namespace hierarchy you desire. Remember that your filesystem hierarchy in the mount namespace does not always have to be the same as your ZFS dataset hierarchy. I hope that this makes sense to you. If you have any additional questions, please do not hesitate. P.S. ZFS/FreeBSD boot process is extremely flexible. For example zfsboot can take zfsloader from pool1/fsA, zfsloader can boot kernel from pool2/fsB and kernel can mount / from pool3/fsC. Of these 3 filesystems from where should zpool.cache be taken? My firm opinion is that it should be taken from / (pool3/fsC in the example above). Because it is the root filesystem that defines what a system is going to do ultimately: what daemons are started, with what configurations, etc. And thus it should also determine what pools to auto-import. We can say that zpool.cache is analogous to /etc/fstab in this respect. So I understand that my change causes a problem for a setup like yours, but I believe that the change is correct. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: r253070 and disappearing zpool
On Jul 22, 2013, at 12:29 AM, Andriy Gapon wrote: [turning this into a public discussion with Glen's permission] on 10/07/2013 21:05 Glen Barber said the following: Hi, My setup is like this: root@nucleus:/usr/src # zpool list NAME SIZE ALLOC FREECAP DEDUP HEALTH ALTROOT zboot0 9.94G 379M 9.57G 3% 1.00x ONLINE - zroot0 159G 113G 46.2G70% 1.00x ONLINE - root@nucleus:/usr/src # zpool get bootfs NAMEPROPERTY VALUE SOURCE zboot0 bootfs- default zroot0 bootfs- default root@nucleus:/usr/src # zfs list zboot0 NAME USED AVAIL REFER MOUNTPOINT zboot0 379M 9.41G 281M /bootdir root@nucleus:/usr/src # zfs list zroot0 NAME USED AVAIL REFER MOUNTPOINT zroot0 113G 43.7G 147M / 'zroot0' is a GELI-backed pool, so I have this to fix the boot process: root@nucleus:/usr/src # ll /boot lrwxr-xr-x 1 root wheel 12 Aug 25 2012 /boot@ - bootdir/boot I upgraded from head/ on July 1 to r253159, and when I rebooted the system, I could correctly boot from the /bootdir/boot. Once I enter the GELI passphrase, / (from zroot0) is mounted. Normally, everything would be okay at that point, but since the upgrade, /bootdir/boot disappears because the zboot0 pool is not imported as it was before. Any thoughts? I think that this setup (on ZFS level) is quite untypical, although not impossible on FreeBSD (and perhaps only FreeBSD). It's untypical because you have separate boot pool (where loader, loader.conf and kernel are taken from) and root pool (where / is mounted from). There is this magic zpool.cache file that essentially tells what pools should be automatically imported. On FreeBSD this file lives in /boot/zfs directory while originally (in Solaris and its descendants) its home is /etc/zfs. Until recently FreeBSD could use only zpool.cache from a boot pool and, in fact, if a root pool was different from a boot pool the root pool had to be in zpool.cache. I changed things a little bit and now a root pool does not have to be in zpool.cache. Also, now zpool.cache is taken from the root pool, or to be more precise from a root filesystem (whatever happens to be /boot/zfs/zpool.cache after / is mounted). I am considering if perhaps now we should move zpool.cache back to /etc/zfs/. So, I see three ways of resolving the problem that my changes caused for your configuration. 1. [the easiest] Put zpool.cache loading instructions that used to be in defaults/loader.conf into your loader.conf. This way everything should work as before -- zpool.cache would be loaded from your boot pool. 2. Somehow (I don't want to go into any technical details here) arrange that your root pool has /boot/zfs/zpool.cache that describes your boot pool. This is probably hard given that your /boot is a symlink at the moment. This probably would be easier to achieve if zpool.cache lived in /etc/zfs. 3. [my favorite] Remove an artificial difference between your boot and root pools, so that they are a single root+boot pool (as zfs gods intended). As far as I understand your setup, you use GELI to protect some sensitive data. Apparently your kernel is not sensitive data, so I wonder if your /bin/sh or /sbin/init are really sensitive either. So perhaps you can arrange your unencrypted pool to hold all of the base system (boot + root) and put all your truly sensitive filesystems (like e.g. /home or /var/data or /opt/xyz) onto your encrypted pool. ZFS is really flexible, you can use mountpoint and canmount properties to place your filesystems from same or different pools into whatever file namespace hierarchy you desire. Remember that your filesystem hierarchy in the mount namespace does not always have to be the same as your ZFS dataset hierarchy. I hope that this makes sense to you. If you have any additional questions, please do not hesitate. P.S. ZFS/FreeBSD boot process is extremely flexible. For example zfsboot can take zfsloader from pool1/fsA, zfsloader can boot kernel from pool2/fsB and kernel can mount / from pool3/fsC. Of these 3 filesystems from where should zpool.cache be taken? My firm opinion is that it should be taken from / (pool3/fsC in the example above). Because it is the root filesystem that defines what a system is going to do ultimately: what daemons are started, with what configurations, etc. And thus it should also determine what pools to auto-import. We can say that zpool.cache is analogous to /etc/fstab in this respect. So I understand that my change causes a problem for a setup like yours, but I believe that the change is correct. +1 for zpool.cache on / (pool3/fsC in last example) -- Devin _ The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and
Re: r253070 and disappearing zpool
On Mon, Jul 22, 2013 at 10:29:40AM +0300, Andriy Gapon wrote: I think that this setup (on ZFS level) is quite untypical, although not impossible on FreeBSD (and perhaps only FreeBSD). It's untypical because you have separate boot pool (where loader, loader.conf and kernel are taken from) and root pool (where / is mounted from). As I said elsewhere, it is pretty typical when full disk encryption is used. The /boot/ has to be unencrypted and can be stored on eg. USB pendrive which is never left unattended, unlike laptop which can be left in eg. a hotel room, but with entire disk encrypted. So, I see three ways of resolving the problem that my changes caused for your configuration. 1. [the easiest] Put zpool.cache loading instructions that used to be in defaults/loader.conf into your loader.conf. This way everything should work as before -- zpool.cache would be loaded from your boot pool. 2. Somehow (I don't want to go into any technical details here) arrange that your root pool has /boot/zfs/zpool.cache that describes your boot pool. This is probably hard given that your /boot is a symlink at the moment. This probably would be easier to achieve if zpool.cache lived in /etc/zfs. 3. [my favorite] Remove an artificial difference between your boot and root pools, so that they are a single root+boot pool (as zfs gods intended). As far as I understand your setup, you use GELI to protect some sensitive data. Apparently your kernel is not sensitive data, so I wonder if your /bin/sh or /sbin/init are really sensitive either. So perhaps you can arrange your unencrypted pool to hold all of the base system (boot + root) and put all your truly sensitive filesystems (like e.g. /home or /var/data or /opt/xyz) onto your encrypted pool. If all you care about is laptop being stolen, then that would work. If you however want to be protected from someone replacing your /sbin/init with something evil then you use encryption or even better integrity verification also supported by GELI. Remember, tools not policies. There is also option number 4 - backing out your commit. When I saw your commit removing those entries from defaults/loader.conf, I thought it is fine, as we now don't require zpool.cache to import the root pool, which was, BTW, very nice and handy improvement. Now that we know it breaks existing installations I'd prefer the commit to be backed out. This is because apart from breaking some existing installations it doesn't gain us anything. So I understand that my change causes a problem for a setup like yours, but I believe that the change is correct. The change is clearly incorrect or incomplete as it breaks existing installations and doesn't allow for full disk encryption configuration on ZFS-only systems. BTW. If moving zpool.cache to /etc/zfs/ will work for both cases that's fine by me, although the migration might be tricky. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://mobter.com pgpG8GeaQjVQd.pgp Description: PGP signature