On 2/20/06, Jerry Jelinek <Gerald.Jelinek at sun.com> wrote: > Mike, > > Mike Gerdts wrote: > >> There is already an RFE open for this: > >> > >> 6383119 RFE: add support for using zfs clones when cloning a zone > > > > > > Sigh... I so wish that RFE's (with relevant status information) were > > accessible outside of Sun. > > I thought the bugs and RFEs were all accessible on opensolaris? > I was able to pull up 6383119 on the opensolaris bug screen but > I am not sure how easy it is to search for bugs/rfes. When I > put in "zones" & "zfs" under Solaris/Utilities, I got almost > 1300 hits so that isn't very helpful although 6383119 was the first > one on the list.
Ahh... my search was "zoneadm AND clone AND zfs" that yielded no results. Had it been "zone" instead of "zoneadm" I would have found it. (It looks like it is important to use AND rather than & to narrow results.) > I think we'll have a note posted about those discussions so everyone can see > what kind of ideas we have internally and maybe get ideas for new projects > that they want to start working themselves, if they are interested. I haven't > actually sent out the formal announcement about the zfs cloning to the alias > yet because we are still having some discussions about what the final design > should be (what options to offer, how it will actually behave, etc.). I > should be able to send that out later in the week though. I do have most > of the first cut at the code changes done, modulo some of these minor tweaks > like options and things, so it is really far along. We have been talking > about > talking about integrating zfs clones since we started the initial cloning > changes last November which is why it is almost done. I look forward to this. > > FWIW, with S10 GA and update 1 I have been doing an (unsupported) version of > > cloning where I create an SVM soft partition for each zone, then use dd to > > copy the data. This has brought zone creation down from about 5 minutes > > with really fast disks to less than one with most any disk. > > Yes, that is a good technique too. The only worry is the manual editing of > the zones index file that you have to do. That is guaranteed to break in the > near future. We are working on a supported technique for that using the > attach/detach -F option. I emailed that proposal out to zones-discuss > a while back. The code is in final review and should be integrating any > day now. Yeah, I kinda expected that it would break at some point in time. The hope was that the ability to clone using zfs or some other improvement would help out when that breakage happened. > We are really interested in you (and others) getting involved with new ideas > and features for zones and solaris in general. I sure hope this hasn't put > you > off in any way. We would be really interested to hear what things you are > doing and what other ideas you have for new features for zones. If would be > great to see those posted on zones-discuss so we can all see what needs to be > done and who wants to work on it. Before I get into what is good for my business, I feel I need a bit of a disclaimer. Code that I contribute to OpenSolaris is based upon personal interest and is a separate pursuit from the work that I perform for my employer. My employer does not endorse or support my development activity with OpenSolaris. All equipment used to develop OpenSolaris is my own. Now, here is what makes sense for my business. I would be happy to provide you with an LSDC contract number to tip the scales in favor of developing these features. 1. Currently the creation of the first zone on a machine (inheriting /usr and /opt) takes longer than a jumpstart that installs a flar of SUNWCXall. This, of course, is only counting the actual install time, not the POST and OS boot. It would be helpful to be able to have flars of standard zones that can be deployed as template or standard zones. These would likely contain the /etc/zones/<zone/template name>.xml file. Of course, there needs to be patch and package compatibility between the OS that is running and the zone flar that is applied. This could be tricky, but I am confident it is not insurmountable. zonecfg and/or zoneadm should be enhanced to support this. 2. If copying a zone happens using a cpio mechanism, it should use the same options as flash archives. This would allow me to exclude particular directories or files. This is especially important to be able to exclude any sparse files (e.g. /var/adm/lastlog), tmp files, etc. Bug 4480319 3. If cpio is used to clone zones, cpio should be enhanced to preserve sparse files. In my business, there are UIDs clustered at 0 - 64k, 100 million, 212 million, 500 million, and 900 million. This causes lastlog to increase in disk usage from negligible to gigabytes. Similar things would likely happen with ufs quotas files. This problem currently affects live upgrade, flash archives, and the nevada b33 implementation of "zoneadm clone". At one point I filed an RFE to use GNU cpio with --sparse but it has not made it very far. 4. "zoneadm boot" should accept options other than "-s". In particular, I should be able to boot to other milestones, including arbitrary milestones that are site defined. Networking should not be up if booted to milestone/none. (See #8 below). 5. There should be a sysidcfg mechanism (maybe there is, but poorly documented) that allows me to use multiple name services. It is quite likely that DNS the most commonly used name service for hosts and the least common name service for anything else. As it is now, I have to choose between using a manual sysidcfg process or using an unsupported automated process. 6. By default, zones should inherit sysidcfg information from the global zone or the template/source zone if being cloned. This implies that there needs to be a way to sys-almost-unconfig to clear out things like ssh keys that really should be different between zones. However, I am fairly confident that there is no need to perform a manual sysidcfg in most zones. 7. There should be a way to specify commands that are run prior to booting a zone and as part of the zone boot process. For example, if a zone is configured on multiple hosts that are attached to shared storage, you may need to import a pool, diskset, diskgroup and mount file systems prior to initiating the boot. As the boot is happening, there may be other tasks that are important to perform. For example, if the global zone and local zones are on different subnets, the default route for the local zones subnets needs to be brought up after the local zone IP is up but before the local zone starts to try to access nework services like NFS and name services. For one "zone farm" that we have, I re-wrote the zone startup process in the global zone to do things like ping the addresses used by zones before booting to help protect against the same zones running on multiple machines. It also does things like setting default routes if needed. 8. It should be possible to boot a zone with networking disabled. This would allow for things like staging a zone that is currently running on another machine. Certain operations (e.g. JASS, importing services) need to happen as part of the zone initialization process and should be able to be done when the same zone (really, a different zone with the same hostname and IP running on a different machine) is up on another machine. As it is, if I am "migrating" a zone between machines, I have to either experience many minutes of downtime or use zonecfg to remove all networks, boot and perform initialization, shutdown, and use zonecfg to replace the networks. If I use my workaround, I can routinely migrate zones between machines with less than 1 minute of application downtime (assuming app starts reasonably fast). 9. Rapid cloning of zones is very good stuff, as is automatic zfs filesystem creation and deletion. Assuming that ZFS makes it into Solaris 10 (update 2? Please? As a supported "extra value" package for early adopters?) these features should be backported to Solaris 10. In my environment this is a feature that would accelerate the adoption of ZFS. > > Thanks a lot for following up on this and working with us on zones, > Jerry My pleasure. I really enjoy digging into this stuff. As you are meeting this week, if you have any questions, feel free to get in touch with me. I'll send you a private email with my work contact info. I would be happy to join in for a bit by phone if that would be helpful. > > > SUMMARY: > > This fast-track enhances the Solaris Zones  subsystem > to address an existing RFE to provide integrated support > for using ZFS clones when cloning zones. > > Patch binding is requested for these sub-commands and the stability > of the interfaces is "evolving". However, support for placing > zone roots on a ZFS filesystem will not be provided in a patch > release until bug 6356600 is fixed . > > DETAILS: > > PSARC/2005/711 introduced support for cloning zones by copying > the data from a source zonepath to a target zonepath. The case > extends that support to use ZFS clones whenever possible in place > of actually copying the data. Excellent idea. I especially like the automatic portion. Does this imply anything about "zoneadm install" as well? > > The zonepath cloning technique will be automatic. We will determine > what kind of filesystem the source zonepath and target zonepath is on. > If both are UFS or one is ZFS and the other UFS, the code will use the > existing cpio copy technique. If both are ZFS we will take a snapshot Please consider bumping up the priority to fix bug 4480319. This adds the third major Solaris utility that is severly impacted by this bug in my environment. > of the source zonepath and then use a ZFS clone to set up the target > zonepath. The destination zone zonepath value will be used to name > the ZFS clone. This zonepath must also be in the same zpool or we > will > automatically fall back to using cpio to copy the data. In fact, this > will be the behavior in all cases if use of a ZFS clone fails. The > name of the snapshot(s) will be SUNWzoneX (where X is a unique id to > distinguish between multiple snapshots). > > You might want to clone a source zone many times and we don't > want to take a snapshot each time. PSARC/2005/711 introduced > an optional "-m method" parameter to the "zoneadm clone" subcommand > with the intention that this would be extended to support ZFS > clones. The only current value for -m is "-m copy". We will > extend the -m parameter to allow a "snapshot" value. The following > argument would then be the name of an existing snapshot of > a zonepath filesystem. This snapshot could be one we have taken > from an earlier clone or one that the sysadmin took themselves. I like the idea of one snapshot. > > For example: > > # zoneadm -z new-zone clone -m snapshot foo/acct at SUNWzone1 > > This case introduces a slight change in the behavior described in > PSARC/2005/711. PSARC/2005/711 stated that the default behavior > for zone cloning was to "copy". This case modifies that to make > the default behavior "do the right thing". Now, if you specify > "-m copy" the zonepath will be copied, not ZFS cloned, even if > the source could be cloned. If you don't specify a -m > option, the code will "do the right thing". Since the code for > PSARC/2005/711 is not released yet, outside of Nevada and OpenSolaris, > this minor change in behavior should not be a factor. > > Cloning a zone by using ZFS clones depends upon the source zonepath > being its own ZFS filesystem. This is actually behavior that we > want to encourage since ZFS filesystems are cheap and offer other > benefits. The "zoneadm install" subcommand will be extended to > automatically create a ZFS filesystem for the zonepath, if possible, > when you initially install a zone. If the zonepath is not on a ZFS > filesystem then of course we won't do that. > > It is possible that the sys-admin might not want to automatically > create a ZFS filesystem when installing a zone. We will add an > optional -z parameter to "zoneadm install" which means that a ZFS > filesystem should not be created. If the zonepath is not on > a ZFS filesystem in the first place, then this parameter will have > no effect. > > When a zone is uninstalled, if the zonepath is a ZFS filesystem, > instead of using "rm -rf" to remove the zone root, we will destroy > the ZFS filesystem. > > EXPORTED INTERFACES > > zoneadm subcommands > clone [-m copy | snapshot] Evolving > install [-z] Evolving > > The ZFS snapshot namespace > XXX/XXX at SUNWzoneX Stable > > IMPORTED INTERFACES > > libzfs Consolidation Private > (this is already imported by zones to support ZFS datasets within > zones) > > REFERENCES > > 1. PSARC 2002/174 Virtualization and Namespace Isolation in Solaris > 2. RFE: add support for using zfs clones when cloning a zone Bugid 6383119 > http://monaco.sfbay/detail.jsf?cr=6383119 > 3. 6356600 - zfs datasets can't be used to provide space for zone roots > http://monaco.sfbay/detail.jsf?cr=6356600 > > > -- Mike Gerdts http://mgerdts.blogspot.com/