Signed-off-by: Ben Lipton <[email protected]> --- doc/design.rst | 175 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 175 insertions(+), 0 deletions(-) create mode 100644 doc/design.rst
diff --git a/doc/design.rst b/doc/design.rst new file mode 100644 index 0000000..76c0543 --- /dev/null +++ b/doc/design.rst @@ -0,0 +1,175 @@ +Ganeti p2v-transfer design +========================== + +This document describes the design of p2v-transfer, a tool for converting +a physical computer into a ganeti instance. + +Objective +--------- + +p2v-transfer should be a simple tool to move a physical Linux machine +into a ganeti instance. This tool would in its simplest usage be able to copy +all the data from a physical machine and produce an identically configured, +bootable ganeti instance all ready to go. It should automatically make +some changes, such as console and disk names, that are needed to make the +machine function properly as an instance, and it should be configurable to make more site-specific or os-specific changes as necessary. + +Background +---------- + +P2V (physical to virtual) systems already exist for various operating systems +and virtualization platforms. Many of them are proprietary, however, and none +are specifically targeted toward ganeti. + +* `VMware vCenter Converter <http://www.vmware.com/products/converter/>`_ is a + free, proprietary P2V tool. It can create virtual machines in the Open + Virtualization Format (OVF, an open image format for virtual machines) as + well as machines that run on the VMware architecture. +* `Citrix XenServer <http://www.xensource.com>`_ also seems to include a P2V + tool that creates xen virtual machines (instructions: `Physical to Virtual + Conversion (P2V) + <http://docs.vmd.citrix.com/XenServer/4.0.1/guest/ch02s04.html>`_) but they + say it only works on RHEL, CentOS, and SuSe as source distros. I’m also not + sure if the process is compatible with ganeti. +* Though not a P2V solution, `Open-OVF <http://gitorious.org/open-ovf>`_ is an + IBM-sponsored open-source python library for dealing with OVF images that + might be useful. +* There is an interesting (I think open-source) project called virt-p2v that is + part of redhat’s virt-v2v package and automates the process of virtualizing a + physical server booted from a specialized liveCD/PXE image (see `Converting, + Inspecting, & Modifying Virtual Machines with Red Hat Enterprise Linux 6.1 + <http://oirase.annexia.org/booth_w_1020_guest_conversion_in_rhel.pdf>`_). + However, it requires the data to be transferred to a virt-v2v server for + conversion to a RHEV instance, so it doesn’t seem to be directly applicable. + +Requirements and Scale +---------------------- + +The main use case for the system is transferring a single physical machine to a +single instance. It does not need to be optimized for transferring many +machines at a time, although it should be scriptable in case the user has a lot +of servers they want to move to ganeti. It should be able to support several +users transferring machines, just in case. How many? P2V won’t be a very common +operation so the number of simultaneous connections shouldn’t be too large. The +data transfer, however, will be fairly large per user, probably on the order of +hundreds of GB. The tool will need to gracefully handle the case where policy +dictates a smaller disk than the one required to store all the data. + +This P2V migration should be possible with a minimum of privileges on the +cluster. For example, the user doing the migration must not need root +privileges on dom0 to make the transfer. + +In order for the transferred machine to work as a ganeti instance, some changes +to its filesystem will be required. Some of these can be automated because they +are necessary to work on the virtual architecture: + +* Change default console to /dev/hvc0 +* Change disks in /etc/fstab and other places to refer to the new UUIDs of the + appropriate filesystems. Actually, filesystems can be created to have the + same UUID as on the source box, but if fstab refers to /dev/sda0 it still + needs to change. +* Anything that refers specifically to the MAC address + +Others are site-specific and should be specifiable as command-line +(or similar) options: + +* Hostname changes +* IP address / networking changes + +Because some of these changes must be implemented in a way that is specific to +the operating system, it may be preferable to have a script or scripts in the +OS definition that can handle making these changes. However, the P2V tool must +still be able to request that these changes be made so that the new instance +doesn’t come online with an invalid hostname, for example. + +There is also the possibility of making additional changes to the machine +in addition to simply moving it to the cluster. These should be considered +optional features, which wouldn’t be developed unless the core functionality +was working well. Some possible examples are: + +* Changing partitioning scheme of machine / switching partitioning to LVM +* Changing kernel (maybe keeping the original kernel is the hard problem here, + as the kernels have to live in dom0...) + +Transfer Process +---------------- + +To maintain the integrity of the copy, the source machine must not be running +when the transfer is taking place. So, the source machine will be booted from a +liveCD/PXE image, and the transfer script run from that operating system. + +Target instances will be created by the administrator with a bootstrap OS, +which unmounts the disk after booting and awaits a connection by the script +running on the source machine. Then the disk can be partitioned, data copied +over rsync, necessary changes made to the filesystem. Then the instance is +rebooted into the new operating system. + +The migration has the following steps: + +1. The target instance is created with a modified OS template (containing tools + required for imaging) +2. The instance is booted with a modified initrd, which copies the root + filesystem into RAM before running init. This allows the OS to run without + the disk being mounted. The command looks something like:: + + gnt-instance start -H initrd_path=/boot/initrd.img-p2v instance17 + +3. The instance tries to fetch an SSH public key from a predetermined location. + When it finds one, it downloads it to its /root/.ssh/authorized_keys file, + giving the source machine shell access to the target. +4. The instance disks are partitioned and formatted as required to duplicate the + source machine. In the case where the target disks are not the same size as + the source ones this requires some cleverness (or user input, more likely) + to ensure that the important filesystems (e.g. /usr) have some wiggle room. +5. The newly created filesystems are mounted on the target. Data is copied from + the source to the target. +6. Modifications are made to the target so that it + works in ganeti. Some of these modifications may be extremely os-specific, + so they probably shouldn’t be hard-coded into the p2v script, but there + isn’t currently a hook in the OS API for this operation. However, the + instance is still running (from RAM) at this point, so there may be other + options. See “Unresolved Questions,” below. +7. Power the instance off, so ganeti-watcher will restart it using the default + kernel and initrd. Or, potentially, using pvgrub to use the kernel that’s on + the transferred image, depending on the setup of your cluster. +8. Log in. Hopefully everything is where you left it! + +Alternatives Considered +----------------------- + +1. The script running on the source machine creates a dump of the filesystem + that can be imported into the ganeti cluster using ``gnt-backup import``. + The disadvantage of this approach is that the source system probably does + not have enough RAM to store the image that is being built, and the image + can't be put on the disk that is being imaged. So, the image would need to + be built off of the source box, which forces the administrator to make + available a staging area where a several-hundered-gigabyte image can be + placed. +2. If creating a system image is acceptable, another option is to create the + image in the OVF format, which is a standard VM export format that is + understood by VMWare and VirtualBox, among others. To make this work with + ganeti would mean implementing at least sufficient OVF support in ganeti to + import the images created by the script. Enabling ganeti to import OVF + images would increase interoperability with other virtual environments and + allow the images created by the P2V tool to be used on systems other than + ganeti, and is in fact a planned feature, but for the reasons discussed in + option 1 this is a problematic approach to pursue for P2V. +3. Boot the source + machine (Physical) into a tool that speaks the remote import-export API of + Ganeti, and coordinate (with a central system) the import of the source + filesystem into the target ganeti cluster. This doesn’t need any OS API + changes, and it still keeps the streaming/no-copy-needed method. This + requires some work to deal with the shared domain secrets that are required + by the remote import/export, but the real problem is that the remote API + only supports a 1:1 dump of a filesystem, and changes must be made to the + filesystem in order for it to boot on ganeti. Either we need a staging area + like in options 1 and 2, or the migration can be destructive and modify the + source filesystem, or the remote API needs to allow triggering of these + filesystem changes (similar to how it is possible to trigger a rename). +4. Create the target instance on the cluster, and then connect to dom0 of + the node that stores the instance, and partition, mount, copy data to, and + tweak the instance disks directly by writing to the DRBD volumes. This + requires the user to be able to ssh to a particular node, mount disks on + dom0 and change arbitrary files on those disks. These permissions should not + be necessary to do this kind of transfer; it should be possible even if only + the administrator can run commands on dom0. -- 1.7.3.1
