On Thu, Nov 14, 2013 at 2:55 PM, Jose A. Lopes <[email protected]> wrote: > On Thu, Nov 14, 2013 at 11:31:11AM +0100, Guido Trotter wrote: >> On Wed, Nov 13, 2013 at 9:57 AM, Michele Tartara <[email protected]> wrote: >> > On Tue, Nov 12, 2013 at 2:13 PM, Guido Trotter <[email protected]> >> > wrote: >> >> On Tue, Nov 12, 2013 at 12:41 PM, Michele Tartara <[email protected]> >> >> wrote: >> >>> Add the document describing a new design for the OS installation process >> >>> for >> >>> new instances. >> >>> >> >>> Signed-off-by: Michele Tartara <[email protected]> >> >>> --- >> >>> doc/design-draft.rst | 1 + >> >>> doc/design-os.rst | 318 >> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++ >> >>> 2 files changed, 319 insertions(+) >> >>> create mode 100644 doc/design-os.rst >> >>> >> >>> diff --git a/doc/design-draft.rst b/doc/design-draft.rst >> >>> index c821292..3ed3852 100644 >> >>> --- a/doc/design-draft.rst >> >>> +++ b/doc/design-draft.rst >> >>> @@ -20,6 +20,7 @@ Design document drafts >> >>> design-daemons.rst >> >>> design-hsqueeze.rst >> >>> design-ssh-ports.rst >> >>> + design-os.rst >> >>> >> >>> .. vim: set textwidth=72 : >> >>> .. Local Variables: >> >>> diff --git a/doc/design-os.rst b/doc/design-os.rst >> >>> new file mode 100644 >> >>> index 0000000..7a42a7f >> >>> --- /dev/null >> >>> +++ b/doc/design-os.rst >> >>> @@ -0,0 +1,318 @@ >> >>> +=============================== >> >>> +Ganeti OS installation redesign >> >>> +=============================== >> >>> + >> >>> +.. contents:: :depth: 3 >> >>> + >> >>> +This is a design document detailing a new OS installation procedure, >> >>> more >> >>> +secure, able to provide more features and easier to use for many common >> >>> tasks >> >>> +w.r.t. the current one. >> >>> + >> >>> +Current state and shortcomings >> >>> +============================== >> >>> + >> >>> +As of Ganeti 2.10, each instance is associated with an OS definition. >> >>> An OS >> >>> +definition is a set of scripts (``create``, ``export``, ``import``, >> >>> ``rename``) >> >>> +that are executed with root privileges on the primary host of the >> >>> instance to >> >>> +perform all the OS-related functionality (setting up an operating >> >>> system inside >> >>> +the disks of the instance being created, exporting/importing the >> >>> instance, >> >>> +renaming it). >> >>> + >> >>> +These scripts receive, as environment variables, a fixed set of >> >>> parameters >> >>> +describing the instance (such as the hypervisor, the name of the >> >>> instance, the >> >>> +number of disks, and their location) and a set of user defined >> >>> parameters. Each >> >>> +of these parameters is also written into the configuration file of >> >>> Ganeti, to >> >>> +allow for future reinstalls of the instance, and in various log files, >> >>> namely: >> >>> + >> >>> +* node daemon log file: contains DEBUG strings of the ``/os_validate``, >> >>> + ``/instance_os_add`` and ``/instance_start`` RPC calls. >> >>> + >> >>> +* master daemon log file: DEBUG strings related to the same RPC calls >> >>> are stored >> >>> + here as well. >> >>> + >> >>> +* commands log: the CLI commands that create a new instance, including >> >>> their >> >>> + parameters, are logged here. >> >>> + >> >>> +* RAPI log: the RAPI commands that create a new instances, including >> >>> their >> >>> + parameters, are logged here. >> >>> + >> >>> +* job logs: the job files stored in the job queue or in its archive >> >>> contain the >> >>> + parameters. >> >>> + >> >>> +The current situation presents a number of shortcomings: >> >>> + >> >>> +* Having the installation scripts run with root power on the nodes is a >> >>> huge >> >>> + security issue. >> >>> + >> >> >> >> s/is a huge security issue/doesn't allow user-defined os scripts, as >> >> they would pose a huge security issue/ >> >> >> >> Note that there's no security issue *per se* in the current situation, >> >> if the OS scripts are trusted. >> >> (except perhaps for export, if the os script mounts the instance disk, >> >> which is also not necessarily the case) >> > >> > Yes, that's what I meant. I'll reword it as you suggest. >> > >> >> >> >> That said it could be a safety issue in the sense that an eventual >> >> bug/error in the os script could risk disrupting the node. >> > >> > ACK >> > >> >> >> >>> +* Ganeti cannot be used to create instances starting from user provided >> >>> disk >> >>> + images: even in the (hypothetical) case where the scripts are >> >>> completely >> >>> + secure and run not by root but by an unprivileged user with only the >> >>> power to >> >>> + mount arbitrary files as disk images, this is a security issue. It >> >>> has been >> >>> + proven that a carefully crafted file system might exploit kernel >> >>> + vulnerabilities to gain control of the system. Therefore, directly >> >>> mounting >> >>> + images on the Ganeti nodes is not an option. >> >>> + >> >>> +* There is no way to inject files into an existing disk image. A common >> >>> use case >> >>> + is for the system administrator to provide a standard image of the >> >>> system, to >> >>> + be later personalized with the network configuration, private keys >> >>> identifying >> >>> + the machine, ssh keys of the users and so on. A possible workaround >> >>> would be >> >>> + for the scripts to mount the image (only if this is trusted!) and to >> >>> receive >> >>> + the configurations and ssh keys as user defined OS parameters. >> >>> Unfortunately, >> >>> + this is also not an option for security sensitive material (such as >> >>> the ssh >> >>> + keys) because the OS parameters are stored in many places on the >> >>> system, as >> >>> + already described above. >> >>> + >> >>> +* Most other virtualization software simply work with instance images, >> >>> not with >> >>> + installation scripts. This difference makes the interaction of Ganeti >> >>> with >> >>> + other softwares difficult. >> >> >> >> s/softwares/software/ >> > >> > ACK >> > >> >> >> >>> + >> >>> +Proposed changes >> >>> +================ >> >>> + >> >>> +In order to fix the shortcomings of the current state, we plan to >> >>> introduce the >> >>> +following changes: >> >>> + >> >>> +* Change the OS parameters to have three categories: >> >>> + >> >>> + * ``public``: the current behavior. The parameter is logged and stored >> >>> freely. >> >>> + >> >>> + * ``private``: the parameter is saved inside the Ganeti configuration >> >>> (to allow >> >>> + for instance reinstall) but it is not shown in logs, job logs, or >> >>> passed back >> >>> + via RAPI. >> >>> + >> >>> + * ``secret``: the parameter is not saved inside the Ganeti >> >>> configuration. >> >>> + Reinstall are impossible unless the data is passed again. The >> >>> parameter will >> >>> + not appear in any log file. In order to preserve the functionality >> >>> of Ganeti, >> >>> + the parameters will still need to be stored in the job files, but >> >>> they will >> >>> + be removed from there when the job has finished running (either >> >>> successfully >> >>> + or not). >> >>> + >> >> >> >> Do we actually need to save them in the job files? >> >> The job files could be saved (to disk) without, and in case the master >> >> is failed over the job can be failed. >> >> (this should make it a lot harder to access) >> > >> > Unfortunately, I think we need to save them. Currently the job is >> > created by luxid, serialized, and then read from file and executed by >> > masterd, as part of the ongoing migration of the job queue from >> > masterd to luxid. >> > >> >> Ack, but this is hopefully temporary, and the job data can perhaps in >> the future be passed via socket between the two... >> So OK temporarily during development, but not by design, let's rather >> fix the underlying problem. >> >> >>> +* A new OS installation procedure, based on a safe virtualized >> >>> environment. >> >>> + This virtualized environment will run with the same hardware >> >>> parameter as the >> >>> + actual instance being installed, as much as possible. This will also >> >>> allow to >> >>> + reduce the memory usage in the host (specifically, in Dom0 for Xen >> >>> + installations). >> >>> Each instance will have these possible execution modes: >> >>> + >> >>> + * ``run``: the default mode, used when the machine is running >> >>> normally. >> >>> + >> >>> + * ``self_install``: Ganeti will start the instance with a different >> >>> set of >> >>> + user-specified parameters, therefore allowing to attach an >> >>> installation >> >>> + floppy/cdrom/network, change the boot device order, or specify an >> >>> OS image >> >>> + to be used. The instance will then be responsible to get the >> >>> parameters for >> >>> + configuring itself (its network interfaces, IP address, hostname, >> >>> etc.) from >> >>> + a set of metadata provided to it by Ganeti (e.g.: using an approach >> >>> + comparable to the one of the ``cloud-init`` tool). When this >> >>> installation >> >>> + mode is used, no OS installation script is required. >> >>> + In order for installation of an OS from an image to be possible, a >> >>> new >> >>> + parameter ``--os-image`` will be added, allwoing to specify where >> >>> to take >> >>> + the image from. It will have to be mutually exclusive with >> >>> ``--os-type``. If >> >>> + ``--os-image`` is specified, ``--os-parameters`` can still be used, >> >>> as it >> >>> + will be passed to the instance as part of the metadata. >> >>> + The set of ``self_install`` parameters will be stored as part of the >> >>> + instance configuration, so that they can be used to reinstall the >> >>> instance. >> >>> + It will be the user's responsibility to ensure that the OS image or >> >>> any >> >>> + installation media is still available in the proper position when a >> >>> + reinstall happens. >> >>> + >> >> >> >> Should we use --os-type image:<name> and/or have an image os provider >> >> that defines: >> >> 1) the actual parameters needed for installation >> >> 2) the image (eg. the verify script could double check that the image >> >> is available from the node or accessible via the network...) >> >> >> >> I think in particular it would be useful to still have the concept of >> >> an OS "provider" that tells ganeti how to install itself (which >> >> parameters to use). This of course could be overridable, but at least >> >> there would be a sane default without relying on the user to "get it >> >> right". >> > >> > Regarding using --os-type image:<name>: >> > That was my initial though too, and also my favorite choice. Still, >> > given that we usually want to keep backwards compatibility, this would >> > cause problems if somebody has an OS definition called "image". >> > Furthermore, that name would become reserved in the future. >> > If you think it is a small enough risk, and listing this in the >> > "incompatible changes" section of the NEWS file is enough, then I'm >> > absolutely in favor of doing it. >> > >> >> I think it would be OK as it's not conflicting with an OS definition >> called "image" but one called image:<something>, no? >> >> > Regarding the os provider: my idea here was to have a possibility of >> > using Ganeti without having to provide a provider, but just an OS >> > image plus some "gnt-instance add" parameters, therefore having a more >> > standard approach, similar to what other solutions are doing. Having >> > an OS provider for this as well, would defeat this purpose. Moreover, >> > providing an installation script would still be an option, so who want >> > to have an OS provider, can have it. >> > >> >> Ack. >> >> >> >> >>> + * ``install``: Ganeti will start the instance using a virtual >> >>> appliance >> >>> + specifically made for installing Ganeti instances. Scripts >> >>> analogous to the >> >>> + current ones will run inside this instance. The disks of the >> >>> instance being >> >>> + installed will be connected to this virtual appliance, so that the >> >>> scripts >> >>> + can mount them and modify them as needed, as currently happens, but >> >>> with the >> >>> + additional protection given by this happening in a VM. The virtual >> >>> appliance >> >>> + will be started in a clean state every time a new instance need to >> >>> be >> >>> + created, to further increase security. Metadata will be provided >> >>> also to >> >>> + this virtual applicance, that will take care of converting them to >> >>> + environment variables for the installation scripts. >> >>> + >> >> >> >> Please specify better that by "will be started in a clean state" you >> >> actually mean "the disk will be reset to its pristine state and not >> >> reused between reinstallation" because it might be construed to mean >> >> just the "booting" (runtime info) which is sort of less strict. >> > >> > ACK >> > >> >> >> >>> +In order to allow for the metadata to be sent inside the instance, a >> >>> +communication mechanism between the instance and the host will be >> >>> created. This >> >>> +mechanism will be bidirectional (e.g.: to allow the setup process going >> >>> on >> >>> +inside the instance to communicate its progress to the host). Each >> >>> instance will >> >>> +have access exclusively to its own metadata, and it will be only able to >> >>> +communicate with its host over this channel. >> >>> + >> >> >> >> Too vague :) >> > >> > It's intentionally vague: here it's just meant to state the problem. >> > The actual description of the metadata and the communication mechanism >> > is in the implementation section. I'll add a reference to that from >> > here. >> > >> >> Thanks. >> >> >> >> >> >> >>> +As part of the instance creation command it will be possible to >> >>> indicate a URL >> >>> +for a "personalization package", that is an archive containing a set of >> >>> files >> >>> +meant to be overlayed on top of the operating system file system at the >> >>> end of >> >>> +the setup process, before the VM is started for the first time in >> >>> ``run`` mode. >> >>> +Ganeti will provide a mechanism for receiving and unpacking this >> >>> archive as part >> >>> +of the ``install`` execution mode, whereas in ``self_install`` mode it >> >>> will only >> >>> +be provided as a metadata for the instance to use. >> >>> +The archive will be in TAR-GZIP format (with extension ``.tar.gz`` or >> >>> ``.tgz``) >> >>> +and will contain the files according to the directory structure that >> >>> will be >> >>> +recreated on the installation disk. Files contained in this archive will >> >>> +overwrite files with the same path created during the install procedure >> >>> (if >> >>> +any). >> >>> +The URL of the "personalization package" will have to specify an >> >>> extesion to >> >>> +identify the file format (in order to allow for more formats to be >> >>> supported in >> >>> +the future). >> >>> +The URL will be stored as part of the configuration of the instance >> >>> (therefore, >> >>> +the URL should not contain confidential information, but the file there >> >>> +available can). It is up to the system administrator to ensure that a >> >>> package >> >>> +is actually available at that URL at install and reinstall time. >> >>> +The content of the package is allowed to change. E.g.: a system >> >>> administrator >> >>> +might create a package containing the private keys of the instance being >> >>> +created. When the instance is reinstalled, a new package with new keys >> >>> can be >> >>> +made available there, therefore allowing instance reinstall without the >> >>> need to >> >>> +store keys. >> >>> + >> >> >> >> Add something about authentication perhaps (so that an admin can have >> >> a file available only to the ganeti installer only for the time of the >> >> installation) and also about the fact that we won't cache/keep the >> >> file on the node OS. >> > >> > ACK >> > >> >> >> >>> +Implementation >> >>> +============== >> >>> + >> >>> +The implementation of this design will happen as an ordered sequence of >> >>> steps, >> >>> +of increasing impact on the system and, in some cases, dependent on >> >>> each other: >> >>> + >> >>> +#. Private and secret instance parameters >> >>> +#. Communication mechanism between host and instance >> >>> +#. Metadata service >> >>> +#. Personalization package >> >>> +#. ``self_install`` mode >> >>> +#. ``install`` mode (with virtualization environment) >> >>> + >> >>> +Some of these steps need to be more deeply specified w.r.t. what is >> >>> already >> >>> +written in the `Proposed changes`_ Section. Extra details will be >> >>> provided in >> >>> +the following Subsections. >> >>> + >> >>> +Communication mechanism and metadata service >> >>> +++++++++++++++++++++++++++++++++++++++++++++ >> >>> + >> >>> +The communication mechanism and the metadata service are described >> >>> together >> >>> +because they are deeply tied. On the other hand, the communication >> >>> mechanism >> >>> +will need to be more generic because it can be used for other reasons >> >>> in the >> >>> +future (like allowing instances to esplicitly send commands to Ganeti, >> >>> or to let >> >> >> >> explicitly >> > >> > ACK >> > >> >> >> >>> +Ganeti control a helper instance, like the one hereby introduced for >> >>> performing >> >>> +OS installs inside a safe environment). >> >>> + >> >>> +The communication mechanism will be enabled automatically when the >> >>> instance is >> >>> +in ``self_install`` or ``install`` mode, but for backwards >> >>> compatibility it will >> >>> +be disabled when the instance is in ``run`` mode unless it is esplicitly >> >> >> >> ^ see above >> > >> > ACK >> > >> >> >> >>> +requested at instance startup by using a new, ad-hoc, parameter >> >>> +(``--communication``). >> >> >> >> Which parameter is this? An instance, hypervisor or backend parameter? >> >> And why? >> >> Also -C could do as well (if we go for instance level). Remember to >> >> specify here as it has to be clear that an instance once configured >> >> that way will be always started that way. >> >> >> > >> > Yes, it's intended to be an instance level parameter. I'll specify >> > that it is set at creation time, or modifiable with "gnt-instance >> > modify", and then is automatically read from the config and used every >> > time the instance is started. >> > >> >>> + >> >>> +When the communication mechanism is enabled, Ganeti will create a new >> >>> network >> >>> +interface inside the instance. This extra network interface will be the >> >>> last one >> >>> +of the instance, after all the user defined ones. On the host side, this >> >>> +interface will be only accessible to the host itself, and not be routed >> >>> outside >> >>> +the machine. >> >> >> >> Actually it would be great if we didn't even have to create the tap. >> > >> > Do you mean something like (for kvm): >> > -net user,net=169.254.169.0/24,host=169.254.169.254 >> > that starts a user network showing the host as reachable with address >> > 169.254.169.254? >> > >> >> Yes, that would be a secure way to do it. Or perhaps using a >> VDE-compatible connection? >> But it doesn't have to be. Otherwise let's discuss which rules will >> there be by default so that we assure that traffic can't get to the >> wrong place. >> >> >>> +On this network interface, the instance will connect using the IP: >> >>> +169.254.169.1 and netmask 255.255.255.0. >> >>> +The host will be on the same network, with the IP address: >> >>> 169.254.169.254. >> >>> +The instance will be able to connect to 169.254.169.254:80, and issue >> >>> GET >> >>> +requests to an HTTP server that will provide the instance metadata. >> >>> + >> >>> +The choice of this IP address and port is done for compatibility >> >>> reasons with >> >>> +OpenStack's and Amazon EC2's ways of providing metadata to the instance. >> >>> + >> >>> +Where possible, the metadata will be provided in a way compatible with >> >>> OpenStack >> >>> +at:: >> >>> + >> >>> + http://169.254.169.254/openstack/<version>/meta_data.json >> >>> + >> >>> +or with Amazon EC2, at:: >> >>> + >> >>> + http://169.254.169.254/<version>/meta-data/* >> >>> + >> >>> +If some metadata are Ganeti-specific and don't fit this structure, they >> >>> will be >> >>> +provided at:: >> >>> + >> >>> + http://169.254.169.254/<version>/ganeti/meta_data.json >> >>> + >> >> >> >> Not quite clear! :) How does the OS choose between those? How are they >> >> expected to differ? >> > >> > The idea is to provide the data in both formats, so the OS can chose >> > based on its own preferences (there are some tools already getting the >> > data from those postions, such as cloud-init). >> > >> >> >> >>> +``<version>`` is either a date in YYYY-MM-DD format, or ``latest`` to >> >>> indicate >> >>> +the most recent available protocol version. >> >>> + >> >> >> >> Is this what openstack and EC2 do? >> > >> > Yes, I'm writing this here just as a clarification, but it's exactly >> > their format. >> > >> >> >> >>> +A bi-directional, pipe-like communication channel will be provided. The >> >>> instance >> >>> +will be able to receive data from the host by a GET request at:: >> >>> + >> >>> + http://169.254.169.254/<version>/ganeti/pipe_in >> >>> + >> >>> +and to send data to the host by a POST request at:: >> >>> + >> >>> + http://169.254.169.254/<version>/ganeti/pipe_out >> >>> + >> >> >> >> Why is it /openstack/<version> >> >> but <version>/meta-data >> >> and <version>/ganeti ? >> >> Can we have it a bit more logical? >> > >> > EC2 is: >> > /<version>/meta-data/* >> > >> > OpenStack came later but wanted to keep compatibility, so they created >> > their own directory, including their own API version number: >> > >> > /openstack/<version>/meta-data.json >> > >> > And Ganeti is supposed to follow the same style as openstack, but I >> > wrote it wrong, sorry for the mistake: >> > /ganeti/<version>/* >> > >> >> Ack then. >> >> >> >> >>> +As in a pipe, once the data are read, they will not be in the buffer >> >>> anymore, so >> >>> +subsequent get request to ``pipe_in`` will not return the same data >> >>> twice. >> >>> +Unlike a pipe, though, it will not be possible to perform blocking I/O >> >>> +operations. >> >>> + >> >> >> >> So maybe we should just call it read and write? :) >> > >> > Perfectly fine for me. >> > >> >>> +The OS parameters will be accessible through a GET >> >>> +request at:: >> >>> + >> >>> + >> >>> http://169.254.169.254/<version>/ganeti/os/parameters/<visibility>.json >> >>> + >> >>> +as a JSON serialized dictionary. ``<visibility>`` will be either >> >>> ``public`` or >> >>> +``private`` or ``secret``. >> >>> + > > Instead of having 'os/parameters/<visibility>', why not just have one > endpoing that returns a JSON object with keys 'public', 'private', and > 'secret'? Something like os/parameters.json. It gives us more > flexibility in case we want to change the datastructure instead of > having to maintain several endpoints.
As I already replied to Guido in a previous email, that's perfectly fine, and I'll do it. Thanks for the suggestion, Michele -- Google Germany GmbH Dienerstr. 12 80331 München Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Graham Law, Christine Elizabeth Flores
