Re: [PATCH master] OS installation redesign

Jose A. Lopes Thu, 14 Nov 2013 05:56:55 -0800

On Thu, Nov 14, 2013 at 11:31:11AM +0100, Guido Trotter wrote:
> On Wed, Nov 13, 2013 at 9:57 AM, Michele Tartara <[email protected]> wrote:
> > On Tue, Nov 12, 2013 at 2:13 PM, Guido Trotter <[email protected]> wrote:
> >> On Tue, Nov 12, 2013 at 12:41 PM, Michele Tartara <[email protected]> 
> >> wrote:
> >>> Add the document describing a new design for the OS installation process 
> >>> for
> >>> new instances.
> >>>
> >>> Signed-off-by: Michele Tartara <[email protected]>
> >>> ---
> >>>  doc/design-draft.rst |    1 +
> >>>  doc/design-os.rst    |  318 
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>  2 files changed, 319 insertions(+)
> >>>  create mode 100644 doc/design-os.rst
> >>>
> >>> diff --git a/doc/design-draft.rst b/doc/design-draft.rst
> >>> index c821292..3ed3852 100644
> >>> --- a/doc/design-draft.rst
> >>> +++ b/doc/design-draft.rst
> >>> @@ -20,6 +20,7 @@ Design document drafts
> >>>     design-daemons.rst
> >>>     design-hsqueeze.rst
> >>>     design-ssh-ports.rst
> >>> +   design-os.rst
> >>>
> >>>  .. vim: set textwidth=72 :
> >>>  .. Local Variables:
> >>> diff --git a/doc/design-os.rst b/doc/design-os.rst
> >>> new file mode 100644
> >>> index 0000000..7a42a7f
> >>> --- /dev/null
> >>> +++ b/doc/design-os.rst
> >>> @@ -0,0 +1,318 @@
> >>> +===============================
> >>> +Ganeti OS installation redesign
> >>> +===============================
> >>> +
> >>> +.. contents:: :depth: 3
> >>> +
> >>> +This is a design document detailing a new OS installation procedure, more
> >>> +secure, able to provide more features and easier to use for many common 
> >>> tasks
> >>> +w.r.t. the current one.
> >>> +
> >>> +Current state and shortcomings
> >>> +==============================
> >>> +
> >>> +As of Ganeti 2.10, each instance is associated with an OS definition. An 
> >>> OS
> >>> +definition is a set of scripts (``create``, ``export``, ``import``, 
> >>> ``rename``)
> >>> +that are executed with root privileges on the primary host of the 
> >>> instance to
> >>> +perform all the OS-related functionality (setting up an operating system 
> >>> inside
> >>> +the disks of the instance being created, exporting/importing the 
> >>> instance,
> >>> +renaming it).
> >>> +
> >>> +These scripts receive, as environment variables, a fixed set of 
> >>> parameters
> >>> +describing the instance (such as the hypervisor, the name of the 
> >>> instance, the
> >>> +number of disks, and their location) and a set of user defined 
> >>> parameters. Each
> >>> +of these parameters is also written into the configuration file of 
> >>> Ganeti, to
> >>> +allow for future reinstalls of the instance, and in various log files, 
> >>> namely:
> >>> +
> >>> +* node daemon log file: contains DEBUG strings of the ``/os_validate``,
> >>> +  ``/instance_os_add`` and ``/instance_start`` RPC calls.
> >>> +
> >>> +* master daemon log file: DEBUG strings related to the same RPC calls 
> >>> are stored
> >>> +  here as well.
> >>> +
> >>> +* commands log: the CLI commands that create a new instance, including 
> >>> their
> >>> +  parameters, are logged here.
> >>> +
> >>> +* RAPI log: the RAPI commands that create a new instances, including 
> >>> their
> >>> +  parameters, are logged here.
> >>> +
> >>> +* job logs: the job files stored in the job queue or in its archive 
> >>> contain the
> >>> +  parameters.
> >>> +
> >>> +The current situation presents a number of shortcomings:
> >>> +
> >>> +* Having the installation scripts run with root power on the nodes is a 
> >>> huge
> >>> +  security issue.
> >>> +
> >>
> >> s/is a huge security issue/doesn't allow user-defined os scripts, as
> >> they would pose a huge security issue/
> >>
> >> Note that there's no security issue *per se* in the current situation,
> >> if the OS scripts are trusted.
> >> (except perhaps for export, if the os script mounts the instance disk,
> >> which is also not necessarily the case)
> >
> > Yes, that's what I meant. I'll reword it as you suggest.
> >
> >>
> >> That said it could be a safety issue in the sense that an eventual
> >> bug/error in the os script could risk disrupting the node.
> >
> > ACK
> >
> >>
> >>> +* Ganeti cannot be used to create instances starting from user provided 
> >>> disk
> >>> +  images: even in the (hypothetical) case where the scripts are 
> >>> completely
> >>> +  secure and run not by root but by an unprivileged user with only the 
> >>> power to
> >>> +  mount arbitrary files as disk images, this is a security issue. It has 
> >>> been
> >>> +  proven that a carefully crafted file system might exploit kernel
> >>> +  vulnerabilities to gain control of the system. Therefore, directly 
> >>> mounting
> >>> +  images on the Ganeti nodes is not an option.
> >>> +
> >>> +* There is no way to inject files into an existing disk image. A common 
> >>> use case
> >>> +  is for the system administrator to provide a standard image of the 
> >>> system, to
> >>> +  be later personalized with the network configuration, private keys 
> >>> identifying
> >>> +  the machine, ssh keys of the users and so on. A possible workaround 
> >>> would be
> >>> +  for the scripts to mount the image (only if this is trusted!) and to 
> >>> receive
> >>> +  the configurations and ssh keys as user defined OS parameters. 
> >>> Unfortunately,
> >>> +  this is also not an option for security sensitive material (such as 
> >>> the ssh
> >>> +  keys) because the OS parameters are stored in many places on the 
> >>> system, as
> >>> +  already described above.
> >>> +
> >>> +* Most other virtualization software simply work with instance images, 
> >>> not with
> >>> +  installation scripts. This difference makes the interaction of Ganeti 
> >>> with
> >>> +  other softwares difficult.
> >>
> >> s/softwares/software/
> >
> > ACK
> >
> >>
> >>> +
> >>> +Proposed changes
> >>> +================
> >>> +
> >>> +In order to fix the shortcomings of the current state, we plan to 
> >>> introduce the
> >>> +following changes:
> >>> +
> >>> +* Change the OS parameters to have three categories:
> >>> +
> >>> + * ``public``: the current behavior. The parameter is logged and stored 
> >>> freely.
> >>> +
> >>> + * ``private``: the parameter is saved inside the Ganeti configuration 
> >>> (to allow
> >>> +   for instance reinstall) but it is not shown in logs, job logs, or 
> >>> passed back
> >>> +   via RAPI.
> >>> +
> >>> + * ``secret``: the parameter is not saved inside the Ganeti 
> >>> configuration.
> >>> +   Reinstall are impossible unless the data is passed again. The 
> >>> parameter will
> >>> +   not appear in any log file. In order to preserve the functionality of 
> >>> Ganeti,
> >>> +   the parameters will still need to be stored in the job files, but 
> >>> they will
> >>> +   be removed from there when the job has finished running (either 
> >>> successfully
> >>> +   or not).
> >>> +
> >>
> >> Do we actually need to save them in the job files?
> >> The job files could be saved (to disk) without, and in case the master
> >> is failed over the job can be failed.
> >> (this should make it a lot harder to access)
> >
> > Unfortunately, I think we need to save them. Currently the job is
> > created by luxid, serialized, and then read from file and executed by
> > masterd, as part of the ongoing migration of the job queue from
> > masterd to luxid.
> >
> 
> Ack, but this is hopefully temporary, and the job data can perhaps in
> the future be passed via socket between the two...
> So OK temporarily during development, but not by design, let's rather
> fix the underlying problem.
> 
> >>> +* A new OS installation procedure, based on a safe virtualized 
> >>> environment.
> >>> +  This virtualized environment will run with the same hardware parameter 
> >>> as the
> >>> +  actual instance being installed, as much as possible. This will also 
> >>> allow to
> >>> +  reduce the memory usage in the host (specifically, in Dom0 for Xen
> >>> +  installations).
> >>> Each instance will have these possible execution modes:
> >>> +
> >>> +  * ``run``: the default mode, used when the machine is running normally.
> >>> +
> >>> +  * ``self_install``: Ganeti will start the instance with a different 
> >>> set of
> >>> +    user-specified parameters, therefore allowing to attach an 
> >>> installation
> >>> +    floppy/cdrom/network, change the boot device order, or specify an OS 
> >>> image
> >>> +    to be used. The instance will then be responsible to get the 
> >>> parameters for
> >>> +    configuring itself (its network interfaces, IP address, hostname, 
> >>> etc.) from
> >>> +    a set of metadata provided to it by Ganeti (e.g.: using an approach
> >>> +    comparable to the one of the ``cloud-init`` tool). When this 
> >>> installation
> >>> +    mode is used, no OS installation script is required.
> >>> +    In order for installation of an OS from an image to be possible, a 
> >>> new
> >>> +    parameter ``--os-image`` will be added, allwoing to specify where to 
> >>> take
> >>> +    the image from. It will have to be mutually exclusive with 
> >>> ``--os-type``. If
> >>> +    ``--os-image`` is specified, ``--os-parameters`` can still be used, 
> >>> as it
> >>> +    will be passed to the instance as part of the metadata.
> >>> +    The set of ``self_install`` parameters will be stored as part of the
> >>> +    instance configuration, so that they can be used to reinstall the 
> >>> instance.
> >>> +    It will be the user's responsibility to ensure that the OS image or 
> >>> any
> >>> +    installation media is still available in the proper position when a
> >>> +    reinstall happens.
> >>> +
> >>
> >> Should we use --os-type image:<name> and/or have an image os provider
> >> that defines:
> >> 1) the actual parameters needed for installation
> >> 2) the image (eg. the verify script could double check that the image
> >> is available from the node or accessible via the network...)
> >>
> >> I think in particular it would be useful to still have the concept of
> >> an OS "provider" that tells ganeti how to install itself (which
> >> parameters to use). This of course could be overridable, but at least
> >> there would be a sane default without relying on the user to "get it
> >> right".
> >
> > Regarding using --os-type image:<name>:
> > That was my initial though too, and also my favorite choice. Still,
> > given that we usually want to keep backwards compatibility, this would
> > cause problems if somebody has an OS definition called "image".
> > Furthermore, that name would become reserved in the future.
> > If you think it is a small enough risk, and listing this in the
> > "incompatible changes" section of the NEWS file is enough, then I'm
> > absolutely in favor of doing it.
> >
> 
> I think it would be OK as it's not conflicting with an OS definition
> called "image" but one called image:<something>, no?
> 
> > Regarding the os provider: my idea here was to have a possibility of
> > using Ganeti without having to provide a provider, but just an OS
> > image plus some "gnt-instance add" parameters, therefore having a more
> > standard approach, similar to what other solutions are doing. Having
> > an OS provider for this as well, would defeat this purpose. Moreover,
> > providing an installation script would still be an option, so who want
> > to have an OS provider, can have it.
> >
> 
> Ack.
> 
> >>
> >>> +  * ``install``: Ganeti will start the instance using a virtual appliance
> >>> +    specifically made for installing Ganeti instances. Scripts analogous 
> >>> to the
> >>> +    current ones will run inside this instance. The disks of the 
> >>> instance being
> >>> +    installed will be connected to this virtual appliance, so that the 
> >>> scripts
> >>> +    can mount them and modify them as needed, as currently happens, but 
> >>> with the
> >>> +    additional protection given by this happening in a VM. The virtual 
> >>> appliance
> >>> +    will be started in a clean state every time a new instance need to be
> >>> +    created, to further increase security. Metadata will be provided 
> >>> also to
> >>> +    this virtual applicance, that will take care of converting them to
> >>> +    environment variables for the installation scripts.
> >>> +
> >>
> >> Please specify better that by "will be started in a clean state" you
> >> actually mean "the disk will be reset to its pristine state and not
> >> reused between reinstallation" because it might be construed to mean
> >> just the "booting" (runtime info) which is sort of less strict.
> >
> > ACK
> >
> >>
> >>> +In order to allow for the metadata to be sent inside the instance, a
> >>> +communication mechanism between the instance and the host will be 
> >>> created. This
> >>> +mechanism will be bidirectional (e.g.: to allow the setup process going 
> >>> on
> >>> +inside the instance to communicate its progress to the host). Each 
> >>> instance will
> >>> +have access exclusively to its own metadata, and it will be only able to
> >>> +communicate with its host over this channel.
> >>> +
> >>
> >> Too vague :)
> >
> > It's intentionally vague: here it's just meant to state the problem.
> > The actual description of the metadata and the communication mechanism
> > is in the implementation section. I'll add a reference to that from
> > here.
> >
> 
> Thanks.
> 
> >>
> >>
> >>> +As part of the instance creation command it will be possible to indicate 
> >>> a URL
> >>> +for a "personalization package", that is an archive containing a set of 
> >>> files
> >>> +meant to be overlayed on top of the operating system file system at the 
> >>> end of
> >>> +the setup process, before the VM is started for the first time in 
> >>> ``run`` mode.
> >>> +Ganeti will provide a mechanism for receiving and unpacking this archive 
> >>> as part
> >>> +of the ``install`` execution mode, whereas in ``self_install`` mode it 
> >>> will only
> >>> +be provided as a metadata for the instance to use.
> >>> +The archive will be in TAR-GZIP format (with extension ``.tar.gz`` or 
> >>> ``.tgz``)
> >>> +and will contain the files according to the directory structure that 
> >>> will be
> >>> +recreated on the installation disk. Files contained in this archive will
> >>> +overwrite files with the same path created during the install procedure 
> >>> (if
> >>> +any).
> >>> +The URL of the "personalization package" will have to specify an 
> >>> extesion to
> >>> +identify the file format (in order to allow for more formats to be 
> >>> supported in
> >>> +the future).
> >>> +The URL will be stored as part of the configuration of the instance 
> >>> (therefore,
> >>> +the URL should not contain confidential information, but the file there
> >>> +available can). It is up to the system administrator to ensure that a 
> >>> package
> >>> +is actually available at that URL at install and reinstall time.
> >>> +The content of the package is allowed to change. E.g.: a system 
> >>> administrator
> >>> +might create a package containing the private keys of the instance being
> >>> +created. When the instance is reinstalled, a new package with new keys 
> >>> can be
> >>> +made available there, therefore allowing instance reinstall without the 
> >>> need to
> >>> +store keys.
> >>> +
> >>
> >> Add something about authentication perhaps (so that an admin can have
> >> a file available only to the ganeti installer only for the time of the
> >> installation) and also about the fact that we won't cache/keep the
> >> file on the node OS.
> >
> > ACK
> >
> >>
> >>> +Implementation
> >>> +==============
> >>> +
> >>> +The implementation of this design will happen as an ordered sequence of 
> >>> steps,
> >>> +of increasing impact on the system and, in some cases, dependent on each 
> >>> other:
> >>> +
> >>> +#. Private and secret instance parameters
> >>> +#. Communication mechanism between host and instance
> >>> +#. Metadata service
> >>> +#. Personalization package
> >>> +#. ``self_install`` mode
> >>> +#. ``install`` mode (with virtualization environment)
> >>> +
> >>> +Some of these steps need to be more deeply specified w.r.t. what is 
> >>> already
> >>> +written in the `Proposed changes`_ Section. Extra details will be 
> >>> provided in
> >>> +the following Subsections.
> >>> +
> >>> +Communication mechanism and metadata service
> >>> +++++++++++++++++++++++++++++++++++++++++++++
> >>> +
> >>> +The communication mechanism and the metadata service are described 
> >>> together
> >>> +because they are deeply tied. On the other hand, the communication 
> >>> mechanism
> >>> +will need to be more generic because it can be used for other reasons in 
> >>> the
> >>> +future (like allowing instances to esplicitly send commands to Ganeti, 
> >>> or to let
> >>
> >> explicitly
> >
> > ACK
> >
> >>
> >>> +Ganeti control a helper instance, like the one hereby introduced for 
> >>> performing
> >>> +OS installs inside a safe environment).
> >>> +
> >>> +The communication mechanism will be enabled automatically when the 
> >>> instance is
> >>> +in ``self_install`` or ``install`` mode, but for backwards compatibility 
> >>> it will
> >>> +be disabled when the instance is in ``run`` mode unless it is esplicitly
> >>
> >> ^ see above
> >
> > ACK
> >
> >>
> >>> +requested at instance startup by using a new, ad-hoc, parameter
> >>> +(``--communication``).
> >>
> >> Which parameter is this? An instance, hypervisor or backend parameter? And 
> >> why?
> >> Also -C could do as well (if we go for instance level). Remember to
> >> specify here as it has to be clear that an instance once configured
> >> that way will be always started that way.
> >>
> >
> > Yes, it's intended to be an instance level parameter. I'll specify
> > that it is set at creation time, or modifiable with "gnt-instance
> > modify", and then is automatically read from the config and used every
> > time the instance is started.
> >
> >>> +
> >>> +When the communication mechanism is enabled, Ganeti will create a new 
> >>> network
> >>> +interface inside the instance. This extra network interface will be the 
> >>> last one
> >>> +of the instance, after all the user defined ones. On the host side, this
> >>> +interface will be only accessible to the host itself, and not be routed 
> >>> outside
> >>> +the machine.
> >>
> >> Actually it would be great if we didn't even have to create the tap.
> >
> > Do you mean something like (for kvm):
> >   -net user,net=169.254.169.0/24,host=169.254.169.254
> > that starts a user network showing the host as reachable with address
> > 169.254.169.254?
> >
> 
> Yes, that would be a secure way to do it. Or perhaps using a
> VDE-compatible connection?
> But it doesn't have to be. Otherwise let's discuss which rules will
> there be by default so that we assure that traffic can't get to the
> wrong place.
> 
> >>> +On this network interface, the instance will connect using the IP:
> >>> +169.254.169.1 and netmask 255.255.255.0.
> >>> +The host will be on the same network, with the IP address: 
> >>> 169.254.169.254.
> >>> +The instance will be able to connect to 169.254.169.254:80, and issue GET
> >>> +requests to an HTTP server that will provide the instance metadata.
> >>> +
> >>> +The choice of this IP address and port is done for compatibility reasons 
> >>> with
> >>> +OpenStack's and Amazon EC2's ways of providing metadata to the instance.
> >>> +
> >>> +Where possible, the metadata will be provided in a way compatible with 
> >>> OpenStack
> >>> +at::
> >>> +
> >>> +  http://169.254.169.254/openstack/<version>/meta_data.json
> >>> +
> >>> +or with Amazon EC2, at::
> >>> +
> >>> +  http://169.254.169.254/<version>/meta-data/*
> >>> +
> >>> +If some metadata are Ganeti-specific and don't fit this structure, they 
> >>> will be
> >>> +provided at::
> >>> +
> >>> +  http://169.254.169.254/<version>/ganeti/meta_data.json
> >>> +
> >>
> >> Not quite clear! :) How does the OS choose between those? How are they
> >> expected to differ?
> >
> > The idea is to provide the data in both formats, so the OS can chose
> > based on its own preferences (there are some tools already getting the
> > data from those postions, such as cloud-init).
> >
> >>
> >>> +``<version>`` is either a date in YYYY-MM-DD format, or ``latest`` to 
> >>> indicate
> >>> +the most recent available protocol version.
> >>> +
> >>
> >> Is this what openstack and EC2 do?
> >
> > Yes, I'm writing this here just as a clarification, but it's exactly
> > their format.
> >
> >>
> >>> +A bi-directional, pipe-like communication channel will be provided. The 
> >>> instance
> >>> +will be able to receive data from the host by a GET request at::
> >>> +
> >>> +  http://169.254.169.254/<version>/ganeti/pipe_in
> >>> +
> >>> +and to send data to the host by a POST request at::
> >>> +
> >>> +  http://169.254.169.254/<version>/ganeti/pipe_out
> >>> +
> >>
> >> Why is it /openstack/<version>
> >> but <version>/meta-data
> >> and <version>/ganeti ?
> >> Can we have it a bit more logical?
> >
> > EC2 is:
> > /<version>/meta-data/*
> >
> > OpenStack came later but wanted to keep compatibility, so they created
> > their own directory, including their own API version number:
> >
> > /openstack/<version>/meta-data.json
> >
> > And Ganeti is supposed to follow the same style as openstack, but I
> > wrote it wrong, sorry for the mistake:
> > /ganeti/<version>/*
> >
> 
> Ack then.
> 
> >>
> >>> +As in a pipe, once the data are read, they will not be in the buffer 
> >>> anymore, so
> >>> +subsequent get request to ``pipe_in`` will not return the same data 
> >>> twice.
> >>> +Unlike a pipe, though, it will not be possible to perform blocking I/O
> >>> +operations.
> >>> +
> >>
> >> So maybe we should just call it read and write? :)
> >
> > Perfectly fine for me.
> >
> >>> +The OS parameters will be accessible through a GET
> >>> +request at::
> >>> +
> >>> +  http://169.254.169.254/<version>/ganeti/os/parameters/<visibility>.json
> >>> +
> >>> +as a JSON serialized dictionary. ``<visibility>`` will be either 
> >>> ``public`` or
> >>> +``private`` or ``secret``.
> >>> +


Instead of having 'os/parameters/<visibility>', why not just have one
endpoing that returns a JSON object with keys 'public', 'private', and
'secret'? Something like os/parameters.json. It gives us more
flexibility in case we want to change the datastructure instead of
having to maintain several endpoints.

> >>
> >> Why does the instance care about the visibility, and why is this
> >> provided at the file level? Couldn't a single json contain all info,
> >> with also ancillary data to specify the level of confidentiality?
> >
> > Yes, a single file is also possible.
> >
> 
> Cool, thanks.
> 
> >>
> >>> +The installation scripts to be run inside the virtualized environment 
> >>> while the
> >>> +instance is run in ``install`` mode will be available at::
> >>> +
> >>> +  http://169.254.169.254/<version>/ganeti/os/scripts/<script_name>
> >>> +
> >>> +where ``<script_name>`` is the name of the script.
> >>> +
> >>> +The host and the instances (as detailed in `Installation process in a
> >>> +virtualized environment`_) will be able to create other communication 
> >>> channels
> >>> +on the other ports of the same IP address.
> >>> +
> >>
> >> Why not at other URLs?
> >
> > In the design with an actual network interface, ports come "for free".
> > If we go towards a design with no TAP device, this is probably going
> > to be more difficult, and providing some way for the users to provide
> > information as other URLS in this hierarchy becomes more interesting.
> >
> 
> Ack.
> 
> >>
> >>> +
> >>> +Rationale
> >>> +---------
> >>> +
> >>> +The choice of using a network interface for instance-host communication, 
> >>> as
> >>> +opposed to VirtIO, XenBus or other methods, is due to the will of having 
> >>> a
> >>> +generic, hypervisor-independent way of creating a communication channel, 
> >>> that
> >>> +doesn't require unusual (para)virtualization drivers.
> >>> +At the same time, a network interface was preferred over solutions 
> >>> involving
> >>> +virtual floppy or USB devices because the latter tend to be detected and
> >>> +configured by the guest operating systems, sometimes even in prominent 
> >>> positions
> >>> +in the user interface, whereas it is fairly common to have an 
> >>> unconfigured
> >>> +network interface in a system, usually without any negative side effects.
> >>> +
> >>> +
> >>> +Installation process in a virtualized environment
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> +
> >>> +In the new OS installation scenario, we distinguish between trusted and
> >>> +untrusted code.
> >>> +
> >>> +The trusted installation code maintains the behavior of the current one, 
> >>> with
> >>> +the scripts running on the node the instance is being created on. The 
> >>> untrusted
> >>> +code is stored in a subdirectory of the OS definition called 
> >>> ``untrusted``.
> >>> +This directory contains scripts that are equivalent to the already 
> >>> existing
> >>> +ones (``create``, ``export``, ``import``, ``rename``) but that will be 
> >>> run
> >>> +inside an virtualized environment, to protect the host from malicious 
> >>> tampering.
> >>> +
> >>> +The ``untrusted`` code is meant to either be untrusted itself, or to be 
> >>> trusted
> >>> +code running operations that might be dangerous (such as mounting a
> >>> +user-provided image).
> >>> +
> >>> +In order to allow for the highest flexibility, if both a trusted and an
> >>> +untrusted script are provided for the same operation (i.e. ``create``), 
> >>> both of
> >>> +them will be executed at the same time, one on the host, and one inside 
> >>> the
> >>> +installation appliance. They will be allowed to communicate with each 
> >>> other
> >>> +through the already described communication mechanism, in order to 
> >>> orchestrate
> >>> +their execution (e.g.: the untrusted code might execute the 
> >>> installation, while
> >>> +the trusted one receives status updates from it and delivers them to a 
> >>> user
> >>> +interface).
> >>> +
> >>
> >> Sounds a bit clunky, and makes it hard to provide OS definitions from
> >> the user (as an admin I have to "open" them and check that the trusted
> >> scripts are empty or allowed... maybe this should be a new version and
> >> disallow the old way altogether.
> >
> > For user provided script, an administrator might simply decide that
> > they are always untrusted, therefore allowing only for the untrusted
> > part, thus requiring only a really simple check.
> >
> > I agree that having the new kind of scripts being completely untrusted
> > and always running inside the VM would be the simplest and cleanest
> > solution.
> >
> > I wrote the proposal this way to meet some explicit requests from the
> > open source community, looking for a way to have trusted and untrusted
> > code running together in a communication-synchronized way. Maybe we
> > can leave this in the design marking it as optional and hope for some
> > code contribution?
> >
> 
> It would be better if an OS had to be explicitly set to trusted via OS
> parameters before insecure scripts in the host could be executed.
> 
> >
> >>
> >>> +Ganeti will provide a script to be run at install time that can be used 
> >>> to
> >>> +create the virtualized environment that will perform the OS installation 
> >>> of new
> >>> +instances.
> >>> +This script will build a debootstrapped basic debian system including 
> >>> including
> >>
> >> s/including including/including/
> >>
> >>> +a software that will read the metadata, setup the environment variables 
> >>> and
> >>> +launch the installation scripts inside the virtualized environment. The 
> >>> script
> >>> +will also provide hooks for personalization.
> >>> +
> >>
> >>
> >>
> >>> +It will also be possible to use other self-made virtualized environment, 
> >>> as long
> >>> +as they connect to ganeti over the described communication mechanism and 
> >>> they
> >>> +know how to read and use the provided metadata to create a new instance.
> >>> +
> >>> +While performing an installation in the virtualized environment, a
> >>> +personalizable timeout will be used to detect possible problems with the
> >>> +installation process, and to kill the virtualized environment.
> >>> +
> >>
> >> Will the timeout be reset upon communication? Will there be a way to reset 
> >> it?
> >> How will it be customizable? Who specifies where to customize it?
> >
> > I think the timeout should be cluster-wide, set by the administrator
> > of the cluster, and not to be reset upon communication.
> > It is supposed to be a way of avoiding an installation VM to run
> > freely and uncontrolled (mainly in case it is taken over by malicious
> > untrusted scripts), therefore a reset upon communication would make it
> > fairly useless.
> >
> 
> Ack, as long as it's optional.
> 
> 
> 
> 
> Thanks,
> 
> Guido

-- 
Jose Antonio Lopes
Ganeti Engineering
Google Germany GmbH
Dienerstr. 12, 80331, München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Steuernummer: 48/725/00206
Umsatzsteueridentifikationsnummer: DE813741370

Re: [PATCH master] OS installation redesign

Reply via email to