Re: fixing the rumpbake-rumprun axis

Andrew Stuart Thu, 24 Sep 2015 15:53:20 -0700

I haven’t chimed in on previous iterations of this topic as I was still coming 
to grips with understanding the bigger picture. I’ve spent a great deal of time 
recently trying to execute every combination of rump kernels on most of the 
target platforms (haven’t done much with ISO yet but I have with all the 
others) so hopefully I can contribute.


#### Beginners perspective
From the perspective of the person new to the project. There is a significant 
mental model to form of what the rump project is and how all the pieces fit 
together. It’s only recently really become clear to me that the primary 
functionality builds a kernel, the secondary functionality launches the kernel 
in a specific environment, and that currently these things are somewhat 
combined. Because all this functionality is integrated, it’s alot harder to 
grasp for someone new to the project - I’m not sure if that matters. It would 
be easier to understand if it was divided into one kernel build and many 
scripts for run, one for each run target. Then for example if I was interested 
in running on EC2 I would need understand only the “build-kernel and 
run-on-ec2” scripts and documentation.

####Core kernel build functionality - has an *inward* perspective - builds a 
unikernel and deals only with the inner workings of the unikernel.

Some ideas (not maybe some of which are current behaviour?):
** Use initramfs / initrd or similar for built in rootfs 
https://www.kernel.org/doc/Documentation/filesystems/ramfs-rootfs-initramfs.txt
** initramfs loaded by grub along with kernel at boot
** Kernels are diskless - no requirement for disk beyond initrd rootfs
** Kernel automounts all attached block volumes by default unless overridden by 
json.cfg
** Kernel configures itself with DHCP if available unless overridden with 
specific network configuration in json.cfg
** json.cfg file is stored in initial ramfs, or may be overridden by being 
passed in text of json.cfg as a kernel argument at launch

Build process produces three files:
** (optional) json.cfg
** kernel
** initial ramdisk containing rootfs and default json.cfg

json.cfg:
** defines configuration that happens INSIDE the unikernel
** does not attempt to specify anything about what configuration happens 
OUTSIDE the unikernel
** specifies names of required block devices and their mountpoints
** specifies network device configuration
** other things kernel needs to configure?
** json.cfg not needed at all for example in a diskless config where there a no 
block devices to mount, and where only DHCP is required

Where possible and where is makes sense, the build process provides EXTERNAL VM 
host configuration information somehow, suggesting for example the command line 
required to boot with the various target platforms. Maybe this is just text or 
some sort of configuration json structure.

Boot process:
Grub starts, then:
config.json —> unikernel <— initrd

I’ll give away my lack of understanding of the technical details here, but if 
there was any way to not need to specify the rumpbake target that would lead to 
there only ever being one sort of rump unikernel.  The rumpbake process means 
that there are four different incompatible rump unikernels and no easy way to 
tell one from another. Maybe this is not practical but it would be a good thing 
if there was just one single type of rump kernel created.

#### Separate launch scripts for execution on:

Simple solutions for common use cases:
Xen
XVM/QEMU
Build ISO
EC2 PV
EC2 HVM
Other cloud targets

####The hardest bit (for me)

The hardest bit currently is trying to understand network device naming and 
what names are needed by what target and how to map the kernel’s naming to the 
external name.  I’m still trying to figure this out. Not sure if this needs 
code or just written explanation.  Feels a bit like black magic at the moment 
to work our network device names. I need to dive into the code to try to fully 
understand this area.

####To present the above wall of text as responses to Antti’s questions:

>> 1: a distributable format which does not require the toolchain to launch 
>> (what rumpbake currently gives you)
** A multiboot kernel is a good well defined distributable format.  Ideally 
only one type of kernel, not different types for xen_pv, xen_pci, hw_generic, 
hw_virtio (perhaps foolish to suggest but I don’t know enough about the 
technical reason for needing different kernels for each). It would be great if 
there was only one single type of rump kernel.

>> 2: a mechanism to configure the runtime behaviour of the distributable 
>> format (what the rumprun tool currently does)
** json.cfg stored on initrd/initramfs, which can be overridden or initially 
provided as json.cfg passed in as a kernel argument at boot time

>> 3: a mechanism to easily launch the result of 1+2 *where available* (what 
>> the rumprun tool does for xen+kvm+qemu)
** separate launch scripts, one for each target.  goal here is just to be 
helpful in common launch cases, not to provide a universally applicable 
solution to all launch configuration requirements.

>> 4: a mechanism to "specialize" the distributed format, e.g. "baked in root 
>> files" or even including "2" (to enable running without block storage)
** initrd/initramfs provided as part of boot process to solve the “baked in 
root files"
** Kernel able to run diskless without block storage would be great and would 
simplify things greatly in some configurations
** mounting all found block devices by default would be great, optionally 
turned off in json.cfg

>> There are actually 2 different configs, the Rumprun runtime config and the 
>> config of whatever you launch on.  We can't always control the latter from 
>> software, consider e.g. the case where you're launching on an embedded 
>> system, so this bit is only about the Rumprun config.  We now use a json 
>> config.  I'm thinking that the best option is to not provide a config 
>> generating tool at all, simply polish the json format spec and be happy with 
>> it.
** Agreed.  json.cfg should deal only with the internal configuration of the 
kernel and specify nothing about external configuration as required by the 
virtual host.

>>Then, "3" or launching.  Now, there needs to be a congruence between your 
>>config from "2" is and what you launch on.  The current rumprun tool sort of 
>>attempts to help you there, but with anything except a trivial setup you need 
>>to know what you're doing anyway, e.g. "-I ,,'-net tap,ifname=tap0" 
>>constructs and so forth.  So the tool is not *really* catching you because it 
>>cannot read your mind.  Do we need a quasi-abstracting tool?  I'm going to 
>>say "no".  The real eye-opener here for me was working on EC2.  We simply 
>>have no way to sensibly abstract the 3 billion toggles available via EC2.  
>>Even for trivial systems like xl.conf (when compared to EC2), if you know 
>>xl.conf you'll just have to learn a second syntax to do what you already 
>>know.  If you don't know xl.conf, we can provide an example or two.  Not sure 
>>if we should provide some case-specific helper scripts, though nothing which 
>>pretends that everything is the same, hides power and throws off people who'd 
>>know how to use the relevant backend tools.
** This is spot on.  Simple launch scripts and examples for common cases 
without attempting to build universally functional vm launcher scripts.


as

Re: fixing the rumpbake-rumprun axis

Reply via email to