Hi,
In the recent weeks I've been pulling the bits of -baremetal and -xen
together and apart and sideways, and it's starting to be clear what the
resulting internal (*) architecture of the rumprun unikernel (**) should
look like. This is a short('ish) description of what I've finished
pushing today.
*) I'm really keen on defining things in terms of internal and external,
since people can now start using our product, and for that it's
important, in Lampson's words, to "keep a place to stand", with the
external interfaces of course being the place where users can stand
**) which, important to stress, is not the same thing as a rump kernel
First, why unify at all? For one, it keeps me happier since I don't
have to change the same code multiple times. Second, it will keep
gratuitous differences from creeping in, which has observably been
happening (and equally observable bugs resulting from those gratuitous
differences!). Third, unified code will work the same way, which is
good for consistency between platforms. Note, I'm not saying that all
platforms will work exactly the same, but at least samer (though, that's
probably not an actual word).
The problem that I generated from the first rumprun stack based on
MiniOS, and I stress that it was in no way of a result of flaw in
MiniOS, was the lack of any separation between "userspace" and
"kernelspace". That was fine in MiniOS, since it provides a
self-contained package, but not so much the case for rumprun, where
there is a clear *conceptual* separation between userspace and
kernelspace; the conceptual separation of course does not mean we can't
link everything into a single address space like we are doing with
rumprun. We want to avoid moebius-strip computing where the kernel runs
on top of libc and vice versa, because that makes it very hard to reason
about dependencies of components. While it seemed like a good idea at
the time, we desperately want to fix the mishmash now.
So, getting to the architecture description, the rule is that upper
layers can depend on lower ones but not vice versa:
1) platform, which provides low-level bootstrap and platform-dependent
routines such as the clock.
2) core, which provides platform-independent routines such as the
scheduler and also MD low-level routines such as thread context
switching, parts of rumpuser and, eventually, atomic operations. Note:
core should define the interfaces to be implemented by platform which
are used by layers >=#2 (mostly #3 and #5).
3) rumpuser, which is below libc but above platform/core. Notably, the
rump kernel depends on this layer (and transitively the ones below)
4) rump kernel
5) base, which provides common userlevel routines, e.g. when ones from
the regular libc are not applicable. conceptually, libc is also in this
layer, but of course we don't implement our own libc
The actual application(s) would be layer 6. It's nice that we didn't
need 7 layers, because 7-layer designs suck, as famously demonstrated by
some committee designing networking stacks.
If you think the names suck, invent just better ones, and we can change
them without screwing users because they're exposed only internally(!).
Ok, so layers 1 and 2 don't really follow the dependency rule, since "1"
can use features from "2". We could create layer "0" to address this
and put e.g. the atomic ops there, but I'm not sure it's worth the fuss
currently. If it some days turns out to be worth the fuss, hey,
internal interfaces, we can just do it without screwing users ...
So, we get clean, conceptual separation between the "kernel" and
"usermode", and therefore it should be mostly trivial to run the rump
kernel on a given platform without userspace (build goo
notwithstanding). That userspace-less mode of operation may be
interesting for example to small embedded system vendors who mostly want
kernel functionality and don't mind writing some hundreds of lines of
"application" directly against rump kernel syscalls for the benefit of
being able to not ship "userspace" at all. For example, networked
sensor devices come to mind.
The other separation we gain is independence between the rumprun
unikernel and the rump kernel. Yes, that makes sense ;)
Purely theoretically speaking *winkwink*, if some other OS besides
NetBSD were to be structured to run on top of the rump kernel hypercall
interface (i.e. turned into an anykernel), and a suitable libc were to
also be available, that alternative OS could be offered as a
more-or-less drop-in replacement of the core of the rumprun unikernel.
All of the above layers now exist, and while there's still a bit of
rototilling left, things are starting to look peachy. For example, you
can have a look at how delightfully empty rumprun/platform/xen already is.
- antti