Dear valued MirageOS developer (or person interested in MirageOS),
as mentioned in the last MirageOS community meeting, and also at the
retreat in Marrakesh, I started to work on removing functors from
MirageOS - and use the linking trick (use the interface for compilation
of dependencies, and only when compiling the executable provide the
implementation) - pioneered (as far as I'm aware) by Daniel Buenzli, and
integrated into dune as "dune variants".
Let me show you the diff for the hello world unikernel as a unified diff:
```
--- a/tutorial/hello/config.ml
+++ b/tutorial/hello/config.ml
@@ -1,6 +1,7 @@
open Mirage
let main =
- main "Unikernel.Hello" (time @-> job) ~packages:[ package "duration" ]
+ let extra_deps = [ dep default_time ] in
+ main ~extra_deps "Unikernel" job ~packages:[ package "duration" ]
-let () = register "hello" [ main $ default_time ]
+let () = register "hello" [ main ]
--- a/tutorial/hello/unikernel.ml
+++ b/tutorial/hello/unikernel.ml
@@ -1,12 +1,10 @@
open Lwt.Infix
-module Hello (Time : Mirage_time.S) = struct
let start _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
Logs.info (fun f -> f "hello");
- Time.sleep_ns (Duration.of_sec 1) >>= fun () -> loop (n - 1)
+ Mirage_time.sleep_ns (Duration.of_sec 1) >>= fun () -> loop (n - 1)
in
loop 4
-end
```
To describe the changes: we no longer apply default_time to the main in
config.ml (the $ sign), and also the signature of the unikernel changed
(from "time @-> job" to "job"). The unikernel itself is no longer a functor.
The benefits, apart from the headspace you need to understand the hello
world, is also editor navigation -- you can directly jump to the
definitions (which you can't in the earlier code, since Mirage_time.S is
abstract, and merlin doesn't know which implementation will be used).
## Motivation
Besides from code readability and editor support, I find it important to
evaluate the abstraction mechanisms we use. And a functor is really nice
and useful, but not really needed for something that you'll only have
once in your binary -- and let's face it, on solo5 we'll use the solo5
mirage-time implementation, on unix the unix one - there's no mix and
match -- and nobody would ever want to have two different sleep_ns
functions that behave differently in a single unikernel.
The motivation also comes from various people whom I talked to that
wanted to port OCaml libraries over to MirageOS, and discovered they'd
need to restructure the code heavily.
## Other examples
If you're keen to look into other unikernel changes, I opened
https://github.com/mirage/mirage-skeleton/pull/394 as a draft.
## Try it out
If you're keen to explore it for yourself and get your hands dirty,
don't hesitate to create a fresh opam switch and add my opam overlay:
```
opam switch create mirage-time-defunctorised 4.14.2
eval `opam env`
opam repo add defunc
https://github.com/hannesm/mirage-time-defunctorised.git
opam install mirage
```
## Implementation
So, below we're now using dune variants - you can take a look into
mirage-time https://github.com/mirage/mirage-time/pull/14 -- how this is
achieved. Due to the state of how `sleep_ns` was implemented earlier,
this currently needs a modification to mirage-runtime
(https://github.com/mirage/mirage/pull/1526), and mirage-solo5 and
mirage-xen being adapted with it
(https://github.com/mirage/mirage-solo5/pull/96,
https://github.com/mirage/mirage-xen/pull/54).
Mirage-time is a virtual library, and has a default implementation (the
unix one), while there's a second one (the solo5 one). The mirage tool
generates, depending on the target, the dependency onto mirage-time.unix
or mirage-time.solo5 (see
https://github.com/mirage/mirage/pull/1529/commits/ff4d1a8924158c4145683b884f55ee869518f69f).
I went further and adapted the users:
- mirage-crypto-rng https://github.com/mirage/mirage-crypto/pull/229
- arp https://github.com/mirage/arp/pull/30
- tcpip https://github.com/mirage/mirage-tcpip/pull/515
- mirage-vnetif https://github.com/mirage/mirage-vnetif/pull/36
- charrua https://github.com/mirage/charrua/pull/125
What I noticed is that in arp there was a test case with a Fast_time,
where the implementation of sleep_ns devided the given time by 600 to
run more quickly. I easily found a different mechanism (increasing the
tick from every 1500ms to every 2ms) which resulted in the test cases
succeeding.
The full PR for mirage is https://github.com/mirage/mirage/pull/1529
## Downsides
As just outlined, passing a Fast_time isn't possible anymore. But I
couldn't find any use apart from the arp test case - and there I could
revise the code, which is more readable right now.
Another question is about extensibility: so what if someone comes up
with a new mirage-time implementation that is faster, uses less memory,
... -- I guess we all agree that we want to use it, and we just merge it
into mirage-time. So, really, we only ever need one mirage-time
implementation in a unikernel! :)
Another axis is what if someone wants to develop a new mirage backend
(let's say unikraft) -- well, you'll have to extend the mirage-time
repository with the new implementation, and the mirage tool to use the
given implementation for the unikraft target. This isn't too different
from what would be needed at the moment, where the mirage tool would be
extended to use the "mirage-time-unikraft" package (NB: dune variants
may only be in the same opam package). I don't see we're loosing
anything here (esp. seeing how MirageOS currently evolves).
## Upsides
- code readability
- editor support
- less frightening for newcomers
- less generated code
- performance? (haven't measured this)
## Future
I'm not yet 100% happy with the result: I think the "~extra_deps" is a
big hammer for what we need -- a simpler way we could have in mirage is
to define "default_time" as a `package` - so it could be added as (in
config.ml):
`main ~packages:[ package "duration" ; time ]`
And that `time` would resolve to "mirage-time.unix" or
"mirage-time.solo5" at mirage configure time (depending on the target
used). Since "Mirage_time" doesn't need any initialization function
being called, such a lightweight approach would be preferably.
The downside is that our "start" needs to be delayed so other top-level
functions have a chance to be setup (i.e. the Mirage_logs reporter). I
think we need a different solution here.
## Comments?
Please feel free to comment in this mail thread. If you have opinions
about the changed code, please comment on the individual PRs linked above.
I'm aware we have various other abstraction mechanisms and build system
tricks in OCaml (first class modules, conditional compilation, ...). My
current plan is to use dune variants since it looks like a nice
lightweight approach for the flexibility we need. Feel free to go down a
different path and let us then compare the results (but please refrain
from discussing what could be nicer with a different approach that
hasn't been tried out - too many unknowns won't help the discussion).
## Future II
Mirage-time was only the beginning ;) If there's positive feedback, I
have unfinished and unclean changes to remove Random and Clock as
functor arguments.
I believe it will make MirageOS more approachable and easier to get
adapted by OCaml programmers (and others as well). From what I remember,
the plan with mirage 4 was to try out the dune variants -- so now 2
years after the 4.0.0 release, here we are :)
My belief is that other packages, such as mirage-net and mirage-block
could benefit as well from using dune variants. I'm happy to continue
that line of work and figure out what are the things that are not doable
with this approach.
Thanks for reading until the end :D
Hannes