Re: Updates on Tarides Plans with MirageOS - Request for Feedback

Thomas Gazagnaire Wed, 20 Dec 2023 08:42:07 -0800

Hi,

Thanks for your answers. I’m trying to give more details bellow.


> I am a bit worried, since the mirage3 -> mirage4 change (change of 
> compilation strategy) had quite some impact on our reproducible build system 
> (https://builds.robur.coop, https://github.com/robur-coop/orb) - now when 
> opam-monorepo will be deprecated in favour of "dune pkg", my worry is that 
> there's again quite some work needed on our end. It turns out, only mirage 
> 4.2 was in a shape where we could use it for reproducible builds.

> But I'm sure you're aware of the issues that we encountered and pushed 
> upstream to e.g. mirage, and how all the bits (mirage/opam/orb/opam-monorepo) 
> currently fit together, and how "dune pkg" will fit in.

This is a legitimate worry. To avoid falling in the same traps a for the 
opam-monorepo situation:
- the feature is designed by the Dune maintainers, with support and 
contributions from the opam-monorepo and opam maintainers. So the maintenance 
story is much clearer and upstream are willing to do the changes to make this 
work properly.
- the feature is expected to be used by all OCaml users (not just the Mirage 
and a few others). So there is a better incentive to make it succeed than 
opam-monorepo.
- the goal is to teach Dune how to compile any opam package, even it it doesn’t 
use Dune (so no need for an overlay and vendoring anymore)

I expect the situation to  probably be broken a little bit in the first alpha 
as it’s a major change but to be improved quickly as users start to pick it up 
and to be maintained consistently over-time as it’s a first-class feature in 
OCaml Platform UX.

You can browse the “package management” tag on GH to see progress: 
https://github.com/ocaml/dune/issues?q=label%3A%22package+management%22. There 
is a lot going on, but here a few highlights:
- how Dune is planning to build opam packages (with support for x-compilation 
via Dune workspace, like opam-monorepo does): 
https://github.com/ocaml/dune/issues/7096
- how all the opam features are integrated into Dune build plans: 
https://github.com/ocaml/dune/issues/8096
- Keep the tag/commit of opam-repository remote in the lock files for 
reproducibility: https://github.com/ocaml/dune/issues/8463
- How to make the Dune rules for building opam packages reproducible: 
https://github.com/ocaml/dune/issues/8240

I don’t think the integration with Orb has been discussed in details at this 
stage - however `dune pkg` uses the opam library and already need to have 
precise dependencies specification to make caching of build rules work 
reliably. So I suspect that won’t be too difficult; but yes I realize this 
means planning work to update orb. I’ve opened 
https://github.com/ocaml/dune/issues/9548 to discuss the scope of the work.

>> Regarding mirage/functoria, my general feeling is that while the CLI tool 
>> was initially valuable for gathering an ecosystem of libraries, nowadays, it 
>> is less clear if this is still required. Right now, most of the tool's 
>> complexity is handling the installation of packages needed for a specific 
>> target/combination of devices. This will no longer be needed if the build 
>> system can do this instead. Ideally, any OCaml application (following a few 
>> design principles) could be compiled to a unikernel simply using Dune, as 
>> envisioned by the [Workflow 
>> W12](https://ocaml.org/docs/platform-roadmap#w12-compile-to-mirageos) of the 
>> OCaml Roadmap. However, there is no existing design on how this should work 
>> at this stage. So, before starting this, is that the right direction for the 
>> mirage tool?
> My experience with Mirage - the tool - is that it does various things at once:
> - figuring out dependencies and requirements of MirageOS devices, putting 
> them into a boot order (what is generated as main.ml)
> - selecting target-specific implementations (still, I thought someone wanted 
> to revise this to use "dune variants", but haven't seen any demonstration 
> thereof)
> - command line arguments at configure and boot time (here, there has been 
> various recent discussions, https://github.com/mirage/mirage/issues/1422, and 
> a huge amount of reshuffling and implementation work by yourself -- which 
> unfortunately doesn't seem to be ready yet (merged onto the main branch, but 
> take a look at the regressions https://github.com/mirage/mirage/issues/1479 
> https://github.com/mirage/mirage/issues/1483), and looks slightly abandoned
> - cross-compiling/linking (using ocaml-solo5 with solo5 cross-compilation 
> shell-script, and opam-monorepo to construct a monorepo)
> 
> Out of these 4 items, I'm not sure what "dune -x mirage" will attempt to 
> solve. My goal is to make mirage - the tool - less smart about what it 
> attempts to achieve, but I don't think that moving these bits into dune would 
> be beneficial.

Good list :-)
1. I agree we need some kind of metadata here. But I’m not sure having a 
complex eDSL is the right approach anymore. We might as well extract the 
metadata (what packages exists, what devices do they define, what parameters do 
they take) and a more simple way to combine then. I’m not convinced we need to 
expose this to end-users in the way it is today.
2. Dune variant was indeed supposed to fix this - it has a few limitations (the 
main one being the removal of x-modules inlining) but my hope is that it is the 
way to do multi-platform development in OCaml in the longer term.
3. I am not convinced that exposing a complex CLI is the way to go. I’d be in 
favor of letting people write configuration files (so you can store them in 
your repo), either using a standard format (did I hear yaml :p) or just put 
this in your dune file. But this is long-term. In the short term I plan to 
unblock my patches by finding some time over Christmas to work on this. 
Longer-term, we probably need a combination of macros/MetaOCaml instead of 
re-implementing our own magic. Would be nice to explore the design space a bit 
more here.
4. This is exactly what `dune -x mirage` will replace initially (with maybe 
some integration for target/device parameters in 3)

>> ### Targets
>> The principal target backend for MirageOS nowadays is Solo5. This is a solid 
>> backend, which has been audited and optimised for security. It is also 
>> relatively simple to add new devices given the by-design low-complexity 
>> approach of its device model. However, while solo5 is today the most secure 
>> unikernel "runtime", I also feel it has issues hindering potential changes. 
>> For one, it is slow -- the device model is not meant for high-speed I/O,
> There have been contributions and attempts to implement that - see 
> https://github.com/solo5-netmap/solo5/tree/netmap as example. I don't quite 
> get the "has issues hindering potential changes”.

I’ve listed a few of these issues later (slow I/O, lack of maintenance). I 
don’t say fixing this is not possible, I’m just saying that I feel we don’t 
have the maintenance momentum to do this right now. But I’m happy to be proven 
wrong :-) 

Regarding Netmap and other Solo5 extensions, I’m interested to hear what went 
ok and what went wrong? Why wasn’t this merged? Takayuki - do you think the 
Netmap approach was the right way to go there? How does Netmap perform in a 
virtualized environment? I think people also discussed using a DPDK-eBPF based 
IO at one point. How does it fit? (I haven’t followed closely what was 
happening in this space). A good topic to discuss during our next Mirage call 
:-)

>> and there is no support for SMP; for most use cases, it is not an issue, but 
>> for others we are looking at, it can be.
> There has been in the early days some fork from solo5 called hermitcore that 
> added SMP. I'm curious why you didn't pick that up for your mail.

I didn’t know HermitCore was based on solo5? Nowadays they compile Rust to 
Unikraft.

>> The other one is that the device model is very simple (for good reasons) and 
>> challenging to extend to new devices (see below for more detail). In an 
>> ideal world, this could be fixable, but there are also very few courageous 
>> active maintainers, so any changes - like moving to OCaml5 - are complex to 
>> make.
> Hmm, one thing clearly is solo5 lacking maintainers and contributors. The 
> other thing is that ocaml-solo5 with its minimal (no)libc surely needs 
> adaption for the OCaml5 runtime rewrite (see the PR that has been around for 
> ages). Now, when you switch to unikraft, I'm not entirely sure what your 
> tradeoff is? Does unikraft support SMP? Did you evaluate in detail the 
> trusted code base differences between solo5 and unikraft?

There are two timeline here. On the short-term: I fully agree we should start 
by moving solo5 to single-core OCaml5 - that PR has been lingering for too 
long. I’ll try to see if some people familiar with the OCaml 5 runtime could 
review this early next year.

And on the medium/longer term, I would like to explore alternative options to 
solo5 (and maybe come back to solo5 if the options are not great).

So far Unikraft has demonstrated a nice momentum, with lots of maintainers (and 
lots of quality contributions). Wealso have funding to explore the integration 
with the unikraft team (the Grant we got accepted was in collaboration with UPB 
where Unikfraft is developed). And yes, there is support for SMP (this is 
pretty recent, so unclear how stable this is) - for instance there is work 
happening here: https://github.com/unikraft/lib-pthread-embedded

I haven’t looked at their codebase directly yet, but I’ve heard lots of good 
comments regarding the general quality and robustness of their C code. However 
I’ve also heard that their current focus is on portability and performance. The 
security roadmap is progressing 
(https://unikraft.org/docs/concepts/security#unikraft-security-features /  
https://github.com/orgs/unikraft/projects/32/views/1) but again unclear what is 
the ETA and quality. I would expect this to be part of the evaluation if 
Unikraft is a good fit or not for Mirage.

>> ### Devices and Libraries
>> There are three areas that we would like to focus on (or continue to focus 
>> on) in the next couple of years.
>> First, we still believe there are better abstractions for storing data than 
>> POSIX. Hence, we are continuing to develop and improve Irmin. We are 
>> currently porting `irmin-pack` to MirageOS (the backend of Irmin used by the 
>> Tezos blockchain to store its ledger history) via the 
>> [Notafs](https://github.com/tarides/notafs) project. Notafs is a pseudo 
>> filesystem for Mirage block devices. It can handle a small number of large 
>> files. While the limited number of filenames is unsatisfying for general 
>> usage, it can be used to run the irmin-pack backend of Irmin, which only 
>> requires a dozen huge files. By running Irmin, one gets for free a 
>> filesystem-on-steroid for MirageOS: it supports an arbitrarily large number 
>> of filenames; is optimised for small and large file contents; performs file 
>> deduplication; includes a git-like history with branching and merging, ... 
>> and it even provides a garbage collector to avoid running out of disk space 
>> (by deleting older commits). Since the Irmin filesystem is versioned by 
>> Merkle hashes, one can imagine deploying reproducible unikernels on 
>> reproducible filesystem states!
> Makes me curious what you try to achieve with it. A "reproducible filesystem" 
> means what exactly? What is the difference to a git repository of your 
> "irmin-pack" (so why use irmin/notafs instead of a git repository)? How do 
> you get a robust file system without "fsync"? What is the performance of 
> notafs compared to a git repository?

Irmin and Git have the same general data model - they both use a Merkel graph 
of objects. There are a few differences though:

- Git support SHA1 only (for now - although ocaml-git is functorised over the 
hash implementation, if you want to interface with actual Git repositories - 
like storing you data on GitHub - you don’t have much choice). Irmin-pack can 
use whatever hashes - Tezos uses BLAKE2b for instance.
- Git has limited support for large directories as he space and speed 
performance of traversing a directory is linear in the number of 
files/sub-directories. This is problematic when you start updating files in 
these large subdirectories, as every write will duplicate the node and cause 
excessive space usage. Irmin-pack use something that looks like (deterministic 
and well distributed) inodes to represent directories so the space and speed 
complexity is logarithmic. 
- Both solutions have limited support for large files. But if you are storing a 
large file in Git you are screwed. While with Irmin you can switch to an 
alternate way to represent large blobs (for instance using a rope-like data 
structure)
- The storage strategy is also a bit different:
    - Git has an interesting storage model: it has a “minor heap” (with recent 
objects that are stored uncompressed) and a “major heap” (with optimized pack 
files that stores compressed objects). Running a GC will compact the minor heap 
into a new pack file. This is a “stop-the-world” operation and so you can’t 
read or write new objects concurrently. The GC can also trigger a repack, that 
will compact the major heap (and unpack / repack the existing pack files by 
removing unreachable objects). This is again “stop-the-world” and very costly 
I/O operation.
    - Irmin-pack has just one heap with a concurrent GC - you can continue to 
read and write efficiently in the store while the GC is running in the 
background, and the GC is efficient enough that it is actually not noticeable. 
This works very well if your history is mostly linear and if you want to keep 
the last X commits and discard the rest. If you want a different GC strategy, 
this won’t work so well.
- The in-memory caching story is also different:
   - Git (and ocaml-git) has no notion of read and write cache - every 
operation directly goes through the store. You can decide to have an in-memory 
or on-disk Git store, but doing a bit of both is complicated.
   - Irmin has an in-memory cache (Irmin.Tree) that lazily read objects and 
cache write object until the next commit. This is a great way to batch write 
operations to avoid writing garbage on disk.
- tangiently related, but we have ongoing experiments to parallelized and have 
a direct-style irmin-pack using eio that are quite promising as that data model 
scales very well (I suspect that’s also the case for Git, but 
concurrent/parallel writes on the file system for the “minor heap” might not 
scale as well) — we’re planning to talk more about this early next year when 
https://github.com/mirage/irmin/tree/eio got merged.

And I’ll let the notafs authors answer about the performance where they are 
back from holidays :-)

Regarding fsync:
- Irmin-pack always store consistent data on disk - so if you computer crashes 
in the middle of some operations, you are supposed to be able to restart — with 
maybe an outdated version of your file but at least a consistent one. So at 
least we should have consistency (if not, that’s a bug).
- Durability is harder as it’s pretty unclear what is the actual semantics for 
block device synchronization - and I don’t think Mirage (and virtio and the 
Linux kernel(s)) implement this semantic consistently. Maybe that’s a good time 
to resurrect the write barrier/durable patches in 
https://github.com/mirage/mirage-block/compare/main...g2p:mirage-block:barrier-and-discard
 but we need to have an idea on how all the existing (virtual) block device 
implementation are supposed to behave? Happy to hear what are people opinions 
here.

Best,
Thomas

Re: Updates on Tarides Plans with MirageOS - Request for Feedback

Reply via email to