Hello again,
our next meeting is on May 19th (in 2.5 weeks) -- live from the retreat
(let's see how that goes). We'll be at https://meet.jit.si/MirageOS --
see our shared pad at https://pad.data.coop/To6IOSeNSOK9kFVlgo7XWw?both#
for notes and agenda (add your talking points there) :)
Here are the notes from our meeting last Monday.
Have an awesome day,
Hannes
## Meeting April 28th 10:00 - 12:00 CEST
- Participants: Fabrice, Pierre, Reynir, Hannes, Sam, Romain
### Defunctorization work
Hannes worked on defunctorizing Mirage. It seems to work well and no
complaints.
Hannes: should we defunctorize the network stack next and the block devices?
Pierre: is very happy with the current state of defunctorization. For
the network stack, at least for Xen it will be tricky since we need two
network interfaces -- for backend and frontend.
Romain: Did an experiment about MirageOS and miou-solo5, and has a TCP
stack without functors -- still an experimental project
Reynir: If I understand, it goes a bit further with no functor at all
Romain: https://git.robur.coop/dinosaure/experiment-miou-solo5
Romain: and a little article here:
https://blog.robur.coop/articles/utcp_and_effects.html
### Unikraft
Sam: the work on unikraft is close to be published
Sam: performance for network is pretty good, for the block device the
situation is less clear
Fabrice: block solo5 outperforms the unikraft
Hannes: from their website (unikraft), they claim much higher
performance than solo5
Fabrice: network is fine, it is a little faster
Sam: one of the next steps is to have a real benchmark with network
Romain: do you know the performance issues with block devices and unikraft?
Fabrice: it uses virtio device, quite different from the network one.
performs badly if you do small sector (one sector at one time)
operations -- much better if you operate on multiple sectors
Fabrice: you can as well have multiple operations in flight with
unikraft - which helps to shadow the single sector bad performance
Romain: did some performance benchmarks in terms of mirage-tcpip and
utcp, and there's a large gap in the benchmarks (using iperf3, we're
near 1Gb/s with mirage-tcpip -- and with utcp 900 Mb/s)
Fabrice: our network test served one file
Romain: we have a really old unikernel which is compatible with iperf2,
but the experiment above has iperf3 support
Romain: mirage-tcpip is 1Gbits/s and utcp is 900 Gbits/s on our
experimentation. The main question is about scheduling now (where miou
differs from what lwt does)
Sam: for getting it released, we populate the different repositories
under the mirage organization, and open PRs for the repositories where
we have unikraft patches
### utcp https://github.com/robur-coop/utcp
Hannes: TCP/IP stack based on a formal model (HOL4, SML; manually
translated SML to OCaml - Recently we found a mistranslation caused by a
missing set of parenthesis).
Hannes: mirage-tcpip: it works very well, but as discussed on the
mailing list it has obscure semantics in certain cases. It is deeply in
the LWT monad and has memory leaks.
Hannes: Utcp has a pure, functional core with unit tests. Recently
worked on performance with Romain. µtcp still lacks congestion control
and newer features of TCP such as selective acknowledgement (which
mirage-tcpip also doesn't implement). µtcp started off several years
ago. We (in Robur) run it in production machines, and we have mostly
worked on correctness and resource usage, and we still have correctness
issues (failed assertions) and resource usage issues. Performance wise
µtcp tries to stick to congression control and window sizes while
mirage-tcpip doesn't try to adhere to a specific congression control
algorithm or bound the memory usage. The gained interest of µtcp is also
due to it not being tied to lwt and thus allows for other schedulers.
Hannes: utcp is meant to replace only mirage-tcpip's src/tcp (ocamlfind
tcpip.tcp)
Romain: for the miou TCP/IP stack, we worked on a new IP stack (which is
different from mirage-tcpip's one)
### Mirage CI
Hannes: OCaml 5.4 support in ocaml-solo5 (ocaml-unikraft)? -- two
different repositories but shared patches, also 5.4 has most patches
upstream \o/ -- https://github.com/ocaml/ocaml/pull/13810 (maybe we can
ask Antonin, Gabriel, Florian at the retreat whether that can make it to
5.4)
Hannes: OCaml 5.3 is not yet tested in the Mirage CI
Hannes: there's a PR from Tim about fixing the OCaml 5.2.1 support
https://github.com/ocurrent/mirage-ci/pull/51
### Remove bigarray from Cstruct
Pierre: experimented with branches from Hannes that use cstruct where
the buffer is Bytes.t
Pierre: updated io-page to not rely on cstruct, but use bigarray directly
Pierre: this currently works with QubesOS, a hello world runs nicely
Pierre: ran into issues when running the network stack, mirage-tcpip is
tightly coupled with cstruct and relies on the fact that cstruct is
based on bigarray (esp. C stubs)
Pierre: may use utcp to check whether that'll be good enough / work
Pierre: need a careful review of the io-page API, since all its
dependencies need updates
Hannes: the only C code is the checksum code, no? in utcp we have pure
OCaml checksum code, and we could use that in mirage-tcpip (we use some
trick about bigarray to outperform the computation)
### Ownership of buffers on the IP level
Romain: another experiment with the IP stack, with mirage-tcpip you have
cstruct everywhere -- difficult to change to something else. with
cstruct you want to not copy when you have a fragmented packet -- with
bigarray and cstruct you can have a subview (without copying). the idea
is to have a bigarray directly when you have a defragmented packet, and
if it is fragmented you get a copy of the bigarrays
Romain: with mirage-tcpip you have the question about ownership and
fragmented/defragmented: do you have the ownership or not?
Romain: My intuition is at the IP level, we should have a variant
between a bigarray (defragmented) and a string (fragmented) -- if you
have a bigarray you should care about the ownership
Hannes: in practise, 99.9999% of IP packets are not fragmented
### Checksum code - performance investigations
Hannes: maybe we should measure the utcp checksum code (OCaml code) and
mirage-tcpip checksum code (C code)
Romain: it is tricky due to memory layout, and also you've to take care
that OCaml 5 C-FFI is different (and introduces a memory barrier), so
check what your environment is before doing the benchmark (CPU cache)
Hannes: maybe the checksum code could then be in a separate, independent
package used by both utcp and mirage-tcpip
Hannes: question about the performance focus: arm? x86? 64 bits only?
also 32 bits?
### tcpip handling of RST
Pierre: curious whether there was more communication about uTCP
Hannes: there wasn't
Pierre: I'll try reach out to them, one of the advisors is in my
reasearch group