The notes from the meeting yesterday are below. Thanks to everyone
participating.
We discussed and found Monday mornings as a better fit for meeting, thus
the next meeting is on Nov 25th 10:00 - 12:00 CET. Please add your
agenda items to https://pad.data.coop/To6IOSeNSOK9kFVlgo7XWw?both#
Best,
Hannes
- Topic: (Hannes, Pierre) cstruct and performance (see e.g.
https://github.com/mirage/mirage-net/pull/25)
- Participants: Pierre, Sam, Hannes, Anh
### Minutes
- Anh is a PhD student (advised by others & Pierre) who works on
opportunistic firewalling - using MirageOS as a firewall
- Also wants to work on an IDS in MirageOS, similar to suricata
- Pierre is an associate professor at University of Rennes, MirageOS
core team
- mainly works on the qubes-mirage-firewall
- joined MirageOS since he uses QubesOS on his main laptop
- Hannes is working full-time on MirageOS
- Works in Robur Coop with other peoples
- Push MirageOS into production
- Works on various applications for MirageOS (VPN, dnsmasq, ...)
- Has done some performance investigations, and would like to improve
the performance of MirageOS to convince more people to use it
- Sam works at Tarides since ~2 years
- since last year works on MirageOS
- mainly to get MirageOS working on OCaml 5
- also to get MirageOS working on unikraft (replacing solo5)
- since solo5 lacks a bit maintenance, also performance (unikraft has
batch IO), maybe some day also multicore
- Cstruct & performances
- Cstruct are important for some backends (Xen) where non moving
memory areas are shared amoing domains
- Cstruct are heavy to allocate (using dlmalloc, is expensive), and
this is against the GC (only the finalizer is used for free)
- Some work in (e.g. mirage-crypto) has shown performance improvments
(2.5x - 3x) during the Cstruct->{string, bytes} swap
- Sam: for some operations we would need to copy
- Sam: for a packet receive / send ring, we need non-moving memory as
well
- Pierre: in the qubes-mirage-firewall we do a lot of copies anyways,
e.g. NAT
- Pierre: it is "probably"(TM) not too painful to move from Cstruct.t
to bytes
- Pierre: a big issue is the finalizer, it is unclear when it is
called, and the memory is fragmented a lot
- Hannes: API-wise, mirage-net receive function takes a callback
(Cstruct.t -> unit) Lwt.t, where the mirage-net allocated a buffer and
passes it to the callback
- Hannes: and the send function gets a `size` hint, and allocates a
buffer to be filled
- Hannes: what about ownership? Should the mirage-net receive, once
the callback has finished, reclaim ownership and reuse the piece of memory?
- Pierre: maybe an opaque type is the path to go?
- Pierre: should the send be a write-only buffer, the receive a
read-only buffer? -- Hannes: there's Cstruct_cap that uses phantom types
for it
- Sam: maybe move to an abstract API would help to benchmark the two
options, test them on real workloads
- Hannes: next to types (API), the question is about ownership (and
who is responsible to allocate / free the memory)
- Hannes: asking the question the other way around, from the
application: what should be done for a packet that is received at the
firewall?
- Hannes: from my point of view, the perfect firewall should not
copy: once a packet is received (given a ring buffer of received
packets), this packet should copied (and eventually modified, if NAT
needs to be done) to an element of the send ring buffer -- there
shouldn't be an allocation of the entire packet in the code
- Pierre: we should avoid any allocations, and also all copies
- Pierre: started to use a bridge firewall which doesn't copy, and
avoiding copies is good for performance (for e.g. solo5), for xen the
copy we can't really avoid (on xen, you either need to copy or you would
need to reconfigure with which VM you share the memory)
- Hannes: with the ring buffer approach, we can't really avoid the
copying -- it would mean a lot of buerocracy, and the ownership and
lifetime of a buffer in the ring buffer isn't well-specified anymore
- Hannes: given xen and uring, the ring buffer would need to contain
non-moving memory, for solo5-hvt/spt it shouldn't matter -- but is there
a difference between using bytes or bigarray for such a ring buffer?
- Hannes also tried to write a library that has an abstract type t
and is backed by either byte or bigarray memory (and the implementation
can be selected at compile time - so no functorisation, but you get a
B.get_uint8 etc. directly), but the issue is that exposing the raw
memory from bigarray makes the OCaml runtime unhappy -> segmentation fault
- Florian mentioned each OCaml value needs to have a tag/header, so
we'd need to allocate a bit memory before the page-aligned stuff, and
put the header in there
- so that may be a path to investigate
- Pierre: not sure how the Cstruct.split works, esp. with the header
-> it creates a new OCaml variable with the same starting buffer
address, and different offset and length
- Hannes: I can see multiple paths to investigate:
- virtio firewall using bytes/string vs cstruct
- virtio firewall using a ring buffer (i.e. allocate not for each
packet) vs allocate for each packet
- Hannes: also we (well, Romain) figured that allocations of < 255
bytes is very cheap if you allocate bytes/string (it is in the minor heap)
- so the high-performance MirageOS unikernel would allocate data in
chunks of < 255 bytes
- Hannes: Cstruct is more than the memory region: we have an offset
and length as well, thus replacing Cstruct.t by Bytes.t removes some
safety, since we don't carry around the offset anymore (and thus the
ethernet layer can hardly pass on the payload (Cstruct.shift buf 14))
- Hannes: what is the path forward? do we have a concrete application
that we want to use for performance investigations?
- Pierre: the qubes-mirage-firewall would be a great study, since
there are users (who sometimes complain about the performance)
- Pierre: a huge performance benefit in the qubes/xen setting
would be segmentation offload
- Pierre: started to measure the virtio (simple-fw) with no cstruct
(see https://github.com/palainp/simple-fw/tree/no-cstruct (doesn't yet
compile, needs some further work on mirage-tcpip))
- Pierre: likes the idea to not trust the upper layer, an abstract
type would be great
- Hannes: the abstract type could as well be cstruct.t, and have an
implementation that uses bytes instead of bigarray.t for switching ;)
- Pierre: we had this in 2022, but the qubes-mirage-firewall fails to
compile with it
- Hannes: let's use that cstruct-backed-by-bytes branch
(https://github.com/hannesm/ocaml-cstruct/tree/no-bigarray) and test the
simple-fw with virtio on it :) [and for now, ignore the
qubes-mirage-firewall compilation issues]
#### OCaml 5 and ocaml-solo5
- how should we move?
- OCaml 5 has a different GC which memory profile is different
- Virgile tested that PR on the mirage website, redirecting every other
flow to the OCaml 5 unikernel
- The behaviour was different under lots of stress (with aborted
connections) -- it would be interesting to see whether under normal
conditions there's a difference?
- With OCaml 5.3, there had been various GC fixes, and big users like
Coq/Frama-C, maybe time to look into it again
- With OCaml 5, we need to call GC.compact manually
- Pierre: it compiles fine for the qubes-mirage-firewall, but doesn't
have any long-term runs (only ~10 hours) - with a slightly improved
memory bandwidth
- Pierre: also tested on dns-resolver, which died due to memory
fragmentation (so we should move the bytes)
- Pierre: with OCaml 5, it uses more memory at startup
- Pierre plans to test simple-fw with bytes x bigarray on OCaml 4 x OCaml 5
- Sam: we should move forward to test it in real conditions
- Sam: we should merge and release, maybe something like the ocaml
compiler with release candidates - so it is available, but you've to ask
for it explicitly
- Hannes: maybe not even needed, since ocaml-solo5 depends on the OCaml
version of your switch, and so if you're using OCaml 5, you'll get the
ocaml-solo5 compatible with OCaml 5, and if you're using OCaml 4, you'll
get the ocaml-solo5 compatible with OCaml 4