On 18 Apr 2024, at 23:55, Andrew Warkentin <andreww...@gmail.com> wrote:
> 
> Somebody in the Zoom call the other day (IIRC it was Gernot) said that
> seL4 IPC is considerably faster than QNX IPC, but it seems that QNX is
> somehow faster in a simple benchmarking experiment in which a client
> repeatedly sends messages to a bit-bucket server, using the TSC to
> determine timing.
> 
> I'm running both QNX and seL4 under identical VirtualBox
> configurations with a single core. I'm using a QNX 6.3.1 x86-32 image
> i have in my collection (x86-64 support was only added in 7.0; this is
> the most recent image I have with dev tools installed). The seL4
> version I'm using is from February 2022 and is built for x86-64 with
> MCS enabled (the priority of the threads is 255, and both the
> timeslice and budget are the initial thread default of 5 ms).
> 
> My benchmark times a loop of 10000 RPCs; for QNX it's just using the
> regular read() function on /dev/null or a custom server that also acts
> like /dev/null but forces a copy of the message into a server-provided
> buffer, and for seL4 I'm directly calling Send()/Recv()/Reply(). I've
> also tested Linux's /dev/null, and QNX's implementation actually seems
> somewhat faster than it (for the system /dev/null; the custom server's
> best case seems only slightly worse than under Linux).
> 
> Under QNX and Linux I'm running a C implementation of the benchmark
> and under seL4 I tried both C and Rust implementations (which have
> very similar performance). The messages on QNX and Linux are 100
> bytes, and on seL4 they are 2 words, which should be hitting the fast
> path.
> 
> Here are the best and worst cases I got:
> seL4: 0.114-0.136s
> QNX, custom null server: 0.006-0.063s
> QNX, system null server: 0.002-0.022s
> Linux: 0.004-0.005s
> 
> Has anyone else benchmarked seL4 IPC against that of QNX and gotten
> better (or comparable) results for seL4, unlike what I got? Is
> synchronous IPC just less optimized under seL4 than QNX?

I’m not completely sure what you’re actually measuring here (actually I’m 
totally unsure 
what your setup really is, I can’t make sense of the above description in terms 
of
what the seL4 system is doing).
But measuring performance of a kernel inside a virtual machine is somewhat 
pointless at best.

In any case it seems clear that you’re *not* measuring raw seL4 IPC.
Your quoted cost of a round-trip on seL4 is say 120ms for 10,000 round trips,
or 12µs for one round trip. You don’t specify the clock rate, but that would be
somewhere between 12,000 and 50,000 cycles, when a round-trip IPC on x86 with 
MCS is <900 cycles

Note that using Send()/Recv()/Reply() is the wrong way to use seL4 IPC, see
https://microkerneldude.org/2019/03/07/how-to-and-how-not-to-use-sel4-ipc/
However, this will only be about a factor two slower than using the correct 
Call()/ReplyWait() incantation. 

In fact, while it is in principle possible to outperform seL4 in IPC, it is 
impossible to do so
by more than a few percent, as the seL4 IPC cost is only about 20% higher than 
the
pure hardware cost of kernel trap, minimal state save+restore and switching 
page tables.
This has been independently established in a peer-reviewed paper, see
https://dl.acm.org/doi/pdf/10.1145/3302424.3303946

Everything I’ve ever seen from QNX is that their IPC costs is many thousands of 
cycles 
(probably >10,000), and at least an order of magnitude slower than seL4’s.

Gernot

_______________________________________________
Devel mailing list -- devel@sel4.systems
To unsubscribe send an email to devel-leave@sel4.systems

Reply via email to