Hi,
tl;dr if you notice new weird behavior in rumprun, it's probably the new
scheduler. file an issue, please.
The thread scheduler we inherited into rumprun from the Xen Mini-OS was
not designed to work in a situation with several infrequently running
threads. While that was ok for Mini-OS, a rump kernel creates a
large('ish) number of kernel threads at bootstrap. Now that we support
variable sized stacks, those threads didn't cost us more than a few
pages of memory per thread, except when the scheduler ran. The old
scheduler kept everything in a single runqueue and moved threads to the
end of the queue when they were run. Therefore, a vast majority of the
scheduling operations involved going over a large amount of blocking
threads before finding actual work to schedule.
I rewrote the scheduler to use separate run/block/timeout queues for
essentially O(1) scheduling. In a microbenchmark with two
sched_yield()ing threads the new scheduler gives >50% better performance
for that scientifically "common workload" of incrementing an integer by
one every time a thread runs. It will probably translate to a few %
better application performance in heavy network use scenarios, so not a
magical enough optimization to glow in the dark, but a few % is always a
few %. Getting closer to beating Linux performance ....
All tests pass on both baremetal and Xen (~200 total runs), but given
that a scheduler is finicky by nature and that our test suite could be
stronger, my confidence in the code is not extremely high it will be
perfect before a bug shakedown period. If suspicious problems appear,
do not hesitate to speculatively blame me.
- antti