[osv-dev] Re: Some questions about OSv

Waldek Kozaczuk Tue, 28 Nov 2023 05:07:40 -0800


On Tuesday, November 28, 2023 at 1:20:06 AM UTC-5 Waldek Kozaczuk wrote:

Hi,

It is great to hear from you. Please see my answers below.

I hope you also do not mind I reply to the group so others may add
something extra or refine/correct my answers as I am not an original
developer/designer of OSv.

On Fri, Nov 24, 2023 at 8:50 AM Yueyang Pan <yueya...@epfl.ch> wrote:

Dear Waldemar Kozaczuk,
I am Yueyang Pan from EPFL. Currently I am working on a project about
remote memory and trying to develop a prototype based on OSv. I am the guy
who raised the questions on the google group several days ago as well. For
that question, I made a workaround by adding my own stats class which
record the sum and count because I need is the average number. Now I have
some further questions. Probably they are a bit dumb for you but I will be
very grateful if you could spend a little bit of time to give me some
suggestions.

The tracepoints use ring buffers of fixed size so eventually, all old
tracepoints would be overwritten by new ones. I think you can either
increase the size or use the approach used by the script *freq.py* (you
need to add the module *httpserver-monitoring-api)*. There is also newly
added (experimental though) strace-like functionality (see
https://github.com/cloudius-systems/osv/commit/7d7b6d0f1261b87b678c572068e39d482e2103e4).

Finally, you may find the comments on this issue relevant -
https://github.com/cloudius-systems/osv/issues/1261#issuecomment-1722549524.
I am also sure you have come across this wiki page -
https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py.

Now after my profiling, I found the mutex in global tib_flush_mutex to
be hot in my benchmark so I am trying to remove it but it turns to be a bit
hard without understanding the thread model of OSv. So I would like to ask
whether there is any high-level doc that describes what the scheduling
policy of OSv is, how the priority of the threads are decided, whether we
can disable preemption or not (the functionality of preempt_lock) and the
design of synchronisation primitives (for example why it is not allowed to
have preemption disabled inside lockfree::mutex). I am trying to understand
by reading the code directly but it can be really helpful if there is some
material which describes the design.

If indeed your "hot" spot is around tlb_flush_mutex (used by
flush_tlb_all()) then I am guessing your program does a lot of mmap/unmap
(see *unpopulate* class in core/memory.cc that uses *tlb_gather*). I am not
familiar with details of what it tlb_gather exactly does it probably forces
TLB (Translation Lookaway Buffer) to flush old virtual/physical memory
mapping entries after unmapping. The mmu::*flush_tlb_all*() is actually
used in more places.

My wild suggestion would be to try to convert the tlb_flush_mutex to
spinlock (see include/osv/spinlock.h and core/spinlock.cc). It is a bit
controversial idea as OSv prides itself on lock-less structures and almost
no spinklocks used (the console initialization is the only place left). But
in some places (see
https://github.com/cloudius-systems/osv/issues/853#issuecomment-279215964)
and
https://github.com/cloudius-systems/osv/commit/f8866c0dfd7ca1fcb4b2d9a280946878313a75d3

and https://groups.google.com/g/osv-dev/c/4wMAHCs7_dk/m/1LHdvmoeBwAJ we may
benefit from those.

Please note the lock-less *sched::thread::wait_until* in the end of the
flush_tlb_all would need to be replaced with "busy" wait/sleep.

Or instead of spinlock you can use the Nadav's "mutex with spinning"
- https://groups.google.com/g/osv-dev/c/4wMAHCs7_dk/m/1LHdvmoeBwAJ - it may
be a good fit here.

As far as the information on mutexes and scheduling, the best information
you can find in the original OSv paper -
https://www.usenix.org/conference/atc14/technical-sessions/presentation/kivity.
See also https://github.com/cloudius-systems/osv/wiki/Components-of-OSv and
many other Wikis.

Your preemption question - the lock-free mutex needs to have preemption on
- imagine if we have a single CPU and the mutex ends up getting into the
wait state to acquire a lock the thread would need to be eventually
switched to another one that would release the lock. But if the preemption
is off, then the scheduler will keep switching to the same waiting thread
for each timer event and our original thread would never acquire the lock.

I hope all this helps.

Waldek

Thanks in advance for any advice you could provide. The questions may
be a bit dumb so pardon me if I disturb you.
Best Wishes
Pan

--
You received this message because you are subscribed to the Google Groups "OSv
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/osv-dev/27b99d15-3471-4237-8c62-a5d69fc81f58n%40googlegroups.com.

[osv-dev] Re: Some questions about OSv

Reply via email to