On Tue, Nov 28, 2023 at 8:20 AM Waldek Kozaczuk <[email protected]> wrote:
> Hi, > > It is great to hear from you. Please see my answers below. > > I hope you also do not mind I reply to the group so others may add > something extra or refine/correct my answers as I am not an original > developer/designer of OSv. > > On Fri, Nov 24, 2023 at 8:50 AM Yueyang Pan <[email protected]> wrote: > >> Dear Waldemar Kozaczuk, >> I am Yueyang Pan from EPFL. Currently I am working on a project about >> remote memory and trying to develop a prototype based on OSv. I am the guy >> who raised the questions on the google group several days ago as well. For >> that question, I made a workaround by adding my own stats class which >> record the sum and count because I need is the average number. Now I have >> some further questions. Probably they are a bit dumb for you but I will be >> very grateful if you could spend a little bit of time to give me some >> suggestions. >> > > The tracepoints use ring buffers of fixed size so eventually, all old > tracepoints would be overwritten by new ones. I think you can either > increase the size or use the approach used by the script *freq.py* > Exactly. OSv's tracepoints have two modes. One is indeed to save them in a ring buffer - so you'll see the last N traced events when you read that buffer - but other is a mode that just counts the events. What freq.py does is to retrieve the count at one second, then retrieve the count the next second - and the subtraction is the average number of this even per second. If you want instead of counting the event, to have a sum of, say, integers that come from the event (e.g., sum of packet lengths), we don't have support for this at the moment - we only increment the count by 1. It could be added as a feature, I guess. But you can always do something ad-hoc like maintain a global variable which you add. > (you need to add the module *httpserver-monitoring-api)*. There is also > newly added (experimental though) strace-like functionality (see > https://github.com/cloudius-systems/osv/commit/7d7b6d0f1261b87b678c572068e39d482e2103e4). > Finally, you may find the comments on this issue relevant - > https://github.com/cloudius-systems/osv/issues/1261#issuecomment-1722549524. > I am also sure you have come across this wiki page - > https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py > . > > Now after my profiling, I found the mutex in global tib_flush_mutex to >> be hot in my benchmark so I am trying to remove it but it turns to be a bit >> hard without understanding the thread model of OSv. So I would like to ask >> whether there is any high-level doc that describes what the scheduling >> policy of OSv is, how the priority of the threads are decided, whether we >> can disable preemption or not (the functionality of preempt_lock) and the >> design of synchronisation primitives (for example why it is not allowed to >> have preemption disabled inside lockfree::mutex). I am trying to understand >> by reading the code directly but it can be really helpful if there is some >> material which describes the design. > > There are a lot of questions here, and I'm not even sure answering them will explain specifically why tlb_flush_mutex is highly contested in your workload. Waldek suggested that you read the OSv paper from Usenix, which is a good start for understanding the overall OSv architecture. The scheduling policy and priority (how to decide which thread should run next) is described in more detail in this document: https://docs.google.com/document/d/1W7KCxOxP-1Fy5EyF2lbJGE2WuKmu5v0suYqoHas1jRM/edit If you have specific questions, post them here and I'll try to answer. But only a few at a time :-) You had a lot of questions above and I can't answer them all in one mail :-) -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CANEVyjvZjpSLuyU2HWP_kLB24%2BiXOUjA1Gg8s9qjcz_P2gU0rQ%40mail.gmail.com.
