Hi, On Thu, Mar 14, 2019 at 5:11 AM Waldek Kozaczuk <jwkozac...@gmail.com> wrote:
> I wonder if anyone has had chance to read this paper. I would like to see > what others think about reasons OSv thread scheduler does not scale well > with number of vCPUs. > I am not sure that the scheduler is to blame here. It is possible (even likely) that applications are impacted by different problems. For example, the paper does not mention if the Python Flask-based application is configured to use multi-threading mode or not. As the performance metrics don't change between single core and multicore, it suggests that perhaps Flask was run in the default single-threaded mode. The paper mentions that they allow Go to manage the worker thread pool size itself, but it's not clear what APIs Go uses to discover the number of CPUs and whether OSv implements this correctly, which could lead to Go using a thread pool that is too small or large. The Java application numbers are most interesting because most of the performance work for OSv was done with native and JVM applications (for example, Apache Cassandra) and the combination is expected to work. Unsurprisingly, for Java application, you see some improvement with multicore, although OSv does not perform as well as Docker. However, the most interesting part of the evaluation results is the following observation: “It should be noted that in all cases the combined CPU load of all cores was between 40 and 50% during testing, indicating that a lot of cycles were being wasted on simply getting threads to run at all, and not enough on actually running them.” The first thing I would look for here is if there's a bottleneck in OSv's networking. In particular, if Xen network driver and device multi-queue are enabled and packet steering (packet distribution to different CPUs) is working correctly. Furthermore, I would look for any locking in the networking stack that serializes packet receive or transmission. Also, one area that JVM-based applications are often affected is locking. So it might make sense to look at OSv's pthread mutex implementation, for example. It would be interesting for someone to attempt to first reproduce the results and then use OSv's tracing to gain more understanding of what is actually going on: https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py Regards, - Pekka -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.