On Mon, Mar 18, 2024 at 8:29 PM Darren L <[email protected]> wrote:
> Hello! > > Thank you for the suggestions. My testing environment is a i9-13900H, > which has 20 total threads, of which I am allocating the first 8 to OSv and > the next 4 to the client. These 12 in-use threads exist on 6 hyperthreaded > cores. > > I wasn't sure how to measure the vmexits and/or process scheduling on the > host. > https://access.redhat.com/solutions/6994095 Guest/hypervisor efficiency is many times a function of how many times the guest exits to the host. Lower is better > > I didn't see netperf on recent versions of OSv in the /tools, it seems to > have existed in OSv v0.5 then afterwards was removed. I did run a similar > benchmark on Python 3.10 and ran the same parallel tests, and I received > much lower latency numbers compared to the Java 8 version. In this case, I > received latency numbers of 45ms (min-max 18-63, std: 12) compared to a > possibly confusing measurement for Linux VM of 837ms (min-max 39-1903, std: > 817). If this does suggest that the JVM is the issue, what steps should I > take to debug this problem? The application I am using must use Java 8 to > run; it cannot be run on any other platform. > Eliminate Java is one option. Another is to use a recent JVM (17) and ZGC and hopefully there wouldn't be GC events (not sure it's a real issue here) > > I know these are not the exact details you requested, but I am more than > happy to learn how I can capture the other details, if necessary. Thank you! > > On Wednesday, March 13, 2024 at 5:23:53 PM UTC-4 דור לאור wrote: > >> Lots of good details. It's not simple to figure out what's the issue >> since >> you have hypervisor, host, OS and JVM variables. >> >> How many threads does the host have? Make sure there are enough hardware >> threads for the >> guest, virtio on the host and the client. This way all >> OSv's runable threads will be schedulable. >> You can also measure the amount of vmexits and process scheduling on the >> host. >> There is a chance the JVM is an issue too, can you do the same with >> netperf? >> >> >> On Wed, Mar 13, 2024 at 9:14 PM Darren L <[email protected]> wrote: >> >>> Hello! >>> >>> I was wondering if I could get any pointers on why I am receiving >>> significant latency issues using the virtio-net driver when processing >>> multiple parallel clients. Hopefully I can explain my issue enough to be >>> replicated. >>> >>> *Testing environment:* >>> - Comparison: Ubuntu Server (Linux) VM and OSv (used the option "-nv" in >>> the run.py script for tap networking) >>> - In common: 4 CPU cores, 4GB of RAM, QEMU KVM, used "taskset" to pin to >>> the same cores >>> - Program: *java-httpserver* program from the apps directory, java8 >>> - What was sent: data of varying sizes (1KB to 1MB, 4MB, 8MB...) on the >>> same machine to the VMs >>> >>> *Observations:* >>> - With single-threaded requests and low data sizes, I was able to >>> measure a latency on OSv that is lower than the Linux VM latency >>> - example: for 32KB I measured ~4ms for OSv and 9.8ms for the Linux >>> VM >>> - At high data sizes (256KB+), OSv started to measure a higher latency >>> than the Linux VM >>> - When I sent *multiple requests* at the same time, OSv suffered a much >>> larger average latency penalty >>> - example, at 1MB data size and 16 parallel requests, average >>> latency was: >>> - OSv: 120ms (min-max 14-225ms, std: 62ms) >>> - Linux VM: 82ms (min-max 24-144ms, std: 34ms) for the Linux VM >>> >>> *Other notes:* >>> - I've been using the OSv profiling tools and have seen that the hot >>> spots typically were in virtio::virtio_driver::wait_for_queue and >>> virtio::net::receiver, but I was unable to identify the exact issue on why >>> this latency is the case >>> - I also noticed when tracing the network layer ( >>> https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py#tracing-network-layer), >>> there were a lot of *net_packet_handling* lines; about as much as there >>> were *net_packet_in* lines for 1MB, which might indicate that the >>> packets are not being processed fast enough and are delayed because it is >>> put in a queue? >>> >>> Hope this is clear enough! I am hoping to understand whether I am >>> misconfiguring OSv or something similar to figure out why this latency >>> difference is occurring. Thank you for the help in advance, and happy to >>> provide any more information as needed. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "OSv Development" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/osv-dev/0ace3980-5036-4df9-9e46-7396bb20ce9fn%40googlegroups.com >>> <https://groups.google.com/d/msgid/osv-dev/0ace3980-5036-4df9-9e46-7396bb20ce9fn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "OSv Development" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/osv-dev/e401a5df-72e0-4fdb-9236-077ca93eaeafn%40googlegroups.com > <https://groups.google.com/d/msgid/osv-dev/e401a5df-72e0-4fdb-9236-077ca93eaeafn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CAKUaUn4g7okt9HCQFMXOCnzfw2kvDxentEaZfZu6R%2B%2BZy18ghw%40mail.gmail.com.
