Lots of good details. It's not simple to figure out what's the issue since
you have hypervisor, host, OS and JVM variables.

How many threads does the host have? Make sure there are enough hardware
threads for the
guest, virtio on the host and the client. This way all
OSv's runable threads will be schedulable.
You can also measure the amount of vmexits and process scheduling on the
host.
There is a chance the JVM is an issue too, can you do the same with netperf?


On Wed, Mar 13, 2024 at 9:14 PM Darren L <lucernarr...@gmail.com> wrote:

> Hello!
>
> I was wondering if I could get any pointers on why I am receiving
> significant latency issues using the virtio-net driver when processing
> multiple parallel clients. Hopefully I can explain my issue enough to be
> replicated.
>
> *Testing environment:*
> - Comparison: Ubuntu Server (Linux) VM and OSv (used the option "-nv" in
> the run.py script for tap networking)
> - In common: 4 CPU cores, 4GB of RAM, QEMU KVM, used "taskset" to pin to
> the same cores
> - Program: *java-httpserver* program from the apps directory, java8
> - What was sent: data of varying sizes (1KB to 1MB, 4MB, 8MB...) on the
> same machine to the VMs
>
> *Observations:*
> - With single-threaded requests and low data sizes, I was able to measure
> a latency on OSv that is lower than the Linux VM latency
>     - example: for 32KB I measured ~4ms for OSv and 9.8ms for the Linux VM
> - At high data sizes (256KB+), OSv started to measure a higher latency
> than the Linux VM
> - When I sent *multiple requests* at the same time, OSv suffered a much
> larger average latency penalty
>     - example, at 1MB data size and 16 parallel requests, average latency
> was:
>         - OSv: 120ms (min-max 14-225ms, std: 62ms)
>         - Linux VM: 82ms (min-max 24-144ms, std: 34ms) for the Linux VM
>
> *Other notes:*
> - I've been using the OSv profiling tools and have seen that the hot spots
> typically were in virtio::virtio_driver::wait_for_queue and
> virtio::net::receiver, but I was unable to identify the exact issue on why
> this latency is the case
> - I also noticed when tracing the network layer (
> https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py#tracing-network-layer),
> there were a lot of *net_packet_handling* lines; about as much as there
> were *net_packet_in* lines for 1MB, which might indicate that the packets
> are not being processed fast enough and are delayed because it is put in a
> queue?
>
> Hope this is clear enough! I am hoping to understand whether I am
> misconfiguring OSv or something similar to figure out why this latency
> difference is occurring. Thank you for the help in advance, and happy to
> provide any more information as needed.
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to osv-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/osv-dev/0ace3980-5036-4df9-9e46-7396bb20ce9fn%40googlegroups.com
> <https://groups.google.com/d/msgid/osv-dev/0ace3980-5036-4df9-9e46-7396bb20ce9fn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/CAKUaUn7hAB6AqhXJ4ozOU-pHRdFTobfH3QFXdaUcz395C15yHw%40mail.gmail.com.

Reply via email to