On Tue, Jul 27, 2021 at 12:16:59PM +0100, Richard W.M. Jones wrote:
Hi Eric, a couple of questions below about nbdkit performance.Modular virt-v2v will use disk pipelines everywhere. The input pipeline looks something like this: socket <- cow filter <- cache filter <- nbdkit curl|vddk We found there's a notable slow down in at least one case: When the source plugin is very slow (eg. it's curl plugin to a slow and remote website, or VDDK in general), everything runs very slowly. I made a simple test case to demonstrate this: $ virt-builder fedora-33 $ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img delay-read=500ms --run 'virt-inspector --format=raw -a "$uri" -vx' This uses a local file with the delay filter on top injecting half second delays into every read. It "feels" a lot like the slow case we were observing. Virt-v2v also does inspection as a first step when converting an image, so using virt-inspector is somewhat realistic. Unfortunately this actually runs far too slowly for me to wait around - at least 30 mins, and probably a lot longer. This compares to only 7 seconds if you remove the delay filter. Reducing the delay to 50ms means at least it finishes in a reasonable time: $ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \ delay-read=50ms \ --run 'virt-inspector --format=raw -a "$uri"' real 5m16.298s user 0m0.509s sys 0m2.894s In the above scenario the cache filter is not actually doing anything (since virt-inspector does not write). Adding cache-on-read=true lets us cache the reads, avoiding going through the "slow" plugin in many cases, and the result is a lot better: $ time ./nbdkit --filter=cache --filter=delay file /var/tmp/fedora-33.img \ delay-read=50ms cache-on-read=true \ --run 'virt-inspector --format=raw -a "$uri"' real 0m27.731s user 0m0.304s sys 0m1.771s However this is still slower than the old method which used qcow2 + qemu's copy-on-read. It's harder to demonstrate this, but I modified virt-inspector to use the copy-on-read setting (which it doesn't do normally). On top of nbdkit with 50ms delay and no other filters: qemu + copy-on-read backed by nbdkit delay-read=50ms file: real 0m23.251s So 23s is the time to beat. (I believe that with longer delays, the gap between qemu and nbdkit increases in favour of qemu.) Q1: What other ideas could we explore to improve performance?
First thing that came to mind: Could it be that QEMU's cache-on-read caches maybe bigger blocks making it effectively do some small read-ahead as well?
- - -
In real scenarios we'll actually want to combine cow + cache, where
cow is caching writes, and cache is caching reads.
socket <- cow filter <- cache filter <- nbdkit
cache-on-read=true curl|vddk
The cow filter is necessary to prevent changes being written back to
the pristine source image.
This is actually surprisingly efficient, making no noticable
difference in this test:
time ./nbdkit --filter=cow --filter=cache --filter=delay \
file /var/tmp/fedora-33.img \
delay-read=50ms cache-on-read=true \
--run 'virt-inspector --format=raw -a "$uri"'
real 0m27.193s
user 0m0.283s
sys 0m1.776s
Q2: Should we consider a "cow-on-read" flag to the cow filter (thus
removing the need to use the cache filter at all)?
That would make at least some sense since there is cow-on-cache already (albeit a little confusing for me personally). I presume it would not increase the size of the difference (when using qemu-img rebase) at all, right? I do not see however how it would be faster than the existing: cow <- cache[cache-on-read] Martin
Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/ _______________________________________________ Libguestfs mailing list [email protected] https://listman.redhat.com/mailman/listinfo/libguestfs
signature.asc
Description: PGP signature
_______________________________________________ Libguestfs mailing list [email protected] https://listman.redhat.com/mailman/listinfo/libguestfs
