On Mon, Apr 01, 2019 at 02:09:06PM +0100, Richard W.M. Jones wrote: > We already have a cache filter so I have two ideas: > > (1) Modify nbdkit-cache-filter to add a readahead parameter.
There's a big problem that I didn't appreciate til now: The cache filter ends up splitting large reads rather badly. For example if the client is issuing 2M reads (not unreasonable for ‘qemu-img convert’) then the cache filter divides these into 4K requests to the plugin. Compare: $ iso='https://download.fedoraproject.org/pub/fedora/linux/releases/29/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-29-1.2.iso' $ ./nbdkit -U - -fv \ curl "$iso" \ --run 'qemu-img convert -f raw -p $nbd /var/tmp/out' nbdkit: curl[1]: debug: pread count=2097152 offset=0 nbdkit: curl[1]: debug: pread count=2097152 offset=2097152 nbdkit: curl[1]: debug: pread count=2097152 offset=4194304 nbdkit: curl[1]: debug: pread count=2097152 offset=6291456 $ ./nbdkit -U - -fv --filter=cache \ curl "$iso" \ --run 'qemu-img convert -f raw -p $nbd /var/tmp/out' nbdkit: curl[1]: debug: cache: pread count=2097152 offset=0 flags=0x0 nbdkit: curl[1]: debug: cache: blk_read block 0 (offset 0) is not cached nbdkit: curl[1]: debug: pread count=4096 offset=0 nbdkit: curl[1]: debug: cache: blk_read block 1 (offset 4096) is not cached nbdkit: curl[1]: debug: pread count=4096 offset=4096 nbdkit: curl[1]: debug: cache: blk_read block 2 (offset 8192) is not cached nbdkit: curl[1]: debug: pread count=4096 offset=8192 nbdkit: curl[1]: debug: cache: blk_read block 3 (offset 12288) is not cached nbdkit: curl[1]: debug: pread count=4096 offset=12288 nbdkit: curl[1]: debug: cache: blk_read block 4 (offset 16384) is not cached nbdkit: curl[1]: debug: pread count=4096 offset=16384 (FWIW we want reads of 64M or larger to get decent performance with virt-v2v). Unfortunately the cache filter kills performance dead because of round-trip times to the web server. This is a problem with the cache filter that we could likely solve with a bit of effort, but let's go back and take a look at option number 2 again: > (2) Add a new readahead filter which extends all pread requests When I'm doing v2v / qemu-img convert I don't really need the cache filter, except it was a convenient place to save the prefetched data. A dumber readahead filter might help here. Suppose it simply stores the position of the last read and prefetches (and saves) a certain amount of data following that read. If the next read is sequential, and so matches the position pointer, return the saved data, otherwise throw it away and do a normal read. I believe that this would solve the readahead problem in this case (but I didn't test it out yet). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/ _______________________________________________ Libguestfs mailing list [email protected] https://www.redhat.com/mailman/listinfo/libguestfs
