On Thu, Mar 04, 2021 at 12:12:51PM +0100, Stefano Garzarella wrote: > On Thu, Mar 04, 2021 at 10:25:33AM +0000, Daniel P. Berrangé wrote: > > On Thu, Mar 04, 2021 at 09:55:40AM +0100, Stefano Garzarella wrote: > > > On Wed, Mar 03, 2021 at 01:47:06PM -0500, Jason Dillaman wrote: > > > > On Wed, Mar 3, 2021 at 12:41 PM Stefano Garzarella > > > > <sgarz...@redhat.com> wrote: > > > > > > > > > > Hi Jason, > > > > > as reported in this BZ [1], when qemu-img creates a QCOW2 image on RBD > > > > > writing data is very slow compared to a raw file. > > > > > > > > > > Comparing raw vs QCOW2 image creation with RBD I found that we use a > > > > > different object size, for the raw file I see '4 MiB objects', for > > > > > QCOW2 > > > > > I see '64 KiB objects' as reported on comment 14 [2]. > > > > > This should be the main issue of slowness, indeed forcing in the code > > > > > 4 > > > > > MiB object size also for QCOW2 increased the speed a lot. > > > > > > > > > > Looking better I discovered that for raw files, we call rbd_create() > > > > > with obj_order = 0 (if 'cluster_size' options is not defined), so the > > > > > default object size is used. > > > > > Instead for QCOW2, we use obj_order = 16, since the default > > > > > 'cluster_size' defined for QCOW2, is 64 KiB. > > > > > > > > > > Using '-o cluster_size=2M' with qemu-img changed only the qcow2 > > > > > cluster > > > > > size, since in qcow2_co_create_opts() we remove the 'cluster_size' > > > > > from > > > > > QemuOpts calling qemu_opts_to_qdict_filtered(). > > > > > For some reason that I have yet to understand, after this deletion, > > > > > however remains in QemuOpts the default value of 'cluster_size' for > > > > > qcow2 (64 KiB), that it's used in qemu_rbd_co_create_opts() > > > > > > > > > > At this point my doubts are: > > > > > Does it make sense to use the same cluster_size as qcow2 as > > > > > object_size > > > > > in RBD? > > > > > > > > No, not really. But it also doesn't really make any sense to put a > > > > QCOW2 image within an RBD image. To clarify from the BZ, OpenStack > > > > does not put QCOW2 images on RBD, it converts QCOW2 images into raw > > > > images to store in RBD. > > > > > > Yes, that was my doubt, thanks for the confirmation. > > > > > > Also Daniel (+CC) confirmed me the same thing, but just to be complete he > > > added that there is a case where OpenStack could use qcow2 on RBD, but in > > > this case using in-kernel RBD, so the QEMU RBD is not involved. > > > > > > > > > > > > If we want to keep the 2 options separated, how can it be done? Should > > > > > we rename the option in block/rbd.c? > > > > > > > > You can already pass overrides to the RBD block driver by just > > > > appending them after the > > > > "rbd:<filename>[:option1=value1[:option2=value2]]" portion, perhaps > > > > that could be re-used. > > > > > > I see, we should extend qemu_rbd_parse_filename() to suppurt it. > > > > We shouldn't really be extending the legacy filename syntax. > > If we need extra options we want them in the QAPI schema for > > blockdev. > > Got it. > > I'm still a bit confused about how QemuOpts are handled between format and > protocol drivers. > > It seems that in this case the protocol tries to access some information > from the format (BLOCK_OPT_CLUSTER_SIZE). > > Since the format removes this information from the QemuOpts passed to the > protocol, this takes the default value of the format, even if a different > value is specified. > > Is it correct for a protocol to access BLOCK_OPT_CLUSTER_SIZE?
In a -blockdev world, the caller would be expected to set the values explicitly at all layers that need it. You're talking about a scenario that is non-blockdev though, and I'm not sure what the right answer is here. Will need Kevin/Max to answer that one. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|