Am 04.03.2021 um 18:32 hat Stefano Garzarella geschrieben: > On Thu, Mar 04, 2021 at 03:59:17PM +0100, Kevin Wolf wrote: > > Am 04.03.2021 um 15:08 hat Stefano Garzarella geschrieben: > > > On Thu, Mar 04, 2021 at 01:05:02PM +0100, Kevin Wolf wrote: > > > > Am 03.03.2021 um 18:40 hat Stefano Garzarella geschrieben: > > > > > Hi Jason, > > > > > as reported in this BZ [1], when qemu-img creates a QCOW2 image on RBD > > > > > writing data is very slow compared to a raw file. > > > > > > > > > > Comparing raw vs QCOW2 image creation with RBD I found that we use a > > > > > different object size, for the raw file I see '4 MiB objects', for > > > > > QCOW2 I > > > > > see '64 KiB objects' as reported on comment 14 [2]. > > > > > This should be the main issue of slowness, indeed forcing in the code > > > > > 4 MiB > > > > > object size also for QCOW2 increased the speed a lot. > > > > > > > > > > Looking better I discovered that for raw files, we call rbd_create() > > > > > with > > > > > obj_order = 0 (if 'cluster_size' options is not defined), so the > > > > > default > > > > > object size is used. > > > > > Instead for QCOW2, we use obj_order = 16, since the default > > > > > 'cluster_size' > > > > > defined for QCOW2, is 64 KiB. > > > > > > > > Hm, the QemuOpts-based image creation is messy, but why does the rbd > > > > driver even see the cluster_size option? > > > > > > > > The first thing qcow2_co_create_opts() does is splitting the passed > > > > QemuOpts into options it will process on the qcow2 layer and options > > > > that are passed to the protocol layer. So if you pass a cluster_size > > > > option, qcow2 should take it for itself and not pass it to rbd. > > > > > > > > If it is passed to rbd, I think that's a bug in the qcow2 driver. > > > > > > IIUC qcow2 properyl remove it, but when rbd uses > > > qemu_opt_get_size_del(opts, > > > BLOCK_OPT_CLUSTER_SIZE, 0) the default value of qcow2 format is returned. > > > > > > Going in depth in qemu_opt_get_size_helper(), I found that qemu_opt_find() > > > properly returns a NULL pointer, but then we call find_default_by_name() > > > that returns the default value of qcow2 format (64k). > > > > Ugh, I see why. We're passing the protocol driver a QemuOpts that was > > created for a QemuOptsList with the qcow2 default, not for its own > > QemuOptsList. This is wrong. > > > > Note that the QemuOptsList is not qcow2_create_opts itself, but a list > > that is created with qemu_opts_append() to combine qcow2 and rbd options > > into a new QemuOptsList. For overlapping options, the format wins. > > > > I don't think you can change the QemuOptsList of an existing QemuOpts, > > nor is there a clone operation that could just copy all options into a > > new QemuOpts created for the rbd QemuOptsList, so maybe the easiest > > hack^Wsolution would be converting to QDict and back... > > Do you mean something like this? (I'll send a proper patch when everything > is a little clearer to me :-) > > diff --git a/block.c b/block.c > index a1f3cecd75..74b02b32dc 100644 > --- a/block.c > +++ b/block.c > @@ -671,13 +671,33 @@ out: > int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp) > { > BlockDriver *drv; > + QemuOpts *new_opts; > + QDict *qdict; > + int ret; > > drv = bdrv_find_protocol(filename, true, errp); > if (drv == NULL) { > return -ENOENT; > } > > - return bdrv_create(drv, filename, opts, errp); > + if (!drv->create_opts) { > + error_setg(errp, "Driver '%s' does not support image creation", > + drv->format_name); > + return -ENOTSUP; > + } > + > + qdict = qemu_opts_to_qdict(opts, NULL); > + new_opts = qemu_opts_from_qdict(drv->create_opts, qdict, errp); > + if (new_opts == NULL) { > + ret = -EINVAL; > + goto out; > + } > + > + ret = bdrv_create(drv, filename, new_opts, errp); > +out: > + qemu_opts_del(new_opts); > + qobject_unref(qdict); > + return ret; > }
Something like this, yes. Does it work for you? Of course, in the real patch it could use a comment why we're doing these seemingly redundant conversions. Kevin