On Wed, 06/13 10:06, Kevin Wolf wrote: > Am 13.06.2018 um 09:46 hat Fam Zheng geschrieben: > > Similar to the host_device's implementation, we check the requested > > length against the namespace size. > > > > Truncation is necessary to make qcow2 creation work. > > > > Signed-off-by: Fam Zheng <f...@redhat.com> > > > +static int coroutine_fn nvme_co_create_opts(const char *filename, QemuOpts > > *opts, > > + Error **errp) > > +{ > > + int ret = 0; > > + BlockDriverState *bs = NULL; > > + int64_t size; > > + > > + if (strncmp(filename, "nvme://", strlen("nvme://"))) { > > + error_setg(errp, "Invalid filename (must start with \"nvme://\")"); > > + ret = -EINVAL; > > + goto out; > > + } > > + > > + bs = bdrv_open(filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL, > > errp); > > + if (!bs) { > > + ret = -EINVAL; > > + goto out; > > + } > > + > > + size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0); > > + > > + if (size < 0 || bdrv_getlength(bs) < size) { > > + error_setg(errp, "Invalid image size"); > > + ret = -EINVAL; > > + } > > + > > +out: > > + bdrv_unref(bs); > > + /* Hold breath for a little while before letting image format creation > > run. > > + * The problem is when testing with Intel P3700, the controller doesn't > > + * like the immediate open after close, as a result, nvme_init() will > > fail. > > + * This works around that. > > + **/ > > + g_usleep(2000000); > > This suggests that nbd_init() is buggy. > > If we need to sleep here (for two whole seconds?!), I'm sure there are > other cases that would have to sleep as well. So even if we can't find a > solution other than sleeping - which feels horribly wrong - the sleep > should probably be in nvme_init() rather than here. > > What kind of error are you running into without the sleep?
The error would be the "Timeout while waiting for device to start..." in nvme_init(), which happens after waiting for 20 seconds after setting the device's enable bit. If we put a sleep in nvme_init() it will hurt the blockdev-add command and QEMU launch badly, whereas being here it hurts x-blockdev-create, qemu-img create, etc. Both are really bad, but the first is worse. BTW nvme_init() already has to spin for a few seconds waiting for bit 0 in this loop: while (!(le32_to_cpu(s->regs->csts) & 0x1)) { if (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) > deadline) { error_setg(errp, "Timeout while waiting for device to start (%" PRId64 " ms)", timeout_ms); ret = -ETIMEDOUT; goto fail_queue; } } (we should probably insert a g_usleep(100) in the loop body, but it doesn't make nvme_init return any faster.) My wild guess is that the controller doesn't respond to the setting of CC.EN (device enable) bit correctly when it is still internally busy due after a previous reset in nvme_close(). But perhaps it probably the cleanup in nvme_close() which is lame in the first place, compared to the complex de-init procedure we have in vfio_pci_reset(), and that unbinding the device from Linux nvme.ko coincidentally takes exactly 2 seconds when nvme_close() takes near 0. What this suggests is that cleanly shutting down the device does take about two seconds, but with the simplistic nvme_close(), the work is left asynchrously to the controller or kernel. I'll see if I can figure out what is missing. Fam