Re: [PATCH v2 0/1] qemu-nbd: Close inherited stderr
On 5/14/20 4:55 PM, Eric Blake wrote: > On 5/14/20 9:29 AM, Eric Blake wrote: > >>> WARNING: Block comments use a leading /* on a separate line >>> #20: FILE: qemu-nbd.c:919: >>> + /* Remember parents stderr only if the fork option is set. >>> > >> The comment could use some grammar help (s/parents/parent's/), and in >> truth, I don't think it adds much beyond what the code itself is >> already doing, so rather than adding another line to silence patchew, >> you could instead just eliminate the comment and life would still be >> fine. Or if you want a one-line comment, I might suggest: >> >> /* Remember parent's stderr if we will restoring it. */ > > It helps if I don't hit 'send' too early. > > /* Remember parent's stderr if we will be restoring it. */ > From e5749541494abcdcaa37d752172741e1bc38e984 Mon Sep 17 00:00:00 2001 From: Raphael Pour Date: Fri, 15 May 2020 08:30:50 +0200 Subject: [PATCH] qemu-nbd: Close inherited stderr Close inherited stderr of the parent if fork_process is false. Otherwise no one will close it. (introduced by e6df58a5) Signed-off-by: Raphael Pour --- qemu-nbd.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/qemu-nbd.c b/qemu-nbd.c index 4aa005004e..306e44fb0a 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -916,7 +916,11 @@ int main(int argc, char **argv) } else if (pid == 0) { close(stderr_fd[0]); -old_stderr = dup(STDERR_FILENO); +/* Remember parent's stderr if we will be restoring it. */ +if (fork_process) { +old_stderr = dup(STDERR_FILENO); +} + ret = qemu_daemon(1, 0); /* Temporarily redirect stderr to the parent's pipe... */ -- 2.25.4 I fixed the issues and checkpatch gave me a 'ready for submission'. Is this the right way to submit a fixed patch or should it be a v3? -- Hetzner Online GmbH Am Datacenter-Park 1 08223 Falkenstein/Vogtland raphael.p...@hetzner.com www.hetzner.com Registergericht Ansbach, HRB 6089 Geschäftsführer: Martin Hetzner, Stephan Konvickova, Günther Müller signature.asc Description: OpenPGP digital signature
Re: [PATCH v6 04/14] block/amend: separate amend and create options for qemu-img
On 14.05.20 18:10, Eric Blake wrote: > On 5/14/20 7:28 AM, Max Reitz wrote: >> On 10.05.20 15:40, Maxim Levitsky wrote: >>> Some options are only useful for creation >>> (or hard to be amended, like cluster size for qcow2), while some other >>> options are only useful for amend, like upcoming keyslot management >>> options for luks >>> > >>> +#define QCOW_COMMON_OPTIONS \ >>> + { \ > >>> + .help = "Width of a reference count entry in bits", \ >>> + .def_value_str = "16" \ >>> + } \ >> >> I think the last line should have a comma in it (otherwise the final >> backslash doesn’t make much sense, because whenever we’d add a new >> option, we would need to modify the line anyway to insert a comma). > > Except that... > >> >> Speaking of adding option, this requires a rebase due to the >> compression_type option added (not trivial in the strict sense, but >> still straightforward to handle). >> >>> + >>> static QemuOptsList qcow2_create_opts = { >>> .name = "qcow2-create-opts", >>> .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head), >>> .desc = { >>> >> >> [...] >> >>> + QCOW_COMMON_OPTIONS, >>> + { /* end of list */ } > > ...the intended usage is to use the macro name followed by a comma, so > including a trailing comma in the macro itself would lead to a syntax > error. But why is that the indended usage? Is there something in our coding style that forbids macros that don’t allow a separator to be placed after them? Max signature.asc Description: OpenPGP digital signature
Re: [PATCH v3 0/3] various: Remove unnecessary casts
Philippe Mathieu-Daudé writes: > Remove unnecessary casts using coccinelle scripts. > > The CPU()/OBJECT() patches don't introduce logical change, > The DEVICE() one removes various OBJECT_CHECK() calls. Queued, thanks! Managing expecations: I'm not a QOM maintainer, I don't want to become one, and I don't normally queue QOM patches :)
Re: [PATCH v2 0/5] fix migration with bitmaps and mirror
14.05.2020 21:29, Eric Blake wrote: reviving this thread... On 4/3/20 6:23 AM, Peter Krempa wrote: On Fri, Apr 03, 2020 at 14:02:47 +0300, Vladimir Sementsov-Ogievskiy wrote: 19.12.2019 13:36, Peter Krempa wrote: On Thu, Dec 19, 2019 at 11:51:01 +0300, Vladimir Sementsov-Ogievskiy wrote: Hi all! It's a continuation for "bitmap migration bug with -drive while block mirror runs" <315cff78-dcdb-a3ce-2742-da3cc9f0c...@redhat.com> https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg07241.html The problem is that bitmaps migrated to node with same node-name or blk-parent name. And currently only the latter actually work in libvirt. And with mirror-top filter it doesn't work, because bdrv_get_device_or_node_name don't go through filters. I want to point out that since libvirt-5.10 we use -blockdev to configure the backend of storage devices with qemu-4.2 and later. This means unfortunately that the BlockBackend of the drive does not have a name any more and thus the above will not work even if you make the lookup code to see through filters. Not that this series doesn't make things worse, as it loops through named block backends when trying to use their name for migration. So with these patches applied, qemu will just work in more possible scenarios. Okay, if that's so it's fair enough in this case. I'm just very firmly against baking in the assumption that node names mean the same thing accross migration, because that will create a precedent situation and more stuff may be baked in on top of this in the future. It seems that it has already happened though and it's wrong. And the worst part is that it's never mentioned that this might occur. But again, don't do that and preferrably remove the matching of node names for bitmaps altogether until we can control it arbitrarily. We've also seen this already before with the backend name of memory devices being baked in to the migration stream which creates an unwanted dependancy. Max is trying to tackle the node-name issue: https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg03358.html And trying to apply that patch after staging this series hits a conflict in mnigration/block-dirty-bitmap.c. Which one should go in first? My patches are needed to fix migration for the pre-blockdev configuration with mirror-top filter. We ofcrouse need them in Virtuozzo, but it's not hard to keep the in downstream-only.. And it will be not simple to use new command from Max in pre-blockdev libvirt configuration, with auto-generated node-names. How much we care about pre-blockdev libvirt now in upstream Qemu? If we don't care, than these series are only for downstreams, and we don't need to apply them upstream.. On the other hand, Max have to resend anyway, to handle old code, which uses device name instead of node-name. And if we don't want to drop now the code which can use device name (needed for old libvirt), why not to apply the series, which just make old code better? In other words: do we still support pre-blockdev libvirt (and any other pre-blockdev users)? If we support, than, as I said somewhere, I need to resend these series as I have updated version in our downstream. And I think, I can rebase Max's patch by myself and send together with this all, if no objections. -- Best regards, Vladimir
Re: [PATCH v2] bitmaps: Update maintainer
14.05.2020 21:00, Eric Blake wrote: Dirty bitmaps are important to incremental backups, including exposure over NBD where I'm already maintainer. Also, I'm aware that lately I have been doing as much code/review on bitmaps as John Snow who is trying to scale back in order to focus elsewhere; and many of the recent patches have come from Vladimir, who is also interested in taking on maintainer duties, but would like to start with co-maintainership. Therefore, it's time to revamp the ownership of this category, as agreed between the three of us. Signed-off-by: Eric Blake --- v2: further tweak to maintainership, update T: listing MAINTAINERS | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index d11f3cb97613..ae23062a51ac 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2001,8 +2001,9 @@ F: qapi/transaction.json T: git https://repo.or.cz/qemu/armbru.git block-next Dirty Bitmaps -M: John Snow -R: Vladimir Sementsov-Ogievskiy +M: Eric Blake +M: Vladimir Sementsov-Ogievskiy +R: John Snow L: qemu-block@nongnu.org S: Supported F: include/qemu/hbitmap.h @@ -2013,7 +2014,7 @@ F: migration/block-dirty-bitmap.c F: util/hbitmap.c F: tests/test-hbitmap.c F: docs/interop/bitmaps.rst -T: git https://github.com/jnsnow/qemu.git bitmaps +T: git https://repo.or.cz/qemu/ericb.git bitmaps Character device backends M: Marc-André Lureau Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
Re: [PATCH v3 2/3] various: Remove unnecessary OBJECT() cast
Philippe Mathieu-Daudé writes: > The OBJECT() macro is defined as: > > #define OBJECT(obj) ((Object *)(obj)) > > Remove the unnecessary OBJECT() casts when we already know the > pointer is of Object type. > > Patch created mechanically using spatch with this script: > > @@ > typedef Object; > Object *o; > @@ > - OBJECT(o) > + o > > Acked-by: Cornelia Huck > Acked-by: Corey Minyard > Acked-by: John Snow > Reviewed-by: Richard Henderson > Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Markus Armbruster
Re: [PATCH v3 1/1] qemu_img: add cvtnum_full to print error reports
On 5/13/20 8:36 AM, Eyal Moscovici wrote: All calls to cvtnum check the return value and print the same error message more or less. And so error reporting moved to cvtnum_full to reduce code duplication and provide a single error message. Additionally, cvtnum now wraps cvtnum_full with the existing default range of 0 to MAX_INT64. Acked-by: Mark Kanda Signed-off-by: Eyal Moscovici --- -static int64_t cvtnum(const char *s) +static int64_t cvtnum_full(const char *name, const char *value, int64_t min, + int64_t max) { int err; -uint64_t value; - -err = qemu_strtosz(s, NULL, &value); -if (err < 0) { +uint64_t res; + +err = qemu_strtosz(value, NULL, &res); +if (err < 0 && err != -ERANGE) { +error_report("Invalid %s specified. You may use " + "k, M, G, T, P or E suffixes for ", name); +error_report("kilobytes, megabytes, gigabytes, terabytes, " + "petabytes and exabytes."); Consecutive error_report() calls each output a newline, which means your new output includes a trailing space. @@ -572,16 +584,8 @@ static int img_create(int argc, char **argv) if (optind < argc) { int64_t sval; -sval = cvtnum(argv[optind++]); +sval = cvtnum("image size", argv[optind++]); if (sval < 0) { -if (sval == -ERANGE) { -error_report("Image size must be less than 8 EiB!"); -} else { -error_report("Invalid image size specified! You may use k, M, " - "G, T, P or E suffixes for "); -error_report("kilobytes, megabytes, gigabytes, terabytes, " - "petabytes and exabytes."); -} True, that's what some of the old code was doing, but... +++ b/tests/qemu-iotests/049.out qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- 1kilobyte -qemu-img: Invalid image size specified! You may use k, M, G, T, P or E suffixes for +qemu-img: Invalid image size specified. You may use k, M, G, T, P or E suffixes for where it gets hairy is that our iotests _intentionally_ strip trailing space before comparing to expected output, because it is such a pain to commit files with trailing spaces into the repository. We're better off making the expected output precisely match what qemu-img actually outputs, which means using this as an opportunity to fix qemu-img to not output trailing space in the first place. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH] hw/ide/ahci: Log lost IRQs
On 5/4/20 5:48 AM, Philippe Mathieu-Daudé wrote: > One might find interesting to look at AHCI IRQs. > > Signed-off-by: Philippe Mathieu-Daudé > --- > hw/ide/ahci.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c > index 13d91e109a..fc82cbd5f1 100644 > --- a/hw/ide/ahci.c > +++ b/hw/ide/ahci.c > @@ -1509,6 +1509,7 @@ static void ahci_cmd_done(IDEDMA *dma) > > static void ahci_irq_set(void *opaque, int n, int level) > { > +qemu_log_mask(LOG_UNIMP, "ahci: IRQ#%d level:%d\n", n, level); > } > > static const IDEDMAOps ahci_dma_ops = { > Reviewed-by: John Snow Sorry, just drowning in backlog. Thanks for the ping on IRC. Acked-by: John Snow ^ Feel free to take through trivial tree. --js
Re: [PATCH] hw/ide: Make IDEDMAOps handlers take a const IDEDMA pointer
On 5/12/20 3:49 PM, Philippe Mathieu-Daudé wrote: > Handlers don't need to modify the IDEDMA structure. > Make it const. > > Signed-off-by: Philippe Mathieu-Daudé I'll trust your judgment. As long as it still compiles and passes qtests, I'm happy if you're happy. Acked-by: John Snow > --- > include/hw/ide/internal.h | 12 ++-- > hw/ide/ahci.c | 18 +- > hw/ide/core.c | 6 +++--- > hw/ide/macio.c| 6 +++--- > hw/ide/pci.c | 12 ++-- > 5 files changed, 27 insertions(+), 27 deletions(-) > > diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h > index 55da35d768..1a7869e85d 100644 > --- a/include/hw/ide/internal.h > +++ b/include/hw/ide/internal.h > @@ -322,12 +322,12 @@ typedef enum { IDE_HD, IDE_CD, IDE_CFATA } IDEDriveKind; > > typedef void EndTransferFunc(IDEState *); > > -typedef void DMAStartFunc(IDEDMA *, IDEState *, BlockCompletionFunc *); > -typedef void DMAVoidFunc(IDEDMA *); > -typedef int DMAIntFunc(IDEDMA *, bool); > -typedef int32_t DMAInt32Func(IDEDMA *, int32_t len); > -typedef void DMAu32Func(IDEDMA *, uint32_t); > -typedef void DMAStopFunc(IDEDMA *, bool); > +typedef void DMAStartFunc(const IDEDMA *, IDEState *, BlockCompletionFunc *); > +typedef void DMAVoidFunc(const IDEDMA *); > +typedef int DMAIntFunc(const IDEDMA *, bool); > +typedef int32_t DMAInt32Func(const IDEDMA *, int32_t len); > +typedef void DMAu32Func(const IDEDMA *, uint32_t); > +typedef void DMAStopFunc(const IDEDMA *, bool); > > struct unreported_events { > bool eject_request; > diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c > index 13d91e109a..168d34e9f2 100644 > --- a/hw/ide/ahci.c > +++ b/hw/ide/ahci.c > @@ -44,7 +44,7 @@ static int handle_cmd(AHCIState *s, int port, uint8_t slot); > static void ahci_reset_port(AHCIState *s, int port); > static bool ahci_write_fis_d2h(AHCIDevice *ad); > static void ahci_init_d2h(AHCIDevice *ad); > -static int ahci_dma_prepare_buf(IDEDMA *dma, int32_t limit); > +static int ahci_dma_prepare_buf(const IDEDMA *dma, int32_t limit); > static bool ahci_map_clb_address(AHCIDevice *ad); > static bool ahci_map_fis_address(AHCIDevice *ad); > static void ahci_unmap_clb_address(AHCIDevice *ad); > @@ -1338,7 +1338,7 @@ out: > } > > /* Transfer PIO data between RAM and device */ > -static void ahci_pio_transfer(IDEDMA *dma) > +static void ahci_pio_transfer(const IDEDMA *dma) > { > AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma); > IDEState *s = &ad->port.ifs[0]; > @@ -1397,7 +1397,7 @@ out: > } > } > > -static void ahci_start_dma(IDEDMA *dma, IDEState *s, > +static void ahci_start_dma(const IDEDMA *dma, IDEState *s, > BlockCompletionFunc *dma_cb) > { > AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma); > @@ -1406,7 +1406,7 @@ static void ahci_start_dma(IDEDMA *dma, IDEState *s, > dma_cb(s, 0); > } > > -static void ahci_restart_dma(IDEDMA *dma) > +static void ahci_restart_dma(const IDEDMA *dma) > { > /* Nothing to do, ahci_start_dma already resets s->io_buffer_offset. */ > } > @@ -1415,7 +1415,7 @@ static void ahci_restart_dma(IDEDMA *dma) > * IDE/PIO restarts are handled by the core layer, but NCQ commands > * need an extra kick from the AHCI HBA. > */ > -static void ahci_restart(IDEDMA *dma) > +static void ahci_restart(const IDEDMA *dma) > { > AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma); > int i; > @@ -1432,7 +1432,7 @@ static void ahci_restart(IDEDMA *dma) > * Called in DMA and PIO R/W chains to read the PRDT. > * Not shared with NCQ pathways. > */ > -static int32_t ahci_dma_prepare_buf(IDEDMA *dma, int32_t limit) > +static int32_t ahci_dma_prepare_buf(const IDEDMA *dma, int32_t limit) > { > AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma); > IDEState *s = &ad->port.ifs[0]; > @@ -1453,7 +1453,7 @@ static int32_t ahci_dma_prepare_buf(IDEDMA *dma, > int32_t limit) > * Called via dma_buf_commit, for both DMA and PIO paths. > * sglist destruction is handled within dma_buf_commit. > */ > -static void ahci_commit_buf(IDEDMA *dma, uint32_t tx_bytes) > +static void ahci_commit_buf(const IDEDMA *dma, uint32_t tx_bytes) > { > AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma); > > @@ -1461,7 +1461,7 @@ static void ahci_commit_buf(IDEDMA *dma, uint32_t > tx_bytes) > ad->cur_cmd->status = cpu_to_le32(tx_bytes); > } > > -static int ahci_dma_rw_buf(IDEDMA *dma, bool is_write) > +static int ahci_dma_rw_buf(const IDEDMA *dma, bool is_write) > { > AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma); > IDEState *s = &ad->port.ifs[0]; > @@ -1486,7 +1486,7 @@ static int ahci_dma_rw_buf(IDEDMA *dma, bool is_write) > return 1; > } > > -static void ahci_cmd_done(IDEDMA *dma) > +static void ahci_cmd_done(const IDEDMA *dma) > { > AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma); > > diff --git a/hw/ide/core.c b/hw/ide/core.c > index 689bb36409..d997a7
[PATCH 3/3] iotests: Categorize NOTRUN messages as INFO, not WARNING
It's not really a warning; we don't want to see it if we're not running in at least a verbose mode. We can see it on the test summary just fine. Signed-off-by: John Snow --- tests/qemu-iotests/iotests.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 9231767acf..8e479e1c6f 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -986,7 +986,7 @@ def notrun(reason): seq = os.path.basename(sys.argv[0]) open('%s/%s.notrun' % (output_dir, seq), 'w').write(reason + '\n') -logger.warning("%s not run: %s", seq, reason) +logger.info("%s not run: %s", seq, reason) sys.exit(0) def case_notrun(reason): -- 2.21.1
[PATCH 2/3] iotests: log to stderr instead of stdout
Separate the streams; stdout is for test diff output, stderr is for control messages and things for the human to look at. (Cough, unfortunately, I didn't realize that ./check actually just always redirects both, so even on STDERR, you can't see warnings. Oh well...) Signed-off-by: John Snow --- tests/qemu-iotests/iotests.py | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 1caa7812de..9231767acf 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -1171,7 +1171,11 @@ def execute_setup_common(supported_fmts: Sequence[str] = (), debug = '-d' in sys.argv if debug: sys.argv.remove('-d') -logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN)) + +logging.basicConfig( +level=logging.DEBUG if debug else logging.WARN, +stream=sys.stderr, +) _verify_image_format(supported_fmts, unsupported_fmts) _verify_protocol(supported_protocols, unsupported_protocols) -- 2.21.1
[PATCH 0/3] iotests: enable logging prior to notrun() invocation
Hi, you can take just patch 1. patches 2-3 admittedly don't do a whole heck of a lot, because I didn't realize that ./check discards *all* output from either stdout or stderr. The changes are tiny, though, and maybe still worth doing in the long run? Hm. They are archived on the list now, anyway. --js John Snow (3): iotests: log messages from notrun() iotests: log to stderr instead of stdout iotests: Categorize NOTRUN messages as INFO, not WARNING tests/qemu-iotests/iotests.py | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) -- 2.21.1
[PATCH 1/3] iotests: log messages from notrun()
Shift the logging initialization up to occur prior to validation checks, so that notrun() messages still get printed to console. (Also, remove the "debugging messages active" message, because we don't need to see that hundreds of times per iotest suite run.) Signed-off-by: John Snow --- tests/qemu-iotests/iotests.py | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 6c0e781af7..1caa7812de 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -1168,18 +1168,17 @@ def execute_setup_common(supported_fmts: Sequence[str] = (), sys.stderr.write('Please run this test via the "check" script\n') sys.exit(os.EX_USAGE) +debug = '-d' in sys.argv +if debug: +sys.argv.remove('-d') +logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN)) + _verify_image_format(supported_fmts, unsupported_fmts) _verify_protocol(supported_protocols, unsupported_protocols) _verify_platform(supported=supported_platforms) _verify_cache_mode(supported_cache_modes) _verify_aio_mode(supported_aio_modes) -debug = '-d' in sys.argv -if debug: -sys.argv.remove('-d') -logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN)) -logger.debug("iotests debugging messages active") - return debug def execute_test(*args, test_function=None, **kwargs): -- 2.21.1
Re: [PATCH v10 14/14] iotests: use python logging for iotests.log()
On 5/14/20 6:06 AM, Kevin Wolf wrote: > Am 14.05.2020 um 08:24 hat John Snow geschrieben: >> On 3/31/20 9:44 AM, Kevin Wolf wrote: >>> Am 31.03.2020 um 02:00 hat John Snow geschrieben: We can turn logging on/off globally instead of per-function. Remove use_log from run_job, and use python logging to turn on diffable output when we run through a script entry point. iotest 245 changes output order due to buffering reasons. An extended note on python logging: A NullHandler is added to `qemu.iotests` to stop output from being generated if this code is used as a library without configuring logging. A NullHandler is only needed at the root, so a duplicate handler is not needed for `qemu.iotests.diff_io`. When logging is not configured, messages at the 'WARNING' levels or above are printed with default settings. The NullHandler stops this from occurring, which is considered good hygiene for code used as a library. See https://docs.python.org/3/howto/logging.html#library-config When logging is actually enabled (always at the behest of an explicit call by a client script), a root logger is implicitly created at the root, which allows messages to propagate upwards and be handled/emitted from the root logger with default settings. When we want iotest logging, we attach a handler to the qemu.iotests.diff_io logger and disable propagation to avoid possible double-printing. For more information on python logging infrastructure, I highly recommend downloading the pip package `logging_tree`, which provides convenient visualizations of the hierarchical logging configuration under different circumstances. See https://pypi.org/project/logging_tree/ for more information. Signed-off-by: John Snow Reviewed-by: Max Reitz >>> >>> Should we enable logger if -d is given? >>> >>> Previously we had: >>> >>> $ ./check -d -T -raw 281 >>> [...] >>> 281 not run: not suitable for this image format: raw >>> 281 not run[15:39:03] [15:39:04]not suitable >>> for this image format: raw >>> Not run: 281 >>> >>> After this series, the first line of output from notrun() is missing. >>> Not that I think it's important to have the line, but as long as we >>> bother to call logger.warning(), I thought that maybe we want to be able >>> to actually see the effect of it somehwere? >>> >>> Kevin >>> >> >> Uh, okay. So this is weirder than I thought it was going to be! >> >> So, if you move the debug configuration up above the _verify calls, >> you'll see the message printed out to the debug stream: >> >> DEBUG:qemu.iotests:iotests debugging messages active >> WARNING:qemu.iotests:281 not run: not suitable for this image format: raw >> >> ...but if you omit the `-d` flag, the message vanishes into a black >> hole. Did it always work like that ...? > > Yes, this is how it used to work. It's a result of ./check only printing > the test output with -d, and such log messages are basically just test > output. > > And I think it's exactly what we want: Without -d, you want only the > summary, i.e. a single line that says "pass", "fail" or "notrun", > potentially with a small note at the end of the line, but that's it. > OK, maybe. So I guess what happens here is that if you don't use -d, the output gets redirected to file, and that file is summarily deleted. Your phrase "but as long as we bother to call logger.warning(), I thought that maybe we want to be able to actually see the effect of it somewhere" stuck with me -- I think you're right. I kind of do expect that if I call a function called warning() that it's gonna do some damage. principle of least surprise, etc. So two things: (1) Maybe the iotest logger ought to always use stderr, and we should see any calls to warning() or error() even when debugging is off. (2) These skip notifications are not warnings, they are informational and can be disabled when `-d` is omitted. (Especially because they are represented through another channel.) --js (I'll send the fixup for the simpler thing first, and you can take or leave the second thing.)
Re: [PATCH v4 1/3] qmp.py: change event_wait to use a dict
On 5/14/20 11:59 AM, Kevin Wolf wrote: > Am 14.05.2020 um 17:07 hat John Snow geschrieben: >> >> >> On 5/14/20 10:47 AM, Kevin Wolf wrote: >>> Am 14.05.2020 um 04:25 hat John Snow geschrieben: It's easier to work with than a list of tuples, because we can check the keys for membership. Signed-off-by: John Snow --- python/qemu/machine.py| 10 +- tests/qemu-iotests/040| 12 ++-- tests/qemu-iotests/260| 5 +++-- tests/qemu-iotests/iotests.py | 16 4 files changed, 22 insertions(+), 21 deletions(-) >>> >>> I think you need to convert scripts/simplebench/bench_block_job.py, too. >> >> Oh, right -- that one is new since I did this. A lot of these scripts >> need to be moved over into the python/ directory and managed under the >> same pylint/mypy regime as everything else. >> >> A *ton* of our scripts are in various states of disrepair. > > Is python/ actually supposed to have executable files in it? I thought > it was more for libraries. > Welll. At the moment it's library only. but one of the things you can do with a library is define executable entry-points into that library. If you haven't cast an eye at that 32 patch series yet, it basically creates a structure like this: > ./python/qemu/lib/[qmp|machine|qtest|accel].py qemu/ forms a PEP420 namespace; the idea is to be able to modularly create and independently package subpackages. qemu/lib forms a proper python package in which there are no executables, just a library, as you say. My idea is that anything under python/*/ ought to form a properly formatted package. So we could, for instance, have a python/qemu/devtools namespace which packages and collects a bunch of our little scripts. Then we could make sure we hit them with the same mypy/pylint/flake8/whatever as the core libraries those scripts are using to keep them in sync better. And, ideally, if they are all using the same kind of paradigms for import and dependency management it will be easier to use them and keep them up to date, etc. For using them as a developer, you could, say, cd ~/src/qemu/python pip3 install --user -e . and install the source packages to your local environment and then have access to e.g. > qmp-shell right on your CLI, without having to fuss with PYTHONPATH or anything else. As you update the source repo, you'll get the new versions of the package living in your python environment automatically. Of course, this maybe has downsides too; so you can always use a virtual environment to adopt a context in which you have these tools. For that, > pip3 install --user pipenv # or use dnf, or apt, w/e. > cd ~/src/qemu/python > pipenv shell > pip install -e . And from here you'll have the dev package installed to a development venv that you can use. *cough* anyway, that's wildly off-topic. Generally, you want to format a library such that you have a callable entry point, maybe named main(). So you'd have some qmp-shell module and it has a main() function. Then, in the setup.py script, you'd define qemu.lib.qmp_shell:main() as an entry point and give it a name like 'qmp-shell'. When pip/setuptools processes your package installation, it'll create a shim for you in e.g. ~/.local/bin/qmp-shell that will just load the library and execute that entrypoint for you. I was thinking I'd do this for all of our python scripts so I could spend my energy on a pylint/mypy test infrastructure *once* and *in one place* and then it would be easier to detect regressions for scripts that don't actually run as part of the test suite. >>> diff --git a/python/qemu/machine.py b/python/qemu/machine.py index b9a98e2c86..eaedc25172 100644 --- a/python/qemu/machine.py +++ b/python/qemu/machine.py @@ -478,21 +478,21 @@ def event_wait(self, name, timeout=60.0, match=None): timeout: QEMUMonitorProtocol.pull_event timeout parameter. match: Optional match criteria. See event_match for details. """ -return self.events_wait([(name, match)], timeout) +return self.events_wait({name: match}, timeout) def events_wait(self, events, timeout=60.0): """ events_wait waits for and returns a named event from QMP with a timeout. -events: a sequence of (name, match_criteria) tuples. +events: a mapping containing {name: match_criteria}. The match criteria are optional and may be None. See event_match for details. timeout: QEMUMonitorProtocol.pull_event timeout parameter. """ def _match(event): -for name, match in events: -if event['event'] == name and self.event_match(event, match): -return True +name = event['event'] +
Re: [PATCH v4 3/3] iotests: modify test 040 to use JobRunner
On 5/14/20 11:53 AM, Kevin Wolf wrote: > Am 14.05.2020 um 04:25 hat John Snow geschrieben: >> Instead of having somewhat reproduced it for itself. >> >> Signed-off-by: John Snow > > I think you should pass auto_dismiss=True to the JobRunner, or (probably > preferable) change prepare_and_start_job() to start the job with > auto_dismiss=False. > > Kevin > okay, I'll try that out and see if I like it. Wild tangents, as is my normal: I also think it would be neat, in some sense, to provide a job creation abstraction where creating the QMP command in python also creates the runner with the right parameters based on how you initialized it. I've not given these even a proper three minutes think, but some generalized interface for managing the creation of jobs to use in concert with the job runner would be slick. (What reminds me of this is needing to remember and understand if I started something with auto_dismiss or not, which jobs it defaults to which for, etc. Streamlining the creation and runner could be slick for faster test-writing in normative cases.)
Re: [PATCH v4 2/3] iotests: add JobRunner class
On 5/14/20 11:40 AM, Kevin Wolf wrote: > Am 14.05.2020 um 04:25 hat John Snow geschrieben: >> The idea is that instead of increasing the arguments to job_run all the >> time, create a more general-purpose job runner that can be subclassed to >> do interesting things with. >> >> pylint note: the 'callbacks' option guards against unused warning >> arguments in functions designated as callbacks. It does not currently >> guard against "no-self-use" though; hence a once-off ignore. >> >> mypy note: QapiEvent is only a weak alias; it's fully interchangable >> with the type it's declared as. In the future, we may wish to tighten >> these types. For now, this communicates the rough shape of the type and >> (more importantly) the intent. >> >> Signed-off-by: John Snow > >> +# Listen for these events with these parameters: >> +self._events = { >> +'BLOCK_JOB_COMPLETED': match_device, >> +'BLOCK_JOB_CANCELLED': match_device, >> +'BLOCK_JOB_ERROR': match_device, >> +'BLOCK_JOB_READY': match_device, >> +'BLOCK_JOB_PENDING': match_id, >> +'JOB_STATUS_CHANGE': match_id >> +} > > The old code had a trailing comma here in case we need to add more > events later. Anyway: > > Reviewed-by: Kevin Wolf > Whoops. I favor those too, so I'll put it back.
Re: [PATCH v2] bitmaps: Update maintainer
On 5/14/20 2:00 PM, Eric Blake wrote: > Dirty bitmaps are important to incremental backups, including exposure > over NBD where I'm already maintainer. Also, I'm aware that lately I > have been doing as much code/review on bitmaps as John Snow who is > trying to scale back in order to focus elsewhere; and many of the > recent patches have come from Vladimir, who is also interested in > taking on maintainer duties, but would like to start with > co-maintainership. Therefore, it's time to revamp the ownership of > this category, as agreed between the three of us. Great! > > Signed-off-by: Eric Blake > --- > > v2: further tweak to maintainership, update T: listing > > MAINTAINERS | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/MAINTAINERS b/MAINTAINERS > index d11f3cb97613..ae23062a51ac 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -2001,8 +2001,9 @@ F: qapi/transaction.json > T: git https://repo.or.cz/qemu/armbru.git block-next > > Dirty Bitmaps > -M: John Snow > -R: Vladimir Sementsov-Ogievskiy > +M: Eric Blake > +M: Vladimir Sementsov-Ogievskiy > +R: John Snow > L: qemu-block@nongnu.org > S: Supported > F: include/qemu/hbitmap.h > @@ -2013,7 +2014,7 @@ F: migration/block-dirty-bitmap.c > F: util/hbitmap.c > F: tests/test-hbitmap.c > F: docs/interop/bitmaps.rst > -T: git https://github.com/jnsnow/qemu.git bitmaps > +T: git https://repo.or.cz/qemu/ericb.git bitmaps > > Character device backends > M: Marc-André Lureau > Acked-by: John Snow Reviewed-by: John Snow You'll want to work out repo access betwixt yourselves, but I'll leave that detail for you to work out. Thank you, --js
Re: [PATCH v2 4/5] migration/block-dirty-bitmap: fix bitmaps migration during mirror job
On 12/19/19 2:51 AM, Vladimir Sementsov-Ogievskiy wrote: Important thing for bitmap migration is to select destination block node to obtain the migrated bitmap. Prepatch, on source we use bdrv_get_device_or_node_name() to identify the node, and on target we do bdrv_lookup_bs. bdrv_get_device_or_node_name() returns blk name only for direct children of blk. So, bitmaps of direct children of blks are migrated by blk name and others - by node name. Libvirt currently is unprepared to bitmap migration by node-name, node-names are mostly auto-generated. So actually only migration by blk name works. It depends on whether -blockdev is in use. With -blockdev, libvirt should be supplying a node name everywhere, without, it is only device names. Now, consider classic libvirt migrations assisted by mirror block job: mirror block job inserts filter, so our source is not a direct child of blk, and bitmaps are migrated by node-names. And this just don't work. Does Max' work to improve seeing through filters fix this? Let's fix it by allowing use blk-name even if some implicit filters are inserted. Note, that we possibly want to allow explicit filters skipping too, but this is another story. Note2: we, of course, can't skip filters and use blk name to migrate bitmaps in filtered node by blk name for this blk if these filters have named bitmaps which should be migrated. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1652424 That bug has been marked CLOSED in the meantime, but it appears to be only because libvirt is now using -blockdev rather than the older drive, while the problem affects drive usage. Signed-off-by: Vladimir Sementsov-Ogievskiy --- migration/block-dirty-bitmap.c | 39 +- 1 file changed, 38 insertions(+), 1 deletion(-) Okay, after reading some more history on this project (the curse of coming up to speed after volunteering to become a co-maintainer), it looks like Max's idea replaces this patch altogether. How much of the rest of the series is still important? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v2 1/5] block: Mark commit and mirror as filter drivers
On 1/31/20 1:23 PM, Eric Blake wrote: On 12/19/19 2:51 AM, Vladimir Sementsov-Ogievskiy wrote: From: Max Reitz The commit and mirror block nodes are filters, so they should be marked as such. Signed-off-by: Max Reitz Signed-off-by: Vladimir Sementsov-Ogievskiy [squash comment fix from another Max's patch and adjust commit msg] --- include/block/block_int.h | 8 +--- block/commit.c | 2 ++ block/mirror.c | 2 ++ 3 files changed, 9 insertions(+), 3 deletions(-) Reviewed-by: Eric Blake It looks like this patch has been updated and is now on Kevin's block branch: https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg03271.html -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v2 3/5] block/dirty-bitmap: add bdrv_has_named_bitmaps helper
On 12/19/19 2:51 AM, Vladimir Sementsov-Ogievskiy wrote: To be used for bitmap migration in further commit. Signed-off-by: Vladimir Sementsov-Ogievskiy --- include/block/dirty-bitmap.h | 1 + block/dirty-bitmap.c | 13 + 2 files changed, 14 insertions(+) Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v2 0/5] fix migration with bitmaps and mirror
reviving this thread... On 4/3/20 6:23 AM, Peter Krempa wrote: On Fri, Apr 03, 2020 at 14:02:47 +0300, Vladimir Sementsov-Ogievskiy wrote: 19.12.2019 13:36, Peter Krempa wrote: On Thu, Dec 19, 2019 at 11:51:01 +0300, Vladimir Sementsov-Ogievskiy wrote: Hi all! It's a continuation for "bitmap migration bug with -drive while block mirror runs" <315cff78-dcdb-a3ce-2742-da3cc9f0c...@redhat.com> https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg07241.html The problem is that bitmaps migrated to node with same node-name or blk-parent name. And currently only the latter actually work in libvirt. And with mirror-top filter it doesn't work, because bdrv_get_device_or_node_name don't go through filters. I want to point out that since libvirt-5.10 we use -blockdev to configure the backend of storage devices with qemu-4.2 and later. This means unfortunately that the BlockBackend of the drive does not have a name any more and thus the above will not work even if you make the lookup code to see through filters. Not that this series doesn't make things worse, as it loops through named block backends when trying to use their name for migration. So with these patches applied, qemu will just work in more possible scenarios. Okay, if that's so it's fair enough in this case. I'm just very firmly against baking in the assumption that node names mean the same thing accross migration, because that will create a precedent situation and more stuff may be baked in on top of this in the future. It seems that it has already happened though and it's wrong. And the worst part is that it's never mentioned that this might occur. But again, don't do that and preferrably remove the matching of node names for bitmaps altogether until we can control it arbitrarily. We've also seen this already before with the backend name of memory devices being baked in to the migration stream which creates an unwanted dependancy. Max is trying to tackle the node-name issue: https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg03358.html And trying to apply that patch after staging this series hits a conflict in mnigration/block-dirty-bitmap.c. Which one should go in first? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
[PATCH v2] bitmaps: Update maintainer
Dirty bitmaps are important to incremental backups, including exposure over NBD where I'm already maintainer. Also, I'm aware that lately I have been doing as much code/review on bitmaps as John Snow who is trying to scale back in order to focus elsewhere; and many of the recent patches have come from Vladimir, who is also interested in taking on maintainer duties, but would like to start with co-maintainership. Therefore, it's time to revamp the ownership of this category, as agreed between the three of us. Signed-off-by: Eric Blake --- v2: further tweak to maintainership, update T: listing MAINTAINERS | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index d11f3cb97613..ae23062a51ac 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2001,8 +2001,9 @@ F: qapi/transaction.json T: git https://repo.or.cz/qemu/armbru.git block-next Dirty Bitmaps -M: John Snow -R: Vladimir Sementsov-Ogievskiy +M: Eric Blake +M: Vladimir Sementsov-Ogievskiy +R: John Snow L: qemu-block@nongnu.org S: Supported F: include/qemu/hbitmap.h @@ -2013,7 +2014,7 @@ F: migration/block-dirty-bitmap.c F: util/hbitmap.c F: tests/test-hbitmap.c F: docs/interop/bitmaps.rst -T: git https://github.com/jnsnow/qemu.git bitmaps +T: git https://repo.or.cz/qemu/ericb.git bitmaps Character device backends M: Marc-André Lureau -- 2.26.2
Re: [PATCH] bitmaps: Add myself as maintainer
14.05.2020 16:52, Eric Blake wrote: On 5/14/20 12:08 AM, John Snow wrote: On 5/14/20 12:49 AM, Vladimir Sementsov-Ogievskiy wrote: 13.05.2020 23:24, John Snow wrote: On 5/13/20 10:14 AM, Eric Blake wrote: Dirty bitmaps are important to incremental backups, including exposure over NBD where I'm already maintainer. Also, I'm aware that lately I have been doing as much code/review on bitmaps as John Snow, who is hoping to scale back on this front. Signed-off-by: Eric Blake --- Dirty Bitmaps M: John Snow +M: Eric Blake R: Vladimir Sementsov-Ogievskiy L: qemu-block@nongnu.org S: Supported I'd also like to point out that I wouldn't mind if Vladimir became an official maintainer, but I can't remember if he wanted the title when we last spoke at KVM Forum. Actually, it would be nice, I'd glad to get it, thanks :) I can send a separate patch, or we may s/R/M/ in this one? That would be very good! I'd be quite happy to be demoted to reviewer; it's about all the time I've been truthfully able to give lately. (I won't speak for Eric!) I can post a v2 that produces the following results: M: Vladimir M: Eric R: John Does that sound reasonable? Great! -- Best regards, Vladimir
Re: [PATCH 0/2] block/block-copy: Fix uninitialized variable in block_copy_task_entry
Am 07.05.2020 um 14:11 hat Philippe Mathieu-Daudé geschrieben: > block_copy_task_entry() might be written differently to avoid > the initialization. This silents the warning and let me build. Thanks, applied to the block branch. Kevin
Re: [PATCH v6 04/14] block/amend: separate amend and create options for qemu-img
On 5/14/20 7:28 AM, Max Reitz wrote: On 10.05.20 15:40, Maxim Levitsky wrote: Some options are only useful for creation (or hard to be amended, like cluster size for qcow2), while some other options are only useful for amend, like upcoming keyslot management options for luks +#define QCOW_COMMON_OPTIONS \ +{ \ +.help = "Width of a reference count entry in bits", \ +.def_value_str = "16" \ +} \ I think the last line should have a comma in it (otherwise the final backslash doesn’t make much sense, because whenever we’d add a new option, we would need to modify the line anyway to insert a comma). Except that... Speaking of adding option, this requires a rebase due to the compression_type option added (not trivial in the strict sense, but still straightforward to handle). + static QemuOptsList qcow2_create_opts = { .name = "qcow2-create-opts", .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head), .desc = { [...] +QCOW_COMMON_OPTIONS, +{ /* end of list */ } ...the intended usage is to use the macro name followed by a comma, so including a trailing comma in the macro itself would lead to a syntax error. I think the better fix is to eliminate the trailing \ on the final line, and have '}' without a trailing comma in the macro. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v6 13/14] block/qcow2: implement blockdev-amend
On 10.05.20 15:40, Maxim Levitsky wrote: > Currently the implementation only supports amending the encryption > options, unlike the qemu-img version > > Signed-off-by: Maxim Levitsky > Reviewed-by: Daniel P. Berrangé > --- > block/qcow2.c| 39 +++ > qapi/block-core.json | 16 +++- > 2 files changed, 54 insertions(+), 1 deletion(-) Reviewed-by: Max Reitz signature.asc Description: OpenPGP digital signature
Re: [PATCH v6 12/14] block/crypto: implement blockdev-amend
On 10.05.20 15:40, Maxim Levitsky wrote: > Signed-off-by: Maxim Levitsky > Reviewed-by: Daniel P. Berrangé > --- > block/crypto.c | 72 > qapi/block-core.json | 14 - > 2 files changed, 66 insertions(+), 20 deletions(-) Reviewed-by: Max Reitz signature.asc Description: OpenPGP digital signature
Re: [PATCH v4 1/3] qmp.py: change event_wait to use a dict
Am 14.05.2020 um 17:07 hat John Snow geschrieben: > > > On 5/14/20 10:47 AM, Kevin Wolf wrote: > > Am 14.05.2020 um 04:25 hat John Snow geschrieben: > >> It's easier to work with than a list of tuples, because we can check the > >> keys for membership. > >> > >> Signed-off-by: John Snow > >> --- > >> python/qemu/machine.py| 10 +- > >> tests/qemu-iotests/040| 12 ++-- > >> tests/qemu-iotests/260| 5 +++-- > >> tests/qemu-iotests/iotests.py | 16 > >> 4 files changed, 22 insertions(+), 21 deletions(-) > > > > I think you need to convert scripts/simplebench/bench_block_job.py, too. > > Oh, right -- that one is new since I did this. A lot of these scripts > need to be moved over into the python/ directory and managed under the > same pylint/mypy regime as everything else. > > A *ton* of our scripts are in various states of disrepair. Is python/ actually supposed to have executable files in it? I thought it was more for libraries. > > > >> diff --git a/python/qemu/machine.py b/python/qemu/machine.py > >> index b9a98e2c86..eaedc25172 100644 > >> --- a/python/qemu/machine.py > >> +++ b/python/qemu/machine.py > >> @@ -478,21 +478,21 @@ def event_wait(self, name, timeout=60.0, match=None): > >> timeout: QEMUMonitorProtocol.pull_event timeout parameter. > >> match: Optional match criteria. See event_match for details. > >> """ > >> -return self.events_wait([(name, match)], timeout) > >> +return self.events_wait({name: match}, timeout) > >> > >> def events_wait(self, events, timeout=60.0): > >> """ > >> events_wait waits for and returns a named event from QMP with a > >> timeout. > >> > >> -events: a sequence of (name, match_criteria) tuples. > >> +events: a mapping containing {name: match_criteria}. > >> The match criteria are optional and may be None. > >> See event_match for details. timeout: > >> QEMUMonitorProtocol.pull_event timeout parameter. > >> """ > >> def _match(event): > >> -for name, match in events: > >> -if event['event'] == name and self.event_match(event, > >> match): > >> -return True > >> +name = event['event'] > >> +if name in events: > >> +return self.event_match(event, events[name]) > > > > This part confused me a bit for a second. Of course, that's probably > > mostly just me, but I feel 'events' isn't a good name any more when the > > values of the dict are match strings rather than events. > > > > This is honestly a really bad function. When I was trying to type > everything, this one was at the bottom of the pile and it was the worst. > > It needs an overhaul. > > In my 32 patch series, I left the "match" types as "Any" pretty much > everywhere, because it's such a laissez-faire series of functions. It would require recursive types, which aren't supported yet. So I guess Any is the best we can do at the moment. > I'll keep the feedback in mind. > > >> return False > >> > >> # Search cached events > >> diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040 > >> index 32c82b4ec6..90b59081ff 100755 > >> --- a/tests/qemu-iotests/040 > >> +++ b/tests/qemu-iotests/040 > >> @@ -485,12 +485,12 @@ class TestErrorHandling(iotests.QMPTestCase): > >> > >> def run_job(self, expected_events, error_pauses_job=False): > >> match_device = {'data': {'device': 'job0'}} > >> -events = [ > >> -('BLOCK_JOB_COMPLETED', match_device), > >> -('BLOCK_JOB_CANCELLED', match_device), > >> -('BLOCK_JOB_ERROR', match_device), > >> -('BLOCK_JOB_READY', match_device), > >> -] > >> +events = { > >> +'BLOCK_JOB_COMPLETED': match_device, > >> +'BLOCK_JOB_CANCELLED': match_device, > >> +'BLOCK_JOB_ERROR': match_device, > >> +'BLOCK_JOB_READY': match_device, > >> +} > >> > >> completed = False > >> log = [] > >> diff --git a/tests/qemu-iotests/260 b/tests/qemu-iotests/260 > >> index 804a7addb9..729f031122 100755 > >> --- a/tests/qemu-iotests/260 > >> +++ b/tests/qemu-iotests/260 > >> @@ -67,8 +67,9 @@ def test(persistent, restart): > >> > >> vm.qmp_log('block-commit', device='drive0', top=top, > >> filters=[iotests.filter_qmp_testfiles]) > >> -ev = vm.events_wait((('BLOCK_JOB_READY', None), > >> - ('BLOCK_JOB_COMPLETED', None))) > >> +ev = vm.events_wait({ > >> +'BLOCK_JOB_READY': None, > >> +'BLOCK_JOB_COMPLETED': None }) > > > > So, I'm not sure if this is nitpicking or rather bikeshedding, but > > having the closing brackets on the next line would be more consistent > > with the other instances in this patch. > > > > Nah, it's fine. I'll
Re: [PATCH v4 3/3] iotests: modify test 040 to use JobRunner
Am 14.05.2020 um 04:25 hat John Snow geschrieben: > Instead of having somewhat reproduced it for itself. > > Signed-off-by: John Snow I think you should pass auto_dismiss=True to the JobRunner, or (probably preferable) change prepare_and_start_job() to start the job with auto_dismiss=False. Kevin
Re: [PATCH v6 11/14] block/core: add generic infrastructure for x-blockdev-amend qmp command
On 10.05.20 15:40, Maxim Levitsky wrote: > blockdev-amend will be used similiar to blockdev-create > to allow on the fly changes of the structure of the format based block > devices. > > Current plan is to first support encryption keyslot management for luks > based formats (raw and embedded in qcow2) > > Signed-off-by: Maxim Levitsky > Reviewed-by: Daniel P. Berrangé > --- > block/Makefile.objs | 2 +- > block/amend.c | 108 ++ > include/block/block_int.h | 21 +--- > qapi/block-core.json | 42 +++ > qapi/job.json | 4 +- > 5 files changed, 169 insertions(+), 8 deletions(-) > create mode 100644 block/amend.c [...] > diff --git a/block/amend.c b/block/amend.c > new file mode 100644 > index 00..4840c0ffef > --- /dev/null > +++ b/block/amend.c [...] > +void qmp_x_blockdev_amend(const char *job_id, > + const char *node_name, > + BlockdevAmendOptions *options, > + bool has_force, > + bool force, > + Error **errp) > +{ > +BlockdevAmendJob *s; > +const char *fmt = BlockdevDriver_str(options->driver); > +BlockDriver *drv = bdrv_find_format(fmt); > +BlockDriverState *bs = bdrv_find_node(node_name); > + > +/* > + * If the driver is in the schema, we know that it exists. I wonder why create.c then still checks whether drv == NULL. I wouldn’t count on the schema. For example, with modularized block driver I could imagine that a driver appears in the schema, but loading the module fails, so that drv still ends up NULL. > But it may not > + * be whitelisted. > + */ > +assert(drv); > +if (bdrv_uses_whitelist() && !bdrv_is_whitelisted(drv, false)) { > +error_setg(errp, "Driver is not whitelisted"); > +return; > +} [...] > diff --git a/include/block/block_int.h b/include/block/block_int.h > index 0a71357b50..fdb0cdbcdd 100644 > --- a/include/block/block_int.h > +++ b/include/block/block_int.h > @@ -133,12 +133,27 @@ struct BlockDriver { > int (*bdrv_file_open)(BlockDriverState *bs, QDict *options, int flags, >Error **errp); > void (*bdrv_close)(BlockDriverState *bs); > + > + > int coroutine_fn (*bdrv_co_create)(BlockdevCreateOptions *opts, > Error **errp); > int coroutine_fn (*bdrv_co_create_opts)(BlockDriver *drv, > const char *filename, > QemuOpts *opts, > Error **errp); > + > +int coroutine_fn (*bdrv_co_amend)(BlockDriverState *bs, > + BlockdevAmendOptions *opts, > + bool force, > + Error **errp); > + > +int (*bdrv_amend_options)(BlockDriverState *bs, > + QemuOpts *opts, > + BlockDriverAmendStatusCB *status_cb, > + void *cb_opaque, > + bool force, > + Error **errp); > + > int (*bdrv_make_empty)(BlockDriverState *bs); > > /* > @@ -433,12 +448,6 @@ struct BlockDriver { >BdrvCheckResult *result, >BdrvCheckMode fix); > > -int (*bdrv_amend_options)(BlockDriverState *bs, QemuOpts *opts, > - BlockDriverAmendStatusCB *status_cb, > - void *cb_opaque, > - bool force, > - Error **errp); > - No harm done, but why not just leave it where it was? > void (*bdrv_debug_event)(BlockDriverState *bs, BlkdebugEvent event); > > /* TODO Better pass a option string/QDict/QemuOpts to add any rule? */ > diff --git a/qapi/block-core.json b/qapi/block-core.json > index 943df1926a..74db515414 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -4649,6 +4649,48 @@ >'data': { 'job-id': 'str', > 'options': 'BlockdevCreateOptions' } } > > +## > +# @BlockdevAmendOptions: > +# > +# Options for amending an image format > +# > +# @driver block driver that is suitable for the image Missing colon after @driver. Also, what does “suitable” mean? Shouldn’t it be exactly the node’s driver? (i.e. “Block driver of the node to amend”) Max signature.asc Description: OpenPGP digital signature
Re: [PATCH v4 2/3] iotests: add JobRunner class
Am 14.05.2020 um 04:25 hat John Snow geschrieben: > The idea is that instead of increasing the arguments to job_run all the > time, create a more general-purpose job runner that can be subclassed to > do interesting things with. > > pylint note: the 'callbacks' option guards against unused warning > arguments in functions designated as callbacks. It does not currently > guard against "no-self-use" though; hence a once-off ignore. > > mypy note: QapiEvent is only a weak alias; it's fully interchangable > with the type it's declared as. In the future, we may wish to tighten > these types. For now, this communicates the rough shape of the type and > (more importantly) the intent. > > Signed-off-by: John Snow > +# Listen for these events with these parameters: > +self._events = { > +'BLOCK_JOB_COMPLETED': match_device, > +'BLOCK_JOB_CANCELLED': match_device, > +'BLOCK_JOB_ERROR': match_device, > +'BLOCK_JOB_READY': match_device, > +'BLOCK_JOB_PENDING': match_id, > +'JOB_STATUS_CHANGE': match_id > +} The old code had a trailing comma here in case we need to add more events later. Anyway: Reviewed-by: Kevin Wolf
Re: [PATCH v4 1/3] qmp.py: change event_wait to use a dict
On 5/14/20 10:47 AM, Kevin Wolf wrote: > Am 14.05.2020 um 04:25 hat John Snow geschrieben: >> It's easier to work with than a list of tuples, because we can check the >> keys for membership. >> >> Signed-off-by: John Snow >> --- >> python/qemu/machine.py| 10 +- >> tests/qemu-iotests/040| 12 ++-- >> tests/qemu-iotests/260| 5 +++-- >> tests/qemu-iotests/iotests.py | 16 >> 4 files changed, 22 insertions(+), 21 deletions(-) > > I think you need to convert scripts/simplebench/bench_block_job.py, too. Oh, right -- that one is new since I did this. A lot of these scripts need to be moved over into the python/ directory and managed under the same pylint/mypy regime as everything else. A *ton* of our scripts are in various states of disrepair. > >> diff --git a/python/qemu/machine.py b/python/qemu/machine.py >> index b9a98e2c86..eaedc25172 100644 >> --- a/python/qemu/machine.py >> +++ b/python/qemu/machine.py >> @@ -478,21 +478,21 @@ def event_wait(self, name, timeout=60.0, match=None): >> timeout: QEMUMonitorProtocol.pull_event timeout parameter. >> match: Optional match criteria. See event_match for details. >> """ >> -return self.events_wait([(name, match)], timeout) >> +return self.events_wait({name: match}, timeout) >> >> def events_wait(self, events, timeout=60.0): >> """ >> events_wait waits for and returns a named event from QMP with a >> timeout. >> >> -events: a sequence of (name, match_criteria) tuples. >> +events: a mapping containing {name: match_criteria}. >> The match criteria are optional and may be None. >> See event_match for details. timeout: >> QEMUMonitorProtocol.pull_event timeout parameter. >> """ >> def _match(event): >> -for name, match in events: >> -if event['event'] == name and self.event_match(event, >> match): >> -return True >> +name = event['event'] >> +if name in events: >> +return self.event_match(event, events[name]) > > This part confused me a bit for a second. Of course, that's probably > mostly just me, but I feel 'events' isn't a good name any more when the > values of the dict are match strings rather than events. > This is honestly a really bad function. When I was trying to type everything, this one was at the bottom of the pile and it was the worst. It needs an overhaul. In my 32 patch series, I left the "match" types as "Any" pretty much everywhere, because it's such a laissez-faire series of functions. I'll keep the feedback in mind. >> return False >> >> # Search cached events >> diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040 >> index 32c82b4ec6..90b59081ff 100755 >> --- a/tests/qemu-iotests/040 >> +++ b/tests/qemu-iotests/040 >> @@ -485,12 +485,12 @@ class TestErrorHandling(iotests.QMPTestCase): >> >> def run_job(self, expected_events, error_pauses_job=False): >> match_device = {'data': {'device': 'job0'}} >> -events = [ >> -('BLOCK_JOB_COMPLETED', match_device), >> -('BLOCK_JOB_CANCELLED', match_device), >> -('BLOCK_JOB_ERROR', match_device), >> -('BLOCK_JOB_READY', match_device), >> -] >> +events = { >> +'BLOCK_JOB_COMPLETED': match_device, >> +'BLOCK_JOB_CANCELLED': match_device, >> +'BLOCK_JOB_ERROR': match_device, >> +'BLOCK_JOB_READY': match_device, >> +} >> >> completed = False >> log = [] >> diff --git a/tests/qemu-iotests/260 b/tests/qemu-iotests/260 >> index 804a7addb9..729f031122 100755 >> --- a/tests/qemu-iotests/260 >> +++ b/tests/qemu-iotests/260 >> @@ -67,8 +67,9 @@ def test(persistent, restart): >> >> vm.qmp_log('block-commit', device='drive0', top=top, >> filters=[iotests.filter_qmp_testfiles]) >> -ev = vm.events_wait((('BLOCK_JOB_READY', None), >> - ('BLOCK_JOB_COMPLETED', None))) >> +ev = vm.events_wait({ >> +'BLOCK_JOB_READY': None, >> +'BLOCK_JOB_COMPLETED': None }) > > So, I'm not sure if this is nitpicking or rather bikeshedding, but > having the closing brackets on the next line would be more consistent > with the other instances in this patch. > Nah, it's fine. I'll clean it up. This is pretty close to an RFC series anyway, so I didn't really polish it. (Or, I will try to clean it up. I probably won't work on it again in the near term. I think I just wanted to see if this seemed useful in general to people. As part of maybe moving the python library onto a package, I thought that maybe developing a JobRunner tool would be useful to have in that library. As you can see, I nestled it into iotests.py, though.) >> log(filter_qmp_event(ev)) >>
Re: [PATCH 0/2] block/block-copy: Fix uninitialized variable in block_copy_task_entry
On 5/7/20 2:11 PM, Philippe Mathieu-Daudé wrote: block_copy_task_entry() might be written differently to avoid the initialization. This silents the warning and let me build. Philippe Mathieu-Daudé (2): block/block-copy: Fix uninitialized variable in block_copy_task_entry block/block-copy: Simplify block_copy_do_copy() block/block-copy.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) ping?
Re: [PATCH v4 6/9] qemu-img: Add bitmap sub-command
14.05.2020 17:20, Eric Blake wrote: On 5/14/20 1:45 AM, Vladimir Sementsov-Ogievskiy wrote: 13.05.2020 04:16, Eric Blake wrote: Include actions for --add, --remove, --clear, --enable, --disable, and --merge (note that --clear is a bit of fluff, because the same can be accomplished by removing a bitmap and then adding a new one in its place, but it matches what QMP commands exist). Listing is omitted, because it does not require a bitmap name and because it was already possible with 'qemu-img info'. A single command line can play one or more bitmap commands in sequence on the same bitmap name (although all added bitmaps share the same granularity, and and all merged bitmaps come from the same source file). Merge defaults to other bitmaps in the primary image, but can also be told to merge bitmaps from a distinct image. I'm sorry for asking it only now on v4.. But still. Why do we need it? Ease of use. We can instead run qemu binary (or even new qemu-storage-daemon) and just use existing qmp commands. Is there a real benefit in developing qemu-img, maintaining two interfaces for the same thing? If it makes someone's life easier, and is not hard to maintain, then yes. A command line interface that calls into QMP is not hard to maintain. And _I_ certainly found it easier to write iotests with this patch in place, so it already has at least one client. Of-course, just run qmp commands from terminal is a lot less comfortable than just a qemu img command.. But may be we need some wrapper, which make it simple to run one qmp command on an image? It's simple to make a python wrapper working like qemu-qmp block-dirty-bitmap-add '{node: self, name: bitmap0, persistent: true}' /path/to/x.qcow2 This _IS_ such a wrapper. The whole point of this patch is that it is now simpler to run one (or more) QMP command on an offline image from the command line. Just because I wrote it in C instead of python, and attached it to an existing tool instead of writing a new tool, doesn't change the fact that it is just a wrapper around the existing QMP commands. OK, that's right. The thing that I didn't like is that we have to make cli-to-qapi interface mapping by hand. But I see now that interface you implementing is prepared to make several actions with same bitmap-name, which can't be achieved with some kind of automatic interface matching anyway, so my proposal don't match your needs, sorry for my haste) -- Best regards, Vladimir
Re: [PATCH v6 07/14] block/crypto: implement the encryption key management
On 14.05.20 16:14, Daniel P. Berrangé wrote: > On Thu, May 14, 2020 at 04:09:59PM +0200, Max Reitz wrote: >> On 10.05.20 15:40, Maxim Levitsky wrote: >>> This implements the encryption key management using the generic code in >>> qcrypto layer and exposes it to the user via qemu-img >>> >>> This code adds another 'write_func' because the initialization >>> write_func works directly on the underlying file, and amend >>> works on instance of luks device. >>> >>> This commit also adds a 'hack/workaround' I and Kevin Wolf (thanks) >>> made to make the driver both support write sharing (to avoid breaking the >>> users), >>> and be safe against concurrent metadata update (the keyslots) >>> >>> Eventually the write sharing for luks driver will be deprecated >>> and removed together with this hack. >>> >>> The hack is that we ask (as a format driver) for BLK_PERM_CONSISTENT_READ >>> and then when we want to update the keys, we unshare that permission. >>> So if someone else has the image open, even readonly, encryption >>> key update will fail gracefully. >>> >>> Also thanks to Daniel Berrange for the idea of >>> unsharing read, rather that write permission which allows >>> to avoid cases when the other user had opened the image read-only. >>> >>> Signed-off-by: Maxim Levitsky >>> Reviewed-by: Daniel P. Berrangé >>> --- >>> block/crypto.c | 127 +++-- >>> block/crypto.h | 34 + >>> 2 files changed, 158 insertions(+), 3 deletions(-) >>> >>> diff --git a/block/crypto.c b/block/crypto.c >>> index 2e16b62bdc..b14cb0ff06 100644 >>> --- a/block/crypto.c >>> +++ b/block/crypto.c >> >> [...] >> >>> +static void >>> +block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c, >>> + const BdrvChildRole *role, >>> + BlockReopenQueue *reopen_queue, >>> + uint64_t perm, uint64_t shared, >>> + uint64_t *nperm, uint64_t *nshared) >>> +{ >>> + >>> +BlockCrypto *crypto = bs->opaque; >>> + >>> +bdrv_filter_default_perms(bs, c, role, reopen_queue, >>> +perm, shared, nperm, nshared); >>> +/* >>> + * Ask for consistent read permission so that if >>> + * someone else tries to open this image with this permission >>> + * neither will be able to edit encryption keys, since >>> + * we will unshare that permission while trying to >>> + * update the encryption keys >>> + */ >>> +if (!(bs->open_flags & BDRV_O_NO_IO)) { >>> +*nperm |= BLK_PERM_CONSISTENT_READ; >>> +} >> >> I’m not sure this is important, because this really means we won’t do >> I/O. Its only relevant use in this case is for qemu-img info. Do we >> really care if someone edits the key slots while qemu-img info is >> processing? > > FWIW, OpenStack runs qemu-img info in a periodic background job, so > it can be concurrent with anything else they are running. That might actually be a problem then, because this may cause sporadic failure when trying to change (amend) keyslots; while qemu-img info holds the CONSISTENT_READ permission, the amend process can’t unshare it. That might lead to hard-to-track-down bugs. > Having said > that due to previous QEMU bugs, they unconditonally pass the arg to > qemu-img to explicitly disable locking Well, then it doesn’t matter in this case. But still something to consider, probably. Max signature.asc Description: OpenPGP digital signature
Re: [PATCH v6 09/14] iotests: filter few more luks specific create options
On 10.05.20 15:40, Maxim Levitsky wrote: > This allows more tests to be able to have same output on both qcow2 luks > encrypted images > and raw luks images > > Signed-off-by: Maxim Levitsky > Reviewed-by: Daniel P. Berrangé > --- > tests/qemu-iotests/087.out | 6 ++--- > tests/qemu-iotests/134.out | 2 +- > tests/qemu-iotests/158.out | 4 +-- > tests/qemu-iotests/188.out | 2 +- > tests/qemu-iotests/189.out | 4 +-- > tests/qemu-iotests/198.out | 4 +-- > tests/qemu-iotests/263.out | 4 +-- > tests/qemu-iotests/274.out | 46 > tests/qemu-iotests/284.out | 6 ++--- > tests/qemu-iotests/common.filter | 6 +++-- > 10 files changed, 43 insertions(+), 41 deletions(-) [...] > diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out > index 9d6fdeb1f7..59de176b99 100644 > --- a/tests/qemu-iotests/274.out > +++ b/tests/qemu-iotests/274.out > @@ -1,9 +1,9 @@ > == Commit tests == > -Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=2097152 cluster_size=65536 > lazy_refcounts=off refcount_bits=16 > +Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 size=2097152 > lazy_refcounts=off refcount_bits=16 > > -Formatting 'TEST_DIR/PID-mid', fmt=qcow2 size=1048576 > backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off > refcount_bits=16 > +Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 size=1048576 > backing_file=TEST_DIR/PID-base lazy_refcounts=off refcount_bits=16 > > -Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=2097152 > backing_file=TEST_DIR/PID-mid cluster_size=65536 lazy_refcounts=off > refcount_bits=16 > +Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 size=2097152 > backing_file=TEST_DIR/PID-mid lazy_refcounts=off refcount_bits=16 @size and @cluster_size swapping positions doesn’t look right for this patch. I think all hunks for 274.out should be in patch 5. Max signature.asc Description: OpenPGP digital signature
Re: [PATCH v2 0/1] qemu-nbd: Close inherited stderr
On 5/14/20 9:29 AM, Eric Blake wrote: WARNING: Block comments use a leading /* on a separate line #20: FILE: qemu-nbd.c:919: + /* Remember parents stderr only if the fork option is set. The comment could use some grammar help (s/parents/parent's/), and in truth, I don't think it adds much beyond what the code itself is already doing, so rather than adding another line to silence patchew, you could instead just eliminate the comment and life would still be fine. Or if you want a one-line comment, I might suggest: /* Remember parent's stderr if we will restoring it. */ It helps if I don't hit 'send' too early. /* Remember parent's stderr if we will be restoring it. */ -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v4 1/3] qmp.py: change event_wait to use a dict
Am 14.05.2020 um 04:25 hat John Snow geschrieben: > It's easier to work with than a list of tuples, because we can check the > keys for membership. > > Signed-off-by: John Snow > --- > python/qemu/machine.py| 10 +- > tests/qemu-iotests/040| 12 ++-- > tests/qemu-iotests/260| 5 +++-- > tests/qemu-iotests/iotests.py | 16 > 4 files changed, 22 insertions(+), 21 deletions(-) I think you need to convert scripts/simplebench/bench_block_job.py, too. > diff --git a/python/qemu/machine.py b/python/qemu/machine.py > index b9a98e2c86..eaedc25172 100644 > --- a/python/qemu/machine.py > +++ b/python/qemu/machine.py > @@ -478,21 +478,21 @@ def event_wait(self, name, timeout=60.0, match=None): > timeout: QEMUMonitorProtocol.pull_event timeout parameter. > match: Optional match criteria. See event_match for details. > """ > -return self.events_wait([(name, match)], timeout) > +return self.events_wait({name: match}, timeout) > > def events_wait(self, events, timeout=60.0): > """ > events_wait waits for and returns a named event from QMP with a > timeout. > > -events: a sequence of (name, match_criteria) tuples. > +events: a mapping containing {name: match_criteria}. > The match criteria are optional and may be None. > See event_match for details. timeout: > QEMUMonitorProtocol.pull_event timeout parameter. > """ > def _match(event): > -for name, match in events: > -if event['event'] == name and self.event_match(event, match): > -return True > +name = event['event'] > +if name in events: > +return self.event_match(event, events[name]) This part confused me a bit for a second. Of course, that's probably mostly just me, but I feel 'events' isn't a good name any more when the values of the dict are match strings rather than events. > return False > > # Search cached events > diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040 > index 32c82b4ec6..90b59081ff 100755 > --- a/tests/qemu-iotests/040 > +++ b/tests/qemu-iotests/040 > @@ -485,12 +485,12 @@ class TestErrorHandling(iotests.QMPTestCase): > > def run_job(self, expected_events, error_pauses_job=False): > match_device = {'data': {'device': 'job0'}} > -events = [ > -('BLOCK_JOB_COMPLETED', match_device), > -('BLOCK_JOB_CANCELLED', match_device), > -('BLOCK_JOB_ERROR', match_device), > -('BLOCK_JOB_READY', match_device), > -] > +events = { > +'BLOCK_JOB_COMPLETED': match_device, > +'BLOCK_JOB_CANCELLED': match_device, > +'BLOCK_JOB_ERROR': match_device, > +'BLOCK_JOB_READY': match_device, > +} > > completed = False > log = [] > diff --git a/tests/qemu-iotests/260 b/tests/qemu-iotests/260 > index 804a7addb9..729f031122 100755 > --- a/tests/qemu-iotests/260 > +++ b/tests/qemu-iotests/260 > @@ -67,8 +67,9 @@ def test(persistent, restart): > > vm.qmp_log('block-commit', device='drive0', top=top, > filters=[iotests.filter_qmp_testfiles]) > -ev = vm.events_wait((('BLOCK_JOB_READY', None), > - ('BLOCK_JOB_COMPLETED', None))) > +ev = vm.events_wait({ > +'BLOCK_JOB_READY': None, > +'BLOCK_JOB_COMPLETED': None }) So, I'm not sure if this is nitpicking or rather bikeshedding, but having the closing brackets on the next line would be more consistent with the other instances in this patch. > log(filter_qmp_event(ev)) > if (ev['event'] == 'BLOCK_JOB_COMPLETED'): > vm.shutdown() Kevin
Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext
Am 14.05.2020 um 15:21 hat Thomas Lamprecht geschrieben: > On 5/12/20 4:43 PM, Kevin Wolf wrote: > > Stefan (Reiter), after looking a bit closer at this, I think there is no > > bug in QEMU, but the bug is in your coroutine code that calls block > > layer functions without moving into the right AioContext first. I've > > written this series anyway as it potentially makes the life of callers > > easier and would probably make your buggy code correct. > > > However, it doesn't feel right to commit something like patch 2 without > > having a user for it. Is there a reason why you can't upstream your > > async snapshot code? > > I mean I understand what you mean, but it would make the interface IMO so > much easier to use, if one wants to explicit schedule it beforehand they > can still do. But that would open the way for two styles doing things, not > sure if this would seen as bad. The assert about from patch 3/3 would be > already really helping a lot, though. I think patches 1 and 3 are good to be committed either way if people think they are useful. They make sense without the async snapshot code. My concern with the interface in patch 2 is both that it could give people a false sense of security and that it would be tempting to write inefficient code. Usually, you won't have just a single call into the block layer for a given block node, but you'll perform multiple operations. Switching to the target context once rather than switching back and forth in every operation is obviously more efficient. But chances are that even if one of these function is bdrv_flush(), which now works correctly from a different thread, you might need another function that doesn't implement the same magic. So you always need to be aware which functions support cross-context calls which ones don't. I feel we'd see a few bugs related to this. > Regarding upstreaming, there was some historical attempt to upstream it > from Dietmar, but in the time frame of ~ 8 to 10 years ago or so. > I'm not quite sure why it didn't went through then, I see if I can get > some time searching the mailing list archive. > > We'd be naturally open and glad to upstream it, what it effectively > allow us to do is to not block the VM to much during snapshoting it > live. Yes, there is no doubt that this is useful functionality. There has been talk about this every now and then, but I don't think we ever got to a point where it actually could be implemented. Vladimir, I seem to remember you (or someone else from your team?) were interested in async snapshots as well a while ago? > I pushed a tree[0] with mostly just that specific code squashed together (hope > I did not break anything), most of the actual code is in commit [1]. > It'd be cleaned up a bit and checked for coding style issues, but works good > here. > > Anyway, thanks for your help and pointers! > > [0]: https://github.com/ThomasLamprecht/qemu/tree/savevm-async > [1]: > https://github.com/ThomasLamprecht/qemu/commit/ffb9531f370ef0073e4b6f6021f4c47ccd702121 It doesn't even look that bad in terms of patch size. I had imagined it a bit larger. But it seems this is not really just an async 'savevm' (which would save the VM state in a qcow2 file), but you store the state in a separate raw file. What is the difference between this and regular migration into a file? I remember people talking about how snapshotting can store things in a way that a normal migration stream can't do, like overwriting outdated RAM state instead of just appending the new state, but you don't seem to implement something like this. Kevin
Re: [PATCH v2 2/3] hw/mips/mips_int: Use qdev gpio rather than qemu_allocate_irqs()
On 5/14/20 3:24 PM, Peter Maydell wrote: On Tue, 12 May 2020 at 08:48, Philippe Mathieu-Daudé wrote: Switch to using the qdev gpio API which is preferred over qemu_allocate_irqs(). One step to eventually deprecate and remove qemu_allocate_irqs() one day. diff --git a/hw/mips/mips_int.c b/hw/mips/mips_int.c index 796730b11d..91788c51a9 100644 --- a/hw/mips/mips_int.c +++ b/hw/mips/mips_int.c @@ -74,14 +74,12 @@ static void cpu_mips_irq_request(void *opaque, int irq, int level) void cpu_mips_irq_init_cpu(MIPSCPU *cpu) { CPUMIPSState *env = &cpu->env; -qemu_irq *qi; int i; -qi = qemu_allocate_irqs(cpu_mips_irq_request, cpu, 8); +qdev_init_gpio_in(DEVICE(cpu), cpu_mips_irq_request, 8); for (i = 0; i < 8; i++) { -env->irq[i] = qi[i]; +env->irq[i] = qdev_get_gpio_in(DEVICE(cpu), i); } -g_free(qi); } Similar comments as with the openrisc patch apply here: * ideally this code should be in target/mips/, not in hw/mips * board code should call qdev_get_gpio_in() to get the IRQ line, rather than fishing env->irq[] out of the CPU object directly This is a bit more complicated than with openrisc, because there's more than just a single board model, and not all MIPS boards call cpu_mips_irq_init_cpu() so figuring out whether MIPS CPUs should always provide inbound CPU lines, or only those with some particular feature, or something else, would need some investigation or MIPS knowledge. Yes, I'm aware of at least 3 different to map interrupts to a MIPS core. QEMU models at least 2. As X86, MIPS code use old QEMU API, I don't see the codebase being upgraded anytime soon. This is another borderline case between architecture and hardware, as the cache units, or the ARM NVIC. Ideally we should keep target/* free of references to hw/* code. In my experience it often give troubles. But this is an OK first step anyway, so Reviewed-by: Peter Maydell Thanks. The idea to keep qemu_allocate_irqs() as internal as possible, and have devices use the qdev GPIO API. thanks -- PMM
Re: [PATCH v6 08/14] block/qcow2: extend qemu-img amend interface with crypto options
On 10.05.20 15:40, Maxim Levitsky wrote: > Now that we have all the infrastructure in place, > wire it in the qcow2 driver and expose this to the user. > > Signed-off-by: Maxim Levitsky > Reviewed-by: Daniel P. Berrangé > --- > block/qcow2.c | 72 +- > tests/qemu-iotests/082.out | 45 Again, some rebasing required because of compression_type. > 2 files changed, 108 insertions(+), 9 deletions(-) > > diff --git a/block/qcow2.c b/block/qcow2.c > index db86500839..4bb6e3fc8f 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c [...] > @@ -4744,17 +4757,11 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts > *opts, BlockDriverState *in_bs, > g_free(optstr); > > if (has_luks) { > + Why? With this line dropped, and 082.out fixed up: Reviewed-by: Max Reitz signature.asc Description: OpenPGP digital signature
Re: [RFC PATCH 3/3] block: Assert we're running in the right thread
Am 14.05.2020 um 15:52 hat Stefan Reiter geschrieben: > On 5/12/20 4:43 PM, Kevin Wolf wrote: > > tracked_request_begin() is called for most I/O operations, so it's a > > good place to assert that we're indeed running in the home thread of the > > node's AioContext. > > > > Is this patch supposed to be always correct or only together with nr. 2? > > I changed our code to call bdrv_flush_all from the main AIO context and it > certainly works just fine (even without this series, so I suppose that would > be the 'correct' way to fix it you mention on the cover), though of course > it trips this assert without patch 2. Yes, I think this is a basic assumption that should always be true. This series shouldn't fix anything for upstream QEMU (at least I'm not aware of anything that needs it), so the assertion could be added even without the other patches. In fact, I'm currently thinking that committing just patch 1 (because it's a nice cleanup anyway) and patch 3 (because it will let us know when we mess up) might make sense. Kevin > > Signed-off-by: Kevin Wolf > > --- > > block/io.c | 5 - > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/block/io.c b/block/io.c > > index 7808e8bdc0..924bf5ba46 100644 > > --- a/block/io.c > > +++ b/block/io.c > > @@ -695,14 +695,17 @@ static void tracked_request_begin(BdrvTrackedRequest > > *req, > > uint64_t bytes, > > enum BdrvTrackedRequestType type) > > { > > +Coroutine *self = qemu_coroutine_self(); > > + > > assert(bytes <= INT64_MAX && offset <= INT64_MAX - bytes); > > +assert(bs->aio_context == qemu_coroutine_get_aio_context(self)); > > *req = (BdrvTrackedRequest){ > > .bs = bs, > > .offset = offset, > > .bytes = bytes, > > .type = type, > > -.co = qemu_coroutine_self(), > > +.co = self, > > .serialising= false, > > .overlap_offset = offset, > > .overlap_bytes = bytes, > > >
Re: [PATCH RFC 03/32] python//machine.py: remove bare except
On 5/14/20 9:55 AM, Eric Blake wrote: > On 5/14/20 12:53 AM, John Snow wrote: >> Catch only the timeout error; if there are other problems, allow the >> stack trace to be visible. >> > > A lot of patches in this series start with "python//" - is that > intentional, or should it be a single slash? > Was trying to imply an elided path structure, because "python/qemu/lib/qmp.py" et al is way too chatty. Feel free to suggest better subject names!
Re: [PATCH v2 0/1] qemu-nbd: Close inherited stderr
On 5/14/20 7:51 AM, no-re...@patchew.org wrote: Patchew URL: https://patchew.org/QEMU/20200514063112.1457125-1-raphael.p...@hetzner.com/ Hi, This series seems to have some coding style problems. See output below for more information: Message-id: 20200514063112.1457125-1-raphael.p...@hetzner.com Subject: [PATCH v2 0/1] qemu-nbd: Close inherited stderr Type: series === OUTPUT BEGIN === WARNING: Block comments use a leading /* on a separate line #20: FILE: qemu-nbd.c:919: +/* Remember parents stderr only if the fork option is set. ERROR: suspect code indent for conditional statements (12, 14) #23: FILE: qemu-nbd.c:922: +if (fork_process) { + old_stderr = dup(STDERR_FILENO); ERROR: Missing Signed-off-by: line(s) Patchew is correct. All three things should be fixed (the missing S-o-b is most important - I can't supply that; the missing spaces and comment formatting are something I could touch up). The comment could use some grammar help (s/parents/parent's/), and in truth, I don't think it adds much beyond what the code itself is already doing, so rather than adding another line to silence patchew, you could instead just eliminate the comment and life would still be fine. Or if you want a one-line comment, I might suggest: /* Remember parent's stderr if we will restoring it. */ -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH] bitmaps: Add myself as maintainer
On 5/14/20 9:52 AM, Eric Blake wrote: > On 5/14/20 12:08 AM, John Snow wrote: >> >> >> On 5/14/20 12:49 AM, Vladimir Sementsov-Ogievskiy wrote: >>> 13.05.2020 23:24, John Snow wrote: On 5/13/20 10:14 AM, Eric Blake wrote: > Dirty bitmaps are important to incremental backups, including exposure > over NBD where I'm already maintainer. Also, I'm aware that lately I > have been doing as much code/review on bitmaps as John Snow, who is > hoping to scale back on this front. > > Signed-off-by: Eric Blake > > --- > > Dirty Bitmaps > M: John Snow > +M: Eric Blake > R: Vladimir Sementsov-Ogievskiy > L: qemu-block@nongnu.org > S: Supported > I'd also like to point out that I wouldn't mind if Vladimir became an official maintainer, but I can't remember if he wanted the title when we last spoke at KVM Forum. >>> >>> Actually, it would be nice, I'd glad to get it, thanks :) >>> I can send a separate patch, or we may s/R/M/ in this one? >>> >> >> That would be very good! >> >> I'd be quite happy to be demoted to reviewer; it's about all the time >> I've been truthfully able to give lately. >> >> (I won't speak for Eric!) > > I can post a v2 that produces the following results: > > M: Vladimir > M: Eric > R: John > > Does that sound reasonable? > > Yeah, I will approve it. I want to help as much as I can, but I don't want to artificially limit the rate of development here as I fear I have been. --js
Re: [PATCH v6 07/14] block/crypto: implement the encryption key management
On Thu, May 14, 2020 at 04:09:59PM +0200, Max Reitz wrote: > On 10.05.20 15:40, Maxim Levitsky wrote: > > This implements the encryption key management using the generic code in > > qcrypto layer and exposes it to the user via qemu-img > > > > This code adds another 'write_func' because the initialization > > write_func works directly on the underlying file, and amend > > works on instance of luks device. > > > > This commit also adds a 'hack/workaround' I and Kevin Wolf (thanks) > > made to make the driver both support write sharing (to avoid breaking the > > users), > > and be safe against concurrent metadata update (the keyslots) > > > > Eventually the write sharing for luks driver will be deprecated > > and removed together with this hack. > > > > The hack is that we ask (as a format driver) for BLK_PERM_CONSISTENT_READ > > and then when we want to update the keys, we unshare that permission. > > So if someone else has the image open, even readonly, encryption > > key update will fail gracefully. > > > > Also thanks to Daniel Berrange for the idea of > > unsharing read, rather that write permission which allows > > to avoid cases when the other user had opened the image read-only. > > > > Signed-off-by: Maxim Levitsky > > Reviewed-by: Daniel P. Berrangé > > --- > > block/crypto.c | 127 +++-- > > block/crypto.h | 34 + > > 2 files changed, 158 insertions(+), 3 deletions(-) > > > > diff --git a/block/crypto.c b/block/crypto.c > > index 2e16b62bdc..b14cb0ff06 100644 > > --- a/block/crypto.c > > +++ b/block/crypto.c > > [...] > > > +static void > > +block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c, > > + const BdrvChildRole *role, > > + BlockReopenQueue *reopen_queue, > > + uint64_t perm, uint64_t shared, > > + uint64_t *nperm, uint64_t *nshared) > > +{ > > + > > +BlockCrypto *crypto = bs->opaque; > > + > > +bdrv_filter_default_perms(bs, c, role, reopen_queue, > > +perm, shared, nperm, nshared); > > +/* > > + * Ask for consistent read permission so that if > > + * someone else tries to open this image with this permission > > + * neither will be able to edit encryption keys, since > > + * we will unshare that permission while trying to > > + * update the encryption keys > > + */ > > +if (!(bs->open_flags & BDRV_O_NO_IO)) { > > +*nperm |= BLK_PERM_CONSISTENT_READ; > > +} > > I’m not sure this is important, because this really means we won’t do > I/O. Its only relevant use in this case is for qemu-img info. Do we > really care if someone edits the key slots while qemu-img info is > processing? FWIW, OpenStack runs qemu-img info in a periodic background job, so it can be concurrent with anything else they are running. Having said that due to previous QEMU bugs, they unconditonally pass the arg to qemu-img to explicitly disable locking Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v4 6/9] qemu-img: Add bitmap sub-command
On 5/14/20 1:45 AM, Vladimir Sementsov-Ogievskiy wrote: 13.05.2020 04:16, Eric Blake wrote: Include actions for --add, --remove, --clear, --enable, --disable, and --merge (note that --clear is a bit of fluff, because the same can be accomplished by removing a bitmap and then adding a new one in its place, but it matches what QMP commands exist). Listing is omitted, because it does not require a bitmap name and because it was already possible with 'qemu-img info'. A single command line can play one or more bitmap commands in sequence on the same bitmap name (although all added bitmaps share the same granularity, and and all merged bitmaps come from the same source file). Merge defaults to other bitmaps in the primary image, but can also be told to merge bitmaps from a distinct image. I'm sorry for asking it only now on v4.. But still. Why do we need it? Ease of use. We can instead run qemu binary (or even new qemu-storage-daemon) and just use existing qmp commands. Is there a real benefit in developing qemu-img, maintaining two interfaces for the same thing? If it makes someone's life easier, and is not hard to maintain, then yes. A command line interface that calls into QMP is not hard to maintain. And _I_ certainly found it easier to write iotests with this patch in place, so it already has at least one client. Of-course, just run qmp commands from terminal is a lot less comfortable than just a qemu img command.. But may be we need some wrapper, which make it simple to run one qmp command on an image? It's simple to make a python wrapper working like qemu-qmp block-dirty-bitmap-add '{node: self, name: bitmap0, persistent: true}' /path/to/x.qcow2 This _IS_ such a wrapper. The whole point of this patch is that it is now simpler to run one (or more) QMP command on an offline image from the command line. Just because I wrote it in C instead of python, and attached it to an existing tool instead of writing a new tool, doesn't change the fact that it is just a wrapper around the existing QMP commands. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v4 5/9] blockdev: Split off basic bitmap operations for qemu-img
On 5/14/20 1:21 AM, Vladimir Sementsov-Ogievskiy wrote: 13.05.2020 04:16, Eric Blake wrote: Upcoming patches want to add some basic bitmap manipulation abilities to qemu-img. But blockdev.o is too heavyweight to link into qemu-img (among other things, it would drag in block jobs and transaction support - qemu-img does offline manipulation, where atomicity is less important because there are no concurrent modifications to compete with), so it's time to split off the bare bones of what we will need into a new file block/monitor/bitmap-qmp-cmds.o. This is sufficient to expose 6 QMP commands for use by qemu-img (add, remove, clear, enable, disable, merge), as well as move the three helper functions touched in the previous patch. Regarding MAINTAINERS, the new file is automatically part of block core, but also makes sense as related to other dirty bitmap files. Signed-off-by: Eric Blake Reviewed-by: Max Reitz --- +++ b/block/monitor/bitmap-qmp-cmds.c Hmm, shouldn't transaction bitmap actions be moved here too? May be, not in these series.. No, the very reason that we split this file off of blockdev.c is that transactions are NOT needed in qemu-img. You have to have separate .o files for what qemu-img needs, vs. what the rest of qemu proper needs, and transactions are only needed in qemu proper at the moment. If we ever need transactions in Kevin's qemu-storage-daemon but not all of blockdev.c, then we'd need yet another .c file independent from either blockdev.c or this patch's new bitmap-qmp-cmds.c. @@ -0,0 +1,323 @@ +/* + * QEMU host block device bitmaps A bit conflicts with tha fact that they are not of block-device level but of node-level. May be just "Block dirty bitmap qmp commands" ? Copy-and-paste from blockdev.c, tweaked by adding one word. Your wording is also fine. + * + * Copyright (c) 2003-2008 Fabrice Bellard Does it really apply here? block-dirty-bitmap-add was added in 2015.. May be Red Hat and Virtuozzo copyrights would be more appropriate. When splitting a file, the safest course of action is to preserve ALL copyright from the original file into the new file. Adding additional copyright lines is okay as part of submitting new functionality, but in this case, I don't feel comfortable adding Red Hat copyright (my split isn't adding new functionality), and I don't have authorization to add Virtuozzo copyright (as that is not my employer). +#include "qemu/osdep.h" + +#include "sysemu/blockdev.h" +#include "block/block.h" +#include "block/block_int.h" +#include "qapi/qapi-commands-block.h" +#include "qapi/error.h" compiles for with only four: #include "qemu/osdep.h" #include "block/block_int.h" #include "qapi/qapi-commands-block.h" #include "qapi/error.h" Thanks. I blame rebase; an earlier version used a different header in patch 4/9; when I moved things to block_int.h in that patch, I didn't realize that this patch could be improved. with at least extra includes dropped: Reviewed-by: Vladimir Sementsov-Ogievskiy -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v4 4/9] blockdev: Promote several bitmap functions to non-static
On 5/14/20 6:45 AM, Vladimir Sementsov-Ogievskiy wrote: 13.05.2020 04:16, Eric Blake wrote: - HBitmap **backup, Error **errp) +BdrvDirtyBitmap *block_dirty_bitmap_merge(const char *node, const char *target, + BlockDirtyBitmapMergeSourceList *bitmaps, + HBitmap **backup, Error **errp) { BlockDriverState *bs; s/bitmaps/bms/ to match declaration and fit into 80 characters Can do, although it has ripple effects to 5/9 (as I wanted that to be pure code motion). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v4 3/9] block: Make it easier to learn which BDS support bitmaps
On 5/14/20 12:19 AM, Vladimir Sementsov-Ogievskiy wrote: In the future, when we improve the ability to look up bitmaps through a filter, we will probably also want to teach the block layer to automatically let filters pass this request on through. Hm. I think that bitmap at filter bs is a valid thing (moreover I have a plan to use it for one issue), so I'm not sure that it's good idea to do any generic logic around bitmaps work through filters, better to always address the exact node you mean.. In the immediate term, the only user of this new function is qemu-img, and qemu-img is not setting up filters, so whether this callback can see through filters is moot (that is, we are always addressing the exact node the user passed on the command line, for whatever bitmap operations qemu-img does). In the long term, being able to see bitmaps through filters is better addressed after we finally include Max's series that include filter handling overall. Signed-off-by: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy Thanks. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v6 07/14] block/crypto: implement the encryption key management
On 10.05.20 15:40, Maxim Levitsky wrote: > This implements the encryption key management using the generic code in > qcrypto layer and exposes it to the user via qemu-img > > This code adds another 'write_func' because the initialization > write_func works directly on the underlying file, and amend > works on instance of luks device. > > This commit also adds a 'hack/workaround' I and Kevin Wolf (thanks) > made to make the driver both support write sharing (to avoid breaking the > users), > and be safe against concurrent metadata update (the keyslots) > > Eventually the write sharing for luks driver will be deprecated > and removed together with this hack. > > The hack is that we ask (as a format driver) for BLK_PERM_CONSISTENT_READ > and then when we want to update the keys, we unshare that permission. > So if someone else has the image open, even readonly, encryption > key update will fail gracefully. > > Also thanks to Daniel Berrange for the idea of > unsharing read, rather that write permission which allows > to avoid cases when the other user had opened the image read-only. > > Signed-off-by: Maxim Levitsky > Reviewed-by: Daniel P. Berrangé > --- > block/crypto.c | 127 +++-- > block/crypto.h | 34 + > 2 files changed, 158 insertions(+), 3 deletions(-) > > diff --git a/block/crypto.c b/block/crypto.c > index 2e16b62bdc..b14cb0ff06 100644 > --- a/block/crypto.c > +++ b/block/crypto.c [...] > +static void > +block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c, > + const BdrvChildRole *role, > + BlockReopenQueue *reopen_queue, > + uint64_t perm, uint64_t shared, > + uint64_t *nperm, uint64_t *nshared) > +{ > + > +BlockCrypto *crypto = bs->opaque; > + > +bdrv_filter_default_perms(bs, c, role, reopen_queue, > +perm, shared, nperm, nshared); > +/* > + * Ask for consistent read permission so that if > + * someone else tries to open this image with this permission > + * neither will be able to edit encryption keys, since > + * we will unshare that permission while trying to > + * update the encryption keys > + */ > +if (!(bs->open_flags & BDRV_O_NO_IO)) { > +*nperm |= BLK_PERM_CONSISTENT_READ; > +} I’m not sure this is important, because this really means we won’t do I/O. Its only relevant use in this case is for qemu-img info. Do we really care if someone edits the key slots while qemu-img info is processing? OTOH, I don’t think it’ll harm much. Well, apart from the fact that BDRV_O_NO_IO won’t do much for LUKS images. *shrug* Reviewed-by: Max Reitz signature.asc Description: OpenPGP digital signature
Re: [PATCH RFC 03/32] python//machine.py: remove bare except
On 5/14/20 12:53 AM, John Snow wrote: Catch only the timeout error; if there are other problems, allow the stack trace to be visible. A lot of patches in this series start with "python//" - is that intentional, or should it be a single slash? Signed-off-by: John Snow --- python/qemu/lib/machine.py | 33 + 1 file changed, 21 insertions(+), 12 deletions(-) -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH] bitmaps: Add myself as maintainer
On 5/14/20 12:08 AM, John Snow wrote: On 5/14/20 12:49 AM, Vladimir Sementsov-Ogievskiy wrote: 13.05.2020 23:24, John Snow wrote: On 5/13/20 10:14 AM, Eric Blake wrote: Dirty bitmaps are important to incremental backups, including exposure over NBD where I'm already maintainer. Also, I'm aware that lately I have been doing as much code/review on bitmaps as John Snow, who is hoping to scale back on this front. Signed-off-by: Eric Blake --- Dirty Bitmaps M: John Snow +M: Eric Blake R: Vladimir Sementsov-Ogievskiy L: qemu-block@nongnu.org S: Supported I'd also like to point out that I wouldn't mind if Vladimir became an official maintainer, but I can't remember if he wanted the title when we last spoke at KVM Forum. Actually, it would be nice, I'd glad to get it, thanks :) I can send a separate patch, or we may s/R/M/ in this one? That would be very good! I'd be quite happy to be demoted to reviewer; it's about all the time I've been truthfully able to give lately. (I won't speak for Eric!) I can post a v2 that produces the following results: M: Vladimir M: Eric R: John Does that sound reasonable? -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [RFC PATCH 3/3] block: Assert we're running in the right thread
On 5/12/20 4:43 PM, Kevin Wolf wrote: tracked_request_begin() is called for most I/O operations, so it's a good place to assert that we're indeed running in the home thread of the node's AioContext. Is this patch supposed to be always correct or only together with nr. 2? I changed our code to call bdrv_flush_all from the main AIO context and it certainly works just fine (even without this series, so I suppose that would be the 'correct' way to fix it you mention on the cover), though of course it trips this assert without patch 2. Signed-off-by: Kevin Wolf --- block/io.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/block/io.c b/block/io.c index 7808e8bdc0..924bf5ba46 100644 --- a/block/io.c +++ b/block/io.c @@ -695,14 +695,17 @@ static void tracked_request_begin(BdrvTrackedRequest *req, uint64_t bytes, enum BdrvTrackedRequestType type) { +Coroutine *self = qemu_coroutine_self(); + assert(bytes <= INT64_MAX && offset <= INT64_MAX - bytes); +assert(bs->aio_context == qemu_coroutine_get_aio_context(self)); *req = (BdrvTrackedRequest){ .bs = bs, .offset = offset, .bytes = bytes, .type = type, -.co = qemu_coroutine_self(), +.co = self, .serialising= false, .overlap_offset = offset, .overlap_bytes = bytes,
Re: [PATCH v3 0/4] Additional parameters for qemu_img map
On 5/13/20 5:13 PM, no-re...@patchew.org wrote: Patchew URL: https://patchew.org/QEMU/20200513133629.18508-1-eyal.moscov...@oracle.com/ Hi, This series failed the docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #! /bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-mingw@fedora J=14 NETWORK=1 === TEST SCRIPT END === CC x86_64-softmmu/ioport.o CC x86_64-softmmu/qtest.o /tmp/qemu-test/src/qemu-img.c: In function 'cvtnum_full': /tmp/qemu-test/src/qemu-img.c:488:63: error: format '%ld' expects argument of type 'long int', but argument 3 has type 'int64_t' {aka 'long long int'} [-Werror=format=] error_report("Invalid %s specified. Must be between %ld bytes " ~~^ %lld "to %ld bytes.", name, min, max); patchew is correct; printing int64_t values requires "%"PRId64 rather than "%ld". I'm fine with touching that up in my queue, unless you'd like to submit v4. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH v4 00/34] block: Introduce real BdrvChildRole
Am 13.05.2020 um 13:05 hat Max Reitz geschrieben: > Based-on: <20200429141126.85159-1-mre...@redhat.com> > (“block: Do not call BlockDriver.bdrv_make_empty() directly”) > > Branch: https://github.com/XanClic/qemu.git child-role-v4 > Branch: https://git.xanclic.moe/XanClic/qemu.git child-role-v4 Thanks, applied to the block branch (and squashed in Eric's English corrections). Kevin
Re: [PATCH v6 05/14] block/amend: refactor qcow2 amend options
On 10.05.20 15:40, Maxim Levitsky wrote: > Some qcow2 create options can't be used for amend. > Remove them from the qcow2 create options and add generic logic to detect > such options in qemu-img > > Signed-off-by: Maxim Levitsky > Reviewed-by: Daniel P. Berrangé > --- > block/qcow2.c | 108 ++--- > qemu-img.c | 18 +++- > tests/qemu-iotests/049.out | 102 ++-- > tests/qemu-iotests/061.out | 12 ++- > tests/qemu-iotests/079.out | 18 ++-- > tests/qemu-iotests/082.out | 149 > tests/qemu-iotests/085.out | 38 > tests/qemu-iotests/087.out | 6 +- > tests/qemu-iotests/115.out | 2 +- > tests/qemu-iotests/121.out | 4 +- > tests/qemu-iotests/125.out | 192 ++--- > tests/qemu-iotests/134.out | 2 +- > tests/qemu-iotests/144.out | 4 +- > tests/qemu-iotests/158.out | 4 +- > tests/qemu-iotests/182.out | 2 +- > tests/qemu-iotests/185.out | 8 +- > tests/qemu-iotests/188.out | 2 +- > tests/qemu-iotests/189.out | 4 +- > tests/qemu-iotests/198.out | 4 +- > tests/qemu-iotests/243.out | 16 ++-- > tests/qemu-iotests/250.out | 2 +- > tests/qemu-iotests/255.out | 8 +- > tests/qemu-iotests/259.out | 2 +- > tests/qemu-iotests/263.out | 4 +- > tests/qemu-iotests/280.out | 2 +- These reference output hunks need some rebasing due to the new compression_type option. > 25 files changed, 284 insertions(+), 429 deletions(-) > > diff --git a/block/qcow2.c b/block/qcow2.c > index fc494c7591..db86500839 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c [...] > @@ -5552,37 +5506,6 @@ void qcow2_signal_corruption(BlockDriverState *bs, > bool fatal, int64_t offset, > .help = "The external data file must stay valid " \ > "as a raw image"\ > }, \ > -{ \ > -.name = BLOCK_OPT_ENCRYPT, \ > -.type = QEMU_OPT_BOOL, \ > -.help = "Encrypt the image with format 'aes'. (Deprecated " \ > -"in favor of " BLOCK_OPT_ENCRYPT_FORMAT "=aes)",\ > -}, \ > -{ \ > -.name = BLOCK_OPT_ENCRYPT_FORMAT, \ > -.type = QEMU_OPT_STRING,\ > -.help = "Encrypt the image, format choices: 'aes', 'luks'", \ > -}, \ > -BLOCK_CRYPTO_OPT_DEF_KEY_SECRET("encrypt.", \ > -"ID of secret providing qcow AES key or LUKS passphrase"), \ > -BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_ALG("encrypt."), \ > -BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_MODE("encrypt."), \ > -BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_ALG("encrypt."),\ > -BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG("encrypt."), \ > -BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG("encrypt."), \ > -BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME("encrypt."),\ > -{ \ > -.name = BLOCK_OPT_CLUSTER_SIZE, \ > -.type = QEMU_OPT_SIZE, \ > -.help = "qcow2 cluster size", \ > -.def_value_str = stringify(DEFAULT_CLUSTER_SIZE)\ > -}, \ > -{ \ > -.name = BLOCK_OPT_PREALLOC, \ > -.type = QEMU_OPT_STRING,\ > -.help = "Preallocation mode (allowed values: off, " \ > -"metadata, falloc, full)" \ > -}, \ > { \ > .name = BLOCK_OPT_LAZY_REFCOUNTS, \ > .type = QEMU_OPT_BOOL, \ > @@ -5600,6 +5523,37 @@ static QemuOptsList qcow2_create_opts = { > .name = "qcow2-create-opts", > .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head), > .desc = { > +{ \ > +.name = BLOCK_OPT_ENCRYPT, \ > +.type = QEMU_OPT_BOOL, \ > +.help = "Encrypt the image with format 'aes'. (Deprecated " \ > +"in favor of " BLOCK_OPT_ENCRYP
Re: [PATCH v2 2/3] hw/mips/mips_int: Use qdev gpio rather than qemu_allocate_irqs()
On Tue, 12 May 2020 at 08:48, Philippe Mathieu-Daudé wrote: > > Switch to using the qdev gpio API which is preferred over > qemu_allocate_irqs(). One step to eventually deprecate and > remove qemu_allocate_irqs() one day. > diff --git a/hw/mips/mips_int.c b/hw/mips/mips_int.c > index 796730b11d..91788c51a9 100644 > --- a/hw/mips/mips_int.c > +++ b/hw/mips/mips_int.c > @@ -74,14 +74,12 @@ static void cpu_mips_irq_request(void *opaque, int irq, > int level) > void cpu_mips_irq_init_cpu(MIPSCPU *cpu) > { > CPUMIPSState *env = &cpu->env; > -qemu_irq *qi; > int i; > > -qi = qemu_allocate_irqs(cpu_mips_irq_request, cpu, 8); > +qdev_init_gpio_in(DEVICE(cpu), cpu_mips_irq_request, 8); > for (i = 0; i < 8; i++) { > -env->irq[i] = qi[i]; > +env->irq[i] = qdev_get_gpio_in(DEVICE(cpu), i); > } > -g_free(qi); > } Similar comments as with the openrisc patch apply here: * ideally this code should be in target/mips/, not in hw/mips * board code should call qdev_get_gpio_in() to get the IRQ line, rather than fishing env->irq[] out of the CPU object directly This is a bit more complicated than with openrisc, because there's more than just a single board model, and not all MIPS boards call cpu_mips_irq_init_cpu() so figuring out whether MIPS CPUs should always provide inbound CPU lines, or only those with some particular feature, or something else, would need some investigation or MIPS knowledge. But this is an OK first step anyway, so Reviewed-by: Peter Maydell thanks -- PMM
Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext
On 5/12/20 4:43 PM, Kevin Wolf wrote: > Stefan (Reiter), after looking a bit closer at this, I think there is no > bug in QEMU, but the bug is in your coroutine code that calls block > layer functions without moving into the right AioContext first. I've > written this series anyway as it potentially makes the life of callers > easier and would probably make your buggy code correct. > However, it doesn't feel right to commit something like patch 2 without > having a user for it. Is there a reason why you can't upstream your > async snapshot code? I mean I understand what you mean, but it would make the interface IMO so much easier to use, if one wants to explicit schedule it beforehand they can still do. But that would open the way for two styles doing things, not sure if this would seen as bad. The assert about from patch 3/3 would be already really helping a lot, though. Regarding upstreaming, there was some historical attempt to upstream it from Dietmar, but in the time frame of ~ 8 to 10 years ago or so. I'm not quite sure why it didn't went through then, I see if I can get some time searching the mailing list archive. We'd be naturally open and glad to upstream it, what it effectively allow us to do is to not block the VM to much during snapshoting it live. I pushed a tree[0] with mostly just that specific code squashed together (hope I did not break anything), most of the actual code is in commit [1]. It'd be cleaned up a bit and checked for coding style issues, but works good here. Anyway, thanks for your help and pointers! [0]: https://github.com/ThomasLamprecht/qemu/tree/savevm-async [1]: https://github.com/ThomasLamprecht/qemu/commit/ffb9531f370ef0073e4b6f6021f4c47ccd702121
Re: [PATCH v2 3/3] hw/openrisc/pic_cpu: Use qdev gpio rather than qemu_allocate_irqs()
On Tue, 12 May 2020 at 08:48, Philippe Mathieu-Daudé wrote: > > Coverity points out (CID 1421934) that we are leaking the > memory returned by qemu_allocate_irqs(). We can avoid this > leak by switching to using qdev_init_gpio_in(); the base > class finalize will free the irqs that this allocates under > the hood. > hw/openrisc/pic_cpu.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/hw/openrisc/pic_cpu.c b/hw/openrisc/pic_cpu.c > index 36f9350830..4b0c92f842 100644 > --- a/hw/openrisc/pic_cpu.c > +++ b/hw/openrisc/pic_cpu.c > @@ -52,10 +52,9 @@ static void openrisc_pic_cpu_handler(void *opaque, int > irq, int level) > void cpu_openrisc_pic_init(OpenRISCCPU *cpu) > { > int i; > -qemu_irq *qi; > -qi = qemu_allocate_irqs(openrisc_pic_cpu_handler, cpu, NR_IRQS); > +qdev_init_gpio_in(DEVICE(cpu), openrisc_pic_cpu_handler, NR_IRQS); > > for (i = 0; i < NR_IRQS; i++) { > -cpu->env.irq[i] = qi[i]; > +cpu->env.irq[i] = qdev_get_gpio_in(DEVICE(cpu), i); > } > } This isn't wrong as such, but it's a bit weird, because it's code outside of a device adding GPIO lines to that device, and the handler function openrisc_pic_cpu_handler() is basically doing nothing but reaching into the internals of the CPU device and changing it. Ideally: * all this code should be in target/openrisc/cpu.c, the same way the arm CPU creates its own inbound IRQs with qdev_init_gpio_in() * cpu->env.irq[] should go away, and hw/openrisc/openrisc_sim.c should be calling qdev_get_gpio_in() to get each IRQ line it needs, rather than directly grabbing cpu->env.irq and then indexing into it But this change is an OK first step and we can build the other cleanup on top of it, so Reviewed-by: Peter Maydell thanks -- PMM
Re: [PATCH 2/2] aio-posix: disable fdmon-io_uring when GSource is used
On Mon, May 11, 2020 at 07:36:30PM +0100, Stefan Hajnoczi wrote: > The glib event loop does not call fdmon_io_uring_wait() so fd handlers > waiting to be submitted build up in the list. There is no benefit is > using io_uring when the glib GSource is being used, so disable it > instead of implementing a more complex fix. > > This fixes a memory leak where AioHandlers would build up and increasing > amounts of CPU time were spent iterating them in aio_pending(). The > symptom is that guests become slow when QEMU is built with io_uring > support. > > Buglink: https://bugs.launchpad.net/qemu/+bug/1877716 > Fixes: 73fd282e7b6dd4e4ea1c3bbb3d302c8db51e4ccf ("aio-posix: add io_uring fd > monitoring implementation") > Signed-off-by: Stefan Hajnoczi > --- > include/block/aio.h | 3 +++ > util/aio-posix.c| 12 > util/aio-win32.c| 4 > util/async.c| 1 + > 4 files changed, 20 insertions(+) > > diff --git a/include/block/aio.h b/include/block/aio.h > index 62ed954344..b2f703fa3f 100644 > --- a/include/block/aio.h > +++ b/include/block/aio.h > @@ -701,6 +701,9 @@ void aio_context_setup(AioContext *ctx); > */ > void aio_context_destroy(AioContext *ctx); > > +/* Used internally, do not call outside AioContext code */ > +void aio_context_use_g_source(AioContext *ctx); > + > /** > * aio_context_set_poll_params: > * @ctx: the aio context > diff --git a/util/aio-posix.c b/util/aio-posix.c > index 8af334ab19..1b2a3af65b 100644 > --- a/util/aio-posix.c > +++ b/util/aio-posix.c > @@ -682,6 +682,18 @@ void aio_context_destroy(AioContext *ctx) > aio_free_deleted_handlers(ctx); > } > > +void aio_context_use_g_source(AioContext *ctx) > +{ > +/* > + * Disable io_uring when the glib main loop is used because it doesn't > + * support mixed glib/aio_poll() usage. It relies on aio_poll() being > + * called regularly so that changes to the monitored file descriptors are > + * submitted, otherwise a list of pending fd handlers builds up. > + */ > +fdmon_io_uring_destroy(ctx); > +aio_free_deleted_handlers(ctx); > +} > + > void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns, > int64_t grow, int64_t shrink, Error **errp) > { > diff --git a/util/aio-win32.c b/util/aio-win32.c > index 729d533faf..953c56ab48 100644 > --- a/util/aio-win32.c > +++ b/util/aio-win32.c > @@ -414,6 +414,10 @@ void aio_context_destroy(AioContext *ctx) > { > } > > +void aio_context_use_g_source(AioContext *ctx) > +{ > +} > + > void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns, > int64_t grow, int64_t shrink, Error **errp) > { > diff --git a/util/async.c b/util/async.c > index 3165a28f2f..1319eee3bc 100644 > --- a/util/async.c > +++ b/util/async.c > @@ -362,6 +362,7 @@ static GSourceFuncs aio_source_funcs = { > > GSource *aio_get_g_source(AioContext *ctx) > { > +aio_context_use_g_source(ctx); > g_source_ref(&ctx->source); > return &ctx->source; > } Tested-by: Oleksandr Natalenko (run Windows 10 VM with storage accessible via io_uring on qemu v5.0.0 with these 2 patches) Thank you. -- Best regards, Oleksandr Natalenko (post-factum) Principal Software Maintenance Engineer
Re: [PATCH 1/2] aio-posix: don't duplicate fd handler deletion in fdmon_io_uring_destroy()
On Mon, May 11, 2020 at 07:36:29PM +0100, Stefan Hajnoczi wrote: > The io_uring file descriptor monitoring implementation has an internal > list of fd handlers that are pending submission to io_uring. > fdmon_io_uring_destroy() deletes all fd handlers on the list. > > Don't delete fd handlers directly in fdmon_io_uring_destroy() for two > reasons: > 1. This duplicates the aio-posix.c AioHandler deletion code and could >become outdated if the struct changes. > 2. Only handlers with the FDMON_IO_URING_REMOVE flag set are safe to >remove. If the flag is not set then something still has a pointer to >the fd handler. Let aio-posix.c and its user worry about that. In >practice this isn't an issue because fdmon_io_uring_destroy() is only >called when shutting down so all users have removed their fd >handlers, but the next patch will need this! > > Signed-off-by: Stefan Hajnoczi > --- > util/aio-posix.c | 1 + > util/fdmon-io_uring.c | 13 ++--- > 2 files changed, 11 insertions(+), 3 deletions(-) > > diff --git a/util/aio-posix.c b/util/aio-posix.c > index c3613d299e..8af334ab19 100644 > --- a/util/aio-posix.c > +++ b/util/aio-posix.c > @@ -679,6 +679,7 @@ void aio_context_destroy(AioContext *ctx) > { > fdmon_io_uring_destroy(ctx); > fdmon_epoll_disable(ctx); > +aio_free_deleted_handlers(ctx); > } > > void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns, > diff --git a/util/fdmon-io_uring.c b/util/fdmon-io_uring.c > index d5a80ed6fb..1d14177df0 100644 > --- a/util/fdmon-io_uring.c > +++ b/util/fdmon-io_uring.c > @@ -342,11 +342,18 @@ void fdmon_io_uring_destroy(AioContext *ctx) > > io_uring_queue_exit(&ctx->fdmon_io_uring); > > -/* No need to submit these anymore, just free them. */ > +/* Move handlers due to be removed onto the deleted list */ > while ((node = QSLIST_FIRST_RCU(&ctx->submit_list))) { > +unsigned flags = atomic_fetch_and(&node->flags, > +~(FDMON_IO_URING_PENDING | > + FDMON_IO_URING_ADD | > + FDMON_IO_URING_REMOVE)); > + > +if (flags & FDMON_IO_URING_REMOVE) { > +QLIST_INSERT_HEAD_RCU(&ctx->deleted_aio_handlers, node, > node_deleted); > +} > + > QSLIST_REMOVE_HEAD_RCU(&ctx->submit_list, node_submitted); > -QLIST_REMOVE(node, node); > -g_free(node); > } > > ctx->fdmon_ops = &fdmon_poll_ops; Tested-by: Oleksandr Natalenko (run Windows 10 VM with storage accessible via io_uring on qemu v5.0.0 with these 2 patches) Thank you. -- Best regards, Oleksandr Natalenko (post-factum) Principal Software Maintenance Engineer
Re: [PATCH v2 0/4] block: Do not call BlockDriver.bdrv_make_empty() directly
Am 29.04.2020 um 16:11 hat Max Reitz geschrieben: > v1: https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg01245.html > > Branch: https://github.com/XanClic/qemu.git fix-bdrv_make_empty-v2 > Branch: https://git.xanclic.moe/XanClic/qemu.git fix-bdrv_make_empty-v2 > > Based-on: <20200428192648.749066-1-ebl...@redhat.com> > (“qcow2: Allow resize of images with internal snapshots”) > > Hi, > > As described in v1’s cover letter (linked above), this series ensures > that all calls to BlockDriver.bdrv_make_empty() go through a wrapper > bdrv_make_empty() function that ensures the caller does have the > necessary permissions. Thanks, fixed up the test output in patch 4 and applied to the block branch. Kevin
Re: [PATCH v2 0/1] qemu-nbd: Close inherited stderr
Patchew URL: https://patchew.org/QEMU/20200514063112.1457125-1-raphael.p...@hetzner.com/ Hi, This series seems to have some coding style problems. See output below for more information: Message-id: 20200514063112.1457125-1-raphael.p...@hetzner.com Subject: [PATCH v2 0/1] qemu-nbd: Close inherited stderr Type: series === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20200514122334.25089-1-kra...@redhat.com -> patchew/20200514122334.25089-1-kra...@redhat.com * [new tag] patchew/20200514123729.156283-1-fran...@linux.ibm.com -> patchew/20200514123729.156283-1-fran...@linux.ibm.com Switched to a new branch 'test' 76bb928 qemu-nbd: Close inherited stderr === OUTPUT BEGIN === WARNING: Block comments use a leading /* on a separate line #20: FILE: qemu-nbd.c:919: +/* Remember parents stderr only if the fork option is set. ERROR: suspect code indent for conditional statements (12, 14) #23: FILE: qemu-nbd.c:922: +if (fork_process) { + old_stderr = dup(STDERR_FILENO); ERROR: Missing Signed-off-by: line(s) total: 2 errors, 1 warnings, 14 lines checked Commit 76bb928d8364 (qemu-nbd: Close inherited stderr) has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 The full log is available at http://patchew.org/logs/20200514063112.1457125-1-raphael.p...@hetzner.com/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
Re: [PATCH v6 04/14] block/amend: separate amend and create options for qemu-img
On 10.05.20 15:40, Maxim Levitsky wrote: > Some options are only useful for creation > (or hard to be amended, like cluster size for qcow2), while some other > options are only useful for amend, like upcoming keyslot management > options for luks > > Since currently only qcow2 supports amend, move all its options > to a common macro and then include it in each action option list. > > In future it might be useful to remove some options which are > not supported anyway from amend list, which currently > cause an error message if amended. > > Signed-off-by: Maxim Levitsky > Reviewed-by: Daniel P. Berrangé > --- > block/qcow2.c | 160 +- > include/block/block_int.h | 4 + > qemu-img.c| 18 ++--- > 3 files changed, 100 insertions(+), 82 deletions(-) > > diff --git a/block/qcow2.c b/block/qcow2.c > index 79fbad9d76..fc494c7591 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -5520,83 +5520,96 @@ void qcow2_signal_corruption(BlockDriverState *bs, > bool fatal, int64_t offset, > s->signaled_corruption = true; > } > > +#define QCOW_COMMON_OPTIONS \ > +{ \ > +.name = BLOCK_OPT_SIZE, \ > +.type = QEMU_OPT_SIZE, \ > +.help = "Virtual disk size" \ > +}, \ > +{ \ > +.name = BLOCK_OPT_COMPAT_LEVEL, \ > +.type = QEMU_OPT_STRING,\ > +.help = "Compatibility level (v2 [0.10] or v3 [1.1])" \ > +}, \ > +{ \ > +.name = BLOCK_OPT_BACKING_FILE, \ > +.type = QEMU_OPT_STRING,\ > +.help = "File name of a base image" \ > +}, \ > +{ \ > +.name = BLOCK_OPT_BACKING_FMT, \ > +.type = QEMU_OPT_STRING,\ > +.help = "Image format of the base image"\ > +}, \ > +{ \ > +.name = BLOCK_OPT_DATA_FILE,\ > +.type = QEMU_OPT_STRING,\ > +.help = "File name of an external data file"\ > +}, \ > +{ \ > +.name = BLOCK_OPT_DATA_FILE_RAW,\ > +.type = QEMU_OPT_BOOL, \ > +.help = "The external data file must stay valid " \ > +"as a raw image"\ > +}, \ > +{ \ > +.name = BLOCK_OPT_ENCRYPT, \ > +.type = QEMU_OPT_BOOL, \ > +.help = "Encrypt the image with format 'aes'. (Deprecated " \ > +"in favor of " BLOCK_OPT_ENCRYPT_FORMAT "=aes)",\ > +}, \ > +{ \ > +.name = BLOCK_OPT_ENCRYPT_FORMAT, \ > +.type = QEMU_OPT_STRING,\ > +.help = "Encrypt the image, format choices: 'aes', 'luks'", \ > +}, \ > +BLOCK_CRYPTO_OPT_DEF_KEY_SECRET("encrypt.", \ > +"ID of secret providing qcow AES key or LUKS passphrase"), \ > +BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_ALG("encrypt."), \ > +BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_MODE("encrypt."), \ > +BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_ALG("encrypt."),\ > +BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG("encrypt."), \ > +BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG("encrypt."), \ > +BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME("encrypt."),\ > +{ \ > +.name = BLOCK_OPT_CLUSTER_SIZE, \ > +.type
Re: [PATCH v6 03/14] block/amend: add 'force' option
On 10.05.20 15:40, Maxim Levitsky wrote: > 'force' option will be used for some unsafe amend operations. > > This includes things like erasing last keyslot in luks based formats > (which destroys the data, unless the master key is backed up > by external means), but that _might_ be desired result. > > Signed-off-by: Maxim Levitsky > Reviewed-by: Daniel P. Berrangé > --- > block.c | 4 +++- > block/qcow2.c | 1 + > docs/tools/qemu-img.rst | 5 - > include/block/block.h | 1 + > include/block/block_int.h | 1 + > qemu-img-cmds.hx | 4 ++-- > qemu-img.c| 8 +++- > 7 files changed, 19 insertions(+), 5 deletions(-) [...] > diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst > index 0080f83a76..fc2dca6649 100644 > --- a/docs/tools/qemu-img.rst > +++ b/docs/tools/qemu-img.rst > @@ -249,11 +249,14 @@ Command description: > > .. program:: qemu-img-commands > > -.. option:: amend [--object OBJECTDEF] [--image-opts] [-p] [-q] [-f FMT] [-t > CACHE] -o OPTIONS FILENAME > +.. option:: amend [--object OBJECTDEF] [--image-opts] [-p] [-q] [-f FMT] [-t > CACHE] [--force] -o OPTIONS FILENAME > >Amends the image format specific *OPTIONS* for the image file >*FILENAME*. Not all file formats support this operation. > > + --force allows some unsafe operations. Currently for -f luks, it allows to > + erase last encryption key, and to overwrite an active encryption key. *erase the last encryption key With that fixed: Reviewed-by: Max Reitz signature.asc Description: OpenPGP digital signature
Re: [PATCH v6 02/14] qcrypto/luks: implement encryption key management
On 10.05.20 15:40, Maxim Levitsky wrote: > Next few patches will expose that functionality to the user. > > Signed-off-by: Maxim Levitsky > Reviewed-by: Daniel P. Berrangé > --- > crypto/block-luks.c | 395 +++- > qapi/crypto.json| 61 ++- > 2 files changed, 452 insertions(+), 4 deletions(-) > > diff --git a/crypto/block-luks.c b/crypto/block-luks.c > index 4861db810c..967d382882 100644 > --- a/crypto/block-luks.c > +++ b/crypto/block-luks.c [...] > @@ -1069,6 +1076,119 @@ qcrypto_block_luks_find_key(QCryptoBlock *block, > return -1; > } > > +/* > + * Returns true if a slot i is marked as active > + * (contains encrypted copy of the master key) > + */ > +static bool > +qcrypto_block_luks_slot_active(const QCryptoBlockLUKS *luks, > + unsigned int slot_idx) > +{ > +uint32_t val = luks->header.key_slots[slot_idx].active; > +return val == QCRYPTO_BLOCK_LUKS_KEY_SLOT_ENABLED; One space too many after the ==. [...] > +/* > + * Erases an keyslot given its index > + * Returns: > + *0 if the keyslot was erased successfully > + * -1 if a error occurred while erasing the keyslot > + * > + */ > +static int > +qcrypto_block_luks_erase_key(QCryptoBlock *block, > + unsigned int slot_idx, > + QCryptoBlockWriteFunc writefunc, > + void *opaque, > + Error **errp) > +{ > +QCryptoBlockLUKS *luks = block->opaque; > +QCryptoBlockLUKSKeySlot *slot = &luks->header.key_slots[slot_idx]; > +g_autofree uint8_t *garbagesplitkey = NULL; > +size_t splitkeylen = luks->header.master_key_len * slot->stripes; This accesses header.key_slots[slot_idx] before... > +size_t i; > +Error *local_err = NULL; > +int ret; > + > +assert(slot_idx < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS); ...we assert that slot_idx is in bounds. I suppose that’s kind of fine, because assertions aren’t meant to fire either, but this basically makes the assertion useless. > +assert(splitkeylen > 0); > +garbagesplitkey = g_new0(uint8_t, splitkeylen); > + > +/* Reset the key slot header */ > +memset(slot->salt, 0, QCRYPTO_BLOCK_LUKS_SALT_LEN); > +slot->iterations = 0; > +slot->active = QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED; > + > +ret = qcrypto_block_luks_store_header(block, writefunc, Superfluous space after the comma. > + opaque, &local_err); > + > +if (ret < 0) { > +error_propagate(errp, local_err); > +} error_propagate() is a no-op with local_err == NULL, so you could do without checking @ret (just calling error_propagate() unconditionally). (But who cares, we need to set @ret anyway, so we might as well check it.) [...] > +static int > +qcrypto_block_luks_amend_add_keyslot(QCryptoBlock *block, > + QCryptoBlockReadFunc readfunc, > + QCryptoBlockWriteFunc writefunc, > + void *opaque, > + QCryptoBlockAmendOptionsLUKS *opts_luks, > + bool force, > + Error **errp) > +{ > +QCryptoBlockLUKS *luks = block->opaque; > +uint64_t iter_time = opts_luks->has_iter_time ? > + opts_luks->iter_time : > + QCRYPTO_BLOCK_LUKS_DEFAULT_ITER_TIME_MS; > +int keyslot; > +g_autofree char *old_password = NULL; > +g_autofree char *new_password = NULL; > +g_autofree uint8_t *master_key = NULL; (I assume we don’t really care about purging secrets from memory after use) [...] > +static int > +qcrypto_block_luks_amend_erase_keyslots(QCryptoBlock *block, > +QCryptoBlockReadFunc readfunc, > +QCryptoBlockWriteFunc writefunc, > +void *opaque, > +QCryptoBlockAmendOptionsLUKS > *opts_luks, > +bool force, > +Error **errp) > +{ [...] > +for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) { > +int rv = qcrypto_block_luks_load_key(block, > + i, > + old_password, > + tmpkey, > + readfunc, > + opaque, > + errp); > +if (rv == -1) { > +return -1; > +} else if (rv == 1) { > +bitmap_set(&slots_to_erase_bitmap, i, 1); We should assert that QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS <= sizeof(slots_to_eras
Re: [PATCH v4 4/9] blockdev: Promote several bitmap functions to non-static
13.05.2020 04:16, Eric Blake wrote: -HBitmap **backup, Error **errp) +BdrvDirtyBitmap *block_dirty_bitmap_merge(const char *node, const char *target, + BlockDirtyBitmapMergeSourceList *bitmaps, + HBitmap **backup, Error **errp) { BlockDriverState *bs; s/bitmaps/bms/ to match declaration and fit into 80 characters -- Best regards, Vladimir
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Am 14.05.2020 um 09:13 hat Max Reitz geschrieben: > On 13.05.20 18:11, Eric Blake wrote: > > On 5/13/20 9:56 AM, Max Reitz wrote: > >> This command allows mapping block node names to aliases for the purpose > >> of block dirty bitmap migration. > >> > >> This way, management tools can use different node names on the source > >> and destination and pass the mapping of how bitmaps are to be > >> transferred to qemu (on the source, the destination, or even both with > >> arbitrary aliases in the migration stream). > >> > >> Suggested-by: Vladimir Sementsov-Ogievskiy > >> Signed-off-by: Max Reitz > >> --- > > > >> @@ -713,6 +731,44 @@ static bool dirty_bitmap_has_postcopy(void *opaque) > >> return true; > >> } > >> +void > >> qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList > >> *mapping, > >> + Error **errp) > >> +{ > >> + QDict *in_mapping = qdict_new(); > >> + QDict *out_mapping = qdict_new(); > >> + > >> + for (; mapping; mapping = mapping->next) { > >> + MigrationBlockNodeMapping *entry = mapping->value; > >> + > >> + if (qdict_haskey(out_mapping, entry->node_name)) { > >> + error_setg(errp, "Cannot map node name '%s' twice", > >> + entry->node_name); > >> + goto fail; > >> + } > > > > Can we call this command more than once? Is it cumulative (call it once > > to set mapping for "a", second time to also set mapping for "b"), or > > should it reset (second call wipes out all mappings from first call, any > > mappings that must exist must be passed in the final call)? > > I tried to make it clear in the documentation: > > > +# @mapping: The mapping; must be one-to-one, but not necessarily > > +# complete. Any mapping not given will be reset to the > > +# default (i.e. the identity mapping). > > So everything that isn’t set in the second call is reset. I thought > about what you proposed (because I guess that’s the most intuitive > idea), but after consideration I didn’t see why we’d need different > behavior, so it would only serve to make the code more complicated. Also, if it were cumulative, we would need a separate reset command because you probably don't want to use the same mapping you used for an incoming migration when you later migrate away again to a third host. Kevin signature.asc Description: PGP signature
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Am 14.05.2020 um 11:09 hat Max Reitz geschrieben: > On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote: > [...] > > 1. So, you decided to make only node-mapping, not bitmap-mapping, so we > > can't rename bitmaps in-flight and can't migrate bitmaps from one node > > to several and visa-versa. I think it's OK, nothing good in such > > possibilities, and this simplifies things. > > On second thought, I wonder whether it would be useful to migrate > bitmaps from multiple nodes to a single one. But probably not, this > would only make sense for filters, really, and in such a case the > bitmaps should probably just be moved prior to migration. > > (And I can’t imagine any other case. When flattening backing chains, > the bitmaps from the dropped layers basically become useless.) I agree, you can always move the bitmaps in a separate step, either on the source before migration or on the destination afterwards. Migration is already complicated enough, let's not move things into it that don't necessarily have to be there. You would get complications like possibly conflicting bitmap names when you migrate the bitmaps of two nodes into a single one. Kevin signature.asc Description: PGP signature
Re: [PATCH 0/5] Introduce 'yank' oob qmp command to recover from hanging qemu
* Lukas Straub (lukasstra...@web.de) wrote: > Terminology: > instance = one (nbd) blockdev/one chardev/the single migrationstate > connection = one TCP connection > > Hello Everyone, > Having read all the comments, here is proposal v2: > Every instance registers itself with a unique name in the form > "blockdev:", "chardev:" and "migration" using > yank_register_instance which will do some sanity checks like checking if the > same name exists already. Then (multiple) yank functions can be registered as > needed with that single name. When the instance exits/is removed, it > unregisters all yank functions and unregisters it's name with > yank_unregister_instance which will check if all yank functions where > unregistered. > Every instance that supports the yank feature will register itself and the > yank functions unconditionally (No extra 'yank' option per instance). > The 'query-yank' oob qmp command lists the names of all registered instances. > The 'yank' oob qmp command takes a list of names and for every name calls all > yank functions registered with that name. Before doing anything, it will > check that all names exist. > > If the instance has multiple connections (say, migration with multifd), i > don't think it makes much sense to just shutdown one connection. Calling > 'yank' on a instance will always shutdown all connections of that instance. Agreed. > Yank functions are generic and in no way limited to connections. Say, if > migration is started to an 'exec:' address, migration could register a yank > function to kill that external command on yank (Won't be implemented yet > though). One thing we need to be care of is that the yank functions stay suitable for OOB calling. Dave > Regards, > Lukas Straub -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
14.05.2020 12:09, Max Reitz wrote: On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote: [...] 1. So, you decided to make only node-mapping, not bitmap-mapping, so we can't rename bitmaps in-flight and can't migrate bitmaps from one node to several and visa-versa. I think it's OK, nothing good in such possibilities, and this simplifies things. On second thought, I wonder whether it would be useful to migrate bitmaps from multiple nodes to a single one. But probably not, this would only make sense for filters, really, and in such a case the bitmaps should probably just be moved prior to migration. Yes, I think the moving before migration is enough. -- Best regards, Vladimir
Re: [PATCH 0/5] Introduce 'yank' oob qmp command to recover from hanging qemu
Am 13.05.2020 um 21:12 hat Lukas Straub geschrieben: > Terminology: > instance = one (nbd) blockdev/one chardev/the single migrationstate > connection = one TCP connection > > Hello Everyone, > Having read all the comments, here is proposal v2: Looks quite good to me. > Every instance registers itself with a unique name in the form > "blockdev:", "chardev:" and "migration" using > yank_register_instance which will do some sanity checks like checking > if the same name exists already. Then (multiple) yank functions can be > registered as needed with that single name. When the instance exits/is > removed, it unregisters all yank functions and unregisters it's name > with yank_unregister_instance which will check if all yank functions > where unregistered. Feels like nitpicking, but you say that functions are registered with a name that you have previously registered. If you mean literally passing a string, could this lead to callers forgetting to register it first? Maybe yank_register_instance() should return something like a YankInstance object that must be passed to yank_register_function(). Then you would probably also have a list of all functions registered for a single instance so that yank_unregister_instance() could automatically remove all of them instead of requiring the instance to do so. > Every instance that supports the yank feature will register itself and > the yank functions unconditionally (No extra 'yank' option per > instance). > The 'query-yank' oob qmp command lists the names of all registered > instances. > The 'yank' oob qmp command takes a list of names and for every name > calls all yank functions registered with that name. Before doing > anything, it will check that all names exist. > > If the instance has multiple connections (say, migration with > multifd), i don't think it makes much sense to just shutdown one > connection. Calling 'yank' on a instance will always shutdown all > connections of that instance. I think it depends. If shutting down one connection basically kills the functionality of the whole instance, there is no reason not to kill all connections. But if the instance could continue working on the second connection, maybe it shouldn't be killed. Anyway, we can extend things as needed later. I already mentioned elsewhere in this thread that block node-names have a restricted set of allowed character, so having a suffix to distinguish multiple yankable things is possible. I just checked chardev and it also restricts the allowed set of characters, so the same applies. 'migration' is only a fixed string, so it's trivially extensible. So we know a path forward if we ever need to yank individual connections, which is good enough for me. > Yank functions are generic and in no way limited to connections. Say, > if migration is started to an 'exec:' address, migration could > register a yank function to kill that external command on yank (Won't > be implemented yet though). Yes, this makes sense as a potential future use case. Kevin signature.asc Description: PGP signature
Re: [PATCH v10 14/14] iotests: use python logging for iotests.log()
Am 14.05.2020 um 08:24 hat John Snow geschrieben: > On 3/31/20 9:44 AM, Kevin Wolf wrote: > > Am 31.03.2020 um 02:00 hat John Snow geschrieben: > >> We can turn logging on/off globally instead of per-function. > >> > >> Remove use_log from run_job, and use python logging to turn on > >> diffable output when we run through a script entry point. > >> > >> iotest 245 changes output order due to buffering reasons. > >> > >> > >> An extended note on python logging: > >> > >> A NullHandler is added to `qemu.iotests` to stop output from being > >> generated if this code is used as a library without configuring logging. > >> A NullHandler is only needed at the root, so a duplicate handler is not > >> needed for `qemu.iotests.diff_io`. > >> > >> When logging is not configured, messages at the 'WARNING' levels or > >> above are printed with default settings. The NullHandler stops this from > >> occurring, which is considered good hygiene for code used as a library. > >> > >> See https://docs.python.org/3/howto/logging.html#library-config > >> > >> When logging is actually enabled (always at the behest of an explicit > >> call by a client script), a root logger is implicitly created at the > >> root, which allows messages to propagate upwards and be handled/emitted > >> from the root logger with default settings. > >> > >> When we want iotest logging, we attach a handler to the > >> qemu.iotests.diff_io logger and disable propagation to avoid possible > >> double-printing. > >> > >> For more information on python logging infrastructure, I highly > >> recommend downloading the pip package `logging_tree`, which provides > >> convenient visualizations of the hierarchical logging configuration > >> under different circumstances. > >> > >> See https://pypi.org/project/logging_tree/ for more information. > >> > >> Signed-off-by: John Snow > >> Reviewed-by: Max Reitz > > > > Should we enable logger if -d is given? > > > > Previously we had: > > > > $ ./check -d -T -raw 281 > > [...] > > 281 not run: not suitable for this image format: raw > > 281 not run[15:39:03] [15:39:04]not suitable > > for this image format: raw > > Not run: 281 > > > > After this series, the first line of output from notrun() is missing. > > Not that I think it's important to have the line, but as long as we > > bother to call logger.warning(), I thought that maybe we want to be able > > to actually see the effect of it somehwere? > > > > Kevin > > > > Uh, okay. So this is weirder than I thought it was going to be! > > So, if you move the debug configuration up above the _verify calls, > you'll see the message printed out to the debug stream: > > DEBUG:qemu.iotests:iotests debugging messages active > WARNING:qemu.iotests:281 not run: not suitable for this image format: raw > > ...but if you omit the `-d` flag, the message vanishes into a black > hole. Did it always work like that ...? Yes, this is how it used to work. It's a result of ./check only printing the test output with -d, and such log messages are basically just test output. And I think it's exactly what we want: Without -d, you want only the summary, i.e. a single line that says "pass", "fail" or "notrun", potentially with a small note at the end of the line, but that's it. Kevin
Re: [PULL 0/5] Block patches
On Wed, 13 May 2020 at 15:15, Max Reitz wrote: > > The following changes since commit d5c75ec500d96f1d93447f990cd5a4ef5ba27fae: > > Merge remote-tracking branch > 'remotes/stefanberger/tags/pull-tpm-2020-05-08-1' into staging (2020-05-12 > 17:00:10 +0100) > > are available in the Git repository at: > > https://github.com/XanClic/qemu.git tags/pull-block-2020-05-13 > > for you to fetch changes up to fc9aefc8c0d3c6392656ea661ce72c1583b70bbd: > > block/block-copy: fix use-after-free of task pointer (2020-05-13 14:20:31 > +0200) > > > Block patches: > - zstd compression for qcow2 > - Fix use-after-free Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/5.1 for any user-visible changes. -- PMM
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
* Max Reitz (mre...@redhat.com) wrote: > On 14.05.20 10:42, Dr. David Alan Gilbert wrote: > > * Max Reitz (mre...@redhat.com) wrote: > > > > > > > >> +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList > >> *mapping, > >> + Error **errp) > >> +{ > >> +QDict *in_mapping = qdict_new(); > >> +QDict *out_mapping = qdict_new(); > >> + > >> +for (; mapping; mapping = mapping->next) { > >> +MigrationBlockNodeMapping *entry = mapping->value; > >> + > >> +if (qdict_haskey(out_mapping, entry->node_name)) { > >> +error_setg(errp, "Cannot map node name '%s' twice", > >> + entry->node_name); > >> +goto fail; > >> +} > > > > I'm not too clear exactly which case this is protecting against; > > I think that's protecting against mapping > > > > 'src1'->'dst1' and 'src1'->'dst2' > > which is a good check.s (or maybe it's checking against dst2 twice?) > > This one is against mapping src1 twice. Both checks together check that > it’s a one-to-one bijective mapping. > > The technical reason why it needs to be one-to-one is because we base > two QDicts off of it, so the inverse mapping needs to work. > > > What about cases where there is no mapping - e.g. imagine > > that we have b1/b2 on the source and b2/b3 on the dest; now > > if we add just a mapping: > > > > b1->b2 > > > > then we end up with: > > > > b1 -> b2 > > b2 -> b2 (non-mapped) > > b3 > > > > so we have a clash there - are we protected against that? > > Oh, no, we aren’t. That wasn’t intentional. However, I’m not sure how > we’d protect against it. We can’t check it in > qmp_migrate_set_bitmap_node_mapping(), because we don’t know yet which > nodes will exist at the time of migration, and which of those will have > bitmaps. > > So we’d need to check it as part of the migration process (by looking up > any unmapped entries that default to the identity mapping in the > respective reverse mapping, to see whether anything maps to the same name). Yeh a once through check of all the nodes at the start of the migration would probably fix it. > OTOH, Vladimir proposed adding a parameter to > migrate-set-bitmap-node-mapping that would make migration fail if any > bitmaps should be migrated off of unmapped nodes, or if any incoming > alias is unmapped (instead of defaulting to the identity mapping). If > we just make that the only behavior, then we wouldn’t have a problem > with that at all, because all unmapped nodes would always throw an error. Yeh that would force you to put a full mapping table in place. > (And on the third hand, I wonder whether we should actually allow > migrating bitmaps from multiple nodes to a single one, but I suppose > that would require two separate commands, one for incoming and one for > outgoing...) Wouldn't that get very messy? Dave > Max > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote: [...] > 1. So, you decided to make only node-mapping, not bitmap-mapping, so we > can't rename bitmaps in-flight and can't migrate bitmaps from one node > to several and visa-versa. I think it's OK, nothing good in such > possibilities, and this simplifies things. On second thought, I wonder whether it would be useful to migrate bitmaps from multiple nodes to a single one. But probably not, this would only make sense for filters, really, and in such a case the bitmaps should probably just be moved prior to migration. (And I can’t imagine any other case. When flattening backing chains, the bitmaps from the dropped layers basically become useless.) Max signature.asc Description: OpenPGP digital signature
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
On 14.05.20 10:42, Dr. David Alan Gilbert wrote: > * Max Reitz (mre...@redhat.com) wrote: > > > >> +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList >> *mapping, >> + Error **errp) >> +{ >> +QDict *in_mapping = qdict_new(); >> +QDict *out_mapping = qdict_new(); >> + >> +for (; mapping; mapping = mapping->next) { >> +MigrationBlockNodeMapping *entry = mapping->value; >> + >> +if (qdict_haskey(out_mapping, entry->node_name)) { >> +error_setg(errp, "Cannot map node name '%s' twice", >> + entry->node_name); >> +goto fail; >> +} > > I'm not too clear exactly which case this is protecting against; > I think that's protecting against mapping > > 'src1'->'dst1' and 'src1'->'dst2' > which is a good check.s (or maybe it's checking against dst2 twice?) This one is against mapping src1 twice. Both checks together check that it’s a one-to-one bijective mapping. The technical reason why it needs to be one-to-one is because we base two QDicts off of it, so the inverse mapping needs to work. > What about cases where there is no mapping - e.g. imagine > that we have b1/b2 on the source and b2/b3 on the dest; now > if we add just a mapping: > > b1->b2 > > then we end up with: > > b1 -> b2 > b2 -> b2 (non-mapped) > b3 > > so we have a clash there - are we protected against that? Oh, no, we aren’t. That wasn’t intentional. However, I’m not sure how we’d protect against it. We can’t check it in qmp_migrate_set_bitmap_node_mapping(), because we don’t know yet which nodes will exist at the time of migration, and which of those will have bitmaps. So we’d need to check it as part of the migration process (by looking up any unmapped entries that default to the identity mapping in the respective reverse mapping, to see whether anything maps to the same name). OTOH, Vladimir proposed adding a parameter to migrate-set-bitmap-node-mapping that would make migration fail if any bitmaps should be migrated off of unmapped nodes, or if any incoming alias is unmapped (instead of defaulting to the identity mapping). If we just make that the only behavior, then we wouldn’t have a problem with that at all, because all unmapped nodes would always throw an error. (And on the third hand, I wonder whether we should actually allow migrating bitmaps from multiple nodes to a single one, but I suppose that would require two separate commands, one for incoming and one for outgoing...) Max signature.asc Description: OpenPGP digital signature
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
* Max Reitz (mre...@redhat.com) wrote: > +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList > *mapping, > + Error **errp) > +{ > +QDict *in_mapping = qdict_new(); > +QDict *out_mapping = qdict_new(); > + > +for (; mapping; mapping = mapping->next) { > +MigrationBlockNodeMapping *entry = mapping->value; > + > +if (qdict_haskey(out_mapping, entry->node_name)) { > +error_setg(errp, "Cannot map node name '%s' twice", > + entry->node_name); > +goto fail; > +} I'm not too clear exactly which case this is protecting against; I think that's protecting against mapping 'src1'->'dst1' and 'src1'->'dst2' which is a good check.s (or maybe it's checking against dst2 twice?) What about cases where there is no mapping - e.g. imagine that we have b1/b2 on the source and b2/b3 on the dest; now if we add just a mapping: b1->b2 then we end up with: b1 -> b2 b2 -> b2 (non-mapped) b3 so we have a clash there - are we protected against that? Dave > +if (qdict_haskey(in_mapping, entry->alias)) { > +error_setg(errp, "Cannot use alias '%s' twice", > + entry->alias); > +goto fail; > +} > + > +qdict_put_str(in_mapping, entry->alias, entry->node_name); > +qdict_put_str(out_mapping, entry->node_name, entry->alias); > +} > + > +qobject_unref(dirty_bitmap_mig_state.node_in_mapping); > +qobject_unref(dirty_bitmap_mig_state.node_out_mapping); > + > +dirty_bitmap_mig_state.node_in_mapping = in_mapping; > +dirty_bitmap_mig_state.node_out_mapping = out_mapping; > + > +return; > + > +fail: > +qobject_unref(in_mapping); > +qobject_unref(out_mapping); > +} > + > static SaveVMHandlers savevm_dirty_bitmap_handlers = { > .save_setup = dirty_bitmap_save_setup, > .save_live_complete_postcopy = dirty_bitmap_save_complete, > -- > 2.26.2 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH v6 19/20] hw/block/nvme: do cmb/pmr init as part of pci init
On 5/14/20 6:46 AM, Klaus Jensen wrote: From: Klaus Jensen Having the patch subject duplicated ease review (not all email client display email subject close to email content): "Do cmb/pmr init as part of pci init." Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Klaus Jensen Reviewed-by: Maxim Levitsky --- hw/block/nvme.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 7254b66ae199..2addcc86034a 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1527,6 +1527,12 @@ static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev) pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64, &n->iomem); msix_init_exclusive_bar(pci_dev, n->params.max_ioqpairs + 1, 4, NULL); + +if (n->params.cmb_size_mb) { +nvme_init_cmb(n, pci_dev); +} else if (n->pmrdev) { +nvme_init_pmr(n, pci_dev); +} } static void nvme_realize(PCIDevice *pci_dev, Error **errp) @@ -1588,12 +1594,6 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) n->bar.vs = 0x00010200; n->bar.intmc = n->bar.intms = 0; -if (n->params.cmb_size_mb) { -nvme_init_cmb(n, pci_dev); -} else if (n->pmrdev) { -nvme_init_pmr(n, pci_dev); -} - for (i = 0; i < n->num_namespaces; i++) { nvme_init_namespace(n, &n->namespaces[i], &local_err); if (local_err) {
Re: [PATCH v6 18/20] hw/block/nvme: factor out pmr setup
On 5/14/20 6:46 AM, Klaus Jensen wrote: From: Klaus Jensen Signed-off-by: Klaus Jensen Reviewed-by: Maxim Levitsky --- hw/block/nvme.c | 95 ++--- 1 file changed, 51 insertions(+), 44 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index d71a5f142d51..7254b66ae199 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -58,6 +58,7 @@ #define NVME_REG_SIZE 0x1000 #define NVME_DB_SIZE 4 #define NVME_CMB_BIR 2 +#define NVME_PMR_BIR 2 #define NVME_GUEST_ERR(trace, fmt, ...) \ do { \ @@ -1463,6 +1464,55 @@ static void nvme_init_cmb(NvmeCtrl *n, PCIDevice *pci_dev) PCI_BASE_ADDRESS_MEM_PREFETCH, &n->ctrl_mem); } +static void nvme_init_pmr(NvmeCtrl *n, PCIDevice *pci_dev) +{ +/* Controller Capabilities register */ +NVME_CAP_SET_PMRS(n->bar.cap, 1); + +/* PMR Capabities register */ +n->bar.pmrcap = 0; +NVME_PMRCAP_SET_RDS(n->bar.pmrcap, 0); +NVME_PMRCAP_SET_WDS(n->bar.pmrcap, 0); +NVME_PMRCAP_SET_BIR(n->bar.pmrcap, NVME_PMR_BIR); +NVME_PMRCAP_SET_PMRTU(n->bar.pmrcap, 0); +/* Turn on bit 1 support */ +NVME_PMRCAP_SET_PMRWBM(n->bar.pmrcap, 0x02); +NVME_PMRCAP_SET_PMRTO(n->bar.pmrcap, 0); +NVME_PMRCAP_SET_CMSS(n->bar.pmrcap, 0); + +/* PMR Control register */ +n->bar.pmrctl = 0; +NVME_PMRCTL_SET_EN(n->bar.pmrctl, 0); + +/* PMR Status register */ +n->bar.pmrsts = 0; +NVME_PMRSTS_SET_ERR(n->bar.pmrsts, 0); +NVME_PMRSTS_SET_NRDY(n->bar.pmrsts, 0); +NVME_PMRSTS_SET_HSTS(n->bar.pmrsts, 0); +NVME_PMRSTS_SET_CBAI(n->bar.pmrsts, 0); + +/* PMR Elasticity Buffer Size register */ +n->bar.pmrebs = 0; +NVME_PMREBS_SET_PMRSZU(n->bar.pmrebs, 0); +NVME_PMREBS_SET_RBB(n->bar.pmrebs, 0); +NVME_PMREBS_SET_PMRWBZ(n->bar.pmrebs, 0); + +/* PMR Sustained Write Throughput register */ +n->bar.pmrswtp = 0; +NVME_PMRSWTP_SET_PMRSWTU(n->bar.pmrswtp, 0); +NVME_PMRSWTP_SET_PMRSWTV(n->bar.pmrswtp, 0); + +/* PMR Memory Space Control register */ +n->bar.pmrmsc = 0; +NVME_PMRMSC_SET_CMSE(n->bar.pmrmsc, 0); +NVME_PMRMSC_SET_CBA(n->bar.pmrmsc, 0); + +pci_register_bar(pci_dev, NVME_PMRCAP_BIR(n->bar.pmrcap), + PCI_BASE_ADDRESS_SPACE_MEMORY | + PCI_BASE_ADDRESS_MEM_TYPE_64 | + PCI_BASE_ADDRESS_MEM_PREFETCH, &n->pmrdev->mr); +} + static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev) { uint8_t *pci_conf = pci_dev->config; @@ -1541,50 +1591,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) if (n->params.cmb_size_mb) { nvme_init_cmb(n, pci_dev); } else if (n->pmrdev) { -/* Controller Capabilities register */ -NVME_CAP_SET_PMRS(n->bar.cap, 1); - -/* PMR Capabities register */ -n->bar.pmrcap = 0; -NVME_PMRCAP_SET_RDS(n->bar.pmrcap, 0); -NVME_PMRCAP_SET_WDS(n->bar.pmrcap, 0); -NVME_PMRCAP_SET_BIR(n->bar.pmrcap, 2); -NVME_PMRCAP_SET_PMRTU(n->bar.pmrcap, 0); -/* Turn on bit 1 support */ -NVME_PMRCAP_SET_PMRWBM(n->bar.pmrcap, 0x02); -NVME_PMRCAP_SET_PMRTO(n->bar.pmrcap, 0); -NVME_PMRCAP_SET_CMSS(n->bar.pmrcap, 0); - -/* PMR Control register */ -n->bar.pmrctl = 0; -NVME_PMRCTL_SET_EN(n->bar.pmrctl, 0); - -/* PMR Status register */ -n->bar.pmrsts = 0; -NVME_PMRSTS_SET_ERR(n->bar.pmrsts, 0); -NVME_PMRSTS_SET_NRDY(n->bar.pmrsts, 0); -NVME_PMRSTS_SET_HSTS(n->bar.pmrsts, 0); -NVME_PMRSTS_SET_CBAI(n->bar.pmrsts, 0); - -/* PMR Elasticity Buffer Size register */ -n->bar.pmrebs = 0; -NVME_PMREBS_SET_PMRSZU(n->bar.pmrebs, 0); -NVME_PMREBS_SET_RBB(n->bar.pmrebs, 0); -NVME_PMREBS_SET_PMRWBZ(n->bar.pmrebs, 0); - -/* PMR Sustained Write Throughput register */ -n->bar.pmrswtp = 0; -NVME_PMRSWTP_SET_PMRSWTU(n->bar.pmrswtp, 0); -NVME_PMRSWTP_SET_PMRSWTV(n->bar.pmrswtp, 0); - -/* PMR Memory Space Control register */ -n->bar.pmrmsc = 0; -NVME_PMRMSC_SET_CMSE(n->bar.pmrmsc, 0); -NVME_PMRMSC_SET_CBA(n->bar.pmrmsc, 0); - -pci_register_bar(pci_dev, NVME_PMRCAP_BIR(n->bar.pmrcap), -PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64 | -PCI_BASE_ADDRESS_MEM_PREFETCH, &n->pmrdev->mr); +nvme_init_pmr(n, pci_dev); } for (i = 0; i < n->num_namespaces; i++) { Reviewed-by: Philippe Mathieu-Daudé
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote: > 13.05.2020 17:56, Max Reitz wrote: >> This command allows mapping block node names to aliases for the purpose >> of block dirty bitmap migration. >> >> This way, management tools can use different node names on the source >> and destination and pass the mapping of how bitmaps are to be >> transferred to qemu (on the source, the destination, or even both with >> arbitrary aliases in the migration stream). >> >> Suggested-by: Vladimir Sementsov-Ogievskiy >> Signed-off-by: Max Reitz >> --- >> Branch: https://github.com/XanClic/qemu.git >> migration-bitmap-mapping-rfc-v2 >> Branch: https://git.xanclic.moe/XanClic/qemu.git >> migration-bitmap-mapping-rfc-v2 >> >> (Sorry, v1 was just broken. This one should work better.) >> >> Vladimir has proposed something like this in April: >> https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg00171.html >> >> Now I’ve been asked by my manager to look at this, so I decided to just >> write a patch to see how it’d play out. > > Great! Sometimes I remember about this thing, but never start > implementing :) > >> >> This is an RFC, because I’d like to tack on tests to the final version, >> but I’m not sure whether I can come up with something before the end of >> the week (and I’ll be on PTO for the next two weeks). >> >> Also, I don’t know whether migration/block-dirty-bitmap.c is the best >> place to put qmp_migrate_set_bitmap_mapping(), but it appears we already >> have some QMP handlers in migration/, so I suppose it isn’t too bad. >> --- >> qapi/migration.json | 36 >> migration/block-dirty-bitmap.c | 60 -- >> 2 files changed, 94 insertions(+), 2 deletions(-) >> >> diff --git a/qapi/migration.json b/qapi/migration.json >> index d5000558c6..97037ea635 100644 >> --- a/qapi/migration.json >> +++ b/qapi/migration.json >> @@ -1621,3 +1621,39 @@ >> ## >> { 'event': 'UNPLUG_PRIMARY', >> 'data': { 'device-id': 'str' } } >> + >> +## >> +# @MigrationBlockNodeMapping: >> +# >> +# Maps a block node name to an alias for migration. >> +# >> +# @node-name: A block node name. >> +# >> +# @alias: An alias name for migration (for example the node name on >> +# the opposite site). >> +# >> +# Since: 5.1 >> +## >> +{ 'struct': 'MigrationBlockNodeMapping', >> + 'data': { >> + 'node-name': 'str', >> + 'alias': 'str' >> + } } >> + >> +## >> +# @migrate-set-bitmap-node-mapping: >> +# >> +# Maps block node names to arbitrary aliases for the purpose of dirty >> +# bitmap migration. Such aliases may for example be the corresponding >> +# node names on the opposite site. >> +# >> +# By default, every node name is mapped to itself. >> +# >> +# @mapping: The mapping; must be one-to-one, but not necessarily >> +# complete. Any mapping not given will be reset to the >> +# default (i.e. the identity mapping). >> +# >> +# Since: 5.1 >> +## >> +{ 'command': 'migrate-set-bitmap-node-mapping', >> + 'data': { 'mapping': ['MigrationBlockNodeMapping'] } } > > Hm. I like it, it's simpler and clearer than what I was thinking about. > > 1. So, you decided to make only node-mapping, not bitmap-mapping, so we > can't rename bitmaps in-flight and can't migrate bitmaps from one node > to several and visa-versa. I think it's OK, nothing good in such > possibilities, and this simplifies things. If it turns out that we’d want it, I suppose we can also still always extend MigrationBlockNodeMapping by another mapping array for bitmaps. > 2. If I understand correctly, default to node-name matching doesn't make > real sense for libvirt.. But on the other hand, libvirt should not be > considered as the ony user of Qemu. Still, the default seems unsafe.. > Could we make it optional? Or add an option to disable this default for > absolutely strict behavior? It was my understanding that libvirt (which should know about all bitmaps on all nodes) would and could ensure itself that all nodes are mapped according to what it needs. (But that’s why Peter is CC’d, to get his input.) But your idea seems simple, so why not. > May be, add a parameter > > fallback: node-name | error | drop > > where > node-name: use node-name as an alias, if found bitmap on the node not > mentioned in @mapping [should not be useful for libvirt, but may be for > others] > error: just error-out if such bitmap found [libvirt should use it, I > propose it as a default value for @fallback] You mean error out during migration? Hm. I suppose that’s OK, if some mapping erroneously isn’t set and the node name doesn’t exist in the destination, we’ll error out, too, so... Shouldn’t be too difficult to implement, just put the enum in dirty_bitmap_mig_state, and then do what it says when no entry can be found in the mapping QDict. > drop: just ignore such bitmap - it will be lost [just and idea, I > doubt that it is really useful] > > === > > Also, we shou
Re: [PATCH v2 5/5] vhost: add device started check in migration set log
On 2020/5/13 下午5:47, Dima Stepanov wrote: case CHR_EVENT_CLOSED: /* a close event may happen during a read/write, but vhost * code assumes the vhost_dev remains setup, so delay the * stop & clear to idle. * FIXME: better handle failure in vhost code, remove bh */ if (s->watch) { AioContext *ctx = qemu_get_current_aio_context(); g_source_remove(s->watch); s->watch = 0; qemu_chr_fe_set_handlers(&s->chr, NULL, NULL, NULL, NULL, NULL, NULL, false); aio_bh_schedule_oneshot(ctx, chr_closed_bh, opaque); } break; I think it's time we dropped the FIXME and moved the handling to common code. Jason? Marc-André? I agree. Just to confirm, do you prefer bh or doing changes like what is done in this series? It looks to me bh can have more easier codes. Could it be a good idea just to make disconnect in the char device but postphone clean up in the vhost-user-blk (or any other vhost-user device) itself? So we are moving the postphone logic and decision from the char device to vhost-user device. One of the idea i have is as follows: - Put ourself in the INITIALIZATION state - Start these vhost-user "handshake" commands - If we got a disconnect error, perform disconnect, but don't clean up device (it will be clean up on the roll back). I can be done by checking the state in vhost_user_..._disconnect routine or smth like it Any issue you saw just using the aio bh as Michael posted above. Then we don't need to deal with the silent vhost_dev_stop() and we will have codes that is much more easier to understand. Thank - vhost-user command returns error back to the _start() routine - Rollback in one place in the start() routine, by calling this postphoned clean up for the disconnect
Re: [PATCH v2 4/5] vhost: check vring address before calling unmap
On 2020/5/13 下午5:36, Dima Stepanov wrote: On Wed, May 13, 2020 at 11:00:38AM +0800, Jason Wang wrote: On 2020/5/12 下午5:08, Dima Stepanov wrote: On Tue, May 12, 2020 at 11:26:11AM +0800, Jason Wang wrote: On 2020/5/11 下午5:11, Dima Stepanov wrote: On Mon, May 11, 2020 at 11:05:58AM +0800, Jason Wang wrote: On 2020/4/30 下午9:36, Dima Stepanov wrote: Since disconnect can happen at any time during initialization not all vring buffers (for instance used vring) can be intialized successfully. If the buffer was not initialized then vhost_memory_unmap call will lead to SIGSEGV. Add checks for the vring address value before calling unmap. Also add assert() in the vhost_memory_unmap() routine. Signed-off-by: Dima Stepanov --- hw/virtio/vhost.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index ddbdc53..3ee50c4 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -314,6 +314,8 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void *buffer, hwaddr len, int is_write, hwaddr access_len) { +assert(buffer); + if (!vhost_dev_has_iommu(dev)) { cpu_physical_memory_unmap(buffer, len, is_write, access_len); } @@ -1132,12 +1134,25 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev, vhost_vq_index); } -vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx), - 1, virtio_queue_get_used_size(vdev, idx)); -vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx), - 0, virtio_queue_get_avail_size(vdev, idx)); -vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx), - 0, virtio_queue_get_desc_size(vdev, idx)); +/* + * Since the vhost-user disconnect can happen during initialization + * check if vring was initialized, before making unmap. + */ +if (vq->used) { +vhost_memory_unmap(dev, vq->used, + virtio_queue_get_used_size(vdev, idx), + 1, virtio_queue_get_used_size(vdev, idx)); +} +if (vq->avail) { +vhost_memory_unmap(dev, vq->avail, + virtio_queue_get_avail_size(vdev, idx), + 0, virtio_queue_get_avail_size(vdev, idx)); +} +if (vq->desc) { +vhost_memory_unmap(dev, vq->desc, + virtio_queue_get_desc_size(vdev, idx), + 0, virtio_queue_get_desc_size(vdev, idx)); +} Any reason not checking hdev->started instead? vhost_dev_start() will set it to true if virtqueues were correctly mapped. Thanks Well i see it a little bit different: - vhost_dev_start() sets hdev->started to true before starting virtqueues - vhost_virtqueue_start() maps all the memory If we hit the vhost disconnect at the start of the vhost_virtqueue_start(), for instance for this call: r = dev->vhost_ops->vhost_set_vring_base(dev, &state); Then we will call vhost_user_blk_disconnect: vhost_user_blk_disconnect()-> vhost_user_blk_stop()-> vhost_dev_stop()-> vhost_virtqueue_stop() As a result we will come in this routine with the hdev->started still set to true, but if used/avail/desc fields still uninitialized and set to 0. I may miss something, but consider both vhost_dev_start() and vhost_user_blk_disconnect() were serialized in main loop. Can this really happen? Yes, consider the case when we start the vhost-user-blk device: vhost_dev_start-> vhost_virtqueue_start And we got a disconnect in the middle of vhost_virtqueue_start() routine, for instance: 1000 vq->num = state.num = virtio_queue_get_num(vdev, idx); 1001 r = dev->vhost_ops->vhost_set_vring_num(dev, &state); 1002 if (r) { 1003 VHOST_OPS_DEBUG("vhost_set_vring_num failed"); 1004 return -errno; 1005 } --> Here we got a disconnect <-- 1006 1007 state.num = virtio_queue_get_last_avail_idx(vdev, idx); 1008 r = dev->vhost_ops->vhost_set_vring_base(dev, &state); 1009 if (r) { 1010 VHOST_OPS_DEBUG("vhost_set_vring_base failed"); 1011 return -errno; 1012 } As a result call to vhost_set_vring_base will call the disconnect routine. The backtrace log for SIGSEGV is as follows: Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x72ea9700 (LWP 183150)] 0x74d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 (gdb) bt #0 0x74d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x5590fd90 in flatview_write_continue (fv=0x7fffec4a2600, addr=0, attrs=..., ptr=0x0, len=1028, addr1=0, l=1028, mr=0x56b1b310) at ./exec.c:3142 #2 0x5590fe98 in flatview_writ
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
On 13.05.20 18:11, Eric Blake wrote: > On 5/13/20 9:56 AM, Max Reitz wrote: >> This command allows mapping block node names to aliases for the purpose >> of block dirty bitmap migration. >> >> This way, management tools can use different node names on the source >> and destination and pass the mapping of how bitmaps are to be >> transferred to qemu (on the source, the destination, or even both with >> arbitrary aliases in the migration stream). >> >> Suggested-by: Vladimir Sementsov-Ogievskiy >> Signed-off-by: Max Reitz >> --- > >> @@ -713,6 +731,44 @@ static bool dirty_bitmap_has_postcopy(void *opaque) >> return true; >> } >> +void >> qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList >> *mapping, >> + Error **errp) >> +{ >> + QDict *in_mapping = qdict_new(); >> + QDict *out_mapping = qdict_new(); >> + >> + for (; mapping; mapping = mapping->next) { >> + MigrationBlockNodeMapping *entry = mapping->value; >> + >> + if (qdict_haskey(out_mapping, entry->node_name)) { >> + error_setg(errp, "Cannot map node name '%s' twice", >> + entry->node_name); >> + goto fail; >> + } > > Can we call this command more than once? Is it cumulative (call it once > to set mapping for "a", second time to also set mapping for "b"), or > should it reset (second call wipes out all mappings from first call, any > mappings that must exist must be passed in the final call)? I tried to make it clear in the documentation: > +# @mapping: The mapping; must be one-to-one, but not necessarily > +# complete. Any mapping not given will be reset to the > +# default (i.e. the identity mapping). So everything that isn’t set in the second call is reset. I thought about what you proposed (because I guess that’s the most intuitive idea), but after consideration I didn’t see why we’d need different behavior, so it would only serve to make the code more complicated. Max > The idea makes sense, and the interface seems usable. It's nice that > either source, destination, or both sides of migration can use it (which > helps in upgrade vs. downgrade scenarios). signature.asc Description: OpenPGP digital signature