date:20200514

Re: [PATCH v2 0/1] qemu-nbd: Close inherited stderr

2020-05-14 Thread Raphael Pour

On 5/14/20 4:55 PM, Eric Blake wrote:
> On 5/14/20 9:29 AM, Eric Blake wrote:
> 
>>> WARNING: Block comments use a leading /* on a separate line
>>> #20: FILE: qemu-nbd.c:919:
>>> +    /* Remember parents stderr only if the fork option is set.
>>>
> 
>> The comment could use some grammar help (s/parents/parent's/), and in
>> truth, I don't think it adds much beyond what the code itself is
>> already doing, so rather than adding another line to silence patchew,
>> you could instead just eliminate the comment and life would still be
>> fine.  Or if you want a one-line comment, I might suggest:
>>
>> /* Remember parent's stderr if we will restoring it. */
> 
> It helps if I don't hit 'send' too early.
> 
> /* Remember parent's stderr if we will be restoring it. */
> 

From e5749541494abcdcaa37d752172741e1bc38e984 Mon Sep 17 00:00:00 2001
From: Raphael Pour 
Date: Fri, 15 May 2020 08:30:50 +0200
Subject: [PATCH] qemu-nbd: Close inherited stderr

Close inherited stderr of the parent if fork_process is false.
Otherwise no one will close it. (introduced by e6df58a5)

Signed-off-by: Raphael Pour 
---
 qemu-nbd.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 4aa005004e..306e44fb0a 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -916,7 +916,11 @@ int main(int argc, char **argv)
 } else if (pid == 0) {
 close(stderr_fd[0]);

-old_stderr = dup(STDERR_FILENO);
+/* Remember parent's stderr if we will be restoring it. */
+if (fork_process) {
+old_stderr = dup(STDERR_FILENO);
+}
+
 ret = qemu_daemon(1, 0);

 /* Temporarily redirect stderr to the parent's pipe...  */
-- 
2.25.4


I fixed the issues and checkpatch gave me a 'ready for submission'.
Is this the right way to submit a fixed patch or should it be a v3?


-- 
Hetzner Online GmbH
Am Datacenter-Park 1
08223 Falkenstein/Vogtland
raphael.p...@hetzner.com
www.hetzner.com

Registergericht Ansbach, HRB 6089
Geschäftsführer: Martin Hetzner, Stephan Konvickova, Günther Müller



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v6 04/14] block/amend: separate amend and create options for qemu-img

2020-05-14 Thread Max Reitz

On 14.05.20 18:10, Eric Blake wrote:
> On 5/14/20 7:28 AM, Max Reitz wrote:
>> On 10.05.20 15:40, Maxim Levitsky wrote:
>>> Some options are only useful for creation
>>> (or hard to be amended, like cluster size for qcow2), while some other
>>> options are only useful for amend, like upcoming keyslot management
>>> options for luks
>>>
> 
>>>   +#define QCOW_COMMON_OPTIONS \
>>> +    {   \
> 
>>> +    .help = "Width of a reference count entry in bits", \
>>> +    .def_value_str = "16"   \
>>> +    }   \
>>
>> I think the last line should have a comma in it (otherwise the final
>> backslash doesn’t make much sense, because whenever we’d add a new
>> option, we would need to modify the line anyway to insert a comma).
> 
> Except that...
> 
>>
>> Speaking of adding option, this requires a rebase due to the
>> compression_type option added (not trivial in the strict sense, but
>> still straightforward to handle).
>>
>>> +
>>>   static QemuOptsList qcow2_create_opts = {
>>>   .name = "qcow2-create-opts",
>>>   .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
>>>   .desc = {
>>>
>>
>> [...]
>>
>>> +    QCOW_COMMON_OPTIONS,
>>> +    { /* end of list */ }
> 
> ...the intended usage is to use the macro name followed by a comma, so
> including a trailing comma in the macro itself would lead to a syntax
> error.

But why is that the indended usage?  Is there something in our coding
style that forbids macros that don’t allow a separator to be placed
after them?

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v3 0/3] various: Remove unnecessary casts

2020-05-14 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> Remove unnecessary casts using coccinelle scripts.
>
> The CPU()/OBJECT() patches don't introduce logical change,
> The DEVICE() one removes various OBJECT_CHECK() calls.

Queued, thanks!

Managing expecations: I'm not a QOM maintainer, I don't want to become
one, and I don't normally queue QOM patches :)

Re: [PATCH v2 0/5] fix migration with bitmaps and mirror

2020-05-14 Thread Vladimir Sementsov-Ogievskiy


14.05.2020 21:29, Eric Blake wrote:

reviving this thread...

On 4/3/20 6:23 AM, Peter Krempa wrote:

On Fri, Apr 03, 2020 at 14:02:47 +0300, Vladimir Sementsov-Ogievskiy wrote:

19.12.2019 13:36, Peter Krempa wrote:

On Thu, Dec 19, 2019 at 11:51:01 +0300, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

It's a continuation for
"bitmap migration bug with -drive while block mirror runs"
<315cff78-dcdb-a3ce-2742-da3cc9f0c...@redhat.com>
https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg07241.html

The problem is that bitmaps migrated to node with same node-name or
blk-parent name. And currently only the latter actually work in libvirt.
And with mirror-top filter it doesn't work, because
bdrv_get_device_or_node_name don't go through filters.


I want to point out that since libvirt-5.10 we use -blockdev to
configure the backend of storage devices with qemu-4.2 and later. This
means unfortunately that the BlockBackend of the drive does not have a
name any more and thus the above will not work even if you make the
lookup code to see through filters.


Not that this series doesn't make things worse, as it loops through named
block backends when trying to use their name for migration. So with these
patches applied, qemu will just work in more possible scenarios.


Okay, if that's so it's fair enough in this case.

I'm just very firmly against baking in the assumption that
node names mean the same thing accross migration, because that will
create a precedent situation and more stuff may be baked in on top of
this in the future. It seems that it has already happened though and
it's wrong. And the worst part is that it's never mentioned that this
might occur. But again, don't do that and preferrably remove the
matching of node names for bitmaps altogether until we can control it
arbitrarily.

We've also seen this already before with the backend name of memory
devices being baked in to the migration stream which creates an unwanted
dependancy.


Max is trying to tackle the node-name issue:
https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg03358.html

And trying to apply that patch after staging this series hits a conflict in 
mnigration/block-dirty-bitmap.c.  Which one should go in first?



My patches are needed to fix migration for the pre-blockdev configuration with 
mirror-top filter.

We ofcrouse need them in Virtuozzo, but it's not hard to keep the in 
downstream-only.. And it will be not simple to use new command from Max in 
pre-blockdev libvirt configuration, with auto-generated node-names.

How much we care about pre-blockdev libvirt now in upstream Qemu?

If we don't care, than these series are only for downstreams, and we don't need 
to apply them upstream..

On the other hand, Max have to resend anyway, to handle old code, which uses 
device name instead of node-name. And if we don't want to drop now the code 
which can use device name (needed for old libvirt), why not to apply the 
series, which just make old code better?



In other words: do we still support pre-blockdev libvirt (and any other 
pre-blockdev users)?

If we support, than, as I said somewhere, I need to resend these series as I 
have updated version in our downstream. And I think, I can rebase Max's patch 
by myself and send together with this all, if no objections.

--
Best regards,
Vladimir

Re: [PATCH v2] bitmaps: Update maintainer

2020-05-14 Thread Vladimir Sementsov-Ogievskiy


14.05.2020 21:00, Eric Blake wrote:

Dirty bitmaps are important to incremental backups, including exposure
over NBD where I'm already maintainer.  Also, I'm aware that lately I
have been doing as much code/review on bitmaps as John Snow who is
trying to scale back in order to focus elsewhere; and many of the
recent patches have come from Vladimir, who is also interested in
taking on maintainer duties, but would like to start with
co-maintainership.  Therefore, it's time to revamp the ownership of
this category, as agreed between the three of us.

Signed-off-by: Eric Blake 
---

v2: further tweak to maintainership, update T: listing

  MAINTAINERS | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d11f3cb97613..ae23062a51ac 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2001,8 +2001,9 @@ F: qapi/transaction.json
  T: git https://repo.or.cz/qemu/armbru.git block-next

  Dirty Bitmaps
-M: John Snow 
-R: Vladimir Sementsov-Ogievskiy 
+M: Eric Blake 
+M: Vladimir Sementsov-Ogievskiy 
+R: John Snow 
  L: qemu-block@nongnu.org
  S: Supported
  F: include/qemu/hbitmap.h
@@ -2013,7 +2014,7 @@ F: migration/block-dirty-bitmap.c
  F: util/hbitmap.c
  F: tests/test-hbitmap.c
  F: docs/interop/bitmaps.rst
-T: git https://github.com/jnsnow/qemu.git bitmaps
+T: git https://repo.or.cz/qemu/ericb.git bitmaps

  Character device backends
  M: Marc-André Lureau 



Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir

Re: [PATCH v3 2/3] various: Remove unnecessary OBJECT() cast

2020-05-14 Thread Markus Armbruster

Philippe Mathieu-Daudé  writes:

> The OBJECT() macro is defined as:
>
>   #define OBJECT(obj) ((Object *)(obj))
>
> Remove the unnecessary OBJECT() casts when we already know the
> pointer is of Object type.
>
> Patch created mechanically using spatch with this script:
>
>   @@
>   typedef Object;
>   Object *o;
>   @@
>   -   OBJECT(o)
>   +   o
>
> Acked-by: Cornelia Huck 
> Acked-by: Corey Minyard 
> Acked-by: John Snow 
> Reviewed-by: Richard Henderson 
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Markus Armbruster

Re: [PATCH v3 1/1] qemu_img: add cvtnum_full to print error reports

2020-05-14 Thread Eric Blake


On 5/13/20 8:36 AM, Eyal Moscovici wrote:

All calls to cvtnum check the return value and print the same error message more
or less. And so error reporting moved to cvtnum_full to reduce code
duplication and provide a single error message. Additionally, cvtnum now wraps
cvtnum_full with the existing default range of 0 to MAX_INT64.

Acked-by: Mark Kanda 
Signed-off-by: Eyal Moscovici 
---



-static int64_t cvtnum(const char *s)
+static int64_t cvtnum_full(const char *name, const char *value, int64_t min,
+   int64_t max)
  {
  int err;
-uint64_t value;
-
-err = qemu_strtosz(s, NULL, &value);
-if (err < 0) {
+uint64_t res;
+
+err = qemu_strtosz(value, NULL, &res);
+if (err < 0 && err != -ERANGE) {
+error_report("Invalid %s specified. You may use "
+ "k, M, G, T, P or E suffixes for ", name);
+error_report("kilobytes, megabytes, gigabytes, terabytes, "
+ "petabytes and exabytes.");


Consecutive error_report() calls each output a newline, which means your 
new output includes a trailing space.



@@ -572,16 +584,8 @@ static int img_create(int argc, char **argv)
  if (optind < argc) {
  int64_t sval;
  
-sval = cvtnum(argv[optind++]);

+sval = cvtnum("image size", argv[optind++]);
  if (sval < 0) {
-if (sval == -ERANGE) {
-error_report("Image size must be less than 8 EiB!");
-} else {
-error_report("Invalid image size specified! You may use k, M, "
-  "G, T, P or E suffixes for ");
-error_report("kilobytes, megabytes, gigabytes, terabytes, "
- "petabytes and exabytes.");
-}


True, that's what some of the old code was doing, but...


+++ b/tests/qemu-iotests/049.out


  
  qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- 1kilobyte

-qemu-img: Invalid image size specified! You may use k, M, G, T, P or E 
suffixes for
+qemu-img: Invalid image size specified. You may use k, M, G, T, P or E 
suffixes for


where it gets hairy is that our iotests _intentionally_ strip trailing 
space before comparing to expected output, because it is such a pain to 
commit files with trailing spaces into the repository.  We're better off 
making the expected output precisely match what qemu-img actually 
outputs, which means using this as an opportunity to fix qemu-img to not 
output trailing space in the first place.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH] hw/ide/ahci: Log lost IRQs

2020-05-14 Thread John Snow




On 5/4/20 5:48 AM, Philippe Mathieu-Daudé wrote:
> One might find interesting to look at AHCI IRQs.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/ide/ahci.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index 13d91e109a..fc82cbd5f1 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -1509,6 +1509,7 @@ static void ahci_cmd_done(IDEDMA *dma)
>  
>  static void ahci_irq_set(void *opaque, int n, int level)
>  {
> +qemu_log_mask(LOG_UNIMP, "ahci: IRQ#%d level:%d\n", n, level);
>  }
>  
>  static const IDEDMAOps ahci_dma_ops = {
> 

Reviewed-by: John Snow 

Sorry, just drowning in backlog. Thanks for the ping on IRC.

Acked-by: John Snow 

^ Feel free to take through trivial tree.

--js

Re: [PATCH] hw/ide: Make IDEDMAOps handlers take a const IDEDMA pointer

2020-05-14 Thread John Snow




On 5/12/20 3:49 PM, Philippe Mathieu-Daudé wrote:
> Handlers don't need to modify the IDEDMA structure.
> Make it const.
> 
> Signed-off-by: Philippe Mathieu-Daudé 

I'll trust your judgment. As long as it still compiles and passes
qtests, I'm happy if you're happy.

Acked-by: John Snow 

> ---
>  include/hw/ide/internal.h | 12 ++--
>  hw/ide/ahci.c | 18 +-
>  hw/ide/core.c |  6 +++---
>  hw/ide/macio.c|  6 +++---
>  hw/ide/pci.c  | 12 ++--
>  5 files changed, 27 insertions(+), 27 deletions(-)
> 
> diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h
> index 55da35d768..1a7869e85d 100644
> --- a/include/hw/ide/internal.h
> +++ b/include/hw/ide/internal.h
> @@ -322,12 +322,12 @@ typedef enum { IDE_HD, IDE_CD, IDE_CFATA } IDEDriveKind;
>  
>  typedef void EndTransferFunc(IDEState *);
>  
> -typedef void DMAStartFunc(IDEDMA *, IDEState *, BlockCompletionFunc *);
> -typedef void DMAVoidFunc(IDEDMA *);
> -typedef int DMAIntFunc(IDEDMA *, bool);
> -typedef int32_t DMAInt32Func(IDEDMA *, int32_t len);
> -typedef void DMAu32Func(IDEDMA *, uint32_t);
> -typedef void DMAStopFunc(IDEDMA *, bool);
> +typedef void DMAStartFunc(const IDEDMA *, IDEState *, BlockCompletionFunc *);
> +typedef void DMAVoidFunc(const IDEDMA *);
> +typedef int DMAIntFunc(const IDEDMA *, bool);
> +typedef int32_t DMAInt32Func(const IDEDMA *, int32_t len);
> +typedef void DMAu32Func(const IDEDMA *, uint32_t);
> +typedef void DMAStopFunc(const IDEDMA *, bool);
>  
>  struct unreported_events {
>  bool eject_request;
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index 13d91e109a..168d34e9f2 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -44,7 +44,7 @@ static int handle_cmd(AHCIState *s, int port, uint8_t slot);
>  static void ahci_reset_port(AHCIState *s, int port);
>  static bool ahci_write_fis_d2h(AHCIDevice *ad);
>  static void ahci_init_d2h(AHCIDevice *ad);
> -static int ahci_dma_prepare_buf(IDEDMA *dma, int32_t limit);
> +static int ahci_dma_prepare_buf(const IDEDMA *dma, int32_t limit);
>  static bool ahci_map_clb_address(AHCIDevice *ad);
>  static bool ahci_map_fis_address(AHCIDevice *ad);
>  static void ahci_unmap_clb_address(AHCIDevice *ad);
> @@ -1338,7 +1338,7 @@ out:
>  }
>  
>  /* Transfer PIO data between RAM and device */
> -static void ahci_pio_transfer(IDEDMA *dma)
> +static void ahci_pio_transfer(const IDEDMA *dma)
>  {
>  AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
>  IDEState *s = &ad->port.ifs[0];
> @@ -1397,7 +1397,7 @@ out:
>  }
>  }
>  
> -static void ahci_start_dma(IDEDMA *dma, IDEState *s,
> +static void ahci_start_dma(const IDEDMA *dma, IDEState *s,
> BlockCompletionFunc *dma_cb)
>  {
>  AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
> @@ -1406,7 +1406,7 @@ static void ahci_start_dma(IDEDMA *dma, IDEState *s,
>  dma_cb(s, 0);
>  }
>  
> -static void ahci_restart_dma(IDEDMA *dma)
> +static void ahci_restart_dma(const IDEDMA *dma)
>  {
>  /* Nothing to do, ahci_start_dma already resets s->io_buffer_offset.  */
>  }
> @@ -1415,7 +1415,7 @@ static void ahci_restart_dma(IDEDMA *dma)
>   * IDE/PIO restarts are handled by the core layer, but NCQ commands
>   * need an extra kick from the AHCI HBA.
>   */
> -static void ahci_restart(IDEDMA *dma)
> +static void ahci_restart(const IDEDMA *dma)
>  {
>  AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
>  int i;
> @@ -1432,7 +1432,7 @@ static void ahci_restart(IDEDMA *dma)
>   * Called in DMA and PIO R/W chains to read the PRDT.
>   * Not shared with NCQ pathways.
>   */
> -static int32_t ahci_dma_prepare_buf(IDEDMA *dma, int32_t limit)
> +static int32_t ahci_dma_prepare_buf(const IDEDMA *dma, int32_t limit)
>  {
>  AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
>  IDEState *s = &ad->port.ifs[0];
> @@ -1453,7 +1453,7 @@ static int32_t ahci_dma_prepare_buf(IDEDMA *dma, 
> int32_t limit)
>   * Called via dma_buf_commit, for both DMA and PIO paths.
>   * sglist destruction is handled within dma_buf_commit.
>   */
> -static void ahci_commit_buf(IDEDMA *dma, uint32_t tx_bytes)
> +static void ahci_commit_buf(const IDEDMA *dma, uint32_t tx_bytes)
>  {
>  AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
>  
> @@ -1461,7 +1461,7 @@ static void ahci_commit_buf(IDEDMA *dma, uint32_t 
> tx_bytes)
>  ad->cur_cmd->status = cpu_to_le32(tx_bytes);
>  }
>  
> -static int ahci_dma_rw_buf(IDEDMA *dma, bool is_write)
> +static int ahci_dma_rw_buf(const IDEDMA *dma, bool is_write)
>  {
>  AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
>  IDEState *s = &ad->port.ifs[0];
> @@ -1486,7 +1486,7 @@ static int ahci_dma_rw_buf(IDEDMA *dma, bool is_write)
>  return 1;
>  }
>  
> -static void ahci_cmd_done(IDEDMA *dma)
> +static void ahci_cmd_done(const IDEDMA *dma)
>  {
>  AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
>  
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index 689bb36409..d997a7

[PATCH 3/3] iotests: Categorize NOTRUN messages as INFO, not WARNING

2020-05-14 Thread John Snow

It's not really a warning; we don't want to see it if we're not running
in at least a verbose mode. We can see it on the test summary just fine.

Signed-off-by: John Snow 
---
 tests/qemu-iotests/iotests.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 9231767acf..8e479e1c6f 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -986,7 +986,7 @@ def notrun(reason):
 seq = os.path.basename(sys.argv[0])
 
 open('%s/%s.notrun' % (output_dir, seq), 'w').write(reason + '\n')
-logger.warning("%s not run: %s", seq, reason)
+logger.info("%s not run: %s", seq, reason)
 sys.exit(0)
 
 def case_notrun(reason):
-- 
2.21.1

[PATCH 2/3] iotests: log to stderr instead of stdout

2020-05-14 Thread John Snow

Separate the streams; stdout is for test diff output, stderr is for
control messages and things for the human to look at.

(Cough, unfortunately, I didn't realize that ./check actually just
always redirects both, so even on STDERR, you can't see warnings.
Oh well...)

Signed-off-by: John Snow 
---
 tests/qemu-iotests/iotests.py | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 1caa7812de..9231767acf 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -1171,7 +1171,11 @@ def execute_setup_common(supported_fmts: Sequence[str] = 
(),
 debug = '-d' in sys.argv
 if debug:
 sys.argv.remove('-d')
-logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN))
+
+logging.basicConfig(
+level=logging.DEBUG if debug else logging.WARN,
+stream=sys.stderr,
+)
 
 _verify_image_format(supported_fmts, unsupported_fmts)
 _verify_protocol(supported_protocols, unsupported_protocols)
-- 
2.21.1

[PATCH 0/3] iotests: enable logging prior to notrun() invocation

2020-05-14 Thread John Snow

Hi, you can take just patch 1. patches 2-3 admittedly don't do a whole
heck of a lot, because I didn't realize that ./check discards *all*
output from either stdout or stderr.

The changes are tiny, though, and maybe still worth doing in the long
run? Hm. They are archived on the list now, anyway.

--js

John Snow (3):
  iotests: log messages from notrun()
  iotests: log to stderr instead of stdout
  iotests: Categorize NOTRUN messages as INFO, not WARNING

 tests/qemu-iotests/iotests.py | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

-- 
2.21.1

[PATCH 1/3] iotests: log messages from notrun()

2020-05-14 Thread John Snow

Shift the logging initialization up to occur prior to validation checks,
so that notrun() messages still get printed to console.

(Also, remove the "debugging messages active" message, because we don't
need to see that hundreds of times per iotest suite run.)

Signed-off-by: John Snow 
---
 tests/qemu-iotests/iotests.py | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 6c0e781af7..1caa7812de 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -1168,18 +1168,17 @@ def execute_setup_common(supported_fmts: Sequence[str] 
= (),
 sys.stderr.write('Please run this test via the "check" script\n')
 sys.exit(os.EX_USAGE)
 
+debug = '-d' in sys.argv
+if debug:
+sys.argv.remove('-d')
+logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN))
+
 _verify_image_format(supported_fmts, unsupported_fmts)
 _verify_protocol(supported_protocols, unsupported_protocols)
 _verify_platform(supported=supported_platforms)
 _verify_cache_mode(supported_cache_modes)
 _verify_aio_mode(supported_aio_modes)
 
-debug = '-d' in sys.argv
-if debug:
-sys.argv.remove('-d')
-logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN))
-logger.debug("iotests debugging messages active")
-
 return debug
 
 def execute_test(*args, test_function=None, **kwargs):
-- 
2.21.1

Re: [PATCH v10 14/14] iotests: use python logging for iotests.log()

2020-05-14 Thread John Snow

On 5/14/20 6:06 AM, Kevin Wolf wrote:
> Am 14.05.2020 um 08:24 hat John Snow geschrieben:
>> On 3/31/20 9:44 AM, Kevin Wolf wrote:
>>> Am 31.03.2020 um 02:00 hat John Snow geschrieben:
 We can turn logging on/off globally instead of per-function.

 Remove use_log from run_job, and use python logging to turn on
 diffable output when we run through a script entry point.

 iotest 245 changes output order due to buffering reasons.

 An extended note on python logging:

 A NullHandler is added to `qemu.iotests` to stop output from being
 generated if this code is used as a library without configuring logging.
 A NullHandler is only needed at the root, so a duplicate handler is not
 needed for `qemu.iotests.diff_io`.

 When logging is not configured, messages at the 'WARNING' levels or
 above are printed with default settings. The NullHandler stops this from
 occurring, which is considered good hygiene for code used as a library.

 See https://docs.python.org/3/howto/logging.html#library-config

 When logging is actually enabled (always at the behest of an explicit
 call by a client script), a root logger is implicitly created at the
 root, which allows messages to propagate upwards and be handled/emitted
 from the root logger with default settings.

 When we want iotest logging, we attach a handler to the
 qemu.iotests.diff_io logger and disable propagation to avoid possible
 double-printing.

 For more information on python logging infrastructure, I highly
 recommend downloading the pip package `logging_tree`, which provides
 convenient visualizations of the hierarchical logging configuration
 under different circumstances.

 See https://pypi.org/project/logging_tree/ for more information.

 Signed-off-by: John Snow 
 Reviewed-by: Max Reitz 
>>>
>>> Should we enable logger if -d is given?
>>>
>>> Previously we had:
>>>
>>> $ ./check -d -T -raw 281
>>> [...]
>>> 281 not run: not suitable for this image format: raw
>>> 281  not run[15:39:03] [15:39:04]not suitable 
>>> for this image format: raw
>>> Not run: 281
>>>
>>> After this series, the first line of output from notrun() is missing.
>>> Not that I think it's important to have the line, but as long as we
>>> bother to call logger.warning(), I thought that maybe we want to be able
>>> to actually see the effect of it somehwere?
>>>
>>> Kevin
>>>
>>
>> Uh, okay. So this is weirder than I thought it was going to be!
>>
>> So, if you move the debug configuration up above the _verify calls,
>> you'll see the message printed out to the debug stream:
>>
>> DEBUG:qemu.iotests:iotests debugging messages active
>> WARNING:qemu.iotests:281 not run: not suitable for this image format: raw
>>
>> ...but if you omit the `-d` flag, the message vanishes into a black
>> hole. Did it always work like that ...?
> 
> Yes, this is how it used to work. It's a result of ./check only printing
> the test output with -d, and such log messages are basically just test
> output.
> 
> And I think it's exactly what we want: Without -d, you want only the
> summary, i.e. a single line that says "pass", "fail" or "notrun",
> potentially with a small note at the end of the line, but that's it.
> 

OK, maybe. So I guess what happens here is that if you don't use -d, the
output gets redirected to file, and that file is summarily deleted.

Your phrase "but as long as we bother to call logger.warning(), I
thought that maybe we want to be able to actually see the effect of it
somewhere" stuck with me -- I think you're right.

I kind of do expect that if I call a function called warning() that it's
gonna do some damage. principle of least surprise, etc.

So two things:

(1) Maybe the iotest logger ought to always use stderr, and we should
see any calls to warning() or error() even when debugging is off.

(2) These skip notifications are not warnings, they are informational
and can be disabled when `-d` is omitted. (Especially because they are
represented through another channel.)

--js

(I'll send the fixup for the simpler thing first, and you can take or
leave the second thing.)

Re: [PATCH v4 1/3] qmp.py: change event_wait to use a dict

2020-05-14 Thread John Snow

On 5/14/20 11:59 AM, Kevin Wolf wrote:
> Am 14.05.2020 um 17:07 hat John Snow geschrieben:
>>
>>
>> On 5/14/20 10:47 AM, Kevin Wolf wrote:
>>> Am 14.05.2020 um 04:25 hat John Snow geschrieben:
 It's easier to work with than a list of tuples, because we can check the
 keys for membership.

 Signed-off-by: John Snow 
 ---
  python/qemu/machine.py| 10 +-
  tests/qemu-iotests/040| 12 ++--
  tests/qemu-iotests/260|  5 +++--
  tests/qemu-iotests/iotests.py | 16 
  4 files changed, 22 insertions(+), 21 deletions(-)
>>>
>>> I think you need to convert scripts/simplebench/bench_block_job.py, too.
>>
>> Oh, right -- that one is new since I did this. A lot of these scripts
>> need to be moved over into the python/ directory and managed under the
>> same pylint/mypy regime as everything else.
>>
>> A *ton* of our scripts are in various states of disrepair.
> 
> Is python/ actually supposed to have executable files in it? I thought
> it was more for libraries.
> 

Welll. At the moment it's library only. but one of the
things you can do with a library is define executable entry-points into
that library.

If you haven't cast an eye at that 32 patch series yet, it basically
creates a structure like this:

> ./python/qemu/lib/[qmp|machine|qtest|accel].py

qemu/ forms a PEP420 namespace; the idea is to be able to modularly
create and independently package subpackages.

qemu/lib forms a proper python package in which there are no
executables, just a library, as you say.

My idea is that anything under python/*/ ought to form a properly
formatted package. So we could, for instance, have a
python/qemu/devtools namespace which packages and collects a bunch of
our little scripts.

Then we could make sure we hit them with the same
mypy/pylint/flake8/whatever as the core libraries those scripts are
using to keep them in sync better.

And, ideally, if they are all using the same kind of paradigms for
import and dependency management it will be easier to use them and keep
them up to date, etc.

For using them as a developer, you could, say,
cd  ~/src/qemu/python
pip3 install --user -e .

and install the source packages to your local environment and then have
access to e.g.

> qmp-shell

right on your CLI, without having to fuss with PYTHONPATH or anything
else. As you update the source repo, you'll get the new versions of the
package living in your python environment automatically.

Of course, this maybe has downsides too; so you can always use a virtual
environment to adopt a context in which you have these tools. For that,

> pip3 install --user pipenv  # or use dnf, or apt, w/e.
> cd ~/src/qemu/python
> pipenv shell
> pip install -e .

And from here you'll have the dev package installed to a development
venv that you can use.

*cough* anyway, that's wildly off-topic.

Generally, you want to format a library such that you have a callable
entry point, maybe named main(). So you'd have some qmp-shell module and
it has a main() function.

Then, in the setup.py script, you'd define qemu.lib.qmp_shell:main() as
an entry point and give it a name like 'qmp-shell'. When pip/setuptools
processes your package installation, it'll create a shim for you in e.g.
~/.local/bin/qmp-shell that will just load the library and execute that
entrypoint for you.

I was thinking I'd do this for all of our python scripts so I could
spend my energy on a pylint/mypy test infrastructure *once* and *in one
place* and then it would be easier to detect regressions for scripts
that don't actually run as part of the test suite.

>>>
 diff --git a/python/qemu/machine.py b/python/qemu/machine.py
 index b9a98e2c86..eaedc25172 100644
 --- a/python/qemu/machine.py
 +++ b/python/qemu/machine.py
 @@ -478,21 +478,21 @@ def event_wait(self, name, timeout=60.0, match=None):
  timeout: QEMUMonitorProtocol.pull_event timeout parameter.
  match: Optional match criteria. See event_match for details.
  """
 -return self.events_wait([(name, match)], timeout)
 +return self.events_wait({name: match}, timeout)

  def events_wait(self, events, timeout=60.0):
  """
  events_wait waits for and returns a named event from QMP with a 
 timeout.

 -events: a sequence of (name, match_criteria) tuples.
 +events: a mapping containing {name: match_criteria}.
  The match criteria are optional and may be None.
  See event_match for details.  timeout:
  QEMUMonitorProtocol.pull_event timeout parameter.
  """
  def _match(event):
 -for name, match in events:
 -if event['event'] == name and self.event_match(event, 
 match):
 -return True
 +name = event['event']
 +

Re: [PATCH v4 3/3] iotests: modify test 040 to use JobRunner

2020-05-14 Thread John Snow

On 5/14/20 11:53 AM, Kevin Wolf wrote:
> Am 14.05.2020 um 04:25 hat John Snow geschrieben:
>> Instead of having somewhat reproduced it for itself.
>>
>> Signed-off-by: John Snow 
> 
> I think you should pass auto_dismiss=True to the JobRunner, or (probably
> preferable) change prepare_and_start_job() to start the job with
> auto_dismiss=False.
> 
> Kevin
> 

okay, I'll try that out and see if I like it.

Wild tangents, as is my normal:

I also think it would be neat, in some sense, to provide a job creation
abstraction where creating the QMP command in python also creates the
runner with the right parameters based on how you initialized it.

I've not given these even a proper three minutes think, but some
generalized interface for managing the creation of jobs to use in
concert with the job runner would be slick.

(What reminds me of this is needing to remember and understand if I
started something with auto_dismiss or not, which jobs it defaults to
which for, etc. Streamlining the creation and runner could be slick for
faster test-writing in normative cases.)

Re: [PATCH v4 2/3] iotests: add JobRunner class

2020-05-14 Thread John Snow




On 5/14/20 11:40 AM, Kevin Wolf wrote:
> Am 14.05.2020 um 04:25 hat John Snow geschrieben:
>> The idea is that instead of increasing the arguments to job_run all the
>> time, create a more general-purpose job runner that can be subclassed to
>> do interesting things with.
>>
>> pylint note: the 'callbacks' option guards against unused warning
>> arguments in functions designated as callbacks. It does not currently
>> guard against "no-self-use" though; hence a once-off ignore.
>>
>> mypy note: QapiEvent is only a weak alias; it's fully interchangable
>> with the type it's declared as. In the future, we may wish to tighten
>> these types. For now, this communicates the rough shape of the type and
>> (more importantly) the intent.
>>
>> Signed-off-by: John Snow 
> 
>> +# Listen for these events with these parameters:
>> +self._events = {
>> +'BLOCK_JOB_COMPLETED': match_device,
>> +'BLOCK_JOB_CANCELLED': match_device,
>> +'BLOCK_JOB_ERROR': match_device,
>> +'BLOCK_JOB_READY': match_device,
>> +'BLOCK_JOB_PENDING': match_id,
>> +'JOB_STATUS_CHANGE': match_id
>> +}
> 
> The old code had a trailing comma here in case we need to add more
> events later. Anyway:
> 
> Reviewed-by: Kevin Wolf 
> 

Whoops. I favor those too, so I'll put it back.

Re: [PATCH v2] bitmaps: Update maintainer

2020-05-14 Thread John Snow




On 5/14/20 2:00 PM, Eric Blake wrote:
> Dirty bitmaps are important to incremental backups, including exposure
> over NBD where I'm already maintainer.  Also, I'm aware that lately I
> have been doing as much code/review on bitmaps as John Snow who is
> trying to scale back in order to focus elsewhere; and many of the
> recent patches have come from Vladimir, who is also interested in
> taking on maintainer duties, but would like to start with
> co-maintainership.  Therefore, it's time to revamp the ownership of
> this category, as agreed between the three of us.

Great!

> 
> Signed-off-by: Eric Blake 
> ---
> 
> v2: further tweak to maintainership, update T: listing
> 
>  MAINTAINERS | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d11f3cb97613..ae23062a51ac 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2001,8 +2001,9 @@ F: qapi/transaction.json
>  T: git https://repo.or.cz/qemu/armbru.git block-next
> 
>  Dirty Bitmaps
> -M: John Snow 
> -R: Vladimir Sementsov-Ogievskiy 
> +M: Eric Blake 
> +M: Vladimir Sementsov-Ogievskiy 
> +R: John Snow 
>  L: qemu-block@nongnu.org
>  S: Supported
>  F: include/qemu/hbitmap.h
> @@ -2013,7 +2014,7 @@ F: migration/block-dirty-bitmap.c
>  F: util/hbitmap.c
>  F: tests/test-hbitmap.c
>  F: docs/interop/bitmaps.rst
> -T: git https://github.com/jnsnow/qemu.git bitmaps
> +T: git https://repo.or.cz/qemu/ericb.git bitmaps
> 
>  Character device backends
>  M: Marc-André Lureau 
> 

Acked-by: John Snow 
Reviewed-by: John Snow 

You'll want to work out repo access betwixt yourselves, but I'll leave
that detail for you to work out.

Thank you,
--js

Re: [PATCH v2 4/5] migration/block-dirty-bitmap: fix bitmaps migration during mirror job

2020-05-14 Thread Eric Blake


On 12/19/19 2:51 AM, Vladimir Sementsov-Ogievskiy wrote:

Important thing for bitmap migration is to select destination block
node to obtain the migrated bitmap.

Prepatch, on source we use bdrv_get_device_or_node_name() to identify
the node, and on target we do bdrv_lookup_bs.
bdrv_get_device_or_node_name() returns blk name only for direct
children of blk. So, bitmaps of direct children of blks are migrated by
blk name and others - by node name.

Libvirt currently is unprepared to bitmap migration by node-name,
node-names are mostly auto-generated. So actually only migration by blk
name works.


It depends on whether -blockdev is in use.  With -blockdev, libvirt 
should be supplying a node name everywhere, without, it is only device 
names.




Now, consider classic libvirt migrations assisted by mirror block job:
mirror block job inserts filter, so our source is not a direct child of
blk, and bitmaps are migrated by node-names. And this just don't work.


Does Max' work to improve seeing through filters fix this?



Let's fix it by allowing use blk-name even if some implicit filters are
inserted.

Note, that we possibly want to allow explicit filters skipping too, but
this is another story.

Note2: we, of course, can't skip filters and use blk name to migrate
bitmaps in filtered node by blk name for this blk if these filters have
named bitmaps which should be migrated.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1652424


That bug has been marked CLOSED in the meantime, but it appears to be 
only because libvirt is now using -blockdev rather than the older drive, 
while the problem affects drive usage.



Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 39 +-
  1 file changed, 38 insertions(+), 1 deletion(-)




Okay, after reading some more history on this project (the curse of 
coming up to speed after volunteering to become a co-maintainer), it 
looks like Max's idea replaces this patch altogether.  How much of the 
rest of the series is still important?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 1/5] block: Mark commit and mirror as filter drivers

2020-05-14 Thread Eric Blake


On 1/31/20 1:23 PM, Eric Blake wrote:

On 12/19/19 2:51 AM, Vladimir Sementsov-Ogievskiy wrote:

From: Max Reitz 

The commit and mirror block nodes are filters, so they should be marked
as such.

Signed-off-by: Max Reitz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
    [squash comment fix from another Max's patch and adjust commit msg]
---
  include/block/block_int.h | 8 +---
  block/commit.c    | 2 ++
  block/mirror.c    | 2 ++
  3 files changed, 9 insertions(+), 3 deletions(-)


Reviewed-by: Eric Blake 


It looks like this patch has been updated and is now on Kevin's block 
branch:

https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg03271.html


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 3/5] block/dirty-bitmap: add bdrv_has_named_bitmaps helper

2020-05-14 Thread Eric Blake


On 12/19/19 2:51 AM, Vladimir Sementsov-Ogievskiy wrote:

To be used for bitmap migration in further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/dirty-bitmap.h |  1 +
  block/dirty-bitmap.c | 13 +
  2 files changed, 14 insertions(+)



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 0/5] fix migration with bitmaps and mirror

2020-05-14 Thread Eric Blake


reviving this thread...

On 4/3/20 6:23 AM, Peter Krempa wrote:

On Fri, Apr 03, 2020 at 14:02:47 +0300, Vladimir Sementsov-Ogievskiy wrote:

19.12.2019 13:36, Peter Krempa wrote:

On Thu, Dec 19, 2019 at 11:51:01 +0300, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

It's a continuation for
"bitmap migration bug with -drive while block mirror runs"
<315cff78-dcdb-a3ce-2742-da3cc9f0c...@redhat.com>
https://lists.gnu.org/archive/html/qemu-devel/2019-09/msg07241.html

The problem is that bitmaps migrated to node with same node-name or
blk-parent name. And currently only the latter actually work in libvirt.
And with mirror-top filter it doesn't work, because
bdrv_get_device_or_node_name don't go through filters.


I want to point out that since libvirt-5.10 we use -blockdev to
configure the backend of storage devices with qemu-4.2 and later. This
means unfortunately that the BlockBackend of the drive does not have a
name any more and thus the above will not work even if you make the
lookup code to see through filters.


Not that this series doesn't make things worse, as it loops through named
block backends when trying to use their name for migration. So with these
patches applied, qemu will just work in more possible scenarios.


Okay, if that's so it's fair enough in this case.

I'm just very firmly against baking in the assumption that
node names mean the same thing accross migration, because that will
create a precedent situation and more stuff may be baked in on top of
this in the future. It seems that it has already happened though and
it's wrong. And the worst part is that it's never mentioned that this
might occur. But again, don't do that and preferrably remove the
matching of node names for bitmaps altogether until we can control it
arbitrarily.

We've also seen this already before with the backend name of memory
devices being baked in to the migration stream which creates an unwanted
dependancy.


Max is trying to tackle the node-name issue:
https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg03358.html

And trying to apply that patch after staging this series hits a conflict 
in mnigration/block-dirty-bitmap.c.  Which one should go in first?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[PATCH v2] bitmaps: Update maintainer

2020-05-14 Thread Eric Blake

Dirty bitmaps are important to incremental backups, including exposure
over NBD where I'm already maintainer.  Also, I'm aware that lately I
have been doing as much code/review on bitmaps as John Snow who is
trying to scale back in order to focus elsewhere; and many of the
recent patches have come from Vladimir, who is also interested in
taking on maintainer duties, but would like to start with
co-maintainership.  Therefore, it's time to revamp the ownership of
this category, as agreed between the three of us.

Signed-off-by: Eric Blake 
---

v2: further tweak to maintainership, update T: listing

 MAINTAINERS | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d11f3cb97613..ae23062a51ac 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2001,8 +2001,9 @@ F: qapi/transaction.json
 T: git https://repo.or.cz/qemu/armbru.git block-next

 Dirty Bitmaps
-M: John Snow 
-R: Vladimir Sementsov-Ogievskiy 
+M: Eric Blake 
+M: Vladimir Sementsov-Ogievskiy 
+R: John Snow 
 L: qemu-block@nongnu.org
 S: Supported
 F: include/qemu/hbitmap.h
@@ -2013,7 +2014,7 @@ F: migration/block-dirty-bitmap.c
 F: util/hbitmap.c
 F: tests/test-hbitmap.c
 F: docs/interop/bitmaps.rst
-T: git https://github.com/jnsnow/qemu.git bitmaps
+T: git https://repo.or.cz/qemu/ericb.git bitmaps

 Character device backends
 M: Marc-André Lureau 
-- 
2.26.2

Re: [PATCH] bitmaps: Add myself as maintainer

2020-05-14 Thread Vladimir Sementsov-Ogievskiy


14.05.2020 16:52, Eric Blake wrote:

On 5/14/20 12:08 AM, John Snow wrote:



On 5/14/20 12:49 AM, Vladimir Sementsov-Ogievskiy wrote:

13.05.2020 23:24, John Snow wrote:



On 5/13/20 10:14 AM, Eric Blake wrote:

Dirty bitmaps are important to incremental backups, including exposure
over NBD where I'm already maintainer.  Also, I'm aware that lately I
have been doing as much code/review on bitmaps as John Snow, who is
hoping to scale back on this front.

Signed-off-by: Eric Blake 

---



   Dirty Bitmaps
   M: John Snow 
+M: Eric Blake 
   R: Vladimir Sementsov-Ogievskiy 
   L: qemu-block@nongnu.org
   S: Supported



I'd also like to point out that I wouldn't mind if Vladimir became an
official maintainer, but I can't remember if he wanted the title when we
last spoke at KVM Forum.


Actually, it would be nice, I'd glad to get it, thanks :)
I can send a separate patch, or we may s/R/M/ in this one?



That would be very good!

I'd be quite happy to be demoted to reviewer; it's about all the time
I've been truthfully able to give lately.

(I won't speak for Eric!)


I can post a v2 that produces the following results:

M: Vladimir
M: Eric
R: John

Does that sound reasonable?



Great!


--
Best regards,
Vladimir

Re: [PATCH 0/2] block/block-copy: Fix uninitialized variable in block_copy_task_entry

2020-05-14 Thread Kevin Wolf

Am 07.05.2020 um 14:11 hat Philippe Mathieu-DaudÃ© geschrieben:
> block_copy_task_entry() might be written differently to avoid
> the initialization. This silents the warning and let me build.

Thanks, applied to the block branch.

Kevin

Re: [PATCH v6 04/14] block/amend: separate amend and create options for qemu-img

2020-05-14 Thread Eric Blake


On 5/14/20 7:28 AM, Max Reitz wrote:

On 10.05.20 15:40, Maxim Levitsky wrote:

Some options are only useful for creation
(or hard to be amended, like cluster size for qcow2), while some other
options are only useful for amend, like upcoming keyslot management
options for luks



  
+#define QCOW_COMMON_OPTIONS \

+{   \



+.help = "Width of a reference count entry in bits", \
+.def_value_str = "16"   \
+}   \


I think the last line should have a comma in it (otherwise the final
backslash doesn’t make much sense, because whenever we’d add a new
option, we would need to modify the line anyway to insert a comma).


Except that...



Speaking of adding option, this requires a rebase due to the
compression_type option added (not trivial in the strict sense, but
still straightforward to handle).


+
  static QemuOptsList qcow2_create_opts = {
  .name = "qcow2-create-opts",
  .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
  .desc = {



[...]


+QCOW_COMMON_OPTIONS,
+{ /* end of list */ }


...the intended usage is to use the macro name followed by a comma, so 
including a trailing comma in the macro itself would lead to a syntax 
error.  I think the better fix is to eliminate the trailing \ on the 
final line, and have '}' without a trailing comma in the macro.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v6 13/14] block/qcow2: implement blockdev-amend

2020-05-14 Thread Max Reitz

On 10.05.20 15:40, Maxim Levitsky wrote:
> Currently the implementation only supports amending the encryption
> options, unlike the qemu-img version
> 
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  block/qcow2.c| 39 +++
>  qapi/block-core.json | 16 +++-
>  2 files changed, 54 insertions(+), 1 deletion(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v6 12/14] block/crypto: implement blockdev-amend

2020-05-14 Thread Max Reitz

On 10.05.20 15:40, Maxim Levitsky wrote:
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  block/crypto.c   | 72 
>  qapi/block-core.json | 14 -
>  2 files changed, 66 insertions(+), 20 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v4 1/3] qmp.py: change event_wait to use a dict

2020-05-14 Thread Kevin Wolf

Am 14.05.2020 um 17:07 hat John Snow geschrieben:
> 
> 
> On 5/14/20 10:47 AM, Kevin Wolf wrote:
> > Am 14.05.2020 um 04:25 hat John Snow geschrieben:
> >> It's easier to work with than a list of tuples, because we can check the
> >> keys for membership.
> >>
> >> Signed-off-by: John Snow 
> >> ---
> >>  python/qemu/machine.py| 10 +-
> >>  tests/qemu-iotests/040| 12 ++--
> >>  tests/qemu-iotests/260|  5 +++--
> >>  tests/qemu-iotests/iotests.py | 16 
> >>  4 files changed, 22 insertions(+), 21 deletions(-)
> > 
> > I think you need to convert scripts/simplebench/bench_block_job.py, too.
> 
> Oh, right -- that one is new since I did this. A lot of these scripts
> need to be moved over into the python/ directory and managed under the
> same pylint/mypy regime as everything else.
> 
> A *ton* of our scripts are in various states of disrepair.

Is python/ actually supposed to have executable files in it? I thought
it was more for libraries.

> > 
> >> diff --git a/python/qemu/machine.py b/python/qemu/machine.py
> >> index b9a98e2c86..eaedc25172 100644
> >> --- a/python/qemu/machine.py
> >> +++ b/python/qemu/machine.py
> >> @@ -478,21 +478,21 @@ def event_wait(self, name, timeout=60.0, match=None):
> >>  timeout: QEMUMonitorProtocol.pull_event timeout parameter.
> >>  match: Optional match criteria. See event_match for details.
> >>  """
> >> -return self.events_wait([(name, match)], timeout)
> >> +return self.events_wait({name: match}, timeout)
> >>  
> >>  def events_wait(self, events, timeout=60.0):
> >>  """
> >>  events_wait waits for and returns a named event from QMP with a 
> >> timeout.
> >>  
> >> -events: a sequence of (name, match_criteria) tuples.
> >> +events: a mapping containing {name: match_criteria}.
> >>  The match criteria are optional and may be None.
> >>  See event_match for details.  timeout:
> >>  QEMUMonitorProtocol.pull_event timeout parameter.
> >>  """
> >>  def _match(event):
> >> -for name, match in events:
> >> -if event['event'] == name and self.event_match(event, 
> >> match):
> >> -return True
> >> +name = event['event']
> >> +if name in events:
> >> +return self.event_match(event, events[name])
> > 
> > This part confused me a bit for a second. Of course, that's probably
> > mostly just me, but I feel 'events' isn't a good name any more when the
> > values of the dict are match strings rather than events.
> > 
> 
> This is honestly a really bad function. When I was trying to type
> everything, this one was at the bottom of the pile and it was the worst.
> 
> It needs an overhaul.
> 
> In my 32 patch series, I left the "match" types as "Any" pretty much
> everywhere, because it's such a laissez-faire series of functions.

It would require recursive types, which aren't supported yet. So I guess
Any is the best we can do at the moment.

> I'll keep the feedback in mind.
> 
> >>  return False
> >>  
> >>  # Search cached events
> >> diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
> >> index 32c82b4ec6..90b59081ff 100755
> >> --- a/tests/qemu-iotests/040
> >> +++ b/tests/qemu-iotests/040
> >> @@ -485,12 +485,12 @@ class TestErrorHandling(iotests.QMPTestCase):
> >>  
> >>  def run_job(self, expected_events, error_pauses_job=False):
> >>  match_device = {'data': {'device': 'job0'}}
> >> -events = [
> >> -('BLOCK_JOB_COMPLETED', match_device),
> >> -('BLOCK_JOB_CANCELLED', match_device),
> >> -('BLOCK_JOB_ERROR', match_device),
> >> -('BLOCK_JOB_READY', match_device),
> >> -]
> >> +events = {
> >> +'BLOCK_JOB_COMPLETED': match_device,
> >> +'BLOCK_JOB_CANCELLED': match_device,
> >> +'BLOCK_JOB_ERROR': match_device,
> >> +'BLOCK_JOB_READY': match_device,
> >> +}
> >>  
> >>  completed = False
> >>  log = []
> >> diff --git a/tests/qemu-iotests/260 b/tests/qemu-iotests/260
> >> index 804a7addb9..729f031122 100755
> >> --- a/tests/qemu-iotests/260
> >> +++ b/tests/qemu-iotests/260
> >> @@ -67,8 +67,9 @@ def test(persistent, restart):
> >>  
> >>  vm.qmp_log('block-commit', device='drive0', top=top,
> >> filters=[iotests.filter_qmp_testfiles])
> >> -ev = vm.events_wait((('BLOCK_JOB_READY', None),
> >> - ('BLOCK_JOB_COMPLETED', None)))
> >> +ev = vm.events_wait({
> >> +'BLOCK_JOB_READY': None,
> >> +'BLOCK_JOB_COMPLETED': None })
> > 
> > So, I'm not sure if this is nitpicking or rather bikeshedding, but
> > having the closing brackets on the next line would be more consistent
> > with the other instances in this patch.
> > 
> 
> Nah, it's fine. I'll

Re: [PATCH v4 3/3] iotests: modify test 040 to use JobRunner

2020-05-14 Thread Kevin Wolf

Am 14.05.2020 um 04:25 hat John Snow geschrieben:
> Instead of having somewhat reproduced it for itself.
> 
> Signed-off-by: John Snow 

I think you should pass auto_dismiss=True to the JobRunner, or (probably
preferable) change prepare_and_start_job() to start the job with
auto_dismiss=False.

Kevin

Re: [PATCH v6 11/14] block/core: add generic infrastructure for x-blockdev-amend qmp command

2020-05-14 Thread Max Reitz

On 10.05.20 15:40, Maxim Levitsky wrote:
> blockdev-amend will be used similiar to blockdev-create
> to allow on the fly changes of the structure of the format based block 
> devices.
> 
> Current plan is to first support encryption keyslot management for luks
> based formats (raw and embedded in qcow2)
> 
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  block/Makefile.objs   |   2 +-
>  block/amend.c | 108 ++
>  include/block/block_int.h |  21 +---
>  qapi/block-core.json  |  42 +++
>  qapi/job.json |   4 +-
>  5 files changed, 169 insertions(+), 8 deletions(-)
>  create mode 100644 block/amend.c

[...]

> diff --git a/block/amend.c b/block/amend.c
> new file mode 100644
> index 00..4840c0ffef
> --- /dev/null
> +++ b/block/amend.c

[...]

> +void qmp_x_blockdev_amend(const char *job_id,
> +  const char *node_name,
> +  BlockdevAmendOptions *options,
> +  bool has_force,
> +  bool force,
> +  Error **errp)
> +{
> +BlockdevAmendJob *s;
> +const char *fmt = BlockdevDriver_str(options->driver);
> +BlockDriver *drv = bdrv_find_format(fmt);
> +BlockDriverState *bs = bdrv_find_node(node_name);
> +
> +/*
> + * If the driver is in the schema, we know that it exists.

I wonder why create.c then still checks whether drv == NULL.

I wouldn’t count on the schema.  For example, with modularized block
driver I could imagine that a driver appears in the schema, but loading
the module fails, so that drv still ends up NULL.

> But it may not
> + * be whitelisted.
> + */
> +assert(drv);
> +if (bdrv_uses_whitelist() && !bdrv_is_whitelisted(drv, false)) {
> +error_setg(errp, "Driver is not whitelisted");
> +return;
> +}

[...]

> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 0a71357b50..fdb0cdbcdd 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -133,12 +133,27 @@ struct BlockDriver {
>  int (*bdrv_file_open)(BlockDriverState *bs, QDict *options, int flags,
>Error **errp);
>  void (*bdrv_close)(BlockDriverState *bs);
> +
> +
>  int coroutine_fn (*bdrv_co_create)(BlockdevCreateOptions *opts,
> Error **errp);
>  int coroutine_fn (*bdrv_co_create_opts)(BlockDriver *drv,
>  const char *filename,
>  QemuOpts *opts,
>  Error **errp);
> +
> +int coroutine_fn (*bdrv_co_amend)(BlockDriverState *bs,
> +  BlockdevAmendOptions *opts,
> +  bool force,
> +  Error **errp);
> +
> +int (*bdrv_amend_options)(BlockDriverState *bs,
> +  QemuOpts *opts,
> +  BlockDriverAmendStatusCB *status_cb,
> +  void *cb_opaque,
> +  bool force,
> +  Error **errp);
> +
>  int (*bdrv_make_empty)(BlockDriverState *bs);
>  
>  /*
> @@ -433,12 +448,6 @@ struct BlockDriver {
>BdrvCheckResult *result,
>BdrvCheckMode fix);
>  
> -int (*bdrv_amend_options)(BlockDriverState *bs, QemuOpts *opts,
> -  BlockDriverAmendStatusCB *status_cb,
> -  void *cb_opaque,
> -  bool force,
> -  Error **errp);
> -

No harm done, but why not just leave it where it was?

>  void (*bdrv_debug_event)(BlockDriverState *bs, BlkdebugEvent event);
>  
>  /* TODO Better pass a option string/QDict/QemuOpts to add any rule? */
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 943df1926a..74db515414 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -4649,6 +4649,48 @@
>'data': { 'job-id': 'str',
>  'options': 'BlockdevCreateOptions' } }
>  
> +##
> +# @BlockdevAmendOptions:
> +#
> +# Options for amending an image format
> +#
> +# @driver   block driver that is suitable for the image

Missing colon after @driver.  Also, what does “suitable” mean?
Shouldn’t it be exactly the node’s driver?  (i.e. “Block driver of the
node to amend”)

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v4 2/3] iotests: add JobRunner class

2020-05-14 Thread Kevin Wolf

Am 14.05.2020 um 04:25 hat John Snow geschrieben:
> The idea is that instead of increasing the arguments to job_run all the
> time, create a more general-purpose job runner that can be subclassed to
> do interesting things with.
> 
> pylint note: the 'callbacks' option guards against unused warning
> arguments in functions designated as callbacks. It does not currently
> guard against "no-self-use" though; hence a once-off ignore.
> 
> mypy note: QapiEvent is only a weak alias; it's fully interchangable
> with the type it's declared as. In the future, we may wish to tighten
> these types. For now, this communicates the rough shape of the type and
> (more importantly) the intent.
> 
> Signed-off-by: John Snow 

> +# Listen for these events with these parameters:
> +self._events = {
> +'BLOCK_JOB_COMPLETED': match_device,
> +'BLOCK_JOB_CANCELLED': match_device,
> +'BLOCK_JOB_ERROR': match_device,
> +'BLOCK_JOB_READY': match_device,
> +'BLOCK_JOB_PENDING': match_id,
> +'JOB_STATUS_CHANGE': match_id
> +}

The old code had a trailing comma here in case we need to add more
events later. Anyway:

Reviewed-by: Kevin Wolf

Re: [PATCH v4 1/3] qmp.py: change event_wait to use a dict

2020-05-14 Thread John Snow




On 5/14/20 10:47 AM, Kevin Wolf wrote:
> Am 14.05.2020 um 04:25 hat John Snow geschrieben:
>> It's easier to work with than a list of tuples, because we can check the
>> keys for membership.
>>
>> Signed-off-by: John Snow 
>> ---
>>  python/qemu/machine.py| 10 +-
>>  tests/qemu-iotests/040| 12 ++--
>>  tests/qemu-iotests/260|  5 +++--
>>  tests/qemu-iotests/iotests.py | 16 
>>  4 files changed, 22 insertions(+), 21 deletions(-)
> 
> I think you need to convert scripts/simplebench/bench_block_job.py, too.

Oh, right -- that one is new since I did this. A lot of these scripts
need to be moved over into the python/ directory and managed under the
same pylint/mypy regime as everything else.

A *ton* of our scripts are in various states of disrepair.

> 
>> diff --git a/python/qemu/machine.py b/python/qemu/machine.py
>> index b9a98e2c86..eaedc25172 100644
>> --- a/python/qemu/machine.py
>> +++ b/python/qemu/machine.py
>> @@ -478,21 +478,21 @@ def event_wait(self, name, timeout=60.0, match=None):
>>  timeout: QEMUMonitorProtocol.pull_event timeout parameter.
>>  match: Optional match criteria. See event_match for details.
>>  """
>> -return self.events_wait([(name, match)], timeout)
>> +return self.events_wait({name: match}, timeout)
>>  
>>  def events_wait(self, events, timeout=60.0):
>>  """
>>  events_wait waits for and returns a named event from QMP with a 
>> timeout.
>>  
>> -events: a sequence of (name, match_criteria) tuples.
>> +events: a mapping containing {name: match_criteria}.
>>  The match criteria are optional and may be None.
>>  See event_match for details.  timeout:
>>  QEMUMonitorProtocol.pull_event timeout parameter.
>>  """
>>  def _match(event):
>> -for name, match in events:
>> -if event['event'] == name and self.event_match(event, 
>> match):
>> -return True
>> +name = event['event']
>> +if name in events:
>> +return self.event_match(event, events[name])
> 
> This part confused me a bit for a second. Of course, that's probably
> mostly just me, but I feel 'events' isn't a good name any more when the
> values of the dict are match strings rather than events.
> 

This is honestly a really bad function. When I was trying to type
everything, this one was at the bottom of the pile and it was the worst.

It needs an overhaul.

In my 32 patch series, I left the "match" types as "Any" pretty much
everywhere, because it's such a laissez-faire series of functions.

I'll keep the feedback in mind.

>>  return False
>>  
>>  # Search cached events
>> diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
>> index 32c82b4ec6..90b59081ff 100755
>> --- a/tests/qemu-iotests/040
>> +++ b/tests/qemu-iotests/040
>> @@ -485,12 +485,12 @@ class TestErrorHandling(iotests.QMPTestCase):
>>  
>>  def run_job(self, expected_events, error_pauses_job=False):
>>  match_device = {'data': {'device': 'job0'}}
>> -events = [
>> -('BLOCK_JOB_COMPLETED', match_device),
>> -('BLOCK_JOB_CANCELLED', match_device),
>> -('BLOCK_JOB_ERROR', match_device),
>> -('BLOCK_JOB_READY', match_device),
>> -]
>> +events = {
>> +'BLOCK_JOB_COMPLETED': match_device,
>> +'BLOCK_JOB_CANCELLED': match_device,
>> +'BLOCK_JOB_ERROR': match_device,
>> +'BLOCK_JOB_READY': match_device,
>> +}
>>  
>>  completed = False
>>  log = []
>> diff --git a/tests/qemu-iotests/260 b/tests/qemu-iotests/260
>> index 804a7addb9..729f031122 100755
>> --- a/tests/qemu-iotests/260
>> +++ b/tests/qemu-iotests/260
>> @@ -67,8 +67,9 @@ def test(persistent, restart):
>>  
>>  vm.qmp_log('block-commit', device='drive0', top=top,
>> filters=[iotests.filter_qmp_testfiles])
>> -ev = vm.events_wait((('BLOCK_JOB_READY', None),
>> - ('BLOCK_JOB_COMPLETED', None)))
>> +ev = vm.events_wait({
>> +'BLOCK_JOB_READY': None,
>> +'BLOCK_JOB_COMPLETED': None })
> 
> So, I'm not sure if this is nitpicking or rather bikeshedding, but
> having the closing brackets on the next line would be more consistent
> with the other instances in this patch.
> 

Nah, it's fine. I'll clean it up. This is pretty close to an RFC series
anyway, so I didn't really polish it.

(Or, I will try to clean it up. I probably won't work on it again in the
near term. I think I just wanted to see if this seemed useful in general
to people.

As part of maybe moving the python library onto a package, I thought
that maybe developing a JobRunner tool would be useful to have in that
library. As you can see, I nestled it into iotests.py, though.)

>>  log(filter_qmp_event(ev))
>>

Re: [PATCH 0/2] block/block-copy: Fix uninitialized variable in block_copy_task_entry

2020-05-14 Thread Philippe Mathieu-Daudé


On 5/7/20 2:11 PM, Philippe Mathieu-Daudé wrote:

block_copy_task_entry() might be written differently to avoid
the initialization. This silents the warning and let me build.

Philippe Mathieu-Daudé (2):
   block/block-copy: Fix uninitialized variable in block_copy_task_entry
   block/block-copy: Simplify block_copy_do_copy()

  block/block-copy.c | 14 --
  1 file changed, 4 insertions(+), 10 deletions(-)



ping?

Re: [PATCH v4 6/9] qemu-img: Add bitmap sub-command

2020-05-14 Thread Vladimir Sementsov-Ogievskiy


14.05.2020 17:20, Eric Blake wrote:

On 5/14/20 1:45 AM, Vladimir Sementsov-Ogievskiy wrote:

13.05.2020 04:16, Eric Blake wrote:

Include actions for --add, --remove, --clear, --enable, --disable, and
--merge (note that --clear is a bit of fluff, because the same can be
accomplished by removing a bitmap and then adding a new one in its
place, but it matches what QMP commands exist).  Listing is omitted,
because it does not require a bitmap name and because it was already
possible with 'qemu-img info'.  A single command line can play one or
more bitmap commands in sequence on the same bitmap name (although all
added bitmaps share the same granularity, and and all merged bitmaps
come from the same source file).  Merge defaults to other bitmaps in
the primary image, but can also be told to merge bitmaps from a
distinct image.





I'm sorry for asking it only now on v4.. But still. Why do we need it? 


Ease of use.


We can instead run qemu binary (or even new qemu-storage-daemon) and just use 
existing qmp commands. Is there a real benefit in developing qemu-img, 
maintaining two interfaces for the same thing?


If it makes someone's life easier, and is not hard to maintain, then yes.  A 
command line interface that calls into QMP is not hard to maintain.  And _I_ 
certainly found it easier to write iotests with this patch in place, so it 
already has at least one client.


Of-course, just run qmp commands from terminal is a lot less comfortable than 
just a qemu img command.. But may be we need some wrapper, which make it simple 
to run one qmp command on an image?

It's simple to make a python wrapper working like

qemu-qmp block-dirty-bitmap-add '{node: self, name: bitmap0, persistent: true}' 
/path/to/x.qcow2


This _IS_ such a wrapper.  The whole point of this patch is that it is now 
simpler to run one (or more) QMP command on an offline image from the command 
line.  Just because I wrote it in C instead of python, and attached it to an 
existing tool instead of writing a new tool, doesn't change the fact that it is 
just a wrapper around the existing QMP commands.



OK, that's right.

The thing that I didn't like is that we have to make cli-to-qapi interface 
mapping by hand. But I see now that interface you implementing is prepared to 
make several actions with same bitmap-name, which can't be achieved with some 
kind of automatic interface matching anyway, so my proposal don't match your 
needs, sorry for my haste)


--
Best regards,
Vladimir

Re: [PATCH v6 07/14] block/crypto: implement the encryption key management

2020-05-14 Thread Max Reitz

On 14.05.20 16:14, Daniel P. Berrangé wrote:
> On Thu, May 14, 2020 at 04:09:59PM +0200, Max Reitz wrote:
>> On 10.05.20 15:40, Maxim Levitsky wrote:
>>> This implements the encryption key management using the generic code in
>>> qcrypto layer and exposes it to the user via qemu-img
>>>
>>> This code adds another 'write_func' because the initialization
>>> write_func works directly on the underlying file, and amend
>>> works on instance of luks device.
>>>
>>> This commit also adds a 'hack/workaround' I and Kevin Wolf (thanks)
>>> made to make the driver both support write sharing (to avoid breaking the 
>>> users),
>>> and be safe against concurrent  metadata update (the keyslots)
>>>
>>> Eventually the write sharing for luks driver will be deprecated
>>> and removed together with this hack.
>>>
>>> The hack is that we ask (as a format driver) for BLK_PERM_CONSISTENT_READ
>>> and then when we want to update the keys, we unshare that permission.
>>> So if someone else has the image open, even readonly, encryption
>>> key update will fail gracefully.
>>>
>>> Also thanks to Daniel Berrange for the idea of
>>> unsharing read, rather that write permission which allows
>>> to avoid cases when the other user had opened the image read-only.
>>>
>>> Signed-off-by: Maxim Levitsky 
>>> Reviewed-by: Daniel P. Berrangé 
>>> ---
>>>  block/crypto.c | 127 +++--
>>>  block/crypto.h |  34 +
>>>  2 files changed, 158 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/block/crypto.c b/block/crypto.c
>>> index 2e16b62bdc..b14cb0ff06 100644
>>> --- a/block/crypto.c
>>> +++ b/block/crypto.c
>>
>> [...]
>>
>>> +static void
>>> +block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c,
>>> + const BdrvChildRole *role,
>>> + BlockReopenQueue *reopen_queue,
>>> + uint64_t perm, uint64_t shared,
>>> + uint64_t *nperm, uint64_t *nshared)
>>> +{
>>> +
>>> +BlockCrypto *crypto = bs->opaque;
>>> +
>>> +bdrv_filter_default_perms(bs, c, role, reopen_queue,
>>> +perm, shared, nperm, nshared);
>>> +/*
>>> + * Ask for consistent read permission so that if
>>> + * someone else tries to open this image with this permission
>>> + * neither will be able to edit encryption keys, since
>>> + * we will unshare that permission while trying to
>>> + * update the encryption keys
>>> + */
>>> +if (!(bs->open_flags & BDRV_O_NO_IO)) {
>>> +*nperm |= BLK_PERM_CONSISTENT_READ;
>>> +}
>>
>> I’m not sure this is important, because this really means we won’t do
>> I/O.  Its only relevant use in this case is for qemu-img info.  Do we
>> really care if someone edits the key slots while qemu-img info is
>> processing?
> 
> FWIW, OpenStack runs  qemu-img info in a periodic background job, so
> it can be concurrent with anything else they are running.

That might actually be a problem then, because this may cause sporadic
failure when trying to change (amend) keyslots; while qemu-img info
holds the CONSISTENT_READ permission, the amend process can’t unshare
it.  That might lead to hard-to-track-down bugs.

> Having said
> that due to previous QEMU bugs, they unconditonally pass the arg to
> qemu-img to explicitly disable locking

Well, then it doesn’t matter in this case.  But still something to
consider, probably.

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v6 09/14] iotests: filter few more luks specific create options

2020-05-14 Thread Max Reitz

On 10.05.20 15:40, Maxim Levitsky wrote:
> This allows more tests to be able to have same output on both qcow2 luks 
> encrypted images
> and raw luks images
> 
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  tests/qemu-iotests/087.out   |  6 ++---
>  tests/qemu-iotests/134.out   |  2 +-
>  tests/qemu-iotests/158.out   |  4 +--
>  tests/qemu-iotests/188.out   |  2 +-
>  tests/qemu-iotests/189.out   |  4 +--
>  tests/qemu-iotests/198.out   |  4 +--
>  tests/qemu-iotests/263.out   |  4 +--
>  tests/qemu-iotests/274.out   | 46 
>  tests/qemu-iotests/284.out   |  6 ++---
>  tests/qemu-iotests/common.filter |  6 +++--
>  10 files changed, 43 insertions(+), 41 deletions(-)

[...]

> diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out
> index 9d6fdeb1f7..59de176b99 100644
> --- a/tests/qemu-iotests/274.out
> +++ b/tests/qemu-iotests/274.out
> @@ -1,9 +1,9 @@
>  == Commit tests ==
> -Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=2097152 cluster_size=65536 
> lazy_refcounts=off refcount_bits=16
> +Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 size=2097152 
> lazy_refcounts=off refcount_bits=16
>  
> -Formatting 'TEST_DIR/PID-mid', fmt=qcow2 size=1048576 
> backing_file=TEST_DIR/PID-base cluster_size=65536 lazy_refcounts=off 
> refcount_bits=16
> +Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 size=1048576 
> backing_file=TEST_DIR/PID-base lazy_refcounts=off refcount_bits=16
>  
> -Formatting 'TEST_DIR/PID-top', fmt=qcow2 size=2097152 
> backing_file=TEST_DIR/PID-mid cluster_size=65536 lazy_refcounts=off 
> refcount_bits=16
> +Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 size=2097152 
> backing_file=TEST_DIR/PID-mid lazy_refcounts=off refcount_bits=16

@size and @cluster_size swapping positions doesn’t look right for this
patch.  I think all hunks for 274.out should be in patch 5.

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v2 0/1] qemu-nbd: Close inherited stderr

2020-05-14 Thread Eric Blake


On 5/14/20 9:29 AM, Eric Blake wrote:


WARNING: Block comments use a leading /* on a separate line
#20: FILE: qemu-nbd.c:919:
+    /* Remember parents stderr only if the fork option is set.



The comment could use some grammar help (s/parents/parent's/), and in 
truth, I don't think it adds much beyond what the code itself is already 
doing, so rather than adding another line to silence patchew, you could 
instead just eliminate the comment and life would still be fine.  Or if 
you want a one-line comment, I might suggest:


/* Remember parent's stderr if we will restoring it. */


It helps if I don't hit 'send' too early.

/* Remember parent's stderr if we will be restoring it. */

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v4 1/3] qmp.py: change event_wait to use a dict

2020-05-14 Thread Kevin Wolf

Am 14.05.2020 um 04:25 hat John Snow geschrieben:
> It's easier to work with than a list of tuples, because we can check the
> keys for membership.
> 
> Signed-off-by: John Snow 
> ---
>  python/qemu/machine.py| 10 +-
>  tests/qemu-iotests/040| 12 ++--
>  tests/qemu-iotests/260|  5 +++--
>  tests/qemu-iotests/iotests.py | 16 
>  4 files changed, 22 insertions(+), 21 deletions(-)

I think you need to convert scripts/simplebench/bench_block_job.py, too.

> diff --git a/python/qemu/machine.py b/python/qemu/machine.py
> index b9a98e2c86..eaedc25172 100644
> --- a/python/qemu/machine.py
> +++ b/python/qemu/machine.py
> @@ -478,21 +478,21 @@ def event_wait(self, name, timeout=60.0, match=None):
>  timeout: QEMUMonitorProtocol.pull_event timeout parameter.
>  match: Optional match criteria. See event_match for details.
>  """
> -return self.events_wait([(name, match)], timeout)
> +return self.events_wait({name: match}, timeout)
>  
>  def events_wait(self, events, timeout=60.0):
>  """
>  events_wait waits for and returns a named event from QMP with a 
> timeout.
>  
> -events: a sequence of (name, match_criteria) tuples.
> +events: a mapping containing {name: match_criteria}.
>  The match criteria are optional and may be None.
>  See event_match for details.  timeout:
>  QEMUMonitorProtocol.pull_event timeout parameter.
>  """
>  def _match(event):
> -for name, match in events:
> -if event['event'] == name and self.event_match(event, match):
> -return True
> +name = event['event']
> +if name in events:
> +return self.event_match(event, events[name])

This part confused me a bit for a second. Of course, that's probably
mostly just me, but I feel 'events' isn't a good name any more when the
values of the dict are match strings rather than events.

>  return False
>  
>  # Search cached events
> diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
> index 32c82b4ec6..90b59081ff 100755
> --- a/tests/qemu-iotests/040
> +++ b/tests/qemu-iotests/040
> @@ -485,12 +485,12 @@ class TestErrorHandling(iotests.QMPTestCase):
>  
>  def run_job(self, expected_events, error_pauses_job=False):
>  match_device = {'data': {'device': 'job0'}}
> -events = [
> -('BLOCK_JOB_COMPLETED', match_device),
> -('BLOCK_JOB_CANCELLED', match_device),
> -('BLOCK_JOB_ERROR', match_device),
> -('BLOCK_JOB_READY', match_device),
> -]
> +events = {
> +'BLOCK_JOB_COMPLETED': match_device,
> +'BLOCK_JOB_CANCELLED': match_device,
> +'BLOCK_JOB_ERROR': match_device,
> +'BLOCK_JOB_READY': match_device,
> +}
>  
>  completed = False
>  log = []
> diff --git a/tests/qemu-iotests/260 b/tests/qemu-iotests/260
> index 804a7addb9..729f031122 100755
> --- a/tests/qemu-iotests/260
> +++ b/tests/qemu-iotests/260
> @@ -67,8 +67,9 @@ def test(persistent, restart):
>  
>  vm.qmp_log('block-commit', device='drive0', top=top,
> filters=[iotests.filter_qmp_testfiles])
> -ev = vm.events_wait((('BLOCK_JOB_READY', None),
> - ('BLOCK_JOB_COMPLETED', None)))
> +ev = vm.events_wait({
> +'BLOCK_JOB_READY': None,
> +'BLOCK_JOB_COMPLETED': None })

So, I'm not sure if this is nitpicking or rather bikeshedding, but
having the closing brackets on the next line would be more consistent
with the other instances in this patch.

>  log(filter_qmp_event(ev))
>  if (ev['event'] == 'BLOCK_JOB_COMPLETED'):
>  vm.shutdown()

Kevin

Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext

2020-05-14 Thread Kevin Wolf

Am 14.05.2020 um 15:21 hat Thomas Lamprecht geschrieben:
> On 5/12/20 4:43 PM, Kevin Wolf wrote:
> > Stefan (Reiter), after looking a bit closer at this, I think there is no
> > bug in QEMU, but the bug is in your coroutine code that calls block
> > layer functions without moving into the right AioContext first. I've
> > written this series anyway as it potentially makes the life of callers
> > easier and would probably make your buggy code correct.
> 
> > However, it doesn't feel right to commit something like patch 2 without
> > having a user for it. Is there a reason why you can't upstream your
> > async snapshot code?
> 
> I mean I understand what you mean, but it would make the interface IMO so
> much easier to use, if one wants to explicit schedule it beforehand they
> can still do. But that would open the way for two styles doing things, not
> sure if this would seen as bad. The assert about from patch 3/3 would be
> already really helping a lot, though.

I think patches 1 and 3 are good to be committed either way if people
think they are useful. They make sense without the async snapshot code.

My concern with the interface in patch 2 is both that it could give
people a false sense of security and that it would be tempting to write
inefficient code.

Usually, you won't have just a single call into the block layer for a
given block node, but you'll perform multiple operations. Switching to
the target context once rather than switching back and forth in every
operation is obviously more efficient.

But chances are that even if one of these function is bdrv_flush(),
which now works correctly from a different thread, you might need
another function that doesn't implement the same magic. So you always
need to be aware which functions support cross-context calls which
ones don't.

I feel we'd see a few bugs related to this.

> Regarding upstreaming, there was some historical attempt to upstream it
> from Dietmar, but in the time frame of ~ 8 to 10 years ago or so.
> I'm not quite sure why it didn't went through then, I see if I can get
> some time searching the mailing list archive.
> 
> We'd be naturally open and glad to upstream it, what it effectively
> allow us to do is to not block the VM to much during snapshoting it
> live.

Yes, there is no doubt that this is useful functionality. There has been
talk about this every now and then, but I don't think we ever got to a
point where it actually could be implemented.

Vladimir, I seem to remember you (or someone else from your team?) were
interested in async snapshots as well a while ago?

> I pushed a tree[0] with mostly just that specific code squashed together (hope
> I did not break anything), most of the actual code is in commit [1].
> It'd be cleaned up a bit and checked for coding style issues, but works good
> here.
> 
> Anyway, thanks for your help and pointers!
> 
> [0]: https://github.com/ThomasLamprecht/qemu/tree/savevm-async
> [1]: 
> https://github.com/ThomasLamprecht/qemu/commit/ffb9531f370ef0073e4b6f6021f4c47ccd702121

It doesn't even look that bad in terms of patch size. I had imagined it
a bit larger.

But it seems this is not really just an async 'savevm' (which would save
the VM state in a qcow2 file), but you store the state in a separate
raw file. What is the difference between this and regular migration into
a file?

I remember people talking about how snapshotting can store things in a
way that a normal migration stream can't do, like overwriting outdated
RAM state instead of just appending the new state, but you don't seem to
implement something like this.

Kevin

Re: [PATCH v2 2/3] hw/mips/mips_int: Use qdev gpio rather than qemu_allocate_irqs()

2020-05-14 Thread Philippe Mathieu-Daudé


On 5/14/20 3:24 PM, Peter Maydell wrote:

On Tue, 12 May 2020 at 08:48, Philippe Mathieu-Daudé  wrote:


Switch to using the qdev gpio API which is preferred over
qemu_allocate_irqs(). One step to eventually deprecate and
remove qemu_allocate_irqs() one day.



diff --git a/hw/mips/mips_int.c b/hw/mips/mips_int.c
index 796730b11d..91788c51a9 100644
--- a/hw/mips/mips_int.c
+++ b/hw/mips/mips_int.c
@@ -74,14 +74,12 @@ static void cpu_mips_irq_request(void *opaque, int irq, int 
level)
  void cpu_mips_irq_init_cpu(MIPSCPU *cpu)
  {
  CPUMIPSState *env = &cpu->env;
-qemu_irq *qi;
  int i;

-qi = qemu_allocate_irqs(cpu_mips_irq_request, cpu, 8);
+qdev_init_gpio_in(DEVICE(cpu), cpu_mips_irq_request, 8);
  for (i = 0; i < 8; i++) {
-env->irq[i] = qi[i];
+env->irq[i] = qdev_get_gpio_in(DEVICE(cpu), i);
  }
-g_free(qi);
  }


Similar comments as with the openrisc patch apply here:
  * ideally this code should be in target/mips/, not in hw/mips
  * board code should call qdev_get_gpio_in() to get the IRQ
line, rather than fishing env->irq[] out of the CPU object
directly
This is a bit more complicated than with openrisc, because there's
more than just a single board model, and not all MIPS boards call
cpu_mips_irq_init_cpu() so figuring out whether MIPS CPUs should
always provide inbound CPU lines, or only those with some
particular feature, or something else, would need some
investigation or MIPS knowledge.


Yes, I'm aware of at least 3 different to map interrupts to a MIPS core.
QEMU models at least 2.

As X86, MIPS code use old QEMU API, I don't see the codebase being 
upgraded anytime soon.


This is another borderline case between architecture and hardware, as 
the cache units, or the ARM NVIC. Ideally we should keep target/* free 
of references to hw/* code. In my experience it often give troubles.



But this is an OK first
step anyway, so

Reviewed-by: Peter Maydell 


Thanks. The idea to keep qemu_allocate_irqs() as internal as possible, 
and have devices use the qdev GPIO API.




thanks
-- PMM

Re: [PATCH v6 08/14] block/qcow2: extend qemu-img amend interface with crypto options

2020-05-14 Thread Max Reitz

On 10.05.20 15:40, Maxim Levitsky wrote:
> Now that we have all the infrastructure in place,
> wire it in the qcow2 driver and expose this to the user.
> 
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  block/qcow2.c  | 72 +-
>  tests/qemu-iotests/082.out | 45 

Again, some rebasing required because of compression_type.

>  2 files changed, 108 insertions(+), 9 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index db86500839..4bb6e3fc8f 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c

[...]

> @@ -4744,17 +4757,11 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts 
> *opts, BlockDriverState *in_bs,
>  g_free(optstr);
>  
>  if (has_luks) {
> +

Why?

With this line dropped, and 082.out fixed up:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [RFC PATCH 3/3] block: Assert we're running in the right thread

2020-05-14 Thread Kevin Wolf

Am 14.05.2020 um 15:52 hat Stefan Reiter geschrieben:
> On 5/12/20 4:43 PM, Kevin Wolf wrote:
> > tracked_request_begin() is called for most I/O operations, so it's a
> > good place to assert that we're indeed running in the home thread of the
> > node's AioContext.
> > 
> 
> Is this patch supposed to be always correct or only together with nr. 2?
> 
> I changed our code to call bdrv_flush_all from the main AIO context and it
> certainly works just fine (even without this series, so I suppose that would
> be the 'correct' way to fix it you mention on the cover), though of course
> it trips this assert without patch 2.

Yes, I think this is a basic assumption that should always be true.
This series shouldn't fix anything for upstream QEMU (at least I'm not
aware of anything that needs it), so the assertion could be added even
without the other patches.

In fact, I'm currently thinking that committing just patch 1 (because
it's a nice cleanup anyway) and patch 3 (because it will let us know
when we mess up) might make sense.

Kevin

> > Signed-off-by: Kevin Wolf 
> > ---
> >   block/io.c | 5 -
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/block/io.c b/block/io.c
> > index 7808e8bdc0..924bf5ba46 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -695,14 +695,17 @@ static void tracked_request_begin(BdrvTrackedRequest 
> > *req,
> > uint64_t bytes,
> > enum BdrvTrackedRequestType type)
> >   {
> > +Coroutine *self = qemu_coroutine_self();
> > +
> >   assert(bytes <= INT64_MAX && offset <= INT64_MAX - bytes);
> > +assert(bs->aio_context == qemu_coroutine_get_aio_context(self));
> >   *req = (BdrvTrackedRequest){
> >   .bs = bs,
> >   .offset = offset,
> >   .bytes  = bytes,
> >   .type   = type,
> > -.co = qemu_coroutine_self(),
> > +.co = self,
> >   .serialising= false,
> >   .overlap_offset = offset,
> >   .overlap_bytes  = bytes,
> > 
>

Re: [PATCH RFC 03/32] python//machine.py: remove bare except

2020-05-14 Thread John Snow

On 5/14/20 9:55 AM, Eric Blake wrote:
> On 5/14/20 12:53 AM, John Snow wrote:
>> Catch only the timeout error; if there are other problems, allow the
>> stack trace to be visible.
>>
> 
> A lot of patches in this series start with "python//" - is that
> intentional, or should it be a single slash?
> 

Was trying to imply an elided path structure, because
"python/qemu/lib/qmp.py" et al is way too chatty.

Feel free to suggest better subject names!

Re: [PATCH v2 0/1] qemu-nbd: Close inherited stderr

2020-05-14 Thread Eric Blake


On 5/14/20 7:51 AM, no-re...@patchew.org wrote:

Patchew URL: 
https://patchew.org/QEMU/20200514063112.1457125-1-raphael.p...@hetzner.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20200514063112.1457125-1-raphael.p...@hetzner.com
Subject: [PATCH v2 0/1] qemu-nbd: Close inherited stderr
Type: series





=== OUTPUT BEGIN ===
WARNING: Block comments use a leading /* on a separate line
#20: FILE: qemu-nbd.c:919:
+/* Remember parents stderr only if the fork option is set.

ERROR: suspect code indent for conditional statements (12, 14)
#23: FILE: qemu-nbd.c:922:
+if (fork_process) {
+  old_stderr = dup(STDERR_FILENO);

ERROR: Missing Signed-off-by: line(s)


Patchew is correct.  All three things should be fixed (the missing S-o-b 
is most important - I can't supply that; the missing spaces and comment 
formatting are something I could touch up).


The comment could use some grammar help (s/parents/parent's/), and in 
truth, I don't think it adds much beyond what the code itself is already 
doing, so rather than adding another line to silence patchew, you could 
instead just eliminate the comment and life would still be fine.  Or if 
you want a one-line comment, I might suggest:


/* Remember parent's stderr if we will restoring it. */

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH] bitmaps: Add myself as maintainer

2020-05-14 Thread John Snow




On 5/14/20 9:52 AM, Eric Blake wrote:
> On 5/14/20 12:08 AM, John Snow wrote:
>>
>>
>> On 5/14/20 12:49 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> 13.05.2020 23:24, John Snow wrote:


 On 5/13/20 10:14 AM, Eric Blake wrote:
> Dirty bitmaps are important to incremental backups, including exposure
> over NBD where I'm already maintainer.  Also, I'm aware that lately I
> have been doing as much code/review on bitmaps as John Snow, who is
> hoping to scale back on this front.
>
> Signed-off-by: Eric Blake 
>
> ---
> 
>    Dirty Bitmaps
>    M: John Snow 
> +M: Eric Blake 
>    R: Vladimir Sementsov-Ogievskiy 
>    L: qemu-block@nongnu.org
>    S: Supported
>

 I'd also like to point out that I wouldn't mind if Vladimir became an
 official maintainer, but I can't remember if he wanted the title
 when we
 last spoke at KVM Forum.
>>>
>>> Actually, it would be nice, I'd glad to get it, thanks :)
>>> I can send a separate patch, or we may s/R/M/ in this one?
>>>
>>
>> That would be very good!
>>
>> I'd be quite happy to be demoted to reviewer; it's about all the time
>> I've been truthfully able to give lately.
>>
>> (I won't speak for Eric!)
> 
> I can post a v2 that produces the following results:
> 
> M: Vladimir
> M: Eric
> R: John
> 
> Does that sound reasonable?
> 
> 

Yeah, I will approve it. I want to help as much as I can, but I don't
want to artificially limit the rate of development here as I fear I have
been.

--js

Re: [PATCH v6 07/14] block/crypto: implement the encryption key management

2020-05-14 Thread Daniel P . Berrangé

On Thu, May 14, 2020 at 04:09:59PM +0200, Max Reitz wrote:
> On 10.05.20 15:40, Maxim Levitsky wrote:
> > This implements the encryption key management using the generic code in
> > qcrypto layer and exposes it to the user via qemu-img
> > 
> > This code adds another 'write_func' because the initialization
> > write_func works directly on the underlying file, and amend
> > works on instance of luks device.
> > 
> > This commit also adds a 'hack/workaround' I and Kevin Wolf (thanks)
> > made to make the driver both support write sharing (to avoid breaking the 
> > users),
> > and be safe against concurrent  metadata update (the keyslots)
> > 
> > Eventually the write sharing for luks driver will be deprecated
> > and removed together with this hack.
> > 
> > The hack is that we ask (as a format driver) for BLK_PERM_CONSISTENT_READ
> > and then when we want to update the keys, we unshare that permission.
> > So if someone else has the image open, even readonly, encryption
> > key update will fail gracefully.
> > 
> > Also thanks to Daniel Berrange for the idea of
> > unsharing read, rather that write permission which allows
> > to avoid cases when the other user had opened the image read-only.
> > 
> > Signed-off-by: Maxim Levitsky 
> > Reviewed-by: Daniel P. Berrangé 
> > ---
> >  block/crypto.c | 127 +++--
> >  block/crypto.h |  34 +
> >  2 files changed, 158 insertions(+), 3 deletions(-)
> > 
> > diff --git a/block/crypto.c b/block/crypto.c
> > index 2e16b62bdc..b14cb0ff06 100644
> > --- a/block/crypto.c
> > +++ b/block/crypto.c
> 
> [...]
> 
> > +static void
> > +block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c,
> > + const BdrvChildRole *role,
> > + BlockReopenQueue *reopen_queue,
> > + uint64_t perm, uint64_t shared,
> > + uint64_t *nperm, uint64_t *nshared)
> > +{
> > +
> > +BlockCrypto *crypto = bs->opaque;
> > +
> > +bdrv_filter_default_perms(bs, c, role, reopen_queue,
> > +perm, shared, nperm, nshared);
> > +/*
> > + * Ask for consistent read permission so that if
> > + * someone else tries to open this image with this permission
> > + * neither will be able to edit encryption keys, since
> > + * we will unshare that permission while trying to
> > + * update the encryption keys
> > + */
> > +if (!(bs->open_flags & BDRV_O_NO_IO)) {
> > +*nperm |= BLK_PERM_CONSISTENT_READ;
> > +}
> 
> I’m not sure this is important, because this really means we won’t do
> I/O.  Its only relevant use in this case is for qemu-img info.  Do we
> really care if someone edits the key slots while qemu-img info is
> processing?

FWIW, OpenStack runs  qemu-img info in a periodic background job, so
it can be concurrent with anything else they are running. Having said
that due to previous QEMU bugs, they unconditonally pass the arg to
qemu-img to explicitly disable locking


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 6/9] qemu-img: Add bitmap sub-command

2020-05-14 Thread Eric Blake


On 5/14/20 1:45 AM, Vladimir Sementsov-Ogievskiy wrote:

13.05.2020 04:16, Eric Blake wrote:

Include actions for --add, --remove, --clear, --enable, --disable, and
--merge (note that --clear is a bit of fluff, because the same can be
accomplished by removing a bitmap and then adding a new one in its
place, but it matches what QMP commands exist).  Listing is omitted,
because it does not require a bitmap name and because it was already
possible with 'qemu-img info'.  A single command line can play one or
more bitmap commands in sequence on the same bitmap name (although all
added bitmaps share the same granularity, and and all merged bitmaps
come from the same source file).  Merge defaults to other bitmaps in
the primary image, but can also be told to merge bitmaps from a
distinct image.





I'm sorry for asking it only now on v4.. But still. Why do we need it? 


Ease of use.

We can instead run qemu binary (or even new qemu-storage-daemon) and 
just use existing qmp commands. Is there a real benefit in developing 
qemu-img, maintaining two interfaces for the same thing?


If it makes someone's life easier, and is not hard to maintain, then 
yes.  A command line interface that calls into QMP is not hard to 
maintain.  And _I_ certainly found it easier to write iotests with this 
patch in place, so it already has at least one client.


Of-course, just 
run qmp commands from terminal is a lot less comfortable than just a 
qemu img command.. But may be we need some wrapper, which make it simple 
to run one qmp command on an image?


It's simple to make a python wrapper working like

qemu-qmp block-dirty-bitmap-add '{node: self, name: bitmap0, persistent: 
true}' /path/to/x.qcow2


This _IS_ such a wrapper.  The whole point of this patch is that it is 
now simpler to run one (or more) QMP command on an offline image from 
the command line.  Just because I wrote it in C instead of python, and 
attached it to an existing tool instead of writing a new tool, doesn't 
change the fact that it is just a wrapper around the existing QMP commands.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v4 5/9] blockdev: Split off basic bitmap operations for qemu-img

2020-05-14 Thread Eric Blake


On 5/14/20 1:21 AM, Vladimir Sementsov-Ogievskiy wrote:

13.05.2020 04:16, Eric Blake wrote:

Upcoming patches want to add some basic bitmap manipulation abilities
to qemu-img.  But blockdev.o is too heavyweight to link into qemu-img
(among other things, it would drag in block jobs and transaction
support - qemu-img does offline manipulation, where atomicity is less
important because there are no concurrent modifications to compete
with), so it's time to split off the bare bones of what we will need
into a new file block/monitor/bitmap-qmp-cmds.o.

This is sufficient to expose 6 QMP commands for use by qemu-img (add,
remove, clear, enable, disable, merge), as well as move the three
helper functions touched in the previous patch.  Regarding
MAINTAINERS, the new file is automatically part of block core, but
also makes sense as related to other dirty bitmap files.

Signed-off-by: Eric Blake 
Reviewed-by: Max Reitz 
---



+++ b/block/monitor/bitmap-qmp-cmds.c


Hmm, shouldn't transaction bitmap actions be moved here too? May be, not 
in these series..


No, the very reason that we split this file off of blockdev.c is that 
transactions are NOT needed in qemu-img.  You have to have separate .o 
files for what qemu-img needs, vs. what the rest of qemu proper needs, 
and transactions are only needed in qemu proper at the moment.  If we 
ever need transactions in Kevin's qemu-storage-daemon but not all of 
blockdev.c, then we'd need yet another .c file independent from either 
blockdev.c or this patch's new bitmap-qmp-cmds.c.





@@ -0,0 +1,323 @@
+/*
+ * QEMU host block device bitmaps


A bit conflicts with tha fact that they are not of block-device level 
but of node-level.


May be just "Block dirty bitmap qmp commands" ?


Copy-and-paste from blockdev.c, tweaked by adding one word.  Your 
wording is also fine.





+ *
+ * Copyright (c) 2003-2008 Fabrice Bellard


Does it really apply here? block-dirty-bitmap-add was added in 2015.. 
May be Red Hat and Virtuozzo copyrights would be more appropriate.


When splitting a file, the safest course of action is to preserve ALL 
copyright from the original file into the new file.


Adding additional copyright lines is okay as part of submitting new 
functionality, but in this case, I don't feel comfortable adding Red Hat 
copyright (my split isn't adding new functionality), and I don't have 
authorization to add Virtuozzo copyright (as that is not my employer).




+#include "qemu/osdep.h"
+
+#include "sysemu/blockdev.h"
+#include "block/block.h"
+#include "block/block_int.h"
+#include "qapi/qapi-commands-block.h"
+#include "qapi/error.h"


compiles for with only four:

   #include "qemu/osdep.h"
   #include "block/block_int.h"
   #include "qapi/qapi-commands-block.h"
   #include "qapi/error.h"


Thanks. I blame rebase; an earlier version used a different header in 
patch 4/9; when I moved things to block_int.h in that patch, I didn't 
realize that this patch could be improved.




with at least extra includes dropped:
Reviewed-by: Vladimir Sementsov-Ogievskiy 



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v4 4/9] blockdev: Promote several bitmap functions to non-static

2020-05-14 Thread Eric Blake


On 5/14/20 6:45 AM, Vladimir Sementsov-Ogievskiy wrote:

13.05.2020 04:16, Eric Blake wrote:

-    HBitmap **backup, Error **errp)
+BdrvDirtyBitmap *block_dirty_bitmap_merge(const char *node, const 
char *target,
+  
BlockDirtyBitmapMergeSourceList *bitmaps,
+  HBitmap **backup, Error 
**errp)

  {
  BlockDriverState *bs;


s/bitmaps/bms/ to match declaration and fit into 80 characters


Can do, although it has ripple effects to 5/9 (as I wanted that to be 
pure code motion).


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v4 3/9] block: Make it easier to learn which BDS support bitmaps

2020-05-14 Thread Eric Blake


On 5/14/20 12:19 AM, Vladimir Sementsov-Ogievskiy wrote:


In the future, when we improve the ability to look up bitmaps through
a filter, we will probably also want to teach the block layer to
automatically let filters pass this request on through.


Hm. I think that bitmap at filter bs is a valid thing (moreover I have a 
plan to use it for one issue), so I'm not sure that it's good idea to do 
any generic logic around bitmaps work through filters, better to always 
address the exact node you mean..


In the immediate term, the only user of this new function is qemu-img, 
and qemu-img is not setting up filters, so whether this callback can see 
through filters is moot (that is, we are always addressing the exact 
node the user passed on the command line, for whatever bitmap operations 
qemu-img does).  In the long term, being able to see bitmaps through 
filters is better addressed after we finally include Max's series that 
include filter handling overall.






Signed-off-by: Eric Blake 


Reviewed-by: Vladimir Sementsov-Ogievskiy 


Thanks.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v6 07/14] block/crypto: implement the encryption key management

2020-05-14 Thread Max Reitz

On 10.05.20 15:40, Maxim Levitsky wrote:
> This implements the encryption key management using the generic code in
> qcrypto layer and exposes it to the user via qemu-img
> 
> This code adds another 'write_func' because the initialization
> write_func works directly on the underlying file, and amend
> works on instance of luks device.
> 
> This commit also adds a 'hack/workaround' I and Kevin Wolf (thanks)
> made to make the driver both support write sharing (to avoid breaking the 
> users),
> and be safe against concurrent  metadata update (the keyslots)
> 
> Eventually the write sharing for luks driver will be deprecated
> and removed together with this hack.
> 
> The hack is that we ask (as a format driver) for BLK_PERM_CONSISTENT_READ
> and then when we want to update the keys, we unshare that permission.
> So if someone else has the image open, even readonly, encryption
> key update will fail gracefully.
> 
> Also thanks to Daniel Berrange for the idea of
> unsharing read, rather that write permission which allows
> to avoid cases when the other user had opened the image read-only.
> 
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  block/crypto.c | 127 +++--
>  block/crypto.h |  34 +
>  2 files changed, 158 insertions(+), 3 deletions(-)
> 
> diff --git a/block/crypto.c b/block/crypto.c
> index 2e16b62bdc..b14cb0ff06 100644
> --- a/block/crypto.c
> +++ b/block/crypto.c

[...]

> +static void
> +block_crypto_child_perms(BlockDriverState *bs, BdrvChild *c,
> + const BdrvChildRole *role,
> + BlockReopenQueue *reopen_queue,
> + uint64_t perm, uint64_t shared,
> + uint64_t *nperm, uint64_t *nshared)
> +{
> +
> +BlockCrypto *crypto = bs->opaque;
> +
> +bdrv_filter_default_perms(bs, c, role, reopen_queue,
> +perm, shared, nperm, nshared);
> +/*
> + * Ask for consistent read permission so that if
> + * someone else tries to open this image with this permission
> + * neither will be able to edit encryption keys, since
> + * we will unshare that permission while trying to
> + * update the encryption keys
> + */
> +if (!(bs->open_flags & BDRV_O_NO_IO)) {
> +*nperm |= BLK_PERM_CONSISTENT_READ;
> +}

I’m not sure this is important, because this really means we won’t do
I/O.  Its only relevant use in this case is for qemu-img info.  Do we
really care if someone edits the key slots while qemu-img info is
processing?

OTOH, I don’t think it’ll harm much.  Well, apart from the fact that
BDRV_O_NO_IO won’t do much for LUKS images.

*shrug*

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [PATCH RFC 03/32] python//machine.py: remove bare except

2020-05-14 Thread Eric Blake


On 5/14/20 12:53 AM, John Snow wrote:

Catch only the timeout error; if there are other problems, allow the
stack trace to be visible.



A lot of patches in this series start with "python//" - is that 
intentional, or should it be a single slash?



Signed-off-by: John Snow 
---
  python/qemu/lib/machine.py | 33 +
  1 file changed, 21 insertions(+), 12 deletions(-)




--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH] bitmaps: Add myself as maintainer

2020-05-14 Thread Eric Blake


On 5/14/20 12:08 AM, John Snow wrote:



On 5/14/20 12:49 AM, Vladimir Sementsov-Ogievskiy wrote:

13.05.2020 23:24, John Snow wrote:



On 5/13/20 10:14 AM, Eric Blake wrote:

Dirty bitmaps are important to incremental backups, including exposure
over NBD where I'm already maintainer.  Also, I'm aware that lately I
have been doing as much code/review on bitmaps as John Snow, who is
hoping to scale back on this front.

Signed-off-by: Eric Blake 

---



   Dirty Bitmaps
   M: John Snow 
+M: Eric Blake 
   R: Vladimir Sementsov-Ogievskiy 
   L: qemu-block@nongnu.org
   S: Supported



I'd also like to point out that I wouldn't mind if Vladimir became an
official maintainer, but I can't remember if he wanted the title when we
last spoke at KVM Forum.


Actually, it would be nice, I'd glad to get it, thanks :)
I can send a separate patch, or we may s/R/M/ in this one?



That would be very good!

I'd be quite happy to be demoted to reviewer; it's about all the time
I've been truthfully able to give lately.

(I won't speak for Eric!)


I can post a v2 that produces the following results:

M: Vladimir
M: Eric
R: John

Does that sound reasonable?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [RFC PATCH 3/3] block: Assert we're running in the right thread

2020-05-14 Thread Stefan Reiter


On 5/12/20 4:43 PM, Kevin Wolf wrote:

tracked_request_begin() is called for most I/O operations, so it's a
good place to assert that we're indeed running in the home thread of the
node's AioContext.



Is this patch supposed to be always correct or only together with nr. 2?

I changed our code to call bdrv_flush_all from the main AIO context and 
it certainly works just fine (even without this series, so I suppose 
that would be the 'correct' way to fix it you mention on the cover), 
though of course it trips this assert without patch 2.



Signed-off-by: Kevin Wolf 
---
  block/io.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 7808e8bdc0..924bf5ba46 100644
--- a/block/io.c
+++ b/block/io.c
@@ -695,14 +695,17 @@ static void tracked_request_begin(BdrvTrackedRequest *req,
uint64_t bytes,
enum BdrvTrackedRequestType type)
  {
+Coroutine *self = qemu_coroutine_self();
+
  assert(bytes <= INT64_MAX && offset <= INT64_MAX - bytes);
+assert(bs->aio_context == qemu_coroutine_get_aio_context(self));
  
  *req = (BdrvTrackedRequest){

  .bs = bs,
  .offset = offset,
  .bytes  = bytes,
  .type   = type,
-.co = qemu_coroutine_self(),
+.co = self,
  .serialising= false,
  .overlap_offset = offset,
  .overlap_bytes  = bytes,

Re: [PATCH v3 0/4] Additional parameters for qemu_img map

2020-05-14 Thread Eric Blake


On 5/13/20 5:13 PM, no-re...@patchew.org wrote:

Patchew URL: 
https://patchew.org/QEMU/20200513133629.18508-1-eyal.moscov...@oracle.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

   CC  x86_64-softmmu/ioport.o
   CC  x86_64-softmmu/qtest.o
/tmp/qemu-test/src/qemu-img.c: In function 'cvtnum_full':
/tmp/qemu-test/src/qemu-img.c:488:63: error: format '%ld' expects argument of 
type 'long int', but argument 3 has type 'int64_t' {aka 'long long int'} 
[-Werror=format=]
  error_report("Invalid %s specified. Must be between %ld bytes "
  ~~^
  %lld
   "to %ld bytes.", name, min, max);


patchew is correct; printing int64_t values requires "%"PRId64 rather 
than "%ld".  I'm fine with touching that up in my queue, unless you'd 
like to submit v4.



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v4 00/34] block: Introduce real BdrvChildRole

2020-05-14 Thread Kevin Wolf

Am 13.05.2020 um 13:05 hat Max Reitz geschrieben:
> Based-on: <20200429141126.85159-1-mre...@redhat.com>
> (“block: Do not call BlockDriver.bdrv_make_empty() directly”)
> 
> Branch: https://github.com/XanClic/qemu.git child-role-v4
> Branch: https://git.xanclic.moe/XanClic/qemu.git child-role-v4

Thanks, applied to the block branch (and squashed in Eric's English
corrections).

Kevin

Re: [PATCH v6 05/14] block/amend: refactor qcow2 amend options

2020-05-14 Thread Max Reitz

On 10.05.20 15:40, Maxim Levitsky wrote:
> Some qcow2 create options can't be used for amend.
> Remove them from the qcow2 create options and add generic logic to detect
> such options in qemu-img
> 
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  block/qcow2.c  | 108 ++---
>  qemu-img.c |  18 +++-
>  tests/qemu-iotests/049.out | 102 ++--
>  tests/qemu-iotests/061.out |  12 ++-
>  tests/qemu-iotests/079.out |  18 ++--
>  tests/qemu-iotests/082.out | 149 
>  tests/qemu-iotests/085.out |  38 
>  tests/qemu-iotests/087.out |   6 +-
>  tests/qemu-iotests/115.out |   2 +-
>  tests/qemu-iotests/121.out |   4 +-
>  tests/qemu-iotests/125.out | 192 ++---
>  tests/qemu-iotests/134.out |   2 +-
>  tests/qemu-iotests/144.out |   4 +-
>  tests/qemu-iotests/158.out |   4 +-
>  tests/qemu-iotests/182.out |   2 +-
>  tests/qemu-iotests/185.out |   8 +-
>  tests/qemu-iotests/188.out |   2 +-
>  tests/qemu-iotests/189.out |   4 +-
>  tests/qemu-iotests/198.out |   4 +-
>  tests/qemu-iotests/243.out |  16 ++--
>  tests/qemu-iotests/250.out |   2 +-
>  tests/qemu-iotests/255.out |   8 +-
>  tests/qemu-iotests/259.out |   2 +-
>  tests/qemu-iotests/263.out |   4 +-
>  tests/qemu-iotests/280.out |   2 +-

These reference output hunks need some rebasing due to the new
compression_type option.

>  25 files changed, 284 insertions(+), 429 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index fc494c7591..db86500839 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c

[...]

> @@ -5552,37 +5506,6 @@ void qcow2_signal_corruption(BlockDriverState *bs, 
> bool fatal, int64_t offset,
>  .help = "The external data file must stay valid "   \
>  "as a raw image"\
>  },  \
> -{   \
> -.name = BLOCK_OPT_ENCRYPT,  \
> -.type = QEMU_OPT_BOOL,  \
> -.help = "Encrypt the image with format 'aes'. (Deprecated " \
> -"in favor of " BLOCK_OPT_ENCRYPT_FORMAT "=aes)",\
> -},  \
> -{   \
> -.name = BLOCK_OPT_ENCRYPT_FORMAT,   \
> -.type = QEMU_OPT_STRING,\
> -.help = "Encrypt the image, format choices: 'aes', 'luks'", \
> -},  \
> -BLOCK_CRYPTO_OPT_DEF_KEY_SECRET("encrypt.", \
> -"ID of secret providing qcow AES key or LUKS passphrase"),  \
> -BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_ALG("encrypt."),   \
> -BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_MODE("encrypt."),  \
> -BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_ALG("encrypt."),\
> -BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG("encrypt."),   \
> -BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG("encrypt."), \
> -BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME("encrypt."),\
> -{   \
> -.name = BLOCK_OPT_CLUSTER_SIZE, \
> -.type = QEMU_OPT_SIZE,  \
> -.help = "qcow2 cluster size",   \
> -.def_value_str = stringify(DEFAULT_CLUSTER_SIZE)\
> -},  \
> -{   \
> -.name = BLOCK_OPT_PREALLOC, \
> -.type = QEMU_OPT_STRING,\
> -.help = "Preallocation mode (allowed values: off, " \
> -"metadata, falloc, full)"   \
> -},  \
>  {   \
>  .name = BLOCK_OPT_LAZY_REFCOUNTS,   \
>  .type = QEMU_OPT_BOOL,  \
> @@ -5600,6 +5523,37 @@ static QemuOptsList qcow2_create_opts = {
>  .name = "qcow2-create-opts",
>  .head = QTAILQ_HEAD_INITIALIZER(qcow2_create_opts.head),
>  .desc = {
> +{   \
> +.name = BLOCK_OPT_ENCRYPT,  \
> +.type = QEMU_OPT_BOOL,  \
> +.help = "Encrypt the image with format 'aes'. (Deprecated " \
> +"in favor of " BLOCK_OPT_ENCRYP

Re: [PATCH v2 2/3] hw/mips/mips_int: Use qdev gpio rather than qemu_allocate_irqs()

2020-05-14 Thread Peter Maydell

On Tue, 12 May 2020 at 08:48, Philippe Mathieu-Daudé  wrote:
>
> Switch to using the qdev gpio API which is preferred over
> qemu_allocate_irqs(). One step to eventually deprecate and
> remove qemu_allocate_irqs() one day.

> diff --git a/hw/mips/mips_int.c b/hw/mips/mips_int.c
> index 796730b11d..91788c51a9 100644
> --- a/hw/mips/mips_int.c
> +++ b/hw/mips/mips_int.c
> @@ -74,14 +74,12 @@ static void cpu_mips_irq_request(void *opaque, int irq, 
> int level)
>  void cpu_mips_irq_init_cpu(MIPSCPU *cpu)
>  {
>  CPUMIPSState *env = &cpu->env;
> -qemu_irq *qi;
>  int i;
>
> -qi = qemu_allocate_irqs(cpu_mips_irq_request, cpu, 8);
> +qdev_init_gpio_in(DEVICE(cpu), cpu_mips_irq_request, 8);
>  for (i = 0; i < 8; i++) {
> -env->irq[i] = qi[i];
> +env->irq[i] = qdev_get_gpio_in(DEVICE(cpu), i);
>  }
> -g_free(qi);
>  }

Similar comments as with the openrisc patch apply here:
 * ideally this code should be in target/mips/, not in hw/mips
 * board code should call qdev_get_gpio_in() to get the IRQ
   line, rather than fishing env->irq[] out of the CPU object
   directly
This is a bit more complicated than with openrisc, because there's
more than just a single board model, and not all MIPS boards call
cpu_mips_irq_init_cpu() so figuring out whether MIPS CPUs should
always provide inbound CPU lines, or only those with some
particular feature, or something else, would need some
investigation or MIPS knowledge. But this is an OK first
step anyway, so

Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [RFC PATCH 0/3] block: Synchronous bdrv_*() from coroutine in different AioContext

2020-05-14 Thread Thomas Lamprecht

On 5/12/20 4:43 PM, Kevin Wolf wrote:
> Stefan (Reiter), after looking a bit closer at this, I think there is no
> bug in QEMU, but the bug is in your coroutine code that calls block
> layer functions without moving into the right AioContext first. I've
> written this series anyway as it potentially makes the life of callers
> easier and would probably make your buggy code correct.

> However, it doesn't feel right to commit something like patch 2 without
> having a user for it. Is there a reason why you can't upstream your
> async snapshot code?

I mean I understand what you mean, but it would make the interface IMO so
much easier to use, if one wants to explicit schedule it beforehand they
can still do. But that would open the way for two styles doing things, not
sure if this would seen as bad. The assert about from patch 3/3 would be
already really helping a lot, though.

Regarding upstreaming, there was some historical attempt to upstream it
from Dietmar, but in the time frame of ~ 8 to 10 years ago or so.
I'm not quite sure why it didn't went through then, I see if I can get some
time searching the mailing list archive.

We'd be naturally open and glad to upstream it, what it effectively allow
us to do is to not block the VM to much during snapshoting it live.

I pushed a tree[0] with mostly just that specific code squashed together (hope
I did not break anything), most of the actual code is in commit [1].
It'd be cleaned up a bit and checked for coding style issues, but works good
here.

Anyway, thanks for your help and pointers!

[0]: https://github.com/ThomasLamprecht/qemu/tree/savevm-async
[1]: 
https://github.com/ThomasLamprecht/qemu/commit/ffb9531f370ef0073e4b6f6021f4c47ccd702121

Re: [PATCH v2 3/3] hw/openrisc/pic_cpu: Use qdev gpio rather than qemu_allocate_irqs()

2020-05-14 Thread Peter Maydell

On Tue, 12 May 2020 at 08:48, Philippe Mathieu-Daudé  wrote:
>
> Coverity points out (CID 1421934) that we are leaking the
> memory returned by qemu_allocate_irqs(). We can avoid this
> leak by switching to using qdev_init_gpio_in(); the base
> class finalize will free the irqs that this allocates under
> the hood.

>  hw/openrisc/pic_cpu.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/hw/openrisc/pic_cpu.c b/hw/openrisc/pic_cpu.c
> index 36f9350830..4b0c92f842 100644
> --- a/hw/openrisc/pic_cpu.c
> +++ b/hw/openrisc/pic_cpu.c
> @@ -52,10 +52,9 @@ static void openrisc_pic_cpu_handler(void *opaque, int 
> irq, int level)
>  void cpu_openrisc_pic_init(OpenRISCCPU *cpu)
>  {
>  int i;
> -qemu_irq *qi;
> -qi = qemu_allocate_irqs(openrisc_pic_cpu_handler, cpu, NR_IRQS);
> +qdev_init_gpio_in(DEVICE(cpu), openrisc_pic_cpu_handler, NR_IRQS);
>
>  for (i = 0; i < NR_IRQS; i++) {
> -cpu->env.irq[i] = qi[i];
> +cpu->env.irq[i] = qdev_get_gpio_in(DEVICE(cpu), i);
>  }
>  }

This isn't wrong as such, but it's a bit weird, because it's code
outside of a device adding GPIO lines to that device, and the
handler function openrisc_pic_cpu_handler() is basically doing
nothing but reaching into the internals of the CPU device and
changing it.

Ideally:
 * all this code should be in target/openrisc/cpu.c, the same
   way the arm CPU creates its own inbound IRQs with qdev_init_gpio_in()
 * cpu->env.irq[] should go away, and hw/openrisc/openrisc_sim.c
   should be calling qdev_get_gpio_in() to get each IRQ line
   it needs, rather than directly grabbing cpu->env.irq and then
   indexing into it

But this change is an OK first step and we can build the other
cleanup on top of it, so
Reviewed-by: Peter Maydell 

thanks
-- PMM

Re: [PATCH 2/2] aio-posix: disable fdmon-io_uring when GSource is used

2020-05-14 Thread Oleksandr Natalenko

On Mon, May 11, 2020 at 07:36:30PM +0100, Stefan Hajnoczi wrote:
> The glib event loop does not call fdmon_io_uring_wait() so fd handlers
> waiting to be submitted build up in the list. There is no benefit is
> using io_uring when the glib GSource is being used, so disable it
> instead of implementing a more complex fix.
> 
> This fixes a memory leak where AioHandlers would build up and increasing
> amounts of CPU time were spent iterating them in aio_pending(). The
> symptom is that guests become slow when QEMU is built with io_uring
> support.
> 
> Buglink: https://bugs.launchpad.net/qemu/+bug/1877716
> Fixes: 73fd282e7b6dd4e4ea1c3bbb3d302c8db51e4ccf ("aio-posix: add io_uring fd 
> monitoring implementation")
> Signed-off-by: Stefan Hajnoczi 
> ---
>  include/block/aio.h |  3 +++
>  util/aio-posix.c| 12 
>  util/aio-win32.c|  4 
>  util/async.c|  1 +
>  4 files changed, 20 insertions(+)
> 
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 62ed954344..b2f703fa3f 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -701,6 +701,9 @@ void aio_context_setup(AioContext *ctx);
>   */
>  void aio_context_destroy(AioContext *ctx);
>  
> +/* Used internally, do not call outside AioContext code */
> +void aio_context_use_g_source(AioContext *ctx);
> +
>  /**
>   * aio_context_set_poll_params:
>   * @ctx: the aio context
> diff --git a/util/aio-posix.c b/util/aio-posix.c
> index 8af334ab19..1b2a3af65b 100644
> --- a/util/aio-posix.c
> +++ b/util/aio-posix.c
> @@ -682,6 +682,18 @@ void aio_context_destroy(AioContext *ctx)
>  aio_free_deleted_handlers(ctx);
>  }
>  
> +void aio_context_use_g_source(AioContext *ctx)
> +{
> +/*
> + * Disable io_uring when the glib main loop is used because it doesn't
> + * support mixed glib/aio_poll() usage. It relies on aio_poll() being
> + * called regularly so that changes to the monitored file descriptors are
> + * submitted, otherwise a list of pending fd handlers builds up.
> + */
> +fdmon_io_uring_destroy(ctx);
> +aio_free_deleted_handlers(ctx);
> +}
> +
>  void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
>   int64_t grow, int64_t shrink, Error **errp)
>  {
> diff --git a/util/aio-win32.c b/util/aio-win32.c
> index 729d533faf..953c56ab48 100644
> --- a/util/aio-win32.c
> +++ b/util/aio-win32.c
> @@ -414,6 +414,10 @@ void aio_context_destroy(AioContext *ctx)
>  {
>  }
>  
> +void aio_context_use_g_source(AioContext *ctx)
> +{
> +}
> +
>  void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
>   int64_t grow, int64_t shrink, Error **errp)
>  {
> diff --git a/util/async.c b/util/async.c
> index 3165a28f2f..1319eee3bc 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -362,6 +362,7 @@ static GSourceFuncs aio_source_funcs = {
>  
>  GSource *aio_get_g_source(AioContext *ctx)
>  {
> +aio_context_use_g_source(ctx);
>  g_source_ref(&ctx->source);
>  return &ctx->source;
>  }

Tested-by: Oleksandr Natalenko 

(run Windows 10 VM with storage accessible via io_uring on qemu v5.0.0
with these 2 patches)

Thank you.

-- 
  Best regards,
Oleksandr Natalenko (post-factum)
Principal Software Maintenance Engineer

Re: [PATCH 1/2] aio-posix: don't duplicate fd handler deletion in fdmon_io_uring_destroy()

2020-05-14 Thread Oleksandr Natalenko

On Mon, May 11, 2020 at 07:36:29PM +0100, Stefan Hajnoczi wrote:
> The io_uring file descriptor monitoring implementation has an internal
> list of fd handlers that are pending submission to io_uring.
> fdmon_io_uring_destroy() deletes all fd handlers on the list.
> 
> Don't delete fd handlers directly in fdmon_io_uring_destroy() for two
> reasons:
> 1. This duplicates the aio-posix.c AioHandler deletion code and could
>become outdated if the struct changes.
> 2. Only handlers with the FDMON_IO_URING_REMOVE flag set are safe to
>remove. If the flag is not set then something still has a pointer to
>the fd handler. Let aio-posix.c and its user worry about that. In
>practice this isn't an issue because fdmon_io_uring_destroy() is only
>called when shutting down so all users have removed their fd
>handlers, but the next patch will need this!
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  util/aio-posix.c  |  1 +
>  util/fdmon-io_uring.c | 13 ++---
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/util/aio-posix.c b/util/aio-posix.c
> index c3613d299e..8af334ab19 100644
> --- a/util/aio-posix.c
> +++ b/util/aio-posix.c
> @@ -679,6 +679,7 @@ void aio_context_destroy(AioContext *ctx)
>  {
>  fdmon_io_uring_destroy(ctx);
>  fdmon_epoll_disable(ctx);
> +aio_free_deleted_handlers(ctx);
>  }
>  
>  void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
> diff --git a/util/fdmon-io_uring.c b/util/fdmon-io_uring.c
> index d5a80ed6fb..1d14177df0 100644
> --- a/util/fdmon-io_uring.c
> +++ b/util/fdmon-io_uring.c
> @@ -342,11 +342,18 @@ void fdmon_io_uring_destroy(AioContext *ctx)
>  
>  io_uring_queue_exit(&ctx->fdmon_io_uring);
>  
> -/* No need to submit these anymore, just free them. */
> +/* Move handlers due to be removed onto the deleted list */
>  while ((node = QSLIST_FIRST_RCU(&ctx->submit_list))) {
> +unsigned flags = atomic_fetch_and(&node->flags,
> +~(FDMON_IO_URING_PENDING |
> +  FDMON_IO_URING_ADD |
> +  FDMON_IO_URING_REMOVE));
> +
> +if (flags & FDMON_IO_URING_REMOVE) {
> +QLIST_INSERT_HEAD_RCU(&ctx->deleted_aio_handlers, node, 
> node_deleted);
> +}
> +
>  QSLIST_REMOVE_HEAD_RCU(&ctx->submit_list, node_submitted);
> -QLIST_REMOVE(node, node);
> -g_free(node);
>  }
>  
>  ctx->fdmon_ops = &fdmon_poll_ops;

Tested-by: Oleksandr Natalenko 

(run Windows 10 VM with storage accessible via io_uring on qemu v5.0.0
with these 2 patches)

Thank you.

-- 
  Best regards,
Oleksandr Natalenko (post-factum)
Principal Software Maintenance Engineer

Re: [PATCH v2 0/4] block: Do not call BlockDriver.bdrv_make_empty() directly

2020-05-14 Thread Kevin Wolf

Am 29.04.2020 um 16:11 hat Max Reitz geschrieben:
> v1: https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg01245.html
> 
> Branch: https://github.com/XanClic/qemu.git fix-bdrv_make_empty-v2
> Branch: https://git.xanclic.moe/XanClic/qemu.git fix-bdrv_make_empty-v2
> 
> Based-on: <20200428192648.749066-1-ebl...@redhat.com>
>   (“qcow2: Allow resize of images with internal snapshots”)
> 
> Hi,
> 
> As described in v1’s cover letter (linked above), this series ensures
> that all calls to BlockDriver.bdrv_make_empty() go through a wrapper
> bdrv_make_empty() function that ensures the caller does have the
> necessary permissions.

Thanks, fixed up the test output in patch 4 and applied to the block
branch.

Kevin

Re: [PATCH v2 0/1] qemu-nbd: Close inherited stderr

2020-05-14 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200514063112.1457125-1-raphael.p...@hetzner.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20200514063112.1457125-1-raphael.p...@hetzner.com
Subject: [PATCH v2 0/1] qemu-nbd: Close inherited stderr
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20200514122334.25089-1-kra...@redhat.com -> 
patchew/20200514122334.25089-1-kra...@redhat.com
 * [new tag] patchew/20200514123729.156283-1-fran...@linux.ibm.com -> 
patchew/20200514123729.156283-1-fran...@linux.ibm.com
Switched to a new branch 'test'
76bb928 qemu-nbd: Close inherited stderr

=== OUTPUT BEGIN ===
WARNING: Block comments use a leading /* on a separate line
#20: FILE: qemu-nbd.c:919:
+/* Remember parents stderr only if the fork option is set.

ERROR: suspect code indent for conditional statements (12, 14)
#23: FILE: qemu-nbd.c:922:
+if (fork_process) {
+  old_stderr = dup(STDERR_FILENO);

ERROR: Missing Signed-off-by: line(s)

total: 2 errors, 1 warnings, 14 lines checked

Commit 76bb928d8364 (qemu-nbd: Close inherited stderr) has style problems, 
please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20200514063112.1457125-1-raphael.p...@hetzner.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH v6 04/14] block/amend: separate amend and create options for qemu-img

2020-05-14 Thread Max Reitz

On 10.05.20 15:40, Maxim Levitsky wrote:
> Some options are only useful for creation
> (or hard to be amended, like cluster size for qcow2), while some other
> options are only useful for amend, like upcoming keyslot management
> options for luks
> 
> Since currently only qcow2 supports amend, move all its options
> to a common macro and then include it in each action option list.
> 
> In future it might be useful to remove some options which are
> not supported anyway from amend list, which currently
> cause an error message if amended.
> 
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  block/qcow2.c | 160 +-
>  include/block/block_int.h |   4 +
>  qemu-img.c|  18 ++---
>  3 files changed, 100 insertions(+), 82 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 79fbad9d76..fc494c7591 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -5520,83 +5520,96 @@ void qcow2_signal_corruption(BlockDriverState *bs, 
> bool fatal, int64_t offset,
>  s->signaled_corruption = true;
>  }
>  
> +#define QCOW_COMMON_OPTIONS \
> +{   \
> +.name = BLOCK_OPT_SIZE, \
> +.type = QEMU_OPT_SIZE,  \
> +.help = "Virtual disk size" \
> +},  \
> +{   \
> +.name = BLOCK_OPT_COMPAT_LEVEL, \
> +.type = QEMU_OPT_STRING,\
> +.help = "Compatibility level (v2 [0.10] or v3 [1.1])"   \
> +},  \
> +{   \
> +.name = BLOCK_OPT_BACKING_FILE, \
> +.type = QEMU_OPT_STRING,\
> +.help = "File name of a base image" \
> +},  \
> +{   \
> +.name = BLOCK_OPT_BACKING_FMT,  \
> +.type = QEMU_OPT_STRING,\
> +.help = "Image format of the base image"\
> +},  \
> +{   \
> +.name = BLOCK_OPT_DATA_FILE,\
> +.type = QEMU_OPT_STRING,\
> +.help = "File name of an external data file"\
> +},  \
> +{   \
> +.name = BLOCK_OPT_DATA_FILE_RAW,\
> +.type = QEMU_OPT_BOOL,  \
> +.help = "The external data file must stay valid "   \
> +"as a raw image"\
> +},  \
> +{   \
> +.name = BLOCK_OPT_ENCRYPT,  \
> +.type = QEMU_OPT_BOOL,  \
> +.help = "Encrypt the image with format 'aes'. (Deprecated " \
> +"in favor of " BLOCK_OPT_ENCRYPT_FORMAT "=aes)",\
> +},  \
> +{   \
> +.name = BLOCK_OPT_ENCRYPT_FORMAT,   \
> +.type = QEMU_OPT_STRING,\
> +.help = "Encrypt the image, format choices: 'aes', 'luks'", \
> +},  \
> +BLOCK_CRYPTO_OPT_DEF_KEY_SECRET("encrypt.", \
> +"ID of secret providing qcow AES key or LUKS passphrase"),  \
> +BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_ALG("encrypt."),   \
> +BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_MODE("encrypt."),  \
> +BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_ALG("encrypt."),\
> +BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG("encrypt."),   \
> +BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG("encrypt."), \
> +BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME("encrypt."),\
> +{   \
> +.name = BLOCK_OPT_CLUSTER_SIZE, \
> +.type

Re: [PATCH v6 03/14] block/amend: add 'force' option

2020-05-14 Thread Max Reitz

On 10.05.20 15:40, Maxim Levitsky wrote:
> 'force' option will be used for some unsafe amend operations.
> 
> This includes things like erasing last keyslot in luks based formats
> (which destroys the data, unless the master key is backed up
> by external means), but that _might_ be desired result.
> 
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  block.c   | 4 +++-
>  block/qcow2.c | 1 +
>  docs/tools/qemu-img.rst   | 5 -
>  include/block/block.h | 1 +
>  include/block/block_int.h | 1 +
>  qemu-img-cmds.hx  | 4 ++--
>  qemu-img.c| 8 +++-
>  7 files changed, 19 insertions(+), 5 deletions(-)

[...]

> diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
> index 0080f83a76..fc2dca6649 100644
> --- a/docs/tools/qemu-img.rst
> +++ b/docs/tools/qemu-img.rst
> @@ -249,11 +249,14 @@ Command description:
>  
>  .. program:: qemu-img-commands
>  
> -.. option:: amend [--object OBJECTDEF] [--image-opts] [-p] [-q] [-f FMT] [-t 
> CACHE] -o OPTIONS FILENAME
> +.. option:: amend [--object OBJECTDEF] [--image-opts] [-p] [-q] [-f FMT] [-t 
> CACHE] [--force] -o OPTIONS FILENAME
>  
>Amends the image format specific *OPTIONS* for the image file
>*FILENAME*. Not all file formats support this operation.
>  
> +  --force allows some unsafe operations. Currently for -f luks, it allows to
> +  erase last encryption key, and to overwrite an active encryption key.

*erase the last encryption key

With that fixed:

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v6 02/14] qcrypto/luks: implement encryption key management

2020-05-14 Thread Max Reitz

On 10.05.20 15:40, Maxim Levitsky wrote:
> Next few patches will expose that functionality to the user.
> 
> Signed-off-by: Maxim Levitsky 
> Reviewed-by: Daniel P. Berrangé 
> ---
>  crypto/block-luks.c | 395 +++-
>  qapi/crypto.json|  61 ++-
>  2 files changed, 452 insertions(+), 4 deletions(-)
> 
> diff --git a/crypto/block-luks.c b/crypto/block-luks.c
> index 4861db810c..967d382882 100644
> --- a/crypto/block-luks.c
> +++ b/crypto/block-luks.c

[...]

> @@ -1069,6 +1076,119 @@ qcrypto_block_luks_find_key(QCryptoBlock *block,
>  return -1;
>  }
>  
> +/*
> + * Returns true if a slot i is marked as active
> + * (contains encrypted copy of the master key)
> + */
> +static bool
> +qcrypto_block_luks_slot_active(const QCryptoBlockLUKS *luks,
> +   unsigned int slot_idx)
> +{
> +uint32_t val = luks->header.key_slots[slot_idx].active;
> +return val ==  QCRYPTO_BLOCK_LUKS_KEY_SLOT_ENABLED;

One space too many after the ==.

[...]

> +/*
> + * Erases an keyslot given its index
> + * Returns:
> + *0 if the keyslot was erased successfully
> + *   -1 if a error occurred while erasing the keyslot
> + *
> + */
> +static int
> +qcrypto_block_luks_erase_key(QCryptoBlock *block,
> + unsigned int slot_idx,
> + QCryptoBlockWriteFunc writefunc,
> + void *opaque,
> + Error **errp)
> +{
> +QCryptoBlockLUKS *luks = block->opaque;
> +QCryptoBlockLUKSKeySlot *slot = &luks->header.key_slots[slot_idx];
> +g_autofree uint8_t *garbagesplitkey = NULL;
> +size_t splitkeylen = luks->header.master_key_len * slot->stripes;

This accesses header.key_slots[slot_idx] before...

> +size_t i;
> +Error *local_err = NULL;
> +int ret;
> +
> +assert(slot_idx < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS);

...we assert that slot_idx is in bounds.

I suppose that’s kind of fine, because assertions aren’t meant to fire
either, but this basically makes the assertion useless.

> +assert(splitkeylen > 0);
> +garbagesplitkey = g_new0(uint8_t, splitkeylen);
> +
> +/* Reset the key slot header */
> +memset(slot->salt, 0, QCRYPTO_BLOCK_LUKS_SALT_LEN);
> +slot->iterations = 0;
> +slot->active = QCRYPTO_BLOCK_LUKS_KEY_SLOT_DISABLED;
> +
> +ret = qcrypto_block_luks_store_header(block,  writefunc,

Superfluous space after the comma.

> +  opaque, &local_err);
> +
> +if (ret < 0) {
> +error_propagate(errp, local_err);
> +}

error_propagate() is a no-op with local_err == NULL, so you could do
without checking @ret (just calling error_propagate() unconditionally).

(But who cares, we need to set @ret anyway, so we might as well check it.)

[...]

> +static int
> +qcrypto_block_luks_amend_add_keyslot(QCryptoBlock *block,
> + QCryptoBlockReadFunc readfunc,
> + QCryptoBlockWriteFunc writefunc,
> + void *opaque,
> + QCryptoBlockAmendOptionsLUKS *opts_luks,
> + bool force,
> + Error **errp)
> +{
> +QCryptoBlockLUKS *luks = block->opaque;
> +uint64_t iter_time = opts_luks->has_iter_time ?
> + opts_luks->iter_time :
> + QCRYPTO_BLOCK_LUKS_DEFAULT_ITER_TIME_MS;
> +int keyslot;
> +g_autofree char *old_password = NULL;
> +g_autofree char *new_password = NULL;
> +g_autofree uint8_t *master_key = NULL;

(I assume we don’t really care about purging secrets from memory after use)

[...]

> +static int
> +qcrypto_block_luks_amend_erase_keyslots(QCryptoBlock *block,
> +QCryptoBlockReadFunc readfunc,
> +QCryptoBlockWriteFunc writefunc,
> +void *opaque,
> +QCryptoBlockAmendOptionsLUKS 
> *opts_luks,
> +bool force,
> +Error **errp)
> +{

[...]

> +for (i = 0; i < QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS; i++) {
> +int rv = qcrypto_block_luks_load_key(block,
> + i,
> + old_password,
> + tmpkey,
> + readfunc,
> + opaque,
> + errp);
> +if (rv == -1) {
> +return -1;
> +} else if (rv == 1) {
> +bitmap_set(&slots_to_erase_bitmap, i, 1);

We should assert that QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS <=
sizeof(slots_to_eras

Re: [PATCH v4 4/9] blockdev: Promote several bitmap functions to non-static

2020-05-14 Thread Vladimir Sementsov-Ogievskiy


13.05.2020 04:16, Eric Blake wrote:

-HBitmap **backup, Error **errp)
+BdrvDirtyBitmap *block_dirty_bitmap_merge(const char *node, const char *target,
+  BlockDirtyBitmapMergeSourceList 
*bitmaps,
+  HBitmap **backup, Error **errp)
  {
  BlockDriverState *bs;


s/bitmaps/bms/ to match declaration and fit into 80 characters

--
Best regards,
Vladimir

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping

2020-05-14 Thread Kevin Wolf

Am 14.05.2020 um 09:13 hat Max Reitz geschrieben:
> On 13.05.20 18:11, Eric Blake wrote:
> > On 5/13/20 9:56 AM, Max Reitz wrote:
> >> This command allows mapping block node names to aliases for the purpose
> >> of block dirty bitmap migration.
> >>
> >> This way, management tools can use different node names on the source
> >> and destination and pass the mapping of how bitmaps are to be
> >> transferred to qemu (on the source, the destination, or even both with
> >> arbitrary aliases in the migration stream).
> >>
> >> Suggested-by: Vladimir Sementsov-Ogievskiy 
> >> Signed-off-by: Max Reitz 
> >> ---
> > 
> >> @@ -713,6 +731,44 @@ static bool dirty_bitmap_has_postcopy(void *opaque)
> >>   return true;
> >>   }
> >>   +void
> >> qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList
> >> *mapping,
> >> + Error **errp)
> >> +{
> >> +    QDict *in_mapping = qdict_new();
> >> +    QDict *out_mapping = qdict_new();
> >> +
> >> +    for (; mapping; mapping = mapping->next) {
> >> +    MigrationBlockNodeMapping *entry = mapping->value;
> >> +
> >> +    if (qdict_haskey(out_mapping, entry->node_name)) {
> >> +    error_setg(errp, "Cannot map node name '%s' twice",
> >> +   entry->node_name);
> >> +    goto fail;
> >> +    }
> > 
> > Can we call this command more than once?  Is it cumulative (call it once
> > to set mapping for "a", second time to also set mapping for "b"), or
> > should it reset (second call wipes out all mappings from first call, any
> > mappings that must exist must be passed in the final call)?
> 
> I tried to make it clear in the documentation:
> 
> > +# @mapping: The mapping; must be one-to-one, but not necessarily
> > +#   complete.  Any mapping not given will be reset to the
> > +#   default (i.e. the identity mapping).
> 
> So everything that isn’t set in the second call is reset.  I thought
> about what you proposed (because I guess that’s the most intuitive
> idea), but after consideration I didn’t see why we’d need different
> behavior, so it would only serve to make the code more complicated.

Also, if it were cumulative, we would need a separate reset command
because you probably don't want to use the same mapping you used for an
incoming migration when you later migrate away again to a third host.

Kevin


signature.asc
Description: PGP signature

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping

2020-05-14 Thread Kevin Wolf

Am 14.05.2020 um 11:09 hat Max Reitz geschrieben:
> On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote:
> [...]
> > 1. So, you decided to make only node-mapping, not bitmap-mapping, so we
> > can't rename bitmaps in-flight and can't migrate bitmaps from one node
> > to several and visa-versa. I think it's OK, nothing good in such
> > possibilities, and this simplifies things.
> 
> On second thought, I wonder whether it would be useful to migrate
> bitmaps from multiple nodes to a single one.  But probably not, this
> would only make sense for filters, really, and in such a case the
> bitmaps should probably just be moved prior to migration.
> 
> (And I can’t imagine any other case.  When flattening backing chains,
> the bitmaps from the dropped layers basically become useless.)

I agree, you can always move the bitmaps in a separate step, either on
the source before migration or on the destination afterwards. Migration
is already complicated enough, let's not move things into it that don't
necessarily have to be there.

You would get complications like possibly conflicting bitmap names when
you migrate the bitmaps of two nodes into a single one.

Kevin

signature.asc
Description: PGP signature

Re: [PATCH 0/5] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-05-14 Thread Dr. David Alan Gilbert

* Lukas Straub (lukasstra...@web.de) wrote:
> Terminology:
> instance = one (nbd) blockdev/one chardev/the single migrationstate
> connection = one TCP connection
> 
> Hello Everyone,
> Having read all the comments, here is proposal v2:
> Every instance registers itself with a unique name in the form 
> "blockdev:", "chardev:" and "migration" using 
> yank_register_instance which will do some sanity checks like checking if the 
> same name exists already. Then (multiple) yank functions can be registered as 
> needed with that single name. When the instance exits/is removed, it 
> unregisters all yank functions and unregisters it's name with 
> yank_unregister_instance which will check if all yank functions where 
> unregistered.
> Every instance that supports the yank feature will register itself and the 
> yank functions unconditionally (No extra 'yank' option per instance).
> The 'query-yank' oob qmp command lists the names of all registered instances.
> The 'yank' oob qmp command takes a list of names and for every name calls all 
> yank functions registered with that name. Before doing anything, it will 
> check that all names exist.
> 
> If the instance has multiple connections (say, migration with multifd), i 
> don't think it makes much sense to just shutdown one connection. Calling 
> 'yank' on a instance will always shutdown all connections of that instance.

Agreed.

> Yank functions are generic and in no way limited to connections. Say, if 
> migration is started to an 'exec:' address, migration could register a yank 
> function to kill that external command on yank (Won't be implemented yet 
> though).

One thing we need to be care of is that the yank functions stay suitable
for OOB calling.

Dave

> Regards,
> Lukas Straub


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping

2020-05-14 Thread Vladimir Sementsov-Ogievskiy


14.05.2020 12:09, Max Reitz wrote:

On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote:
[...]

1. So, you decided to make only node-mapping, not bitmap-mapping, so we
can't rename bitmaps in-flight and can't migrate bitmaps from one node
to several and visa-versa. I think it's OK, nothing good in such
possibilities, and this simplifies things.


On second thought, I wonder whether it would be useful to migrate
bitmaps from multiple nodes to a single one.  But probably not, this
would only make sense for filters, really, and in such a case the
bitmaps should probably just be moved prior to migration.


Yes, I think the moving before migration is enough.


--
Best regards,
Vladimir

Re: [PATCH 0/5] Introduce 'yank' oob qmp command to recover from hanging qemu

2020-05-14 Thread Kevin Wolf

Am 13.05.2020 um 21:12 hat Lukas Straub geschrieben:
> Terminology:
> instance = one (nbd) blockdev/one chardev/the single migrationstate
> connection = one TCP connection
> 
> Hello Everyone,
> Having read all the comments, here is proposal v2:

Looks quite good to me.

> Every instance registers itself with a unique name in the form
> "blockdev:", "chardev:" and "migration" using
> yank_register_instance which will do some sanity checks like checking
> if the same name exists already. Then (multiple) yank functions can be
> registered as needed with that single name. When the instance exits/is
> removed, it unregisters all yank functions and unregisters it's name
> with yank_unregister_instance which will check if all yank functions
> where unregistered.

Feels like nitpicking, but you say that functions are registered with a
name that you have previously registered. If you mean literally passing
a string, could this lead to callers forgetting to register it first?

Maybe yank_register_instance() should return something like a
YankInstance object that must be passed to yank_register_function().
Then you would probably also have a list of all functions registered for
a single instance so that yank_unregister_instance() could automatically
remove all of them instead of requiring the instance to do so.

> Every instance that supports the yank feature will register itself and
> the yank functions unconditionally (No extra 'yank' option per
> instance).
> The 'query-yank' oob qmp command lists the names of all registered
> instances.
> The 'yank' oob qmp command takes a list of names and for every name
> calls all yank functions registered with that name. Before doing
> anything, it will check that all names exist.
> 
> If the instance has multiple connections (say, migration with
> multifd), i don't think it makes much sense to just shutdown one
> connection. Calling 'yank' on a instance will always shutdown all
> connections of that instance.

I think it depends. If shutting down one connection basically kills the
functionality of the whole instance, there is no reason not to kill all
connections. But if the instance could continue working on the second
connection, maybe it shouldn't be killed.

Anyway, we can extend things as needed later. I already mentioned
elsewhere in this thread that block node-names have a restricted set of
allowed character, so having a suffix to distinguish multiple yankable
things is possible. I just checked chardev and it also restricts the
allowed set of characters, so the same applies. 'migration' is only a
fixed string, so it's trivially extensible.

So we know a path forward if we ever need to yank individual
connections, which is good enough for me.

> Yank functions are generic and in no way limited to connections. Say,
> if migration is started to an 'exec:' address, migration could
> register a yank function to kill that external command on yank (Won't
> be implemented yet though).

Yes, this makes sense as a potential future use case.

Kevin


signature.asc
Description: PGP signature

Re: [PATCH v10 14/14] iotests: use python logging for iotests.log()

2020-05-14 Thread Kevin Wolf

Am 14.05.2020 um 08:24 hat John Snow geschrieben:
> On 3/31/20 9:44 AM, Kevin Wolf wrote:
> > Am 31.03.2020 um 02:00 hat John Snow geschrieben:
> >> We can turn logging on/off globally instead of per-function.
> >>
> >> Remove use_log from run_job, and use python logging to turn on
> >> diffable output when we run through a script entry point.
> >>
> >> iotest 245 changes output order due to buffering reasons.
> >>
> >>
> >> An extended note on python logging:
> >>
> >> A NullHandler is added to `qemu.iotests` to stop output from being
> >> generated if this code is used as a library without configuring logging.
> >> A NullHandler is only needed at the root, so a duplicate handler is not
> >> needed for `qemu.iotests.diff_io`.
> >>
> >> When logging is not configured, messages at the 'WARNING' levels or
> >> above are printed with default settings. The NullHandler stops this from
> >> occurring, which is considered good hygiene for code used as a library.
> >>
> >> See https://docs.python.org/3/howto/logging.html#library-config
> >>
> >> When logging is actually enabled (always at the behest of an explicit
> >> call by a client script), a root logger is implicitly created at the
> >> root, which allows messages to propagate upwards and be handled/emitted
> >> from the root logger with default settings.
> >>
> >> When we want iotest logging, we attach a handler to the
> >> qemu.iotests.diff_io logger and disable propagation to avoid possible
> >> double-printing.
> >>
> >> For more information on python logging infrastructure, I highly
> >> recommend downloading the pip package `logging_tree`, which provides
> >> convenient visualizations of the hierarchical logging configuration
> >> under different circumstances.
> >>
> >> See https://pypi.org/project/logging_tree/ for more information.
> >>
> >> Signed-off-by: John Snow 
> >> Reviewed-by: Max Reitz 
> > 
> > Should we enable logger if -d is given?
> > 
> > Previously we had:
> > 
> > $ ./check -d -T -raw 281
> > [...]
> > 281 not run: not suitable for this image format: raw
> > 281  not run[15:39:03] [15:39:04]not suitable 
> > for this image format: raw
> > Not run: 281
> > 
> > After this series, the first line of output from notrun() is missing.
> > Not that I think it's important to have the line, but as long as we
> > bother to call logger.warning(), I thought that maybe we want to be able
> > to actually see the effect of it somehwere?
> > 
> > Kevin
> > 
> 
> Uh, okay. So this is weirder than I thought it was going to be!
> 
> So, if you move the debug configuration up above the _verify calls,
> you'll see the message printed out to the debug stream:
> 
> DEBUG:qemu.iotests:iotests debugging messages active
> WARNING:qemu.iotests:281 not run: not suitable for this image format: raw
> 
> ...but if you omit the `-d` flag, the message vanishes into a black
> hole. Did it always work like that ...?

Yes, this is how it used to work. It's a result of ./check only printing
the test output with -d, and such log messages are basically just test
output.

And I think it's exactly what we want: Without -d, you want only the
summary, i.e. a single line that says "pass", "fail" or "notrun",
potentially with a small note at the end of the line, but that's it.

Kevin

Re: [PULL 0/5] Block patches

2020-05-14 Thread Peter Maydell

On Wed, 13 May 2020 at 15:15, Max Reitz  wrote:
>
> The following changes since commit d5c75ec500d96f1d93447f990cd5a4ef5ba27fae:
>
>   Merge remote-tracking branch 
> 'remotes/stefanberger/tags/pull-tpm-2020-05-08-1' into staging (2020-05-12 
> 17:00:10 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/XanClic/qemu.git tags/pull-block-2020-05-13
>
> for you to fetch changes up to fc9aefc8c0d3c6392656ea661ce72c1583b70bbd:
>
>   block/block-copy: fix use-after-free of task pointer (2020-05-13 14:20:31 
> +0200)
>
> 
> Block patches:
> - zstd compression for qcow2
> - Fix use-after-free


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/5.1
for any user-visible changes.

-- PMM

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping

2020-05-14 Thread Dr. David Alan Gilbert

* Max Reitz (mre...@redhat.com) wrote:
> On 14.05.20 10:42, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mre...@redhat.com) wrote:
> > 
> > 
> > 
> >> +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList 
> >> *mapping,
> >> + Error **errp)
> >> +{
> >> +QDict *in_mapping = qdict_new();
> >> +QDict *out_mapping = qdict_new();
> >> +
> >> +for (; mapping; mapping = mapping->next) {
> >> +MigrationBlockNodeMapping *entry = mapping->value;
> >> +
> >> +if (qdict_haskey(out_mapping, entry->node_name)) {
> >> +error_setg(errp, "Cannot map node name '%s' twice",
> >> +   entry->node_name);
> >> +goto fail;
> >> +}
> > 
> > I'm not too clear exactly which case this is protecting against;
> > I think that's protecting against mapping
> > 
> >   'src1'->'dst1' and 'src1'->'dst2'
> > which is a good check.s (or maybe it's checking against dst2 twice?)
> 
> This one is against mapping src1 twice.  Both checks together check that
> it’s a one-to-one bijective mapping.
> 
> The technical reason why it needs to be one-to-one is because we base
> two QDicts off of it, so the inverse mapping needs to work.
> 
> > What about cases where there is no mapping - e.g. imagine
> > that we have b1/b2 on the source and b2/b3 on the dest; now
> > if we add just a mapping:
> > 
> >   b1->b2
> > 
> > then we end up with:
> > 
> >   b1 -> b2
> >   b2 -> b2  (non-mapped)
> > b3
> > 
> > so we have a clash there - are we protected against that?
> 
> Oh, no, we aren’t.  That wasn’t intentional.  However, I’m not sure how
> we’d protect against it.  We can’t check it in
> qmp_migrate_set_bitmap_node_mapping(), because we don’t know yet which
> nodes will exist at the time of migration, and which of those will have
> bitmaps.
> 
> So we’d need to check it as part of the migration process (by looking up
> any unmapped entries that default to the identity mapping in the
> respective reverse mapping, to see whether anything maps to the same name).

Yeh a once through check of all the nodes at the start of the migration
would probably fix it.

> OTOH, Vladimir proposed adding a parameter to
> migrate-set-bitmap-node-mapping that would make migration fail if any
> bitmaps should be migrated off of unmapped nodes, or if any incoming
> alias is unmapped (instead of defaulting to the identity mapping).  If
> we just make that the only behavior, then we wouldn’t have a problem
> with that at all, because all unmapped nodes would always throw an error.

Yeh that would force you to put a full mapping table in place.

> (And on the third hand, I wonder whether we should actually allow
> migrating bitmaps from multiple nodes to a single one, but I suppose
> that would require two separate commands, one for incoming and one for
> outgoing...)

Wouldn't that get very messy?

Dave

> Max
> 



--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping

2020-05-14 Thread Max Reitz

On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote:
[...]
> 1. So, you decided to make only node-mapping, not bitmap-mapping, so we
> can't rename bitmaps in-flight and can't migrate bitmaps from one node
> to several and visa-versa. I think it's OK, nothing good in such
> possibilities, and this simplifies things.

On second thought, I wonder whether it would be useful to migrate
bitmaps from multiple nodes to a single one.  But probably not, this
would only make sense for filters, really, and in such a case the
bitmaps should probably just be moved prior to migration.

(And I can’t imagine any other case.  When flattening backing chains,
the bitmaps from the dropped layers basically become useless.)

Max

signature.asc
Description: OpenPGP digital signature

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping

2020-05-14 Thread Max Reitz

On 14.05.20 10:42, Dr. David Alan Gilbert wrote:
> * Max Reitz (mre...@redhat.com) wrote:
> 
> 
> 
>> +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList 
>> *mapping,
>> + Error **errp)
>> +{
>> +QDict *in_mapping = qdict_new();
>> +QDict *out_mapping = qdict_new();
>> +
>> +for (; mapping; mapping = mapping->next) {
>> +MigrationBlockNodeMapping *entry = mapping->value;
>> +
>> +if (qdict_haskey(out_mapping, entry->node_name)) {
>> +error_setg(errp, "Cannot map node name '%s' twice",
>> +   entry->node_name);
>> +goto fail;
>> +}
> 
> I'm not too clear exactly which case this is protecting against;
> I think that's protecting against mapping
> 
>   'src1'->'dst1' and 'src1'->'dst2'
> which is a good check.s (or maybe it's checking against dst2 twice?)

This one is against mapping src1 twice.  Both checks together check that
it’s a one-to-one bijective mapping.

The technical reason why it needs to be one-to-one is because we base
two QDicts off of it, so the inverse mapping needs to work.

> What about cases where there is no mapping - e.g. imagine
> that we have b1/b2 on the source and b2/b3 on the dest; now
> if we add just a mapping:
> 
>   b1->b2
> 
> then we end up with:
> 
>   b1 -> b2
>   b2 -> b2  (non-mapped)
> b3
> 
> so we have a clash there - are we protected against that?

Oh, no, we aren’t.  That wasn’t intentional.  However, I’m not sure how
we’d protect against it.  We can’t check it in
qmp_migrate_set_bitmap_node_mapping(), because we don’t know yet which
nodes will exist at the time of migration, and which of those will have
bitmaps.

So we’d need to check it as part of the migration process (by looking up
any unmapped entries that default to the identity mapping in the
respective reverse mapping, to see whether anything maps to the same name).

OTOH, Vladimir proposed adding a parameter to
migrate-set-bitmap-node-mapping that would make migration fail if any
bitmaps should be migrated off of unmapped nodes, or if any incoming
alias is unmapped (instead of defaulting to the identity mapping).  If
we just make that the only behavior, then we wouldn’t have a problem
with that at all, because all unmapped nodes would always throw an error.

(And on the third hand, I wonder whether we should actually allow
migrating bitmaps from multiple nodes to a single one, but I suppose
that would require two separate commands, one for incoming and one for
outgoing...)

Max

signature.asc
Description: OpenPGP digital signature

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping

2020-05-14 Thread Dr. David Alan Gilbert

* Max Reitz (mre...@redhat.com) wrote:



> +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList 
> *mapping,
> + Error **errp)
> +{
> +QDict *in_mapping = qdict_new();
> +QDict *out_mapping = qdict_new();
> +
> +for (; mapping; mapping = mapping->next) {
> +MigrationBlockNodeMapping *entry = mapping->value;
> +
> +if (qdict_haskey(out_mapping, entry->node_name)) {
> +error_setg(errp, "Cannot map node name '%s' twice",
> +   entry->node_name);
> +goto fail;
> +}

I'm not too clear exactly which case this is protecting against;
I think that's protecting against mapping

  'src1'->'dst1' and 'src1'->'dst2'
which is a good check.s (or maybe it's checking against dst2 twice?)

What about cases where there is no mapping - e.g. imagine
that we have b1/b2 on the source and b2/b3 on the dest; now
if we add just a mapping:

  b1->b2

then we end up with:

  b1 -> b2
  b2 -> b2  (non-mapped)
b3

so we have a clash there - are we protected against that?

Dave

> +if (qdict_haskey(in_mapping, entry->alias)) {
> +error_setg(errp, "Cannot use alias '%s' twice",
> +   entry->alias);
> +goto fail;
> +}
> +
> +qdict_put_str(in_mapping, entry->alias, entry->node_name);
> +qdict_put_str(out_mapping, entry->node_name, entry->alias);
> +}
> +
> +qobject_unref(dirty_bitmap_mig_state.node_in_mapping);
> +qobject_unref(dirty_bitmap_mig_state.node_out_mapping);
> +
> +dirty_bitmap_mig_state.node_in_mapping = in_mapping;
> +dirty_bitmap_mig_state.node_out_mapping = out_mapping;
> +
> +return;
> +
> +fail:
> +qobject_unref(in_mapping);
> +qobject_unref(out_mapping);
> +}
> +
>  static SaveVMHandlers savevm_dirty_bitmap_handlers = {
>  .save_setup = dirty_bitmap_save_setup,
>  .save_live_complete_postcopy = dirty_bitmap_save_complete,
> -- 
> 2.26.2
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH v6 19/20] hw/block/nvme: do cmb/pmr init as part of pci init

2020-05-14 Thread Philippe Mathieu-Daudé


On 5/14/20 6:46 AM, Klaus Jensen wrote:

From: Klaus Jensen 


Having the patch subject duplicated ease review (not all email client 
display email subject close to email content):


"Do cmb/pmr init as part of pci init."

Reviewed-by: Philippe Mathieu-Daudé 



Signed-off-by: Klaus Jensen 
Reviewed-by: Maxim Levitsky 
---
  hw/block/nvme.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 7254b66ae199..2addcc86034a 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1527,6 +1527,12 @@ static void nvme_init_pci(NvmeCtrl *n, PCIDevice 
*pci_dev)
  pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY |
   PCI_BASE_ADDRESS_MEM_TYPE_64, &n->iomem);
  msix_init_exclusive_bar(pci_dev, n->params.max_ioqpairs + 1, 4, NULL);
+
+if (n->params.cmb_size_mb) {
+nvme_init_cmb(n, pci_dev);
+} else if (n->pmrdev) {
+nvme_init_pmr(n, pci_dev);
+}
  }
  
  static void nvme_realize(PCIDevice *pci_dev, Error **errp)

@@ -1588,12 +1594,6 @@ static void nvme_realize(PCIDevice *pci_dev, Error 
**errp)
  n->bar.vs = 0x00010200;
  n->bar.intmc = n->bar.intms = 0;
  
-if (n->params.cmb_size_mb) {

-nvme_init_cmb(n, pci_dev);
-} else if (n->pmrdev) {
-nvme_init_pmr(n, pci_dev);
-}
-
  for (i = 0; i < n->num_namespaces; i++) {
  nvme_init_namespace(n, &n->namespaces[i], &local_err);
  if (local_err) {

Re: [PATCH v6 18/20] hw/block/nvme: factor out pmr setup

2020-05-14 Thread Philippe Mathieu-Daudé


On 5/14/20 6:46 AM, Klaus Jensen wrote:

From: Klaus Jensen 

Signed-off-by: Klaus Jensen 
Reviewed-by: Maxim Levitsky 
---
  hw/block/nvme.c | 95 ++---
  1 file changed, 51 insertions(+), 44 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d71a5f142d51..7254b66ae199 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -58,6 +58,7 @@
  #define NVME_REG_SIZE 0x1000
  #define NVME_DB_SIZE  4
  #define NVME_CMB_BIR 2
+#define NVME_PMR_BIR 2
  
  #define NVME_GUEST_ERR(trace, fmt, ...) \

  do { \
@@ -1463,6 +1464,55 @@ static void nvme_init_cmb(NvmeCtrl *n, PCIDevice 
*pci_dev)
   PCI_BASE_ADDRESS_MEM_PREFETCH, &n->ctrl_mem);
  }
  
+static void nvme_init_pmr(NvmeCtrl *n, PCIDevice *pci_dev)

+{
+/* Controller Capabilities register */
+NVME_CAP_SET_PMRS(n->bar.cap, 1);
+
+/* PMR Capabities register */
+n->bar.pmrcap = 0;
+NVME_PMRCAP_SET_RDS(n->bar.pmrcap, 0);
+NVME_PMRCAP_SET_WDS(n->bar.pmrcap, 0);
+NVME_PMRCAP_SET_BIR(n->bar.pmrcap, NVME_PMR_BIR);
+NVME_PMRCAP_SET_PMRTU(n->bar.pmrcap, 0);
+/* Turn on bit 1 support */
+NVME_PMRCAP_SET_PMRWBM(n->bar.pmrcap, 0x02);
+NVME_PMRCAP_SET_PMRTO(n->bar.pmrcap, 0);
+NVME_PMRCAP_SET_CMSS(n->bar.pmrcap, 0);
+
+/* PMR Control register */
+n->bar.pmrctl = 0;
+NVME_PMRCTL_SET_EN(n->bar.pmrctl, 0);
+
+/* PMR Status register */
+n->bar.pmrsts = 0;
+NVME_PMRSTS_SET_ERR(n->bar.pmrsts, 0);
+NVME_PMRSTS_SET_NRDY(n->bar.pmrsts, 0);
+NVME_PMRSTS_SET_HSTS(n->bar.pmrsts, 0);
+NVME_PMRSTS_SET_CBAI(n->bar.pmrsts, 0);
+
+/* PMR Elasticity Buffer Size register */
+n->bar.pmrebs = 0;
+NVME_PMREBS_SET_PMRSZU(n->bar.pmrebs, 0);
+NVME_PMREBS_SET_RBB(n->bar.pmrebs, 0);
+NVME_PMREBS_SET_PMRWBZ(n->bar.pmrebs, 0);
+
+/* PMR Sustained Write Throughput register */
+n->bar.pmrswtp = 0;
+NVME_PMRSWTP_SET_PMRSWTU(n->bar.pmrswtp, 0);
+NVME_PMRSWTP_SET_PMRSWTV(n->bar.pmrswtp, 0);
+
+/* PMR Memory Space Control register */
+n->bar.pmrmsc = 0;
+NVME_PMRMSC_SET_CMSE(n->bar.pmrmsc, 0);
+NVME_PMRMSC_SET_CBA(n->bar.pmrmsc, 0);
+
+pci_register_bar(pci_dev, NVME_PMRCAP_BIR(n->bar.pmrcap),
+ PCI_BASE_ADDRESS_SPACE_MEMORY |
+ PCI_BASE_ADDRESS_MEM_TYPE_64 |
+ PCI_BASE_ADDRESS_MEM_PREFETCH, &n->pmrdev->mr);
+}
+
  static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev)
  {
  uint8_t *pci_conf = pci_dev->config;
@@ -1541,50 +1591,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error 
**errp)
  if (n->params.cmb_size_mb) {
  nvme_init_cmb(n, pci_dev);
  } else if (n->pmrdev) {
-/* Controller Capabilities register */
-NVME_CAP_SET_PMRS(n->bar.cap, 1);
-
-/* PMR Capabities register */
-n->bar.pmrcap = 0;
-NVME_PMRCAP_SET_RDS(n->bar.pmrcap, 0);
-NVME_PMRCAP_SET_WDS(n->bar.pmrcap, 0);
-NVME_PMRCAP_SET_BIR(n->bar.pmrcap, 2);
-NVME_PMRCAP_SET_PMRTU(n->bar.pmrcap, 0);
-/* Turn on bit 1 support */
-NVME_PMRCAP_SET_PMRWBM(n->bar.pmrcap, 0x02);
-NVME_PMRCAP_SET_PMRTO(n->bar.pmrcap, 0);
-NVME_PMRCAP_SET_CMSS(n->bar.pmrcap, 0);
-
-/* PMR Control register */
-n->bar.pmrctl = 0;
-NVME_PMRCTL_SET_EN(n->bar.pmrctl, 0);
-
-/* PMR Status register */
-n->bar.pmrsts = 0;
-NVME_PMRSTS_SET_ERR(n->bar.pmrsts, 0);
-NVME_PMRSTS_SET_NRDY(n->bar.pmrsts, 0);
-NVME_PMRSTS_SET_HSTS(n->bar.pmrsts, 0);
-NVME_PMRSTS_SET_CBAI(n->bar.pmrsts, 0);
-
-/* PMR Elasticity Buffer Size register */
-n->bar.pmrebs = 0;
-NVME_PMREBS_SET_PMRSZU(n->bar.pmrebs, 0);
-NVME_PMREBS_SET_RBB(n->bar.pmrebs, 0);
-NVME_PMREBS_SET_PMRWBZ(n->bar.pmrebs, 0);
-
-/* PMR Sustained Write Throughput register */
-n->bar.pmrswtp = 0;
-NVME_PMRSWTP_SET_PMRSWTU(n->bar.pmrswtp, 0);
-NVME_PMRSWTP_SET_PMRSWTV(n->bar.pmrswtp, 0);
-
-/* PMR Memory Space Control register */
-n->bar.pmrmsc = 0;
-NVME_PMRMSC_SET_CMSE(n->bar.pmrmsc, 0);
-NVME_PMRMSC_SET_CBA(n->bar.pmrmsc, 0);
-
-pci_register_bar(pci_dev, NVME_PMRCAP_BIR(n->bar.pmrcap),
-PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64 |
-PCI_BASE_ADDRESS_MEM_PREFETCH, &n->pmrdev->mr);
+nvme_init_pmr(n, pci_dev);
  }
  
  for (i = 0; i < n->num_namespaces; i++) {




Reviewed-by: Philippe Mathieu-Daudé

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping

2020-05-14 Thread Max Reitz

On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote:
> 13.05.2020 17:56, Max Reitz wrote:
>> This command allows mapping block node names to aliases for the purpose
>> of block dirty bitmap migration.
>>
>> This way, management tools can use different node names on the source
>> and destination and pass the mapping of how bitmaps are to be
>> transferred to qemu (on the source, the destination, or even both with
>> arbitrary aliases in the migration stream).
>>
>> Suggested-by: Vladimir Sementsov-Ogievskiy 
>> Signed-off-by: Max Reitz 
>> ---
>> Branch: https://github.com/XanClic/qemu.git
>> migration-bitmap-mapping-rfc-v2
>> Branch: https://git.xanclic.moe/XanClic/qemu.git
>> migration-bitmap-mapping-rfc-v2
>>
>> (Sorry, v1 was just broken.  This one should work better.)
>>
>> Vladimir has proposed something like this in April:
>> https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg00171.html
>>
>> Now I’ve been asked by my manager to look at this, so I decided to just
>> write a patch to see how it’d play out.
> 
> Great! Sometimes I remember about this thing, but never start
> implementing :)
> 
>>
>> This is an RFC, because I’d like to tack on tests to the final version,
>> but I’m not sure whether I can come up with something before the end of
>> the week (and I’ll be on PTO for the next two weeks).
>>
>> Also, I don’t know whether migration/block-dirty-bitmap.c is the best
>> place to put qmp_migrate_set_bitmap_mapping(), but it appears we already
>> have some QMP handlers in migration/, so I suppose it isn’t too bad.
>> ---
>>   qapi/migration.json    | 36 
>>   migration/block-dirty-bitmap.c | 60 --
>>   2 files changed, 94 insertions(+), 2 deletions(-)
>>
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index d5000558c6..97037ea635 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -1621,3 +1621,39 @@
>>   ##
>>   { 'event': 'UNPLUG_PRIMARY',
>>     'data': { 'device-id': 'str' } }
>> +
>> +##
>> +# @MigrationBlockNodeMapping:
>> +#
>> +# Maps a block node name to an alias for migration.
>> +#
>> +# @node-name: A block node name.
>> +#
>> +# @alias: An alias name for migration (for example the node name on
>> +# the opposite site).
>> +#
>> +# Since: 5.1
>> +##
>> +{ 'struct': 'MigrationBlockNodeMapping',
>> +  'data': {
>> +  'node-name': 'str',
>> +  'alias': 'str'
>> +  } }
>> +
>> +##
>> +# @migrate-set-bitmap-node-mapping:
>> +#
>> +# Maps block node names to arbitrary aliases for the purpose of dirty
>> +# bitmap migration.  Such aliases may for example be the corresponding
>> +# node names on the opposite site.
>> +#
>> +# By default, every node name is mapped to itself.
>> +#
>> +# @mapping: The mapping; must be one-to-one, but not necessarily
>> +#   complete.  Any mapping not given will be reset to the
>> +#   default (i.e. the identity mapping).
>> +#
>> +# Since: 5.1
>> +##
>> +{ 'command': 'migrate-set-bitmap-node-mapping',
>> +  'data': { 'mapping': ['MigrationBlockNodeMapping'] } }
> 
> Hm. I like it, it's simpler and clearer than what I was thinking about.
> 
> 1. So, you decided to make only node-mapping, not bitmap-mapping, so we
> can't rename bitmaps in-flight and can't migrate bitmaps from one node
> to several and visa-versa. I think it's OK, nothing good in such
> possibilities, and this simplifies things.

If it turns out that we’d want it, I suppose we can also still always
extend MigrationBlockNodeMapping by another mapping array for bitmaps.

> 2. If I understand correctly, default to node-name matching doesn't make
> real sense for libvirt.. But on the other hand, libvirt should not be
> considered as the ony user of Qemu. Still, the default seems unsafe..
> Could we make it optional? Or add an option to disable this default for
> absolutely strict behavior?

It was my understanding that libvirt (which should know about all
bitmaps on all nodes) would and could ensure itself that all nodes are
mapped according to what it needs.  (But that’s why Peter is CC’d, to
get his input.)

But your idea seems simple, so why not.

> May be, add a parameter
> 
> fallback: node-name | error | drop
> 
> where
>   node-name: use node-name as an alias, if found bitmap on the node not
> mentioned in @mapping [should not be useful for libvirt, but may be for
> others]
>   error: just error-out if such bitmap found [libvirt should use it, I
> propose it as a default value for @fallback]

You mean error out during migration?  Hm.  I suppose that’s OK, if some
mapping erroneously isn’t set and the node name doesn’t exist in the
destination, we’ll error out, too, so...

Shouldn’t be too difficult to implement, just put the enum in
dirty_bitmap_mig_state, and then do what it says when no entry can be
found in the mapping QDict.

>   drop: just ignore such bitmap - it will be lost [just and idea, I
> doubt that it is really useful]
> 
> ===
> 
> Also, we shou

Re: [PATCH v2 5/5] vhost: add device started check in migration set log

2020-05-14 Thread Jason Wang




On 2020/5/13 下午5:47, Dima Stepanov wrote:

 case CHR_EVENT_CLOSED:
 /* a close event may happen during a read/write, but vhost
  * code assumes the vhost_dev remains setup, so delay the
  * stop & clear to idle.
  * FIXME: better handle failure in vhost code, remove bh
  */
 if (s->watch) {
 AioContext *ctx = qemu_get_current_aio_context();

 g_source_remove(s->watch);
 s->watch = 0;
 qemu_chr_fe_set_handlers(&s->chr, NULL, NULL, NULL, NULL,
  NULL, NULL, false);

 aio_bh_schedule_oneshot(ctx, chr_closed_bh, opaque);
 }
 break;

I think it's time we dropped the FIXME and moved the handling to common
code. Jason? Marc-André?

I agree. Just to confirm, do you prefer bh or doing changes like what is
done in this series? It looks to me bh can have more easier codes.

Could it be a good idea just to make disconnect in the char device but
postphone clean up in the vhost-user-blk (or any other vhost-user
device) itself? So we are moving the postphone logic and decision from
the char device to vhost-user device. One of the idea i have is as
follows:
   - Put ourself in the INITIALIZATION state
   - Start these vhost-user "handshake" commands
   - If we got a disconnect error, perform disconnect, but don't clean up
 device (it will be clean up on the roll back). I can be done by
 checking the state in vhost_user_..._disconnect routine or smth like it



Any issue you saw just using the aio bh as Michael posted above.

Then we don't need to deal with the silent vhost_dev_stop() and we will 
have codes that is much more easier to understand.


Thank



   - vhost-user command returns error back to the _start() routine
   - Rollback in one place in the start() routine, by calling this
 postphoned clean up for the disconnect

Re: [PATCH v2 4/5] vhost: check vring address before calling unmap

2020-05-14 Thread Jason Wang




On 2020/5/13 下午5:36, Dima Stepanov wrote:

On Wed, May 13, 2020 at 11:00:38AM +0800, Jason Wang wrote:

On 2020/5/12 下午5:08, Dima Stepanov wrote:

On Tue, May 12, 2020 at 11:26:11AM +0800, Jason Wang wrote:

On 2020/5/11 下午5:11, Dima Stepanov wrote:

On Mon, May 11, 2020 at 11:05:58AM +0800, Jason Wang wrote:

On 2020/4/30 下午9:36, Dima Stepanov wrote:

Since disconnect can happen at any time during initialization not all
vring buffers (for instance used vring) can be intialized successfully.
If the buffer was not initialized then vhost_memory_unmap call will lead
to SIGSEGV. Add checks for the vring address value before calling unmap.
Also add assert() in the vhost_memory_unmap() routine.

Signed-off-by: Dima Stepanov
---
  hw/virtio/vhost.c | 27 +--
  1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index ddbdc53..3ee50c4 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -314,6 +314,8 @@ static void vhost_memory_unmap(struct vhost_dev *dev, void 
*buffer,
 hwaddr len, int is_write,
 hwaddr access_len)
  {
+assert(buffer);
+
  if (!vhost_dev_has_iommu(dev)) {
  cpu_physical_memory_unmap(buffer, len, is_write, access_len);
  }
@@ -1132,12 +1134,25 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
  vhost_vq_index);
  }
-vhost_memory_unmap(dev, vq->used, virtio_queue_get_used_size(vdev, idx),
-   1, virtio_queue_get_used_size(vdev, idx));
-vhost_memory_unmap(dev, vq->avail, virtio_queue_get_avail_size(vdev, idx),
-   0, virtio_queue_get_avail_size(vdev, idx));
-vhost_memory_unmap(dev, vq->desc, virtio_queue_get_desc_size(vdev, idx),
-   0, virtio_queue_get_desc_size(vdev, idx));
+/*
+ * Since the vhost-user disconnect can happen during initialization
+ * check if vring was initialized, before making unmap.
+ */
+if (vq->used) {
+vhost_memory_unmap(dev, vq->used,
+   virtio_queue_get_used_size(vdev, idx),
+   1, virtio_queue_get_used_size(vdev, idx));
+}
+if (vq->avail) {
+vhost_memory_unmap(dev, vq->avail,
+   virtio_queue_get_avail_size(vdev, idx),
+   0, virtio_queue_get_avail_size(vdev, idx));
+}
+if (vq->desc) {
+vhost_memory_unmap(dev, vq->desc,
+   virtio_queue_get_desc_size(vdev, idx),
+   0, virtio_queue_get_desc_size(vdev, idx));
+}

Any reason not checking hdev->started instead? vhost_dev_start() will set it
to true if virtqueues were correctly mapped.

Thanks

Well i see it a little bit different:
  - vhost_dev_start() sets hdev->started to true before starting
virtqueues
  - vhost_virtqueue_start() maps all the memory
If we hit the vhost disconnect at the start of the
vhost_virtqueue_start(), for instance for this call:
   r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
Then we will call vhost_user_blk_disconnect:
   vhost_user_blk_disconnect()->
 vhost_user_blk_stop()->
   vhost_dev_stop()->
 vhost_virtqueue_stop()
As a result we will come in this routine with the hdev->started still
set to true, but if used/avail/desc fields still uninitialized and set
to 0.

I may miss something, but consider both vhost_dev_start() and
vhost_user_blk_disconnect() were serialized in main loop. Can this really
happen?

Yes, consider the case when we start the vhost-user-blk device:
   vhost_dev_start->
 vhost_virtqueue_start
And we got a disconnect in the middle of vhost_virtqueue_start()
routine, for instance:
   1000 vq->num = state.num = virtio_queue_get_num(vdev, idx);
   1001 r = dev->vhost_ops->vhost_set_vring_num(dev, &state);
   1002 if (r) {
   1003 VHOST_OPS_DEBUG("vhost_set_vring_num failed");
   1004 return -errno;
   1005 }
   --> Here we got a disconnect <--
   1006
   1007 state.num = virtio_queue_get_last_avail_idx(vdev, idx);
   1008 r = dev->vhost_ops->vhost_set_vring_base(dev, &state);
   1009 if (r) {
   1010 VHOST_OPS_DEBUG("vhost_set_vring_base failed");
   1011 return -errno;
   1012 }
As a result call to vhost_set_vring_base will call the disconnect
routine. The backtrace log for SIGSEGV is as follows:
   Thread 4 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
   [Switching to Thread 0x72ea9700 (LWP 183150)]
   0x74d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
   (gdb) bt
   #0  0x74d60840 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
   #1  0x5590fd90 in flatview_write_continue (fv=0x7fffec4a2600,
   addr=0, attrs=..., ptr=0x0, len=1028, addr1=0,
   l=1028, mr=0x56b1b310) at ./exec.c:3142
   #2  0x5590fe98 in flatview_writ

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping

2020-05-14 Thread Max Reitz

On 13.05.20 18:11, Eric Blake wrote:
> On 5/13/20 9:56 AM, Max Reitz wrote:
>> This command allows mapping block node names to aliases for the purpose
>> of block dirty bitmap migration.
>>
>> This way, management tools can use different node names on the source
>> and destination and pass the mapping of how bitmaps are to be
>> transferred to qemu (on the source, the destination, or even both with
>> arbitrary aliases in the migration stream).
>>
>> Suggested-by: Vladimir Sementsov-Ogievskiy 
>> Signed-off-by: Max Reitz 
>> ---
> 
>> @@ -713,6 +731,44 @@ static bool dirty_bitmap_has_postcopy(void *opaque)
>>   return true;
>>   }
>>   +void
>> qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList
>> *mapping,
>> + Error **errp)
>> +{
>> +    QDict *in_mapping = qdict_new();
>> +    QDict *out_mapping = qdict_new();
>> +
>> +    for (; mapping; mapping = mapping->next) {
>> +    MigrationBlockNodeMapping *entry = mapping->value;
>> +
>> +    if (qdict_haskey(out_mapping, entry->node_name)) {
>> +    error_setg(errp, "Cannot map node name '%s' twice",
>> +   entry->node_name);
>> +    goto fail;
>> +    }
> 
> Can we call this command more than once?  Is it cumulative (call it once
> to set mapping for "a", second time to also set mapping for "b"), or
> should it reset (second call wipes out all mappings from first call, any
> mappings that must exist must be passed in the final call)?

I tried to make it clear in the documentation:

> +# @mapping: The mapping; must be one-to-one, but not necessarily
> +#   complete.  Any mapping not given will be reset to the
> +#   default (i.e. the identity mapping).

So everything that isn’t set in the second call is reset.  I thought
about what you proposed (because I guess that’s the most intuitive
idea), but after consideration I didn’t see why we’d need different
behavior, so it would only serve to make the code more complicated.

Max

> The idea makes sense, and the interface seems usable.  It's nice that
> either source, destination, or both sides of migration can use it (which
> helps in upgrade vs. downgrade scenarios).



signature.asc
Description: OpenPGP digital signature

86 matches

Mail list logo