multiple connections per block device with NBD?

2022-02-01 Thread Sam
Hello,

I am wondering whether NBD integration of qemu is able to establish multiple 
connections to remote NBD server for a single block device. nbd-client has this 
option:

-C, --connections
Use num connections to the server, to allow speeding up request handling, at 
the cost of higher resource usage on the server. Use of this option requires 
kernel support available first with Linux 4.9.

I am looking for equivalent functionality in qemu's direct NBD access as 
multiple connections in nbd-client does increase speeds many folds.

Regards,
Sam



Re: [PATCH v4 4/4] python/aqmp: add socket bind step to legacy.py

2022-02-01 Thread Kevin Wolf
Am 01.02.2022 um 19:32 hat John Snow geschrieben:
> On Tue, Feb 1, 2022 at 8:21 AM Kevin Wolf  wrote:
> >
> > Am 01.02.2022 um 05:11 hat John Snow geschrieben:
> > > The synchronous QMP library would bind to the server address during
> > > __init__(). The new library delays this to the accept() call, because
> > > binding occurs inside of the call to start_[unix_]server(), which is an
> > > async method -- so it cannot happen during __init__ anymore.
> > >
> > > Python 3.7+ adds the ability to create the server (and thus the bind()
> > > call) and begin the active listening in separate steps, but we don't
> > > have that functionality in 3.6, our current minimum.
> > >
> > > Therefore ... Add a temporary workaround that allows the synchronous
> > > version of the client to bind the socket in advance, guaranteeing that
> > > there will be a UNIX socket in the filesystem ready for the QEMU client
> > > to connect to without a race condition.
> > >
> > > (Yes, it's a bit ugly. Fixing it more nicely will have to wait until our
> > > minimum Python version is 3.7+.)
> > >
> > > Signed-off-by: John Snow 
> > > ---
> > >  python/qemu/aqmp/legacy.py   |  3 +++
> > >  python/qemu/aqmp/protocol.py | 41 +---
> > >  2 files changed, 41 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/python/qemu/aqmp/legacy.py b/python/qemu/aqmp/legacy.py
> > > index 0890f95b16..6baa5f3409 100644
> > > --- a/python/qemu/aqmp/legacy.py
> > > +++ b/python/qemu/aqmp/legacy.py
> > > @@ -56,6 +56,9 @@ def __init__(self, address: SocketAddrT,
> > >  self._address = address
> > >  self._timeout: Optional[float] = None
> > >
> > > +if server:
> > > +self._aqmp._bind_hack(address)  # pylint: 
> > > disable=protected-access
> >
> > I feel that this is the only part that really makes it ugly. Do you
> > really think this way is so bad that we can't make it an official public
> > interface in the library?
> >
> > Kevin
> >
> 
> Good question.
> 
> I felt like I'd rather use the 'start_serving' parameter of
> loop.create_server(...), added in python 3.7; see
> https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.create_server
> Python 3.6 is already EOL, but we still depend on it for our build and
> I wasn't prepared to write the series that forces us on to 3.7,
> because RHEL uses 3.6 as its native python. I'll have to update the
> docker files, etc -- and I'm sure people will be kind of unhappy with
> this, so I am putting it off. People were unhappy enough with the move
> to Python 3.6.

Yes, I understand. In theory - I haven't really thought much about it -
it feels like whether you create a separate socket in a first step and
pass it to the server that is created in the second step (for 3.6) or
you start an idle server in the first step and then let it start serving
in the second step (for 3.7+) should be an implementation detail if we
get the external API right.

> I also felt as though the async version has no real need for a
> separate bind step -- you can just start the server in a coroutine and
> wait until it yields, then proceed to launch QEMU. It's easy in that
> paradigm. If this bind step is only for the benefit of the legacy.py
> interface, I thought maybe it wasn't wise to commit to supporting it
> if it was something I wanted to get rid of soon anyway. There's also
> the ugliness that if you use the early bind step, the arguments passed
> to accept() are now ignored, which is kind of ugly, too. It's not a
> *great* interface. It doesn't spark joy.

Right, that's probably a sign that the interfaces aren't completely
right. I'm not sure which interfaces. As long as it's just _bind_hack()
and it's only used in an API that is going away in the future, that's no
problem. But it could also be a sign that the public interfaces aren't
flexible enough.

> I have some patches that are building a "sync.py" module that's meant
> to replace "legacy.py", and it's in the development of that module
> that I expect to either remove the bits I am unhappy with, or commit
> to supporting necessary infrastructure that's just simply required for
> a functional synchronous interface to work. I planned to start
> versioning the "qemu.qmp" package at 0.0.1, and the version that drops
> legacy.py I intend to version at 0.1.0.

If I understand these version numbers correctly, this also implies that
there is no compatibility promise yet. So maybe what to make a public
interface isn't even a big concern yet.

I guess the relevant question in the context of this patch is whether
sync.py will need a similar two-phase setup as legacy.py. Do you think
you can do without it when you have to reintroduce this behaviour here
to fix bugs?

> All that said, I'm fairly wishy-washy on it, so if you have some
> strong feelings, lemme know. There may be some real utility in just
> doubling down on always creating our own socket object. I just haven't
> thought through everything 

Re: [PATCH v2 15/25] block: fix FreeBSD build failure with fallocate

2022-02-01 Thread Kevin Wolf
Am 01.02.2022 um 19:20 hat Alex Bennée geschrieben:
> We already use the CONFIG_FALLOCATE_PUNCH_HOLE symbol elsewhere in the
> code so use it here.
> 
> Fixes: 4ca37a96a7 ("fuse: (Partially) implement fallocate()")
> Cc: Hanna Reitz 
> Signed-off-by: Alex Bennée 

I think this addresses the same issue as Phil's series?

https://lists.gnu.org/archive/html/qemu-block/2022-02/msg00016.html

It's in the pull request I sent earlier today.

Kevin




Re: [PULL 00/24] Block patches

2022-02-01 Thread Peter Maydell
On Tue, 1 Feb 2022 at 14:42, Hanna Reitz  wrote:
>
> The following changes since commit 804b30d25f8d70dc2dea951883ea92235274a50c:
>
>   Merge remote-tracking branch 'remotes/legoater/tags/pull-ppc-20220130' into 
> staging (2022-01-31 11:10:08 +)
>
> are available in the Git repository at:
>
>   https://gitlab.com/hreitz/qemu.git tags/pull-block-2022-02-01
>
> for you to fetch changes up to 751486c18555169ca4baf59440275d5831140822:
>
>   block.h: remove outdated comment (2022-02-01 13:28:53 +0100)
>
> 
> Block patches:
> - Add support to the iotests to test qcow2's zstd compression mode
> - Fix post-migration block node permissions
> - iotests fixes (051 and mirror-ready-cancel-error)
> - Remove an outdated comment
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.0
for any user-visible changes.

-- PMM



Re: [PATCH v4 1/4] python/aqmp: Fix negotiation with pre-"oob" QEMU

2022-02-01 Thread John Snow
On Tue, Feb 1, 2022 at 7:59 AM Kevin Wolf  wrote:
>
> Am 01.02.2022 um 05:11 hat John Snow geschrieben:
> > QEMU versions prior to the "oob" capability *also* can't accept the
> > "enable" keyword argument at all. Fix the handshake process with older
> > QEMU versions.
> >
> > Signed-off-by: John Snow 
> > Reviewed-by: Hanna Reitz 
> > ---
> >  python/qemu/aqmp/qmp_client.py | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/python/qemu/aqmp/qmp_client.py b/python/qemu/aqmp/qmp_client.py
> > index f1a845cc82..90a8737f03 100644
> > --- a/python/qemu/aqmp/qmp_client.py
> > +++ b/python/qemu/aqmp/qmp_client.py
> > @@ -292,9 +292,9 @@ async def _negotiate(self) -> None:
> >  """
> >  self.logger.debug("Negotiating capabilities ...")
> >
> > -arguments: Dict[str, List[str]] = {'enable': []}
> > +arguments: Dict[str, List[str]] = {}
> >  if self._greeting and 'oob' in self._greeting.QMP.capabilities:
> > -arguments['enable'].append('oob')
> > +arguments.setdefault('enable', []).append('oob')
> >  msg = self.make_execute_msg('qmp_capabilities', 
> > arguments=arguments)
>
> In case you have some interest in bike sheds:
>
> As long as we only ever append a single capability, it doesn't really
> make a difference and an explicit setdefault() when adding it is fine.
> But if we had more than one, maybe making arguments a defaultdict(list)
> would be nicer.
>
> Not worth respinning, of course, if you don't for another reason.
>
> Kevin
>

Nope, no reason. I just forget that there's a fancier doodad. I'll add
a patch to a less-important series to shine this up.

--js




[PATCH v2 15/25] block: fix FreeBSD build failure with fallocate

2022-02-01 Thread Alex Bennée
We already use the CONFIG_FALLOCATE_PUNCH_HOLE symbol elsewhere in the
code so use it here.

Fixes: 4ca37a96a7 ("fuse: (Partially) implement fallocate()")
Cc: Hanna Reitz 
Signed-off-by: Alex Bennée 
---
 block/export/fuse.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/export/fuse.c b/block/export/fuse.c
index 6710d8aed8..7ed69c4a05 100644
--- a/block/export/fuse.c
+++ b/block/export/fuse.c
@@ -625,6 +625,7 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 return;
 }
 
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
 if (mode & FALLOC_FL_KEEP_SIZE) {
 length = MIN(length, blk_len - offset);
 }
@@ -643,6 +644,7 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 length -= size;
 } while (ret == 0 && length > 0);
 }
+#endif
 #ifdef CONFIG_FALLOCATE_ZERO_RANGE
 else if (mode & FALLOC_FL_ZERO_RANGE) {
 if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + length > blk_len) {
-- 
2.30.2




Re: [PATCH v4 0/4] Python: Improvements for iotest 040,041

2022-02-01 Thread John Snow
On Tue, Feb 1, 2022 at 8:28 AM Kevin Wolf  wrote:
>
> Am 01.02.2022 um 05:11 hat John Snow geschrieben:
> > GitLab: https://gitlab.com/jsnow/qemu/-/commits/python-aqmp-fixes
> > CI: https://gitlab.com/jsnow/qemu/-/pipelines/455146881
> >
> > Fixes and improvements all relating to "iotest 040,041, intermittent
> > failure in netbsd VM"
> > https://lists.gnu.org/archive/html/qemu-devel/2022-01/msg01975.html
> >
> > See each patch for details.
>
> Thanks, the new output when QEMU fails to start looks really useful!
>
> The only thing we could probably still improve is detecting that the
> QEMU process has exited instead of waiting for the socket connection to
> time out. But since it only happens in case of failure, the additional
> seconds of waiting are probably only a bit annoying for debugging, but
> not a big problem.

That's absolutely on my radar, I assure you!

It's something that is easy to solve with asyncio:

async def launch(self, ...):
task1 = asyncio.create_task(self.qmp.accept(...))
task2 = asyncio.create_subprocess_exec(...)
ret = asyncio.wait_for((task1, task2))
...

If either task raises, then the wait_for will also end prematurely
(and cancel the other task). I'm sure it won't actually be this easy,
but that's the general idea.

Though, that's a pretty big upheaval to the Python code again, so it's
not something I can jam in quickly. I have some light sketches that
examine a "what-if" for an asyncio-native machine.py using the asyncio
QMP code, but there are some design concerns there -- if I go through
the effort of doing this, I will want to publish that python package
upstream as well, and if I do that, it needs to be carefully thought
through in terms of a supportable interface.

Well, I'm thinking about it, anyway. For right now I am interested in
getting the qemu.qmp project out the door and onto PyPI.org, and then
I'll worry about machine improvements. If in the meantime you have a
good idea for how to fix up the machine.py code we already have, I'm
happy to take patches. Otherwise, I'll try to get to it "soon".

>
> Reviewed-by: Kevin Wolf 
>

Thanks!




Re: [PATCH v4 4/4] python/aqmp: add socket bind step to legacy.py

2022-02-01 Thread John Snow
On Tue, Feb 1, 2022 at 8:21 AM Kevin Wolf  wrote:
>
> Am 01.02.2022 um 05:11 hat John Snow geschrieben:
> > The synchronous QMP library would bind to the server address during
> > __init__(). The new library delays this to the accept() call, because
> > binding occurs inside of the call to start_[unix_]server(), which is an
> > async method -- so it cannot happen during __init__ anymore.
> >
> > Python 3.7+ adds the ability to create the server (and thus the bind()
> > call) and begin the active listening in separate steps, but we don't
> > have that functionality in 3.6, our current minimum.
> >
> > Therefore ... Add a temporary workaround that allows the synchronous
> > version of the client to bind the socket in advance, guaranteeing that
> > there will be a UNIX socket in the filesystem ready for the QEMU client
> > to connect to without a race condition.
> >
> > (Yes, it's a bit ugly. Fixing it more nicely will have to wait until our
> > minimum Python version is 3.7+.)
> >
> > Signed-off-by: John Snow 
> > ---
> >  python/qemu/aqmp/legacy.py   |  3 +++
> >  python/qemu/aqmp/protocol.py | 41 +---
> >  2 files changed, 41 insertions(+), 3 deletions(-)
> >
> > diff --git a/python/qemu/aqmp/legacy.py b/python/qemu/aqmp/legacy.py
> > index 0890f95b16..6baa5f3409 100644
> > --- a/python/qemu/aqmp/legacy.py
> > +++ b/python/qemu/aqmp/legacy.py
> > @@ -56,6 +56,9 @@ def __init__(self, address: SocketAddrT,
> >  self._address = address
> >  self._timeout: Optional[float] = None
> >
> > +if server:
> > +self._aqmp._bind_hack(address)  # pylint: 
> > disable=protected-access
>
> I feel that this is the only part that really makes it ugly. Do you
> really think this way is so bad that we can't make it an official public
> interface in the library?
>
> Kevin
>

Good question.

I felt like I'd rather use the 'start_serving' parameter of
loop.create_server(...), added in python 3.7; see
https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.create_server
Python 3.6 is already EOL, but we still depend on it for our build and
I wasn't prepared to write the series that forces us on to 3.7,
because RHEL uses 3.6 as its native python. I'll have to update the
docker files, etc -- and I'm sure people will be kind of unhappy with
this, so I am putting it off. People were unhappy enough with the move
to Python 3.6.

I also felt as though the async version has no real need for a
separate bind step -- you can just start the server in a coroutine and
wait until it yields, then proceed to launch QEMU. It's easy in that
paradigm. If this bind step is only for the benefit of the legacy.py
interface, I thought maybe it wasn't wise to commit to supporting it
if it was something I wanted to get rid of soon anyway. There's also
the ugliness that if you use the early bind step, the arguments passed
to accept() are now ignored, which is kind of ugly, too. It's not a
*great* interface. It doesn't spark joy.

I have some patches that are building a "sync.py" module that's meant
to replace "legacy.py", and it's in the development of that module
that I expect to either remove the bits I am unhappy with, or commit
to supporting necessary infrastructure that's just simply required for
a functional synchronous interface to work. I planned to start
versioning the "qemu.qmp" package at 0.0.1, and the version that drops
legacy.py I intend to version at 0.1.0.

All that said, I'm fairly wishy-washy on it, so if you have some
strong feelings, lemme know. There may be some real utility in just
doubling down on always creating our own socket object. I just haven't
thought through everything here, admittedly.

--js




[PATCH v2 08/25] drop libxml2 checks since libxml is not actually used (for parallels)

2022-02-01 Thread Alex Bennée
From: Michael Tokarev 

For a long time, we assumed that libxml2 is necessary for parallels
block format support (block/parallels*). However, this format actually
does not use libxml [*]. Since this is the only user of libxml2 in
whole QEMU tree, we can drop all libxml2 checks and dependencies too.

It is even more: --enable-parallels configure option was the only
option which was silently ignored when it's (fake) dependency
(libxml2) isn't installed.

Drop all mentions of libxml2.

[*] Actually the basis for libxml use were introduced in commit
ed279a06c53 ("configure: add dependency") but the implementation
was never merged:

https://lore.kernel.org/qemu-devel/70227bbd-a517-70e9-714f-e6e0ec431...@openvz.org/

Signed-off-by: Michael Tokarev 
Reviewed-by: Stefan Hajnoczi 
Message-Id: <20220119090423.149315-1-...@msgid.tls.msk.ru>
Tested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
[PMD: Updated description and adapted to use lcitool]
Reviewed-by: Thomas Huth 
Signed-off-by: Philippe Mathieu-Daudé 
Signed-off-by: Alex Bennée 
Message-Id: <20220121154134.315047-5-f4...@amsat.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20220124201608.604599-9-alex.ben...@linaro.org>
---
 meson.build | 6 --
 block/meson.build   | 3 +--
 meson_options.txt   | 2 --
 scripts/checkpatch.pl   | 1 -
 scripts/ci/org.centos/stream/8/x86_64/configure | 1 -
 scripts/coverity-scan/coverity-scan.docker  | 1 -
 scripts/coverity-scan/run-coverity-scan | 2 +-
 scripts/meson-buildoptions.sh   | 3 ---
 8 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/meson.build b/meson.build
index 5f43355071..82db1e7e74 100644
--- a/meson.build
+++ b/meson.build
@@ -453,11 +453,6 @@ if not get_option('linux_io_uring').auto() or have_block
   required: get_option('linux_io_uring'),
   method: 'pkg-config', kwargs: static_kwargs)
 endif
-libxml2 = not_found
-if not get_option('libxml2').auto() or have_block
-  libxml2 = dependency('libxml-2.0', required: get_option('libxml2'),
-   method: 'pkg-config', kwargs: static_kwargs)
-endif
 libnfs = not_found
 if not get_option('libnfs').auto() or have_block
   libnfs = dependency('libnfs', version: '>=1.9.3',
@@ -3496,7 +3491,6 @@ summary_info += {'bzip2 support': libbzip2}
 summary_info += {'lzfse support': liblzfse}
 summary_info += {'zstd support':  zstd}
 summary_info += {'NUMA host support': config_host.has_key('CONFIG_NUMA')}
-summary_info += {'libxml2':   libxml2}
 summary_info += {'capstone':  capstone_opt == 'internal' ? 
capstone_opt : capstone}
 summary_info += {'libpmem support':   libpmem}
 summary_info += {'libdaxctl support': libdaxctl}
diff --git a/block/meson.build b/block/meson.build
index deb73ca389..90dc9983e5 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -58,8 +58,7 @@ block_ss.add(when: 'CONFIG_QED', if_true: files(
   'qed-table.c',
   'qed.c',
 ))
-block_ss.add(when: [libxml2, 'CONFIG_PARALLELS'],
- if_true: files('parallels.c', 'parallels-ext.c'))
+block_ss.add(when: 'CONFIG_PARALLELS', if_true: files('parallels.c', 
'parallels-ext.c'))
 block_ss.add(when: 'CONFIG_WIN32', if_true: files('file-win32.c', 
'win32-aio.c'))
 block_ss.add(when: 'CONFIG_POSIX', if_true: [files('file-posix.c'), coref, 
iokit])
 block_ss.add(when: libiscsi, if_true: files('iscsi-opts.c'))
diff --git a/meson_options.txt b/meson_options.txt
index 921967eddb..95d527f773 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -113,8 +113,6 @@ option('libudev', type : 'feature', value : 'auto',
description: 'Use libudev to enumerate host devices')
 option('libusb', type : 'feature', value : 'auto',
description: 'libusb support for USB passthrough')
-option('libxml2', type : 'feature', value : 'auto',
-   description: 'libxml2 support for Parallels image format')
 option('linux_aio', type : 'feature', value : 'auto',
description: 'Linux AIO support')
 option('linux_io_uring', type : 'feature', value : 'auto',
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 5caa739db4..5e50111060 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -307,7 +307,6 @@ our @typeList = (
qr{target_(?:u)?long},
qr{hwaddr},
 # external libraries
-   qr{xml${Ident}},
qr{xen\w+_handle},
# Glib definitions
qr{gchar},
diff --git a/scripts/ci/org.centos/stream/8/x86_64/configure 
b/scripts/ci/org.centos/stream/8/x86_64/configure
index e05f2fddcc..9850dd 100755
--- a/scripts/ci/org.centos/stream/8/x86_64/configure
+++ b/scripts/ci/org.centos/stream/8/x86_64/configure
@@ -81,7 +81,6 @@
 --disable-libssh \
 --disable-libudev \
 --disable-libusb \
---disable-libxml2 \
 --disable-linux-aio \
 

Re: [PATCH v8 1/5] QIOChannel: Add flags on io_writev and introduce io_flush callback

2022-02-01 Thread Leonardo Bras Soares Passos
Hello Daniel, thanks for reviewing!

On Tue, Feb 1, 2022 at 6:35 AM Daniel P. Berrangé  wrote:
>
> On Tue, Feb 01, 2022 at 03:28:59AM -0300, Leonardo Bras wrote:
> > Add flags to io_writev and introduce io_flush as optional callback to
> > QIOChannelClass, allowing the implementation of zero copy writes by
> > subclasses.
> >
> > How to use them:
> > - Write data using 
> > qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY),
> > - Wait write completion with qio_channel_flush().
> >
> > Notes:
> > As some zero copy write implementations work asynchronously, it's
> > recommended to keep the write buffer untouched until the return of
> > qio_channel_flush(), to avoid the risk of sending an updated buffer
> > instead of the buffer state during write.
> >
> > As io_flush callback is optional, if a subclass does not implement it, then:
> > - io_flush will return 0 without changing anything.
> >
> > Also, some functions like qio_channel_writev_full_all() were adapted to
> > receive a flag parameter. That allows shared code between zero copy and
> > non-zero copy writev, and also an easier implementation on new flags.
> >
> > Signed-off-by: Leonardo Bras 
> > ---
> >  include/io/channel.h| 38 -
> >  chardev/char-io.c   |  2 +-
> >  hw/remote/mpqemu-link.c |  2 +-
> >  io/channel-buffer.c |  1 +
> >  io/channel-command.c|  1 +
> >  io/channel-file.c   |  1 +
> >  io/channel-socket.c |  2 ++
> >  io/channel-tls.c|  1 +
> >  io/channel-websock.c|  1 +
> >  io/channel.c| 53 +++--
> >  migration/rdma.c|  1 +
> >  scsi/pr-manager-helper.c|  2 +-
> >  tests/unit/test-io-channel-socket.c |  1 +
> >  13 files changed, 92 insertions(+), 14 deletions(-)
> >
> > diff --git a/io/channel.c b/io/channel.c
> > index e8b019dc36..b8b99fdc4c 100644
> > --- a/io/channel.c
> > +++ b/io/channel.c
> > @@ -72,18 +72,32 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
> >  size_t niov,
> >  int *fds,
> >  size_t nfds,
> > +int flags,
> >  Error **errp)
> >  {
> >  QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
> >
> > -if ((fds || nfds) &&
> > -!qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_FD_PASS)) {
> > +if (fds || nfds) {
> > +if (!qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_FD_PASS)) {
> > +error_setg_errno(errp, EINVAL,
> > + "Channel does not support file descriptor 
> > passing");
> > +return -1;
> > +}
> > +if (flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) {
> > +error_setg_errno(errp, EINVAL,
> > + "Zero Copy does not support file descriptor 
> > passing");
> > +return -1;
> > +}
>
> Here you gracefully reject FD passing when zero copy is requested
> which is good.
>
> > +}
> > +
>
> > @@ -235,10 +249,16 @@ int qio_channel_writev_full_all(QIOChannel *ioc,
> >iov, niov,
> >0, iov_size(iov, niov));
> >
> > +if (flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) {
> > +assert(fds == NULL && nfds == 0);
> > +}
>
> But here you  abort QEMU if FD passing is requested when zero copy
> is set.
>
> AFAICT, if you just delete this assert, the code to gracefully
> report errors will do the right thing.

Yeah, thatś right. This test is unnecessary since qio_channel_writev_full()
will be called and will return error if fds + zerocopy happens.

Good catch!

>
> Without the assert:
>
>   Reviewed-by: Daniel P. Berrangé 
>

Thanks!
I will wait for more feedback on other patches before sending the v9,
but it should not take too long this time.

Best regards,
Leo




Re: [RFC] block/nbd: Move s->ioc on AioContext change

2022-02-01 Thread Vladimir Sementsov-Ogievskiy

01.02.2022 19:14, Hanna Reitz wrote:

reconnect_delay_timer should exist only during IO request: it's created during 
request if we don't have a connection. And request will not finish until timer 
elapsed or connection established (timer should be removed in this case too). 
So, again, when attaching / detaching the context we should be in a drained 
sections, so no in-flight requests and no reconnect_delay_timer.


Got it.  FWIW, other block drivers rely on this, too (e.g. null-aio with 
latency-ns set creates a timer in every I/O request and settles the request 
once the timer expires).


Looks like the timer isn’t removed when the connection is reestablished.


Oops. And that's wrong.. If connection lost again when old timer is still not 
fired, assertion in reconnect_delay_timer_init() will fail.

I think we just need reconnect_delay_timer_del() call at the end of 
nbd_reconnect_attempt() function (after nbd_co_do_establish_connection() call).


--
Best regards,
Vladimir



Re: [RFC] block/nbd: Move s->ioc on AioContext change

2022-02-01 Thread Hanna Reitz

On 01.02.22 12:40, Hanna Reitz wrote:

On 01.02.22 12:18, Vladimir Sementsov-Ogievskiy wrote:

28.01.2022 18:51, Hanna Reitz wrote:

s->ioc must always be attached to the NBD node's AioContext.  If that
context changes, s->ioc must be attached to the new context.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1990835
Signed-off-by: Hanna Reitz 
---
This is an RFC because I believe there are some other things in the NBD
block driver that need attention on an AioContext change, too. Namely,
there are two timers (reconnect_delay_timer and open_timer) that are
also attached to the node's AioContext, and I'm afraid they need to be
handled, too.  Probably pause them on detach, and resume them on 
attach,

but I'm not sure, which is why I'm posting this as an RFC to get some
comments from that from someone who knows this code better than me. :)

(Also, in a real v1, of course I'd want to add a regression test.)
---
  block/nbd.c | 28 
  1 file changed, 28 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 63dbfa807d..119a774c04 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -2036,6 +2036,25 @@ static void 
nbd_cancel_in_flight(BlockDriverState *bs)

  nbd_co_establish_connection_cancel(s->conn);
  }
  +static void nbd_attach_aio_context(BlockDriverState *bs,
+   AioContext *new_context)
+{
+    BDRVNBDState *s = bs->opaque;
+
+    if (s->ioc) {
+    qio_channel_attach_aio_context(s->ioc, new_context);
+    }
+}
+
+static void nbd_detach_aio_context(BlockDriverState *bs)
+{
+    BDRVNBDState *s = bs->opaque;
+
+    if (s->ioc) {
+    qio_channel_detach_aio_context(s->ioc);
+    }
+}
+
  static BlockDriver bdrv_nbd = {
  .format_name    = "nbd",
  .protocol_name  = "nbd",
@@ -2059,6 +2078,9 @@ static BlockDriver bdrv_nbd = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts    = nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+    .bdrv_attach_aio_context    = nbd_attach_aio_context,
+    .bdrv_detach_aio_context    = nbd_detach_aio_context,
  };
    static BlockDriver bdrv_nbd_tcp = {
@@ -2084,6 +2106,9 @@ static BlockDriver bdrv_nbd_tcp = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts    = nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+    .bdrv_attach_aio_context    = nbd_attach_aio_context,
+    .bdrv_detach_aio_context    = nbd_detach_aio_context,
  };
    static BlockDriver bdrv_nbd_unix = {
@@ -2109,6 +2134,9 @@ static BlockDriver bdrv_nbd_unix = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts    = nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+    .bdrv_attach_aio_context    = nbd_attach_aio_context,
+    .bdrv_detach_aio_context    = nbd_detach_aio_context,
  };
    static void bdrv_nbd_init(void)




Hmm. I was so happy to remove these handlers together with 
connection-coroutine :) . But you are right, seems I've removed too 
much :(.



open_timer exists only during bdrv_open() handler, so, I hope on 
attach/detach it should not exist.


That’s… kind of surprising.  It’s good for me here, but as far as I 
can see it means that all of qemu blocks until the connection 
succeeds, right?  That doesn’t seem quite ideal...


Anyway, good for me. O:)

reconnect_delay_timer should exist only during IO request: it's 
created during request if we don't have a connection. And request 
will not finish until timer elapsed or connection established (timer 
should be removed in this case too). So, again, when attaching / 
detaching the context we should be in a drained sections, so no 
in-flight requests and no reconnect_delay_timer.


Got it.  FWIW, other block drivers rely on this, too (e.g. null-aio 
with latency-ns set creates a timer in every I/O request and settles 
the request once the timer expires).


Looks like the timer isn’t removed when the connection is 
reestablished.  When I add an `assert(!s->reconnect_delay_timer)` to 
`nbd_attach_aio_context()` (on top of this patch), then I get:


$ ./qemu-nbd \
    --fork \
    --pid-file=/tmp/nbd.pid \
    --socket=/tmp/nbd.sock \
    -f raw \
    null-co://

$ (echo '{"execute": "qmp_capabilities"}';
sleep 1;
kill $(cat /tmp/nbd.pid);
./qemu-nbd \
    --fork \
    --pid-file=/tmp/nbd.pid \
    --socket=/tmp/nbd.sock \
    -f raw \
    null-co://;
echo '{"execute": "human-monitor-command",
   "arguments": {"command-line": "qemu-io nbd \"write 0 64k\""}}';
echo '{"execute": "x-blockdev-set-iothread",
   "arguments": {"node-name": "nbd", "iothread": "iothr0"}}') \
| ./qemu-system-x86_64 \
    -qmp stdio \
    -blockdev '{
    "node-name": "nbd",
    "driver": "nbd",
    "reconnect-delay": 1,
    "server": {
    "type": "unix",
    "path": "/tmp/nbd.sock"
    } }' \
    -object 

[PULL 10/10] block/rbd: workaround for ceph issue #53784

2022-02-01 Thread Kevin Wolf
From: Peter Lieven 

librbd had a bug until early 2022 that affected all versions of ceph that
supported fast-diff. This bug results in reporting of incorrect offsets
if the offset parameter to rbd_diff_iterate2 is not object aligned.

This patch works around this bug for pre Quincy versions of librbd.

Fixes: 0347a8fd4c3faaedf119be04c197804be40a384b
Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Lieven 
Message-Id: <20220113144426.4036493-3...@kamp.de>
Reviewed-by: Ilya Dryomov 
Reviewed-by: Stefano Garzarella 
Tested-by: Stefano Garzarella 
Signed-off-by: Kevin Wolf 
---
 block/rbd.c | 42 --
 1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 20bb896c4a..8f183eba2a 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1320,6 +1320,7 @@ static int coroutine_fn 
qemu_rbd_co_block_status(BlockDriverState *bs,
 int status, r;
 RBDDiffIterateReq req = { .offs = offset };
 uint64_t features, flags;
+uint64_t head = 0;
 
 assert(offset + bytes <= s->image_size);
 
@@ -1347,7 +1348,43 @@ static int coroutine_fn 
qemu_rbd_co_block_status(BlockDriverState *bs,
 return status;
 }
 
-r = rbd_diff_iterate2(s->image, NULL, offset, bytes, true, true,
+#if LIBRBD_VERSION_CODE < LIBRBD_VERSION(1, 17, 0)
+/*
+ * librbd had a bug until early 2022 that affected all versions of ceph 
that
+ * supported fast-diff. This bug results in reporting of incorrect offsets
+ * if the offset parameter to rbd_diff_iterate2 is not object aligned.
+ * Work around this bug by rounding down the offset to object boundaries.
+ * This is OK because we call rbd_diff_iterate2 with whole_object = true.
+ * However, this workaround only works for non cloned images with default
+ * striping.
+ *
+ * See: https://tracker.ceph.com/issues/53784
+ */
+
+/* check if RBD image has non-default striping enabled */
+if (features & RBD_FEATURE_STRIPINGV2) {
+return status;
+}
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
+/*
+ * check if RBD image is a clone (= has a parent).
+ *
+ * rbd_get_parent_info is deprecated from Nautilus onwards, but the
+ * replacement rbd_get_parent is not present in Luminous and Mimic.
+ */
+if (rbd_get_parent_info(s->image, NULL, 0, NULL, 0, NULL, 0) != -ENOENT) {
+return status;
+}
+#pragma GCC diagnostic pop
+
+head = req.offs & (s->object_size - 1);
+req.offs -= head;
+bytes += head;
+#endif
+
+r = rbd_diff_iterate2(s->image, NULL, req.offs, bytes, true, true,
   qemu_rbd_diff_iterate_cb, );
 if (r < 0 && r != QEMU_RBD_EXIT_DIFF_ITERATE2) {
 return status;
@@ -1366,7 +1403,8 @@ static int coroutine_fn 
qemu_rbd_co_block_status(BlockDriverState *bs,
 status = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID;
 }
 
-*pnum = req.bytes;
+assert(req.bytes > head);
+*pnum = req.bytes - head;
 return status;
 }
 
-- 
2.31.1




[PULL 09/10] block/rbd: fix handling of holes in .bdrv_co_block_status

2022-02-01 Thread Kevin Wolf
From: Peter Lieven 

the assumption that we can't hit a hole if we do not diff against a snapshot 
was wrong.

We can see a hole in an image if we diff against base if there exists an older 
snapshot
of the image and we have discarded blocks in the image where the snapshot has 
data.

Fix this by simply handling a hole like an unallocated area. There are no 
callbacks
for unallocated areas so just bail out if we hit a hole.

Fixes: 0347a8fd4c3faaedf119be04c197804be40a384b
Suggested-by: Ilya Dryomov 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Lieven 
Message-Id: <20220113144426.4036493-2...@kamp.de>
Reviewed-by: Ilya Dryomov 
Reviewed-by: Stefano Garzarella 
Signed-off-by: Kevin Wolf 
---
 block/rbd.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index def96292e0..20bb896c4a 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1279,11 +1279,11 @@ static int qemu_rbd_diff_iterate_cb(uint64_t offs, 
size_t len,
 RBDDiffIterateReq *req = opaque;
 
 assert(req->offs + req->bytes <= offs);
-/*
- * we do not diff against a snapshot so we should never receive a callback
- * for a hole.
- */
-assert(exists);
+
+/* treat a hole like an unallocated area and bail out */
+if (!exists) {
+return 0;
+}
 
 if (!req->exists && offs > req->offs) {
 /*
-- 
2.31.1




[PULL 08/10] qemu-img: Unify [-b [-F]] documentation

2022-02-01 Thread Kevin Wolf
From: Hanna Reitz 

qemu-img convert documents the backing file and backing format options
as follows:
[-B backing_file [-F backing_fmt]]
whereas qemu-img create has this:
[-b backing_file] [-F backing_fmt]

That is, for convert, we document that -F cannot be given without -B,
while for create, way say that they are independent.

Indeed, it is technically possible to give -F without -b, because it is
left to the block driver to decide whether this is an error or not, so
sometimes it is:

$ qemu-img create -f qed -F qed test.qed 64M
Formatting 'test.qed', fmt=qed size=67108864 backing_fmt=qed [...]

And sometimes it is not:

$ qemu-img create -f qcow2 -F qcow2 test.qcow2 64M
Formatting 'test.qcow2', fmt=qcow2 cluster_size=65536 [...]
qemu-img: test.qcow2: Backing format cannot be used without backing file

Generally, it does not make much sense, though, and users should only
give -F with -b, so document it that way, as we have already done for
qemu-img convert (commit 1899bf47375ad40555dcdff12ba49b4b8b82df38).

Reported-by: Tingting Mao 
Signed-off-by: Hanna Reitz 
Message-Id: <20220131135908.32393-1-hre...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 docs/tools/qemu-img.rst | 2 +-
 qemu-img-cmds.hx| 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index d663dd92bd..8885ea11cf 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -463,7 +463,7 @@ Command description:
   ``--skip-broken-bitmaps`` is also specified to copy only the
   consistent bitmaps.
 
-.. option:: create [--object OBJECTDEF] [-q] [-f FMT] [-b BACKING_FILE] [-F 
BACKING_FMT] [-u] [-o OPTIONS] FILENAME [SIZE]
+.. option:: create [--object OBJECTDEF] [-q] [-f FMT] [-b BACKING_FILE [-F 
BACKING_FMT]] [-u] [-o OPTIONS] FILENAME [SIZE]
 
   Create the new disk image *FILENAME* of size *SIZE* and format
   *FMT*. Depending on the file format, you can add one or more *OPTIONS*
diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 72bcdcfbfa..1b1dab5b17 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -52,9 +52,9 @@ SRST
 ERST
 
 DEF("create", img_create,
-"create [--object objectdef] [-q] [-f fmt] [-b backing_file] [-F 
backing_fmt] [-u] [-o options] filename [size]")
+"create [--object objectdef] [-q] [-f fmt] [-b backing_file [-F 
backing_fmt]] [-u] [-o options] filename [size]")
 SRST
-.. option:: create [--object OBJECTDEF] [-q] [-f FMT] [-b BACKING_FILE] [-F 
BACKING_FMT] [-u] [-o OPTIONS] FILENAME [SIZE]
+.. option:: create [--object OBJECTDEF] [-q] [-f FMT] [-b BACKING_FILE [-F 
BACKING_FMT]] [-u] [-o OPTIONS] FILENAME [SIZE]
 ERST
 
 DEF("dd", img_dd,
-- 
2.31.1




Re: [PATCH v4 0/4] Python: Improvements for iotest 040,041

2022-02-01 Thread Kevin Wolf
Am 01.02.2022 um 05:11 hat John Snow geschrieben:
> GitLab: https://gitlab.com/jsnow/qemu/-/commits/python-aqmp-fixes
> CI: https://gitlab.com/jsnow/qemu/-/pipelines/455146881
> 
> Fixes and improvements all relating to "iotest 040,041, intermittent
> failure in netbsd VM"
> https://lists.gnu.org/archive/html/qemu-devel/2022-01/msg01975.html
> 
> See each patch for details.

Thanks, the new output when QEMU fails to start looks really useful!

The only thing we could probably still improve is detecting that the
QEMU process has exited instead of waiting for the socket connection to
time out. But since it only happens in case of failure, the additional
seconds of waiting are probably only a bit annoying for debugging, but
not a big problem.

Reviewed-by: Kevin Wolf 




Re: [PATCH v4 4/4] python/aqmp: add socket bind step to legacy.py

2022-02-01 Thread Kevin Wolf
Am 01.02.2022 um 05:11 hat John Snow geschrieben:
> The synchronous QMP library would bind to the server address during
> __init__(). The new library delays this to the accept() call, because
> binding occurs inside of the call to start_[unix_]server(), which is an
> async method -- so it cannot happen during __init__ anymore.
> 
> Python 3.7+ adds the ability to create the server (and thus the bind()
> call) and begin the active listening in separate steps, but we don't
> have that functionality in 3.6, our current minimum.
> 
> Therefore ... Add a temporary workaround that allows the synchronous
> version of the client to bind the socket in advance, guaranteeing that
> there will be a UNIX socket in the filesystem ready for the QEMU client
> to connect to without a race condition.
> 
> (Yes, it's a bit ugly. Fixing it more nicely will have to wait until our
> minimum Python version is 3.7+.)
> 
> Signed-off-by: John Snow 
> ---
>  python/qemu/aqmp/legacy.py   |  3 +++
>  python/qemu/aqmp/protocol.py | 41 +---
>  2 files changed, 41 insertions(+), 3 deletions(-)
> 
> diff --git a/python/qemu/aqmp/legacy.py b/python/qemu/aqmp/legacy.py
> index 0890f95b16..6baa5f3409 100644
> --- a/python/qemu/aqmp/legacy.py
> +++ b/python/qemu/aqmp/legacy.py
> @@ -56,6 +56,9 @@ def __init__(self, address: SocketAddrT,
>  self._address = address
>  self._timeout: Optional[float] = None
>  
> +if server:
> +self._aqmp._bind_hack(address)  # pylint: 
> disable=protected-access

I feel that this is the only part that really makes it ugly. Do you
really think this way is so bad that we can't make it an official public
interface in the library?

Kevin




[PULL 03/10] block/export: Fix vhost-user-blk shutdown with requests in flight

2022-02-01 Thread Kevin Wolf
The vhost-user-blk export runs requests asynchronously in their own
coroutine. When the vhost connection goes away and we want to stop the
vhost-user server, we need to wait for these coroutines to stop before
we can unmap the shared memory. Otherwise, they would still access the
unmapped memory and crash.

This introduces a refcount to VuServer which is increased when spawning
a new request coroutine and decreased before the coroutine exits. The
memory is only unmapped when the refcount reaches zero.

Signed-off-by: Kevin Wolf 
Message-Id: <20220125151435.48792-1-kw...@redhat.com>
Signed-off-by: Kevin Wolf 
---
 include/qemu/vhost-user-server.h |  5 +
 block/export/vhost-user-blk-server.c |  5 +
 util/vhost-user-server.c | 22 ++
 3 files changed, 32 insertions(+)

diff --git a/include/qemu/vhost-user-server.h b/include/qemu/vhost-user-server.h
index 121ea1dedf..cd43193b80 100644
--- a/include/qemu/vhost-user-server.h
+++ b/include/qemu/vhost-user-server.h
@@ -42,6 +42,8 @@ typedef struct {
 const VuDevIface *vu_iface;
 
 /* Protected by ctx lock */
+unsigned int refcount;
+bool wait_idle;
 VuDev vu_dev;
 QIOChannel *ioc; /* The I/O channel with the client */
 QIOChannelSocket *sioc; /* The underlying data channel with the client */
@@ -59,6 +61,9 @@ bool vhost_user_server_start(VuServer *server,
 
 void vhost_user_server_stop(VuServer *server);
 
+void vhost_user_server_ref(VuServer *server);
+void vhost_user_server_unref(VuServer *server);
+
 void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx);
 void vhost_user_server_detach_aio_context(VuServer *server);
 
diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index 1862563336..a129204c44 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -172,6 +172,7 @@ vu_blk_discard_write_zeroes(VuBlkExport *vexp, struct iovec 
*iov,
 return VIRTIO_BLK_S_IOERR;
 }
 
+/* Called with server refcount increased, must decrease before returning */
 static void coroutine_fn vu_blk_virtio_process_req(void *opaque)
 {
 VuBlkReq *req = opaque;
@@ -286,10 +287,12 @@ static void coroutine_fn vu_blk_virtio_process_req(void 
*opaque)
 }
 
 vu_blk_req_complete(req);
+vhost_user_server_unref(server);
 return;
 
 err:
 free(req);
+vhost_user_server_unref(server);
 }
 
 static void vu_blk_process_vq(VuDev *vu_dev, int idx)
@@ -310,6 +313,8 @@ static void vu_blk_process_vq(VuDev *vu_dev, int idx)
 
 Coroutine *co =
 qemu_coroutine_create(vu_blk_virtio_process_req, req);
+
+vhost_user_server_ref(server);
 qemu_coroutine_enter(co);
 }
 }
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index f68287e811..f66fbba710 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -74,6 +74,20 @@ static void panic_cb(VuDev *vu_dev, const char *buf)
 error_report("vu_panic: %s", buf);
 }
 
+void vhost_user_server_ref(VuServer *server)
+{
+assert(!server->wait_idle);
+server->refcount++;
+}
+
+void vhost_user_server_unref(VuServer *server)
+{
+server->refcount--;
+if (server->wait_idle && !server->refcount) {
+aio_co_wake(server->co_trip);
+}
+}
+
 static bool coroutine_fn
 vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
 {
@@ -177,6 +191,14 @@ static coroutine_fn void vu_client_trip(void *opaque)
 /* Keep running */
 }
 
+if (server->refcount) {
+/* Wait for requests to complete before we can unmap the memory */
+server->wait_idle = true;
+qemu_coroutine_yield();
+server->wait_idle = false;
+}
+assert(server->refcount == 0);
+
 vu_deinit(vu_dev);
 
 /* vu_deinit() should have called remove_watch() */
-- 
2.31.1




Re: Block alignment of qcow2 compress driver

2022-02-01 Thread Vladimir Sementsov-Ogievskiy

28.01.2022 14:07, Richard W.M. Jones wrote:

The commands below set up a sparse RAM disk, with an allocated block
at offset 32K and another one at offset 1M-32K.  Then it tries to copy
this to a compressed qcow2 file using qemu-nbd + the qemu compress
filter:

   $ qemu-img create -f qcow2 output.qcow2 1M
   $ qemu-nbd -t --image-opts 
driver=compress,file.driver=qcow2,file.file.driver=file,file.file.filename=output.qcow2
 & sleep 1
   $ nbdkit -U - \
data '@32768 1*32768 @1015808 1*32768' \
--run 'nbdcopy $uri nbd://localhost -p'

The nbdcopy command fails when zeroing the first 32K with:

   nbd://localhost: nbd_aio_zero: request is unaligned: Invalid argument

This is a bug in nbdcopy because it ignores the minimum block size
being correctly declared by the compress filter:

   $ nbdinfo nbd://localhost
   protocol: newstyle-fixed without TLS
   export="":
export-size: 1048576 (1M)
uri: nbd://localhost:10809/
contexts:
   ...
block_size_minimum: 65536  <
block_size_preferred: 65536
block_size_maximum: 33554432

The compress filter sets the minimum block size to the the same as the
qcow2 cluster size here:

   
https://gitlab.com/qemu-project/qemu/-/blob/cfe63e46be0a1f8a7fd2fd5547222f8344a43279/block/filter-compress.c#L117

I patched qemu to force this to 4K:

-bs->bl.request_alignment = bdi.cluster_size;
+//bs->bl.request_alignment = bdi.cluster_size;
+bs->bl.request_alignment = 4096;

and the copy above works, and the output file is compressed!

So my question is, does the compress filter in qemu really need to
declare the large minimum block size?  I'm not especially concerned
about efficiency, I'd prefer it just worked, and changing nbdcopy to
understand block sizes is painful.

Is it already adjustable at run time?  (I tried using --image-opts
like compress.request_alignment=4096 but it seems like the filter
doesn't support anything I could think of, and I don't know how to
list the supported options.)




Hi!

I didn't read the whole thread, so in case it was not mentioned:

There is a limitation about compressed writes in qcow2 driver in Qemu: you 
can't do compressed write to the same cluster twice, the second write will 
fail. So, when we do some copying (or backup) and write cluster by cluster (or 
at least cluster-aligend) everything works. If you write partial cluster (with 
compress filter) it may work due to automatic RMW, but when you than try to 
write second part of same cluster it will fail.


--
Best regards,
Vladimir



[PULL 04/10] block/export/fuse: Rearrange if-else-if ladder in fuse_fallocate()

2022-02-01 Thread Kevin Wolf
From: Philippe Mathieu-Daudé 

In order to safely maintain a mixture of #ifdef'ry with if-else-if
ladder, rearrange the last statement (!mode) first. Since it is
mutually exclusive with the other conditions, checking it first
doesn't make any logical difference, but allows to add #ifdef'ry
around in a more cleanly way.

Suggested-by: Kevin Wolf 
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20220201112655.344373-2-f4...@amsat.org>
Signed-off-by: Kevin Wolf 
---
 block/export/fuse.c | 41 +
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/block/export/fuse.c b/block/export/fuse.c
index 6710d8aed8..d25e478c0a 100644
--- a/block/export/fuse.c
+++ b/block/export/fuse.c
@@ -629,7 +629,26 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 length = MIN(length, blk_len - offset);
 }
 
-if (mode & FALLOC_FL_PUNCH_HOLE) {
+if (!mode) {
+/* We can only fallocate at the EOF with a truncate */
+if (offset < blk_len) {
+fuse_reply_err(req, EOPNOTSUPP);
+return;
+}
+
+if (offset > blk_len) {
+/* No preallocation needed here */
+ret = fuse_do_truncate(exp, offset, true, PREALLOC_MODE_OFF);
+if (ret < 0) {
+fuse_reply_err(req, -ret);
+return;
+}
+}
+
+ret = fuse_do_truncate(exp, offset + length, true,
+   PREALLOC_MODE_FALLOC);
+}
+else if (mode & FALLOC_FL_PUNCH_HOLE) {
 if (!(mode & FALLOC_FL_KEEP_SIZE)) {
 fuse_reply_err(req, EINVAL);
 return;
@@ -665,25 +684,7 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 } while (ret == 0 && length > 0);
 }
 #endif /* CONFIG_FALLOCATE_ZERO_RANGE */
-else if (!mode) {
-/* We can only fallocate at the EOF with a truncate */
-if (offset < blk_len) {
-fuse_reply_err(req, EOPNOTSUPP);
-return;
-}
-
-if (offset > blk_len) {
-/* No preallocation needed here */
-ret = fuse_do_truncate(exp, offset, true, PREALLOC_MODE_OFF);
-if (ret < 0) {
-fuse_reply_err(req, -ret);
-return;
-}
-}
-
-ret = fuse_do_truncate(exp, offset + length, true,
-   PREALLOC_MODE_FALLOC);
-} else {
+else {
 ret = -EOPNOTSUPP;
 }
 
-- 
2.31.1




iotests failure on netbsd VM, no useful errors logged

2022-02-01 Thread Peter Maydell
Hi; I see an iotests error running 'make vm-build-netbsd' (ie running
a build and make check on a NetBSD VM), but there is no indication
of what's actually failing. Same thing seems to happen on OpenBSD too.

Full logfile at https://people.linaro.org/~peter.maydell/netbsd-log.txt ;
here's the relevant section:

===begin===
▶ 1/1 qcow2 qsd-jobsOK
1/1 qemu:block / qemu-iotests qcow2 ERROR  227.62s   exit status 1


Summary of Failures:

1/1 qemu:block / qemu-iotests qcow2 ERROR  227.62s   exit status 1


Ok: 0
Expected Fail:  0
Fail:   1
Unexpected Pass:0
Skipped:0
Timeout:0

Full log written to /home/qemu/qemu-test.z9hOGs/build/meson-logs/iotestslog.txt
===endit===


Would somebody like to investigate ? I'm not sure if this is an intermittent,
or a new persistent failure that I let slip into the tree by accident.

thanks
-- PMM



[PULL 01/24] tests/qemu-iotests: Fix 051 for binaries without 'lsi53c895a'

2022-02-01 Thread Hanna Reitz
From: Thomas Huth 

The lsi53c895a SCSI adaptor might not be enabled in each and every
x86 QEMU binary, e.g. it's disabled in the RHEL/CentOS build.
Thus let's add a check to the 051 test so that it does not fail if
this device is not available.

Signed-off-by: Thomas Huth 
Message-Id: <20211206143404.247032-1-th...@redhat.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/051 | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index 1d2fa93a11..e9042a6214 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -45,6 +45,10 @@ _supported_proto file
 _unsupported_imgopts 'refcount_bits=\([^1]\|.\([^6]\|$\)\)' data_file
 _require_drivers nbd
 
+if [ "$QEMU_DEFAULT_MACHINE" = "pc" ]; then
+_require_devices lsi53c895a
+fi
+
 do_run_qemu()
 {
 echo Testing: "$@"
-- 
2.34.1




[PULL 06/10] block.h: remove outdated comment

2022-02-01 Thread Kevin Wolf
From: Emanuele Giuseppe Esposito 

The comment "disk I/O throttling" doesn't make any sense at all
any more. It was added in commit 0563e191516 to describe
bdrv_io_limits_enable()/disable(), which were removed in commit
97148076, so the comment is just a forgotten leftover.

Suggested-by: Kevin Wolf 
Signed-off-by: Emanuele Giuseppe Esposito 
Message-Id: <20220131125615.74612-1-eespo...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Kevin Wolf 
---
 include/block/block.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/block/block.h b/include/block/block.h
index 9d4050220b..e1713ee306 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -344,7 +344,6 @@ typedef unsigned int BdrvChildRole;
 char *bdrv_perm_names(uint64_t perm);
 uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission qapi_perm);
 
-/* disk I/O throttling */
 void bdrv_init(void);
 void bdrv_init_with_whitelist(void);
 bool bdrv_uses_whitelist(void);
-- 
2.31.1




[PULL 00/10] Block layer patches

2022-02-01 Thread Kevin Wolf
The following changes since commit 804b30d25f8d70dc2dea951883ea92235274a50c:

  Merge remote-tracking branch 'remotes/legoater/tags/pull-ppc-20220130' into 
staging (2022-01-31 11:10:08 +)

are available in the Git repository at:

  https://gitlab.com/kmwolf/qemu.git tags/for-upstream

for you to fetch changes up to fc176116cdea816ceb8dd969080b2b95f58edbc0:

  block/rbd: workaround for ceph issue #53784 (2022-02-01 15:16:32 +0100)


Block layer patches

- rbd: fix handling of holes in .bdrv_co_block_status
- Fix potential crash in bdrv_set_backing_hd()
- vhost-user-blk export: Fix shutdown with requests in flight
- FUSE export: Fix build failure on FreeBSD
- Documentation improvements


Emanuele Giuseppe Esposito (1):
  block.h: remove outdated comment

Hanna Reitz (2):
  qsd: Document fuse's allow-other option
  qemu-img: Unify [-b [-F]] documentation

Kevin Wolf (2):
  qemu-storage-daemon: Fix typo in vhost-user-blk help
  block/export: Fix vhost-user-blk shutdown with requests in flight

Peter Lieven (2):
  block/rbd: fix handling of holes in .bdrv_co_block_status
  block/rbd: workaround for ceph issue #53784

Philippe Mathieu-Daudé (2):
  block/export/fuse: Rearrange if-else-if ladder in fuse_fallocate()
  block/export/fuse: Fix build failure on FreeBSD

Vladimir Sementsov-Ogievskiy (1):
  block: bdrv_set_backing_hd(): use drained section

 docs/tools/qemu-img.rst  |  2 +-
 docs/tools/qemu-storage-daemon.rst   |  9 +--
 include/block/block.h|  1 -
 include/qemu/vhost-user-server.h |  5 
 block.c  |  4 +++
 block/export/fuse.c  | 45 +--
 block/export/vhost-user-blk-server.c |  5 
 block/rbd.c  | 52 +++-
 storage-daemon/qemu-storage-daemon.c |  4 +--
 util/vhost-user-server.c | 22 +++
 qemu-img-cmds.hx |  4 +--
 11 files changed, 118 insertions(+), 35 deletions(-)




[PULL 05/10] block/export/fuse: Fix build failure on FreeBSD

2022-02-01 Thread Kevin Wolf
From: Philippe Mathieu-Daudé 

When building on FreeBSD we get:

  [816/6851] Compiling C object libblockdev.fa.p/block_export_fuse.c.o
  ../block/export/fuse.c:628:16: error: use of undeclared identifier 
'FALLOC_FL_KEEP_SIZE'
  if (mode & FALLOC_FL_KEEP_SIZE) {
 ^
  ../block/export/fuse.c:651:16: error: use of undeclared identifier 
'FALLOC_FL_PUNCH_HOLE'
  if (mode & FALLOC_FL_PUNCH_HOLE) {
 ^
  ../block/export/fuse.c:652:22: error: use of undeclared identifier 
'FALLOC_FL_KEEP_SIZE'
  if (!(mode & FALLOC_FL_KEEP_SIZE)) {
   ^
  3 errors generated.
  FAILED: libblockdev.fa.p/block_export_fuse.c.o

Meson indeed reported FALLOC_FL_PUNCH_HOLE is not available:

  C compiler for the host machine: cc (clang 10.0.1 "FreeBSD clang version 
10.0.1")
  Checking for function "fallocate" : NO
  Checking for function "posix_fallocate" : YES
  Header  has symbol "FALLOC_FL_PUNCH_HOLE" : NO
  Header  has symbol "FALLOC_FL_ZERO_RANGE" : NO
  ...

Similarly to commit 304332039 ("block/export/fuse.c: fix musl build"),
guard the code requiring FALLOC_FL_KEEP_SIZE / FALLOC_FL_PUNCH_HOLE
definitions under CONFIG_FALLOCATE_PUNCH_HOLE #ifdef'ry.

Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20220201112655.344373-3-f4...@amsat.org>
Signed-off-by: Kevin Wolf 
---
 block/export/fuse.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/block/export/fuse.c b/block/export/fuse.c
index d25e478c0a..fdda8e3c81 100644
--- a/block/export/fuse.c
+++ b/block/export/fuse.c
@@ -625,9 +625,11 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 return;
 }
 
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
 if (mode & FALLOC_FL_KEEP_SIZE) {
 length = MIN(length, blk_len - offset);
 }
+#endif /* CONFIG_FALLOCATE_PUNCH_HOLE */
 
 if (!mode) {
 /* We can only fallocate at the EOF with a truncate */
@@ -648,6 +650,7 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 ret = fuse_do_truncate(exp, offset + length, true,
PREALLOC_MODE_FALLOC);
 }
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
 else if (mode & FALLOC_FL_PUNCH_HOLE) {
 if (!(mode & FALLOC_FL_KEEP_SIZE)) {
 fuse_reply_err(req, EINVAL);
@@ -662,6 +665,7 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 length -= size;
 } while (ret == 0 && length > 0);
 }
+#endif /* CONFIG_FALLOCATE_PUNCH_HOLE */
 #ifdef CONFIG_FALLOCATE_ZERO_RANGE
 else if (mode & FALLOC_FL_ZERO_RANGE) {
 if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + length > blk_len) {
-- 
2.31.1




[PULL 07/10] qsd: Document fuse's allow-other option

2022-02-01 Thread Kevin Wolf
From: Hanna Reitz 

We did not add documentation to the storage daemon's man page for fuse's
allow-other option when it was introduced, so do that now.

Fixes: 8fc54f9428b9763f800 ("export/fuse: Add allow-other option")
Signed-off-by: Hanna Reitz 
Message-Id: <20220131103124.20325-1-hre...@redhat.com>
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 docs/tools/qemu-storage-daemon.rst   | 9 +++--
 storage-daemon/qemu-storage-daemon.c | 2 +-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/docs/tools/qemu-storage-daemon.rst 
b/docs/tools/qemu-storage-daemon.rst
index 9b0eaba6e5..878e6a5c5c 100644
--- a/docs/tools/qemu-storage-daemon.rst
+++ b/docs/tools/qemu-storage-daemon.rst
@@ -76,7 +76,7 @@ Standard options:
 .. option:: --export 
[type=]nbd,id=,node-name=[,name=][,writable=on|off][,bitmap=]
   --export 
[type=]vhost-user-blk,id=,node-name=,addr.type=unix,addr.path=[,writable=on|off][,logical-block-size=][,num-queues=]
   --export 
[type=]vhost-user-blk,id=,node-name=,addr.type=fd,addr.str=[,writable=on|off][,logical-block-size=][,num-queues=]
-  --export 
[type=]fuse,id=,node-name=,mountpoint=[,growable=on|off][,writable=on|off]
+  --export 
[type=]fuse,id=,node-name=,mountpoint=[,growable=on|off][,writable=on|off][,allow-other=on|off|auto]
 
   is a block export definition. ``node-name`` is the block node that should be
   exported. ``writable`` determines whether or not the export allows write
@@ -103,7 +103,12 @@ Standard options:
   mounted). Consequently, applications that have opened the given file before
   the export became active will continue to see its original content. If
   ``growable`` is set, writes after the end of the exported file will grow the
-  block node to fit.
+  block node to fit.  The ``allow-other`` option controls whether users other
+  than the user running the process will be allowed to access the export.  Note
+  that enabling this option as a non-root user requires enabling the
+  user_allow_other option in the global fuse.conf configuration file.  Setting
+  ``allow-other`` to auto (the default) will try enabling this option, and on
+  error fall back to disabling it.
 
 .. option:: --monitor MONITORDEF
 
diff --git a/storage-daemon/qemu-storage-daemon.c 
b/storage-daemon/qemu-storage-daemon.c
index ec9aa79b55..504d33aa91 100644
--- a/storage-daemon/qemu-storage-daemon.c
+++ b/storage-daemon/qemu-storage-daemon.c
@@ -100,7 +100,7 @@ static void help(void)
 "\n"
 #ifdef CONFIG_FUSE
 "  --export [type=]fuse,id=,node-name=,mountpoint=\n"
-"   [,growable=on|off][,writable=on|off]\n"
+"   [,growable=on|off][,writable=on|off][,allow-other=on|off|auto]\n"
 " export the specified block node over FUSE\n"
 "\n"
 #endif /* CONFIG_FUSE */
-- 
2.31.1




[PULL 01/10] qemu-storage-daemon: Fix typo in vhost-user-blk help

2022-02-01 Thread Kevin Wolf
The syntax of the fd passing case misses the "addr.type=" key. Add it.

Signed-off-by: Kevin Wolf 
Message-Id: <20220125151514.49035-1-kw...@redhat.com>
Reviewed-by: Hanna Reitz 
Signed-off-by: Kevin Wolf 
---
 storage-daemon/qemu-storage-daemon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/storage-daemon/qemu-storage-daemon.c 
b/storage-daemon/qemu-storage-daemon.c
index 9d76d1114d..ec9aa79b55 100644
--- a/storage-daemon/qemu-storage-daemon.c
+++ b/storage-daemon/qemu-storage-daemon.c
@@ -111,7 +111,7 @@ static void help(void)
 " export the specified block node as a\n"
 " vhost-user-blk device over UNIX domain socket\n"
 "  --export [type=]vhost-user-blk,id=,node-name=,\n"
-"   fd,addr.str=[,writable=on|off]\n"
+"   addr.type=fd,addr.str=[,writable=on|off]\n"
 "   [,logical-block-size=][,num-queues=]\n"
 " export the specified block node as a\n"
 " vhost-user-blk device over file descriptor\n"
-- 
2.31.1




[PULL 02/10] block: bdrv_set_backing_hd(): use drained section

2022-02-01 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Graph modifications should be done in drained section. stream_prepare()
handler of block stream job call bdrv_set_backing_hd() without using
drained section and it's theoretically possible that some IO request
will interleave with graph modification and will use outdated pointers
to removed block nodes.

Some other callers use bdrv_set_backing_hd() not caring about drained
sections too. So it seems good to make a drained section exactly in
bdrv_set_backing_hd().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20220124173741.2984056-1-vsement...@virtuozzo.com>
Signed-off-by: Kevin Wolf 
---
 block.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/block.c b/block.c
index 7b3ce415d8..b54d59d1fa 100644
--- a/block.c
+++ b/block.c
@@ -3341,6 +3341,8 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 int ret;
 Transaction *tran = tran_new();
 
+bdrv_drained_begin(bs);
+
 ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
 if (ret < 0) {
 goto out;
@@ -3350,6 +3352,8 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
 out:
 tran_finalize(tran, ret);
 
+bdrv_drained_end(bs);
+
 return ret;
 }
 
-- 
2.31.1




[PULL 22/24] block-backend: Retain permissions after migration

2022-02-01 Thread Hanna Reitz
After migration, the permissions the guest device wants to impose on its
BlockBackend are stored in blk->perm and blk->shared_perm.  In
blk_root_activate(), we take our permissions, but keep all shared
permissions open by calling `blk_set_perm(blk->perm, BLK_PERM_ALL)`.

Only afterwards (immediately or later, depending on the runstate) do we
restrict the shared permissions by calling
`blk_set_perm(blk->perm, blk->shared_perm)`.  Unfortunately, our first
call with shared_perm=BLK_PERM_ALL has overwritten blk->shared_perm to
be BLK_PERM_ALL, so this is a no-op and the set of shared permissions is
not restricted.

Fix this bug by saving the set of shared permissions before invoking
blk_set_perm() with BLK_PERM_ALL and restoring it afterwards.

Fixes: 5f7772c4d0cf32f4e779fcd5a69ae4dae24aeebf
   ("block-backend: Defer shared_perm tightening migration
   completion")
Reported-by: Peng Liang 
Signed-off-by: Hanna Reitz 
Message-Id: <20211125135317.186576-2-hre...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Peng Liang 
---
 block/block-backend.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 23e727199b..4ff6b4d785 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -190,6 +190,7 @@ static void blk_root_activate(BdrvChild *child, Error 
**errp)
 {
 BlockBackend *blk = child->opaque;
 Error *local_err = NULL;
+uint64_t saved_shared_perm;
 
 if (!blk->disable_perm) {
 return;
@@ -197,12 +198,22 @@ static void blk_root_activate(BdrvChild *child, Error 
**errp)
 
 blk->disable_perm = false;
 
+/*
+ * blk->shared_perm contains the permissions we want to share once
+ * migration is really completely done.  For now, we need to share
+ * all; but we also need to retain blk->shared_perm, which is
+ * overwritten by a successful blk_set_perm() call.  Save it and
+ * restore it below.
+ */
+saved_shared_perm = blk->shared_perm;
+
 blk_set_perm(blk, blk->perm, BLK_PERM_ALL, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 blk->disable_perm = true;
 return;
 }
+blk->shared_perm = saved_shared_perm;
 
 if (runstate_check(RUN_STATE_INMIGRATE)) {
 /* Activation can happen when migration process is still active, for
-- 
2.34.1




[PULL 09/24] iotest 303: explicit compression type

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

The test prints qcow2 header fields which depends on chosen compression
type. So, let's be explicit in what compression type we want and
independent of IMGOPTS. Test both existing compression types.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20211223160144.1097696-8-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/303 | 25 -
 tests/qemu-iotests/303.out | 30 +-
 2 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/tests/qemu-iotests/303 b/tests/qemu-iotests/303
index 475cb5428d..16c2e10827 100755
--- a/tests/qemu-iotests/303
+++ b/tests/qemu-iotests/303
@@ -54,12 +54,19 @@ def add_bitmap(num, begin, end, disabled):
 log('')
 
 
-qemu_img_create('-f', iotests.imgfmt, disk, '10M')
-
-add_bitmap(1, 0, 6, False)
-add_bitmap(2, 6, 8, True)
-dump = ['./qcow2.py', disk, 'dump-header']
-subprocess.run(dump)
-# Dump the metadata in JSON format
-dump.append('-j')
-subprocess.run(dump)
+def test(compression_type: str, json_output: bool) -> None:
+qemu_img_create('-f', iotests.imgfmt,
+'-o', f'compression_type={compression_type}',
+disk, '10M')
+add_bitmap(1, 0, 6, False)
+add_bitmap(2, 6, 8, True)
+
+cmd = ['./qcow2.py', disk, 'dump-header']
+if json_output:
+cmd.append('-j')
+
+subprocess.run(cmd)
+
+
+test('zlib', False)
+test('zstd', True)
diff --git a/tests/qemu-iotests/303.out b/tests/qemu-iotests/303.out
index 7c16998587..b3c70827b7 100644
--- a/tests/qemu-iotests/303.out
+++ b/tests/qemu-iotests/303.out
@@ -80,6 +80,34 @@ extra_data_size   0
 Bitmap table   typesize offset
 0  all-zeroes  00
 
+Add bitmap 1
+wrote 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+wrote 1048576/1048576 bytes at offset 1048576
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+wrote 1048576/1048576 bytes at offset 2097152
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+wrote 1048576/1048576 bytes at offset 3145728
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+wrote 1048576/1048576 bytes at offset 4194304
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+wrote 1048576/1048576 bytes at offset 5242880
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+
+Add bitmap 2
+wrote 1048576/1048576 bytes at offset 6291456
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+wrote 1048576/1048576 bytes at offset 7340032
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+
 {
 "magic": 1363560955,
 "version": 3,
@@ -94,7 +122,7 @@ Bitmap table   typesize offset
 "refcount_table_clusters": 1,
 "nb_snapshots": 0,
 "snapshot_offset": 0,
-"incompatible_features": 0,
+"incompatible_features": 8,
 "compatible_features": 0,
 "autoclear_features": 1,
 "refcount_order": 4,
-- 
2.34.1




[PULL 17/24] iotest 39: use _qcow2_dump_header

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

_qcow2_dump_header has filter for compression type, so this change
makes test pass with IMGOPTS='compression_type=zstd'.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20211223160144.1097696-16-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/039 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/039 b/tests/qemu-iotests/039
index 8e783a8380..00d379cde2 100755
--- a/tests/qemu-iotests/039
+++ b/tests/qemu-iotests/039
@@ -142,7 +142,7 @@ $QEMU_IMG commit "$TEST_IMG"
 
 # The dirty bit must not be set
 _qcow2_dump_header | grep incompatible_features
-$PYTHON qcow2.py "$TEST_IMG".base dump-header | grep incompatible_features
+_qcow2_dump_header "$TEST_IMG".base | grep incompatible_features
 
 _check_test_img
 TEST_IMG="$TEST_IMG".base _check_test_img
-- 
2.34.1




[PULL 20/24] iotest 214: explicit compression type

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

The test-case "Corrupted size field in compressed cluster descriptor"
heavily depends on zlib compression type. So, make it explicit. This
way test passes with IMGOPTS='compression_type=zstd'.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20211223160144.1097696-19-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/214 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/214 b/tests/qemu-iotests/214
index 0889089d81..c66e246ba2 100755
--- a/tests/qemu-iotests/214
+++ b/tests/qemu-iotests/214
@@ -51,7 +51,7 @@ echo
 # The L2 entries of the two compressed clusters are located at
 # 0x80 and 0x88, their original values are 0x400800a0
 # and 0x400800a00802 (5 sectors for compressed data each).
-_make_test_img 8M -o cluster_size=2M
+_make_test_img 8M -o cluster_size=2M,compression_type=zlib
 $QEMU_IO -c "write -c -P 0x11 0 2M" -c "write -c -P 0x11 2M 2M" "$TEST_IMG" \
  2>&1 | _filter_qemu_io | _filter_testdir
 
-- 
2.34.1




[PULL 02/24] iotests/MRCE: Write data to source

2022-02-01 Thread Hanna Reitz
This test assumes that mirror flushes the source when entering the READY
state, and that the format level will pass that flush on to the protocol
level (where we intercept it with blkdebug).

However, apparently that does not happen when using a VMDK image with
zeroed_grain=on, which actually is the default set by testenv.py.  Right
now, Python tests ignore IMGOPTS, though, so this has no effect; but
Vladimir has a series that will change this, so we need to fix this test
before that series lands.

We can fix it by writing data to the source before we start the mirror
job; apparently that makes the (VMDK) format layer change its mind and
pass on the pre-READY flush to the protocol level, so the test passes
again.  (I presume, without any data written, mirror just does a 64M
zero write on the target, which VMDK with zeroed_grain=on basically just
ignores.)

Without this, we do not get a flush, and so blkdebug only sees a single
flush at the end of the job instead of two, and therefore does not
inject an error, which makes the block job complete instead of raising
an error.

Signed-off-by: Hanna Reitz 
Message-Id: <20211223165308.103793-1-hre...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/tests/mirror-ready-cancel-error | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/tests/mirror-ready-cancel-error 
b/tests/qemu-iotests/tests/mirror-ready-cancel-error
index f2dc1f..770ffca379 100755
--- a/tests/qemu-iotests/tests/mirror-ready-cancel-error
+++ b/tests/qemu-iotests/tests/mirror-ready-cancel-error
@@ -36,6 +36,11 @@ class TestMirrorReadyCancelError(iotests.QMPTestCase):
 assert iotests.qemu_img_create('-f', iotests.imgfmt, target,
str(image_size)) == 0
 
+# Ensure that mirror will copy something before READY so the
+# target format layer will forward the pre-READY flush to its
+# file child
+assert iotests.qemu_io_silent('-c', 'write -P 1 0 64k', source) == 0
+
 self.vm = iotests.VM()
 self.vm.launch()
 
@@ -97,7 +102,7 @@ class TestMirrorReadyCancelError(iotests.QMPTestCase):
 # Write something so will not leave the job immediately, but
 # flush first (which will fail, thanks to blkdebug)
 res = self.vm.qmp('human-monitor-command',
-  command_line='qemu-io mirror-top "write 0 64k"')
+  command_line='qemu-io mirror-top "write -P 2 0 64k"')
 self.assert_qmp(res, 'return', '')
 
 # Drain status change events
-- 
2.34.1




Re: [PATCH 00/14] block: cleanup backing and file handling

2022-02-01 Thread Vladimir Sementsov-Ogievskiy

Ping

03.12.2021 23:25, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

I started this as a follow-up to
"block: Attempt on fixing 030-reported errors" by Hanna.

In Hanna's series I really like how bs->children handling moved to
.attach/.detach handlers.

.file and .backing are kind of "shortcuts" or "links" to some elementes
of this list, they duplicate the information. So they should be updated
in the same place to be in sync.

On the way to this target, I do also the following cleanups:

  - establish, which restrictions we have on how much children of
  different roles should node has, and which of the should be linked in
  .file / .backing. Add documentation and assertions.

  - drop all the complicated logic around passing pointer to bs->backing
  / bs->file  (BdrvChild **c), so that the field be automatically
  updated. Now they are natively automatically updated in
  .attach/.detach, so the rest of the code becomes simpler.

  - now .file / .backing are updated ONLY in .attach / .detach, no other
  code modify these fields

Vladimir Sementsov-Ogievskiy (14):
   block: BlockDriver: add .filtered_child_is_backing field
   block: introduce bdrv_open_file_child() helper
   block/blklogwrites: don't care to remove bs->file child on failure
   test-bdrv-graph-mod: update test_parallel_perm_update test case
   tests-bdrv-drain: bdrv_replace_test driver: declare supports_backing
   test-bdrv-graph-mod: fix filters to be filters
   block: document connection between child roles and
 bs->backing/bs->file
   block/snapshot: stress that we fallback to primary child
   Revert "block: Let replace_child_noperm free children"
   Revert "block: Let replace_child_tran keep indirect pointer"
   Revert "block: Restructure remove_file_or_backing_child()"
   Revert "block: Pass BdrvChild ** to replace_child_noperm"
   block: Manipulate bs->file / bs->backing pointers in .attach/.detach
   block/snapshot: drop indirection around bdrv_snapshot_fallback_ptr

  include/block/block.h|  45 +
  include/block/block_int.h|  30 ++-
  block.c  | 335 ++-
  block/blkdebug.c |   9 +-
  block/blklogwrites.c |  11 +-
  block/blkreplay.c|   7 +-
  block/blkverify.c|   9 +-
  block/bochs.c|   7 +-
  block/cloop.c|   7 +-
  block/commit.c   |   1 +
  block/copy-before-write.c|   9 +-
  block/copy-on-read.c |   9 +-
  block/crypto.c   |  11 +-
  block/dmg.c  |   7 +-
  block/filter-compress.c  |   6 +-
  block/mirror.c   |   1 +
  block/parallels.c|   7 +-
  block/preallocate.c  |   9 +-
  block/qcow.c |   6 +-
  block/qcow2.c|   8 +-
  block/qed.c  |   8 +-
  block/raw-format.c   |   4 +-
  block/replication.c  |   8 +-
  block/snapshot.c |  60 ++
  block/throttle.c |   8 +-
  block/vdi.c  |   7 +-
  block/vhdx.c |   7 +-
  block/vmdk.c |   7 +-
  block/vpc.c  |   7 +-
  tests/unit/test-bdrv-drain.c |  11 +-
  tests/unit/test-bdrv-graph-mod.c |  94 ++---
  31 files changed, 343 insertions(+), 412 deletions(-)




--
Best regards,
Vladimir



[PULL 03/24] iotests.py: img_info_log(): rename imgopts argument

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

We are going to support IMGOPTS environment variable like in bash
tests. Corresponding global variable in iotests.py should be called
imgopts. So to not interfere with function argument, rename it in
advance.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20211223160144.1097696-2-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/210| 8 
 tests/qemu-iotests/iotests.py | 5 +++--
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/210 b/tests/qemu-iotests/210
index a4dcc5fe59..10b0a0b87c 100755
--- a/tests/qemu-iotests/210
+++ b/tests/qemu-iotests/210
@@ -62,7 +62,7 @@ with iotests.FilePath('t.luks') as disk_path, \
 'driver=luks,file.driver=file,file.filename=%s,key-secret=keysec0' % 
(disk_path),
 filter_path=disk_path,
 extra_args=['--object', 'secret,id=keysec0,data=foo'],
-imgopts=True)
+use_image_opts=True)
 
 #
 # Successful image creation (with non-default options)
@@ -96,7 +96,7 @@ with iotests.FilePath('t.luks') as disk_path, \
 'driver=luks,file.driver=file,file.filename=%s,key-secret=keysec0' % 
(disk_path),
 filter_path=disk_path,
 extra_args=['--object', 'secret,id=keysec0,data=foo'],
-imgopts=True)
+use_image_opts=True)
 
 #
 # Invalid BlockdevRef
@@ -132,7 +132,7 @@ with iotests.FilePath('t.luks') as disk_path, \
 'driver=luks,file.driver=file,file.filename=%s,key-secret=keysec0' % 
(disk_path),
 filter_path=disk_path,
 extra_args=['--object', 'secret,id=keysec0,data=foo'],
-imgopts=True)
+use_image_opts=True)
 
 #
 # Invalid sizes
@@ -176,4 +176,4 @@ with iotests.FilePath('t.luks') as disk_path, \
 'driver=luks,file.driver=file,file.filename=%s,key-secret=keysec0' % 
(disk_path),
 filter_path=disk_path,
 extra_args=['--object', 'secret,id=keysec0,data=foo'],
-imgopts=True)
+use_image_opts=True)
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 1e2f2391d1..30a8837ea2 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -227,9 +227,10 @@ def qemu_img_log(*args):
 log(result, filters=[filter_testfiles])
 return result
 
-def img_info_log(filename, filter_path=None, imgopts=False, extra_args=()):
+def img_info_log(filename, filter_path=None, use_image_opts=False,
+ extra_args=()):
 args = ['info']
-if imgopts:
+if use_image_opts:
 args.append('--image-opts')
 else:
 args += ['-f', imgfmt]
-- 
2.34.1




[PULL 10/24] iotest 065: explicit compression type

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

The test checks different options. It of course fails if set
IMGOPTS='compression_type=zstd'. So, let's be explicit in what
compression type we want and independent of IMGOPTS. Test both existing
compression types.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Hanna Reitz 
Message-Id: <20211223160144.1097696-9-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/065 | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tests/qemu-iotests/065 b/tests/qemu-iotests/065
index dc7716275f..f7c1b68dad 100755
--- a/tests/qemu-iotests/065
+++ b/tests/qemu-iotests/065
@@ -88,7 +88,7 @@ class TestQMP(TestImageInfoSpecific):
 
 class TestQCow2(TestQemuImgInfo):
 '''Testing a qcow2 version 2 image'''
-img_options = 'compat=0.10'
+img_options = 'compat=0.10,compression_type=zlib'
 json_compare = { 'compat': '0.10', 'refcount-bits': 16,
  'compression-type': 'zlib' }
 human_compare = [ 'compat: 0.10', 'compression type: zlib',
@@ -96,17 +96,17 @@ class TestQCow2(TestQemuImgInfo):
 
 class TestQCow3NotLazy(TestQemuImgInfo):
 '''Testing a qcow2 version 3 image with lazy refcounts disabled'''
-img_options = 'compat=1.1,lazy_refcounts=off'
+img_options = 'compat=1.1,lazy_refcounts=off,compression_type=zstd'
 json_compare = { 'compat': '1.1', 'lazy-refcounts': False,
  'refcount-bits': 16, 'corrupt': False,
- 'compression-type': 'zlib', 'extended-l2': False }
-human_compare = [ 'compat: 1.1', 'compression type: zlib',
+ 'compression-type': 'zstd', 'extended-l2': False }
+human_compare = [ 'compat: 1.1', 'compression type: zstd',
   'lazy refcounts: false', 'refcount bits: 16',
   'corrupt: false', 'extended l2: false' ]
 
 class TestQCow3Lazy(TestQemuImgInfo):
 '''Testing a qcow2 version 3 image with lazy refcounts enabled'''
-img_options = 'compat=1.1,lazy_refcounts=on'
+img_options = 'compat=1.1,lazy_refcounts=on,compression_type=zlib'
 json_compare = { 'compat': '1.1', 'lazy-refcounts': True,
  'refcount-bits': 16, 'corrupt': False,
  'compression-type': 'zlib', 'extended-l2': False }
@@ -117,7 +117,7 @@ class TestQCow3Lazy(TestQemuImgInfo):
 class TestQCow3NotLazyQMP(TestQMP):
 '''Testing a qcow2 version 3 image with lazy refcounts disabled, opening
with lazy refcounts enabled'''
-img_options = 'compat=1.1,lazy_refcounts=off'
+img_options = 'compat=1.1,lazy_refcounts=off,compression_type=zlib'
 qemu_options = 'lazy-refcounts=on'
 compare = { 'compat': '1.1', 'lazy-refcounts': False,
 'refcount-bits': 16, 'corrupt': False,
@@ -127,11 +127,11 @@ class TestQCow3NotLazyQMP(TestQMP):
 class TestQCow3LazyQMP(TestQMP):
 '''Testing a qcow2 version 3 image with lazy refcounts enabled, opening
with lazy refcounts disabled'''
-img_options = 'compat=1.1,lazy_refcounts=on'
+img_options = 'compat=1.1,lazy_refcounts=on,compression_type=zstd'
 qemu_options = 'lazy-refcounts=off'
 compare = { 'compat': '1.1', 'lazy-refcounts': True,
 'refcount-bits': 16, 'corrupt': False,
-'compression-type': 'zlib', 'extended-l2': False }
+'compression-type': 'zstd', 'extended-l2': False }
 
 TestImageInfoSpecific = None
 TestQemuImgInfo = None
-- 
2.34.1




Re: [PATCH 10/12] block.c: add subtree_drains where needed

2022-02-01 Thread Vladimir Sementsov-Ogievskiy

18.01.2022 19:27, Emanuele Giuseppe Esposito wrote:

Protect bdrv_replace_child_noperm, as it modifies the
graph by adding/removing elements to .children and .parents
list of a bs. Use the newly introduced
bdrv_subtree_drained_{begin/end}_unlocked drains to achieve
that and be free from the aiocontext lock.

One important criteria to keep in mind is that if the caller of
bdrv_replace_child_noperm creates a transaction, we need to make sure that the
whole transaction is under the same drain block. This is imperative, as having
multiple drains also in the .abort() class of functions causes discrepancies
in the drained counters (as nodes are put back into the original positions),
making it really hard to retourn all to zero and leaving the code very buggy.
See https://patchew.org/QEMU/20211213104014.69858-1-eespo...@redhat.com/
for more explanations.

Unfortunately we still need to have bdrv_subtree_drained_begin/end
in bdrv_detach_child() releasing and then holding the AioContext
lock, since it later invokes bdrv_try_set_aio_context() that is
not safe yet. Once all is cleaned up, we can also remove the
acquire/release locks in job_unref, artificially added because of this.

Signed-off-by: Emanuele Giuseppe Esposito 
---
  block.c | 50 --
  1 file changed, 44 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index fcc44a49a0..6196c95aae 100644
--- a/block.c
+++ b/block.c
@@ -3114,8 +3114,22 @@ static void bdrv_detach_child(BdrvChild **childp)
  BlockDriverState *old_bs = (*childp)->bs;
  
  assert(qemu_in_main_thread());

+if (old_bs) {
+/*
+ * TODO: this is called by job_unref with lock held, because
+ * afterwards it calls bdrv_try_set_aio_context.
+ * Once all of this is fixed, take care of removing
+ * the aiocontext lock and make this function _unlocked.
+ */
+bdrv_subtree_drained_begin(old_bs);
+}
+
  bdrv_replace_child_noperm(childp, NULL, true);
  
+if (old_bs) {

+bdrv_subtree_drained_end(old_bs);
+}
+
  if (old_bs) {
  /*
   * Update permissions for old node. We're just taking a parent away, 
so
@@ -3154,6 +3168,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState 
*child_bs,
  Transaction *tran = tran_new();
  
  assert(qemu_in_main_thread());

+bdrv_subtree_drained_begin_unlocked(child_bs);
  
  ret = bdrv_attach_child_common(child_bs, child_name, child_class,

 child_role, perm, shared_perm, opaque,
@@ -3168,6 +3183,7 @@ out:
  tran_finalize(tran, ret);
  /* child is unset on failure by bdrv_attach_child_common_abort() */
  assert((ret < 0) == !child);
+bdrv_subtree_drained_end_unlocked(child_bs);
  
  bdrv_unref(child_bs);

  return child;
@@ -3197,6 +3213,9 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
  
  assert(qemu_in_main_thread());
  
+bdrv_subtree_drained_begin_unlocked(parent_bs);

+bdrv_subtree_drained_begin_unlocked(child_bs);
+
  ret = bdrv_attach_child_noperm(parent_bs, child_bs, child_name, 
child_class,
 child_role, , tran, errp);
  if (ret < 0) {
@@ -3211,6 +3230,9 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
  out:
  tran_finalize(tran, ret);
  /* child is unset on failure by bdrv_attach_child_common_abort() */
+bdrv_subtree_drained_end_unlocked(child_bs);
+bdrv_subtree_drained_end_unlocked(parent_bs);
+
  assert((ret < 0) == !child);
  
  bdrv_unref(child_bs);

@@ -3456,6 +3478,11 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
  
  assert(qemu_in_main_thread());
  
+bdrv_subtree_drained_begin_unlocked(bs);

+if (backing_hd) {
+bdrv_subtree_drained_begin_unlocked(backing_hd);
+}
+
  ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
  if (ret < 0) {
  goto out;
@@ -3464,6 +3491,10 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
  ret = bdrv_refresh_perms(bs, errp);
  out:
  tran_finalize(tran, ret);
+if (backing_hd) {
+bdrv_subtree_drained_end_unlocked(backing_hd);
+}
+bdrv_subtree_drained_end_unlocked(bs);
  
  return ret;

  }
@@ -5266,7 +5297,8 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
  
  assert(qemu_get_current_aio_context() == qemu_get_aio_context());

  assert(bdrv_get_aio_context(from) == bdrv_get_aio_context(to));
-bdrv_drained_begin(from);
+bdrv_subtree_drained_begin_unlocked(from);
+bdrv_subtree_drained_begin_unlocked(to);
  
  /*

   * Do the replacement without permission update.
@@ -5298,7 +5330,8 @@ static int bdrv_replace_node_common(BlockDriverState 
*from,
  out:
  tran_finalize(tran, ret);
  
-bdrv_drained_end(from);

+bdrv_subtree_drained_end_unlocked(to);
+

[PULL 00/24] Block patches

2022-02-01 Thread Hanna Reitz
The following changes since commit 804b30d25f8d70dc2dea951883ea92235274a50c:

  Merge remote-tracking branch 'remotes/legoater/tags/pull-ppc-20220130' into 
staging (2022-01-31 11:10:08 +)

are available in the Git repository at:

  https://gitlab.com/hreitz/qemu.git tags/pull-block-2022-02-01

for you to fetch changes up to 751486c18555169ca4baf59440275d5831140822:

  block.h: remove outdated comment (2022-02-01 13:28:53 +0100)


Block patches:
- Add support to the iotests to test qcow2's zstd compression mode
- Fix post-migration block node permissions
- iotests fixes (051 and mirror-ready-cancel-error)
- Remove an outdated comment


Emanuele Giuseppe Esposito (1):
  block.h: remove outdated comment

Hanna Reitz (3):
  iotests/MRCE: Write data to source
  block-backend: Retain permissions after migration
  iotests/migration-permissions: New test

Thomas Huth (1):
  tests/qemu-iotests: Fix 051 for binaries without 'lsi53c895a'

Vladimir Sementsov-Ogievskiy (19):
  iotests.py: img_info_log(): rename imgopts argument
  iotests.py: implement unsupported_imgopts
  iotests: specify some unsupported_imgopts for python iotests
  iotests.py: qemu_img*("create"): support
IMGOPTS='compression_type=zstd'
  iotests: drop qemu_img_verbose() helper
  iotests.py: rewrite default luks support in qemu_img
  iotest 303: explicit compression type
  iotest 065: explicit compression type
  iotests.py: filter out successful output of qemu-img create
  iotests.py: filter compression type out
  iotest 302: use img_info_log() helper
  qcow2: simple case support for downgrading of qcow2 images with zstd
  iotests/common.rc: introduce _qcow2_dump_header helper
  iotests: massive use _qcow2_dump_header
  iotest 39: use _qcow2_dump_header
  iotests: bash tests: filter compression type
  iotests 60: more accurate set dirty bit in qcow2 header
  iotest 214: explicit compression type
  iotests: declare lack of support for compresion_type in IMGOPTS

 include/block/block.h |   1 -
 block/block-backend.c |  11 ++
 block/qcow2.c |  58 +-
 tests/qemu-iotests/031|  11 +-
 tests/qemu-iotests/036|   6 +-
 tests/qemu-iotests/039|  22 ++--
 tests/qemu-iotests/044|   8 +-
 tests/qemu-iotests/044.out|   1 +
 tests/qemu-iotests/051|   9 +-
 tests/qemu-iotests/060|  22 ++--
 tests/qemu-iotests/060.out|   2 +-
 tests/qemu-iotests/061|  42 
 tests/qemu-iotests/061.out|  12 +--
 tests/qemu-iotests/065|  19 ++--
 tests/qemu-iotests/082.out|  14 +--
 tests/qemu-iotests/112|   3 +-
 tests/qemu-iotests/137|   2 +-
 tests/qemu-iotests/149.out|  21 
 tests/qemu-iotests/163|   3 +-
 tests/qemu-iotests/165|   3 +-
 tests/qemu-iotests/196|   3 +-
 tests/qemu-iotests/198.out|   4 +-
 tests/qemu-iotests/206.out|  10 +-
 tests/qemu-iotests/209|   7 +-
 tests/qemu-iotests/209.out|   2 +
 tests/qemu-iotests/210|   8 +-
 tests/qemu-iotests/214|   2 +-
 tests/qemu-iotests/237.out|   3 -
 tests/qemu-iotests/242|   3 +-
 tests/qemu-iotests/242.out|  10 +-
 tests/qemu-iotests/246|   3 +-
 tests/qemu-iotests/254|   3 +-
 tests/qemu-iotests/255.out|   4 -
 tests/qemu-iotests/260|   3 +-
 tests/qemu-iotests/274|   3 +-
 tests/qemu-iotests/274.out|  39 +--
 tests/qemu-iotests/280.out|   1 -
 tests/qemu-iotests/281|   3 +-
 tests/qemu-iotests/287|   8 +-
 tests/qemu-iotests/290|   2 +-
 tests/qemu-iotests/296.out|  10 +-
 tests/qemu-iotests/302|   4 +-
 tests/qemu-iotests/302.out|   7 +-
 tests/qemu-iotests/303|  26 +++--
 tests/qemu-iotests/303.out|  30 +-
 tests/qemu-iotests/common.filter  |   8 ++
 tests/qemu-iotests/common.rc  |  22 
 tests/qemu-iotests/iotests.py |  99 +++--
 .../tests/migrate-bitmaps-postcopy-test   |   3 +-
 tests/qemu-iotests/tests/migrate-bitmaps-test |   3 +-
 .../qemu-iotests/tests/migration-permissions  | 101 

[PULL 16/24] iotests: massive use _qcow2_dump_header

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

We are going to add filtering in _qcow2_dump_header and want all tests
use it.

The patch is generated by commands:
  cd tests/qemu-iotests
  sed -ie 's/$PYTHON qcow2.py "$TEST_IMG" dump-header\($\| 
\)/_qcow2_dump_header\1/' ??? tests/*

(the difficulty is to avoid converting dump-header-exts)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20211223160144.1097696-15-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/031 |  6 +++---
 tests/qemu-iotests/036 |  6 +++---
 tests/qemu-iotests/039 | 20 ++--
 tests/qemu-iotests/060 | 20 ++--
 tests/qemu-iotests/061 | 36 ++--
 tests/qemu-iotests/137 |  2 +-
 tests/qemu-iotests/287 |  8 
 7 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/tests/qemu-iotests/031 b/tests/qemu-iotests/031
index 58b57a0ef2..648112f796 100755
--- a/tests/qemu-iotests/031
+++ b/tests/qemu-iotests/031
@@ -58,21 +58,21 @@ for compat in "compat=0.10" "compat=1.1"; do
 echo
 _make_test_img -o $compat 64M
 $PYTHON qcow2.py "$TEST_IMG" add-header-ext 0x12345678 "This is a test 
header extension"
-$PYTHON qcow2.py "$TEST_IMG" dump-header
+_qcow2_dump_header
 _check_test_img
 
 echo
 echo === Rewrite header with no backing file ===
 echo
 $QEMU_IMG rebase -u -b "" "$TEST_IMG"
-$PYTHON qcow2.py "$TEST_IMG" dump-header
+_qcow2_dump_header
 _check_test_img
 
 echo
 echo === Add a backing file and format ===
 echo
 $QEMU_IMG rebase -u -b "/some/backing/file/path" -F host_device "$TEST_IMG"
-$PYTHON qcow2.py "$TEST_IMG" dump-header
+_qcow2_dump_header
 done
 
 # success, all done
diff --git a/tests/qemu-iotests/036 b/tests/qemu-iotests/036
index 5e567012a8..f703605e44 100755
--- a/tests/qemu-iotests/036
+++ b/tests/qemu-iotests/036
@@ -58,7 +58,7 @@ $PYTHON qcow2.py "$TEST_IMG" set-feature-bit incompatible 63
 
 # Without feature table
 $PYTHON qcow2.py "$TEST_IMG" del-header-ext 0x6803f857
-$PYTHON qcow2.py "$TEST_IMG" dump-header | grep features
+_qcow2_dump_header | grep features
 $PYTHON qcow2.py "$TEST_IMG" dump-header-exts
 _img_info
 
@@ -107,7 +107,7 @@ echo === Create image with unknown autoclear feature bit ===
 echo
 _make_test_img 64M
 $PYTHON qcow2.py "$TEST_IMG" set-feature-bit autoclear 63
-$PYTHON qcow2.py "$TEST_IMG" dump-header | grep features
+_qcow2_dump_header | grep features
 $PYTHON qcow2.py "$TEST_IMG" dump-header-exts
 
 echo
@@ -115,7 +115,7 @@ echo === Repair image ===
 echo
 _check_test_img -r all
 
-$PYTHON qcow2.py "$TEST_IMG" dump-header | grep features
+_qcow2_dump_header | grep features
 $PYTHON qcow2.py "$TEST_IMG" dump-header-exts
 
 # success, all done
diff --git a/tests/qemu-iotests/039 b/tests/qemu-iotests/039
index 12b2c7fa7b..8e783a8380 100755
--- a/tests/qemu-iotests/039
+++ b/tests/qemu-iotests/039
@@ -59,7 +59,7 @@ _make_test_img -o "compat=1.1,lazy_refcounts=on" $size
 $QEMU_IO -c "write -P 0x5a 0 512" "$TEST_IMG" | _filter_qemu_io
 
 # The dirty bit must not be set
-$PYTHON qcow2.py "$TEST_IMG" dump-header | grep incompatible_features
+_qcow2_dump_header | grep incompatible_features
 _check_test_img
 
 echo
@@ -73,7 +73,7 @@ $QEMU_IO -c "write -P 0x5a 0 512" \
 | _filter_qemu_io
 
 # The dirty bit must be set
-$PYTHON qcow2.py "$TEST_IMG" dump-header | grep incompatible_features
+_qcow2_dump_header | grep incompatible_features
 _check_test_img
 
 echo
@@ -82,7 +82,7 @@ echo "== Read-only access must still work =="
 $QEMU_IO -r -c "read -P 0x5a 0 512" "$TEST_IMG" | _filter_qemu_io
 
 # The dirty bit must be set
-$PYTHON qcow2.py "$TEST_IMG" dump-header | grep incompatible_features
+_qcow2_dump_header | grep incompatible_features
 
 echo
 echo "== Repairing the image file must succeed =="
@@ -90,7 +90,7 @@ echo "== Repairing the image file must succeed =="
 _check_test_img -r all
 
 # The dirty bit must not be set
-$PYTHON qcow2.py "$TEST_IMG" dump-header | grep incompatible_features
+_qcow2_dump_header | grep incompatible_features
 
 echo
 echo "== Data should still be accessible after repair =="
@@ -108,12 +108,12 @@ $QEMU_IO -c "write -P 0x5a 0 512" \
 | _filter_qemu_io
 
 # The dirty bit must be set
-$PYTHON qcow2.py "$TEST_IMG" dump-header | grep incompatible_features
+_qcow2_dump_header | grep incompatible_features
 
 $QEMU_IO -c "write 0 512" "$TEST_IMG" | _filter_qemu_io
 
 # The dirty bit must not be set
-$PYTHON qcow2.py "$TEST_IMG" dump-header | grep incompatible_features
+_qcow2_dump_header | grep incompatible_features
 
 echo
 echo "== Creating an image file with lazy_refcounts=off =="
@@ -126,7 +126,7 @@ $QEMU_IO -c "write -P 0x5a 0 512" \
 | _filter_qemu_io
 
 # The dirty bit must not be set since lazy_refcounts=off
-$PYTHON qcow2.py "$TEST_IMG" dump-header | grep incompatible_features
+_qcow2_dump_header | grep incompatible_features
 _check_test_img
 
 echo
@@ -141,7 +141,7 

[PULL 19/24] iotests 60: more accurate set dirty bit in qcow2 header

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

Don't touch other incompatible bits, like compression-type. This makes
the test pass with IMGOPTS='compression_type=zstd'.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20211223160144.1097696-18-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/060 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
index d1e3204d4e..df87d600f7 100755
--- a/tests/qemu-iotests/060
+++ b/tests/qemu-iotests/060
@@ -326,7 +326,7 @@ _make_test_img 64M
 # Let the refblock appear unaligned
 poke_file "$TEST_IMG" "$rt_offset""\x00\x00\x00\x00\xff\xff\x2a\x00"
 # Mark the image dirty, thus forcing an automatic check when opening it
-poke_file "$TEST_IMG" 72 "\x00\x00\x00\x00\x00\x00\x00\x01"
+$PYTHON qcow2.py "$TEST_IMG" set-feature-bit incompatible 0
 # Open the image (qemu should refuse to do so)
 $QEMU_IO -c close "$TEST_IMG" 2>&1 | _filter_testdir | _filter_imgfmt
 
-- 
2.34.1




[PULL 06/24] iotests.py: qemu_img*("create"): support IMGOPTS='compression_type=zstd'

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

Adding support of IMGOPTS (like in bash tests) allows user to pass a
lot of different options. Still, some may require additional logic.

Now we want compression_type option, so add some smart logic around it:
ignore compression_type=zstd in IMGOPTS, if test want qcow2 in
compatibility mode. As well, ignore compression_type for non-qcow2
formats.

Note that we may instead add support only to qemu_img_create(), but
that works bad:

1. We'll have to update a lot of tests to use qemu_img_create instead
   of qemu_img('create'). (still, we may want do it anyway, but no
   reason to create a dependancy between task of supporting IMGOPTS and
   updating a lot of tests)

2. Some tests use qemu_img_pipe('create', ..) - even more work on
   updating

3. Even if we update all tests to go through qemu_img_create, we'll
   need a way to avoid creating new tests using qemu_img*('create') -
   add assertions.. That doesn't seem good.

So, let's add support of IMGOPTS to most generic
qemu_img_pipe_and_status().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Hanna Reitz 
Message-Id: <20211223160144.1097696-5-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/iotests.py | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 2fa5dcba76..740f8be36b 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -16,6 +16,7 @@
 # along with this program.  If not, see .
 #
 
+import argparse
 import atexit
 import bz2
 from collections import OrderedDict
@@ -161,11 +162,35 @@ def qemu_tool_pipe_and_status(tool: str, args: 
Sequence[str],
{-subp.returncode}: {cmd}\n')
 return (output, subp.returncode)
 
+def qemu_img_create_prepare_args(args: List[str]) -> List[str]:
+if not args or args[0] != 'create':
+return list(args)
+args = args[1:]
+
+p = argparse.ArgumentParser(allow_abbrev=False)
+p.add_argument('-f')
+parsed, remaining = p.parse_known_args(args)
+
+result = ['create']
+if parsed.f is not None:
+result += ['-f', parsed.f]
+
+# IMGOPTS most probably contain options specific for the selected format,
+# like extended_l2 or compression_type for qcow2. Test may want to create
+# additional images in other formats that doesn't support these options.
+# So, use IMGOPTS only for images created in imgfmt format.
+if parsed.f == imgfmt and 'IMGOPTS' in os.environ:
+result += ['-o', os.environ['IMGOPTS']]
+
+result += remaining
+
+return result
+
 def qemu_img_pipe_and_status(*args: str) -> Tuple[str, int]:
 """
 Run qemu-img and return both its output and its exit code
 """
-full_args = qemu_img_args + list(args)
+full_args = qemu_img_args + qemu_img_create_prepare_args(list(args))
 return qemu_tool_pipe_and_status('qemu-img', full_args)
 
 def qemu_img(*args: str) -> int:
-- 
2.34.1




[PULL 23/24] iotests/migration-permissions: New test

2022-02-01 Thread Hanna Reitz
This test checks that a raw image in use by a virtio-blk device does not
share the WRITE permission both before and after migration.

Signed-off-by: Hanna Reitz 
---
 .../qemu-iotests/tests/migration-permissions  | 101 ++
 .../tests/migration-permissions.out   |   5 +
 2 files changed, 106 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/migration-permissions
 create mode 100644 tests/qemu-iotests/tests/migration-permissions.out

diff --git a/tests/qemu-iotests/tests/migration-permissions 
b/tests/qemu-iotests/tests/migration-permissions
new file mode 100755
index 00..6be02581c7
--- /dev/null
+++ b/tests/qemu-iotests/tests/migration-permissions
@@ -0,0 +1,101 @@
+#!/usr/bin/env python3
+# group: migration
+#
+# Copyright (C) 2021 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import iotests
+from iotests import imgfmt, qemu_img_create, qemu_io
+
+
+test_img = os.path.join(iotests.test_dir, 'test.img')
+mig_sock = os.path.join(iotests.sock_dir, 'mig.sock')
+
+
+class TestMigrationPermissions(iotests.QMPTestCase):
+def setUp(self):
+qemu_img_create('-f', imgfmt, test_img, '1M')
+
+# Set up two VMs (source and destination) accessing the same raw
+# image file with a virtio-blk device; prepare the destination for
+# migration with .add_incoming() and enable migration events
+vms = [None, None]
+for i in range(2):
+vms[i] = iotests.VM(path_suffix=f'{i}')
+vms[i].add_blockdev(f'file,node-name=prot,filename={test_img}')
+vms[i].add_blockdev(f'{imgfmt},node-name=fmt,file=prot')
+vms[i].add_device('virtio-blk,drive=fmt')
+
+if i == 1:
+vms[i].add_incoming(f'unix:{mig_sock}')
+
+vms[i].launch()
+
+result = vms[i].qmp('migrate-set-capabilities',
+capabilities=[
+{'capability': 'events', 'state': True}
+])
+self.assert_qmp(result, 'return', {})
+
+self.vm_s = vms[0]
+self.vm_d = vms[1]
+
+def tearDown(self):
+self.vm_s.shutdown()
+self.vm_d.shutdown()
+try:
+os.remove(mig_sock)
+except FileNotFoundError:
+pass
+os.remove(test_img)
+
+# Migrate an image in use by a virtio-blk device to another VM and
+# verify that the WRITE permission is unshared both before and after
+# migration
+def test_post_migration_permissions(self):
+# Try to access the image R/W, which should fail because virtio-blk
+# has not been configured with share-rw=on
+log = qemu_io('-f', imgfmt, '-c', 'quit', test_img)
+if not log.strip():
+print('ERROR (pre-migration): qemu-io should not be able to '
+  'access this image, but it reported no error')
+else:
+# This is the expected output
+assert 'Is another process using the image' in log
+
+# Now migrate the VM
+self.vm_s.qmp('migrate', uri=f'unix:{mig_sock}')
+assert self.vm_s.wait_migration(None)
+assert self.vm_d.wait_migration(None)
+
+# Try the same qemu-io access again, verifying that the WRITE
+# permission remains unshared
+log = qemu_io('-f', imgfmt, '-c', 'quit', test_img)
+if not log.strip():
+print('ERROR (post-migration): qemu-io should not be able to '
+  'access this image, but it reported no error')
+else:
+# This is the expected output
+assert 'Is another process using the image' in log
+
+
+if __name__ == '__main__':
+# Only works with raw images because we are testing the
+# BlockBackend permissions; image format drivers may additionally
+# unshare permissions and thus tamper with the result
+iotests.main(supported_fmts=['raw'],
+ supported_protocols=['file'])
diff --git a/tests/qemu-iotests/tests/migration-permissions.out 
b/tests/qemu-iotests/tests/migration-permissions.out
new file mode 100644
index 00..ae1213e6f8
--- /dev/null
+++ b/tests/qemu-iotests/tests/migration-permissions.out
@@ -0,0 +1,5 @@
+.
+--
+Ran 1 tests
+
+OK
-- 
2.34.1




[PULL 11/24] iotests.py: filter out successful output of qemu-img create

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

The only "feature" of this "Formatting ..." line is that we have to
update it every time we add new option. Let's drop it.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Hanna Reitz 
Message-Id: <20211223160144.1097696-10-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/149.out| 21 -
 tests/qemu-iotests/237.out|  3 ---
 tests/qemu-iotests/255.out|  4 
 tests/qemu-iotests/274.out| 29 -
 tests/qemu-iotests/280.out|  1 -
 tests/qemu-iotests/296.out| 10 +++---
 tests/qemu-iotests/iotests.py | 10 --
 7 files changed, 11 insertions(+), 67 deletions(-)

diff --git a/tests/qemu-iotests/149.out b/tests/qemu-iotests/149.out
index 6877ab6c4a..ab879596ce 100644
--- a/tests/qemu-iotests/149.out
+++ b/tests/qemu-iotests/149.out
@@ -61,7 +61,6 @@ unlink TEST_DIR/luks-aes-256-xts-plain64-sha1.img
 # = qemu-img aes-256-xts-plain64-sha1 =
 # Create image
 qemu-img create -f luks --object secret,id=sec0,data=MTIzNDU2,format=base64 -o 
key-secret=sec0,iter-time=10,cipher-alg=aes-256,cipher-mode=xts,ivgen-alg=plain64,hash-alg=sha1
 TEST_DIR/luks-aes-256-xts-plain64-sha1.img 4194304M
-Formatting 'TEST_DIR/luks-aes-256-xts-plain64-sha1.img', fmt=luks 
size=4398046511104 key-secret=sec0 cipher-alg=aes-256 cipher-mode=xts 
ivgen-alg=plain64 hash-alg=sha1 iter-time=10
 
 # Open dev
 sudo cryptsetup -q -v luksOpen TEST_DIR/luks-aes-256-xts-plain64-sha1.img 
qiotest-145-aes-256-xts-plain64-sha1
@@ -181,7 +180,6 @@ unlink TEST_DIR/luks-twofish-256-xts-plain64-sha1.img
 # = qemu-img twofish-256-xts-plain64-sha1 =
 # Create image
 qemu-img create -f luks --object secret,id=sec0,data=MTIzNDU2,format=base64 -o 
key-secret=sec0,iter-time=10,cipher-alg=twofish-256,cipher-mode=xts,ivgen-alg=plain64,hash-alg=sha1
 TEST_DIR/luks-twofish-256-xts-plain64-sha1.img 4194304M
-Formatting 'TEST_DIR/luks-twofish-256-xts-plain64-sha1.img', fmt=luks 
size=4398046511104 key-secret=sec0 cipher-alg=twofish-256 cipher-mode=xts 
ivgen-alg=plain64 hash-alg=sha1 iter-time=10
 
 # Open dev
 sudo cryptsetup -q -v luksOpen TEST_DIR/luks-twofish-256-xts-plain64-sha1.img 
qiotest-145-twofish-256-xts-plain64-sha1
@@ -301,7 +299,6 @@ unlink TEST_DIR/luks-serpent-256-xts-plain64-sha1.img
 # = qemu-img serpent-256-xts-plain64-sha1 =
 # Create image
 qemu-img create -f luks --object secret,id=sec0,data=MTIzNDU2,format=base64 -o 
key-secret=sec0,iter-time=10,cipher-alg=serpent-256,cipher-mode=xts,ivgen-alg=plain64,hash-alg=sha1
 TEST_DIR/luks-serpent-256-xts-plain64-sha1.img 4194304M
-Formatting 'TEST_DIR/luks-serpent-256-xts-plain64-sha1.img', fmt=luks 
size=4398046511104 key-secret=sec0 cipher-alg=serpent-256 cipher-mode=xts 
ivgen-alg=plain64 hash-alg=sha1 iter-time=10
 
 # Open dev
 sudo cryptsetup -q -v luksOpen TEST_DIR/luks-serpent-256-xts-plain64-sha1.img 
qiotest-145-serpent-256-xts-plain64-sha1
@@ -421,7 +418,6 @@ unlink TEST_DIR/luks-cast5-128-cbc-plain64-sha1.img
 # = qemu-img cast5-128-cbc-plain64-sha1 =
 # Create image
 qemu-img create -f luks --object secret,id=sec0,data=MTIzNDU2,format=base64 -o 
key-secret=sec0,iter-time=10,cipher-alg=cast5-128,cipher-mode=cbc,ivgen-alg=plain64,hash-alg=sha1
 TEST_DIR/luks-cast5-128-cbc-plain64-sha1.img 4194304M
-Formatting 'TEST_DIR/luks-cast5-128-cbc-plain64-sha1.img', fmt=luks 
size=4398046511104 key-secret=sec0 cipher-alg=cast5-128 cipher-mode=cbc 
ivgen-alg=plain64 hash-alg=sha1 iter-time=10
 
 # Open dev
 sudo cryptsetup -q -v luksOpen TEST_DIR/luks-cast5-128-cbc-plain64-sha1.img 
qiotest-145-cast5-128-cbc-plain64-sha1
@@ -542,7 +538,6 @@ unlink TEST_DIR/luks-aes-256-cbc-plain-sha1.img
 # = qemu-img aes-256-cbc-plain-sha1 =
 # Create image
 qemu-img create -f luks --object secret,id=sec0,data=MTIzNDU2,format=base64 -o 
key-secret=sec0,iter-time=10,cipher-alg=aes-256,cipher-mode=cbc,ivgen-alg=plain,hash-alg=sha1
 TEST_DIR/luks-aes-256-cbc-plain-sha1.img 4194304M
-Formatting 'TEST_DIR/luks-aes-256-cbc-plain-sha1.img', fmt=luks 
size=4398046511104 key-secret=sec0 cipher-alg=aes-256 cipher-mode=cbc 
ivgen-alg=plain hash-alg=sha1 iter-time=10
 
 # Open dev
 sudo cryptsetup -q -v luksOpen TEST_DIR/luks-aes-256-cbc-plain-sha1.img 
qiotest-145-aes-256-cbc-plain-sha1
@@ -662,7 +657,6 @@ unlink TEST_DIR/luks-aes-256-cbc-plain64-sha1.img
 # = qemu-img aes-256-cbc-plain64-sha1 =
 # Create image
 qemu-img create -f luks --object secret,id=sec0,data=MTIzNDU2,format=base64 -o 
key-secret=sec0,iter-time=10,cipher-alg=aes-256,cipher-mode=cbc,ivgen-alg=plain64,hash-alg=sha1
 TEST_DIR/luks-aes-256-cbc-plain64-sha1.img 4194304M
-Formatting 'TEST_DIR/luks-aes-256-cbc-plain64-sha1.img', fmt=luks 
size=4398046511104 key-secret=sec0 cipher-alg=aes-256 cipher-mode=cbc 
ivgen-alg=plain64 

[PULL 14/24] qcow2: simple case support for downgrading of qcow2 images with zstd

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

If image doesn't have any compressed cluster we can easily switch to
zlib compression, which may allow to downgrade the image.

That's mostly needed to support IMGOPTS='compression_type=zstd' in some
iotests which do qcow2 downgrade.

While being here also fix checkpatch complain against '#' in printf
formatting.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20211223160144.1097696-13-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 block/qcow2.c | 58 +--
 1 file changed, 56 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index d509016756..c8115e1cba 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -5279,6 +5279,38 @@ static int qcow2_load_vmstate(BlockDriverState *bs, 
QEMUIOVector *qiov,
 return bs->drv->bdrv_co_preadv_part(bs, offset, qiov->size, qiov, 0, 0);
 }
 
+static int qcow2_has_compressed_clusters(BlockDriverState *bs)
+{
+int64_t offset = 0;
+int64_t bytes = bdrv_getlength(bs);
+
+if (bytes < 0) {
+return bytes;
+}
+
+while (bytes != 0) {
+int ret;
+QCow2SubclusterType type;
+unsigned int cur_bytes = MIN(INT_MAX, bytes);
+uint64_t host_offset;
+
+ret = qcow2_get_host_offset(bs, offset, _bytes, _offset,
+);
+if (ret < 0) {
+return ret;
+}
+
+if (type == QCOW2_SUBCLUSTER_COMPRESSED) {
+return 1;
+}
+
+offset += cur_bytes;
+bytes -= cur_bytes;
+}
+
+return 0;
+}
+
 /*
  * Downgrades an image's version. To achieve this, any incompatible features
  * have to be removed.
@@ -5336,9 +5368,10 @@ static int qcow2_downgrade(BlockDriverState *bs, int 
target_version,
  * the first place; if that happens nonetheless, returning -ENOTSUP is the
  * best thing to do anyway */
 
-if (s->incompatible_features) {
+if (s->incompatible_features & ~QCOW2_INCOMPAT_COMPRESSION) {
 error_setg(errp, "Cannot downgrade an image with incompatible features 
"
-   "%#" PRIx64 " set", s->incompatible_features);
+   "0x%" PRIx64 " set",
+   s->incompatible_features & ~QCOW2_INCOMPAT_COMPRESSION);
 return -ENOTSUP;
 }
 
@@ -5356,6 +5389,27 @@ static int qcow2_downgrade(BlockDriverState *bs, int 
target_version,
 return ret;
 }
 
+if (s->incompatible_features & QCOW2_INCOMPAT_COMPRESSION) {
+ret = qcow2_has_compressed_clusters(bs);
+if (ret < 0) {
+error_setg(errp, "Failed to check block status");
+return -EINVAL;
+}
+if (ret) {
+error_setg(errp, "Cannot downgrade an image with zstd compression "
+   "type and existing compressed clusters");
+return -ENOTSUP;
+}
+/*
+ * No compressed clusters for now, so just chose default zlib
+ * compression.
+ */
+s->incompatible_features &= ~QCOW2_INCOMPAT_COMPRESSION;
+s->compression_type = QCOW2_COMPRESSION_TYPE_ZLIB;
+}
+
+assert(s->incompatible_features == 0);
+
 s->qcow_version = target_version;
 ret = qcow2_update_header(bs);
 if (ret < 0) {
-- 
2.34.1




[PULL 15/24] iotests/common.rc: introduce _qcow2_dump_header helper

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

We'll use it in tests instead of explicit qcow2.py. Then we are going
to add some filtering in _qcow2_dump_header.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20211223160144.1097696-14-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/common.rc | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index d8582454de..5dea310ea0 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -996,5 +996,15 @@ _require_one_device_of()
 _notrun "$* not available"
 }
 
+_qcow2_dump_header()
+{
+img="$1"
+if [ -z "$img" ]; then
+img="$TEST_IMG"
+fi
+
+$PYTHON qcow2.py "$img" dump-header
+}
+
 # make sure this script returns success
 true
-- 
2.34.1




[PULL 05/24] iotests: specify some unsupported_imgopts for python iotests

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

We are going to support IMGOPTS for python iotests. Still some iotests
will not work with common IMGOPTS used with bash iotests like
specifying refcount_bits and compat qcow2 options. So we
should define corresponding unsupported_imgopts for now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20211223160144.1097696-4-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/044 | 3 ++-
 tests/qemu-iotests/065 | 3 ++-
 tests/qemu-iotests/163 | 3 ++-
 tests/qemu-iotests/165 | 3 ++-
 tests/qemu-iotests/196 | 3 ++-
 tests/qemu-iotests/242 | 3 ++-
 tests/qemu-iotests/246 | 3 ++-
 tests/qemu-iotests/254 | 3 ++-
 tests/qemu-iotests/260 | 3 ++-
 tests/qemu-iotests/274 | 3 ++-
 tests/qemu-iotests/281 | 3 ++-
 tests/qemu-iotests/303 | 3 ++-
 tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test | 3 ++-
 tests/qemu-iotests/tests/migrate-bitmaps-test  | 3 ++-
 tests/qemu-iotests/tests/remove-bitmap-from-backing| 3 ++-
 15 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/044 b/tests/qemu-iotests/044
index 64b18eb7c8..d696e6442a 100755
--- a/tests/qemu-iotests/044
+++ b/tests/qemu-iotests/044
@@ -117,4 +117,5 @@ class TestRefcountTableGrowth(iotests.QMPTestCase):
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'],
- supported_protocols=['file'])
+ supported_protocols=['file'],
+ unsupported_imgopts=['refcount_bits'])
diff --git a/tests/qemu-iotests/065 b/tests/qemu-iotests/065
index 3c2ca27627..dc7716275f 100755
--- a/tests/qemu-iotests/065
+++ b/tests/qemu-iotests/065
@@ -139,4 +139,5 @@ TestQMP = None
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'],
- supported_protocols=['file'])
+ supported_protocols=['file'],
+ unsupported_imgopts=['refcount_bits'])
diff --git a/tests/qemu-iotests/163 b/tests/qemu-iotests/163
index dedce8ef43..b8bfc95358 100755
--- a/tests/qemu-iotests/163
+++ b/tests/qemu-iotests/163
@@ -169,4 +169,5 @@ ShrinkBaseClass = None
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['raw', 'qcow2'],
- supported_protocols=['file'])
+ supported_protocols=['file'],
+ unsupported_imgopts=['compat'])
diff --git a/tests/qemu-iotests/165 b/tests/qemu-iotests/165
index ce499946b8..e3ef28e2ee 100755
--- a/tests/qemu-iotests/165
+++ b/tests/qemu-iotests/165
@@ -157,4 +157,5 @@ class TestPersistentDirtyBitmap(iotests.QMPTestCase):
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'],
- supported_protocols=['file'])
+ supported_protocols=['file'],
+ unsupported_imgopts=['compat'])
diff --git a/tests/qemu-iotests/196 b/tests/qemu-iotests/196
index 2451515094..76509a5ad1 100755
--- a/tests/qemu-iotests/196
+++ b/tests/qemu-iotests/196
@@ -65,4 +65,5 @@ class TestInvalidateAutoclear(iotests.QMPTestCase):
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'],
- supported_protocols=['file'])
+ supported_protocols=['file'],
+ unsupported_imgopts=['compat'])
diff --git a/tests/qemu-iotests/242 b/tests/qemu-iotests/242
index a9b27668c2..96a30152b0 100755
--- a/tests/qemu-iotests/242
+++ b/tests/qemu-iotests/242
@@ -26,7 +26,8 @@ from iotests import qemu_img_create, qemu_io, qemu_img_pipe, \
 file_path, img_info_log, log, filter_qemu_io
 
 iotests.script_initialize(supported_fmts=['qcow2'],
-  supported_protocols=['file'])
+  supported_protocols=['file'],
+  unsupported_imgopts=['refcount_bits', 'compat'])
 
 disk = file_path('disk')
 chunk = 256 * 1024
diff --git a/tests/qemu-iotests/246 b/tests/qemu-iotests/246
index 5932a0e8a9..b009a78397 100755
--- a/tests/qemu-iotests/246
+++ b/tests/qemu-iotests/246
@@ -23,7 +23,8 @@
 import iotests
 from iotests import log
 
-iotests.script_initialize(supported_fmts=['qcow2'])
+iotests.script_initialize(supported_fmts=['qcow2'],
+  unsupported_imgopts=['compat'])
 size = 64 * 1024 * 1024 * 1024
 gran_small = 32 * 1024
 gran_large = 128 * 1024
diff --git a/tests/qemu-iotests/254 b/tests/qemu-iotests/254
index 108bf5f894..7ea098818c 100755
--- a/tests/qemu-iotests/254
+++ b/tests/qemu-iotests/254
@@ -22,7 +22,8 @@
 import iotests
 from iotests import qemu_img_create, file_path, log
 
-iotests.script_initialize(supported_fmts=['qcow2'])

[PULL 13/24] iotest 302: use img_info_log() helper

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

Instead of qemu_img_log("info", ..) use generic helper img_info_log().

img_info_log() has smarter logic. For example it use filter_img_info()
to filter output, which in turns filter a compression type. So it will
help us in future when we implement a possibility to use zstd
compression by default (with help of some runtime config file or maybe
build option). For now to test you should recompile qemu with a small
addition into block/qcow2.c before
"if (qcow2_opts->has_compression_type":

if (!qcow2_opts->has_compression_type && version >= 3) {
qcow2_opts->has_compression_type = true;
qcow2_opts->compression_type = QCOW2_COMPRESSION_TYPE_ZSTD;
}

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Message-Id: <20211223160144.1097696-12-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/302 | 4 +++-
 tests/qemu-iotests/302.out | 7 +++
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/302 b/tests/qemu-iotests/302
index 5695af4914..a6d79e727b 100755
--- a/tests/qemu-iotests/302
+++ b/tests/qemu-iotests/302
@@ -34,6 +34,7 @@ from iotests import (
 qemu_img_measure,
 qemu_io,
 qemu_nbd_popen,
+img_info_log,
 )
 
 iotests.script_initialize(supported_fmts=["qcow2"])
@@ -88,6 +89,7 @@ with tarfile.open(tar_file, "w") as tar:
 tar_file):
 
 iotests.log("=== Target image info ===")
+# Not img_info_log as it enforces imgfmt, but now we print info on raw
 qemu_img_log("info", nbd_uri)
 
 qemu_img(
@@ -99,7 +101,7 @@ with tarfile.open(tar_file, "w") as tar:
 nbd_uri)
 
 iotests.log("=== Converted image info ===")
-qemu_img_log("info", nbd_uri)
+img_info_log(nbd_uri)
 
 iotests.log("=== Converted image check ===")
 qemu_img_log("check", nbd_uri)
diff --git a/tests/qemu-iotests/302.out b/tests/qemu-iotests/302.out
index e2f6077e83..3e7c281b91 100644
--- a/tests/qemu-iotests/302.out
+++ b/tests/qemu-iotests/302.out
@@ -6,14 +6,13 @@ virtual size: 448 KiB (458752 bytes)
 disk size: unavailable
 
 === Converted image info ===
-image: nbd+unix:///exp?socket=SOCK_DIR/PID-nbd-sock
-file format: qcow2
+image: TEST_IMG
+file format: IMGFMT
 virtual size: 1 GiB (1073741824 bytes)
-disk size: unavailable
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 corrupt: false
-- 
2.34.1




[PULL 21/24] iotests: declare lack of support for compresion_type in IMGOPTS

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

compression_type can't be used if we want to create image with
compat=0.10. So, skip these tests, not many of them.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Hanna Reitz 
Message-Id: <20211223160144.1097696-20-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/031 | 5 +++--
 tests/qemu-iotests/051 | 5 +++--
 tests/qemu-iotests/061 | 6 +-
 tests/qemu-iotests/112 | 3 ++-
 tests/qemu-iotests/290 | 2 +-
 5 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/tests/qemu-iotests/031 b/tests/qemu-iotests/031
index 648112f796..ee587b1606 100755
--- a/tests/qemu-iotests/031
+++ b/tests/qemu-iotests/031
@@ -42,8 +42,9 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 _supported_fmt qcow2
 _supported_proto file fuse
 # We want to test compat=0.10, which does not support external data
-# files or refcount widths other than 16
-_unsupported_imgopts data_file 'refcount_bits=\([^1]\|.\([^6]\|$\)\)'
+# files or refcount widths other than 16 or compression type
+_unsupported_imgopts data_file compression_type \
+'refcount_bits=\([^1]\|.\([^6]\|$\)\)'
 
 CLUSTER_SIZE=65536
 
diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index e9042a6214..f1a506518b 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -41,8 +41,9 @@ _supported_fmt qcow2
 _supported_proto file
 # A compat=0.10 image is created in this test which does not support anything
 # other than refcount_bits=16;
-# it also will not support an external data file
-_unsupported_imgopts 'refcount_bits=\([^1]\|.\([^6]\|$\)\)' data_file
+# it also will not support an external data file and compression type
+_unsupported_imgopts 'refcount_bits=\([^1]\|.\([^6]\|$\)\)' data_file \
+compression_type
 _require_drivers nbd
 
 if [ "$QEMU_DEFAULT_MACHINE" = "pc" ]; then
diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061
index 70edf1a163..513fbec14c 100755
--- a/tests/qemu-iotests/061
+++ b/tests/qemu-iotests/061
@@ -48,7 +48,11 @@ _supported_os Linux
 # not work with it;
 # we have explicit tests for various cluster sizes, the remaining tests
 # require the default 64k cluster
-_unsupported_imgopts 'refcount_bits=\([^1]\|.\([^6]\|$\)\)' data_file 
cluster_size
+# we don't have explicit tests for zstd qcow2 compression type, as zstd may be
+# not compiled in. And we can't create compat images with comression type
+# extension
+_unsupported_imgopts 'refcount_bits=\([^1]\|.\([^6]\|$\)\)' data_file \
+cluster_size compression_type
 
 echo
 echo "=== Testing version downgrade with zero expansion ==="
diff --git a/tests/qemu-iotests/112 b/tests/qemu-iotests/112
index 07ac74fb2c..5333212993 100755
--- a/tests/qemu-iotests/112
+++ b/tests/qemu-iotests/112
@@ -43,7 +43,8 @@ _supported_proto file fuse
 # This test will set refcount_bits on its own which would conflict with the
 # manual setting; compat will be overridden as well;
 # and external data files do not work well with our refcount testing
-_unsupported_imgopts refcount_bits 'compat=0.10' data_file
+# also, compression type is not supported with compat=0.10 used in test
+_unsupported_imgopts refcount_bits 'compat=0.10' data_file compression_type
 
 print_refcount_bits()
 {
diff --git a/tests/qemu-iotests/290 b/tests/qemu-iotests/290
index ed80da2685..776b59de1b 100755
--- a/tests/qemu-iotests/290
+++ b/tests/qemu-iotests/290
@@ -41,7 +41,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 _supported_fmt qcow2
 _supported_proto file fuse
 _supported_os Linux
-_unsupported_imgopts 'compat=0.10' refcount_bits data_file
+_unsupported_imgopts 'compat=0.10' refcount_bits data_file compression_type
 
 echo
 echo "### Test 'qemu-io -c discard' on a QCOW2 image without a backing file"
-- 
2.34.1




[PULL 18/24] iotests: bash tests: filter compression type

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

We want iotests pass with both the default zlib compression and with
IMGOPTS='compression_type=zstd'.

Actually the only test that is interested in real compression type in
test output is 287 (test for qcow2 compression type), so implement
specific option for it.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Hanna Reitz 
Message-Id: <20211223160144.1097696-17-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/060.out   |  2 +-
 tests/qemu-iotests/061.out   | 12 ++--
 tests/qemu-iotests/082.out   | 14 +++---
 tests/qemu-iotests/198.out   |  4 ++--
 tests/qemu-iotests/287   |  8 
 tests/qemu-iotests/common.filter |  8 
 tests/qemu-iotests/common.rc | 14 +-
 7 files changed, 41 insertions(+), 21 deletions(-)

diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index b74540bafb..329977d9b9 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -17,7 +17,7 @@ virtual size: 64 MiB (67108864 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 corrupt: true
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
index 7ecbd4dea8..139fc68177 100644
--- a/tests/qemu-iotests/061.out
+++ b/tests/qemu-iotests/061.out
@@ -525,7 +525,7 @@ virtual size: 64 MiB (67108864 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 data file: TEST_DIR/t.IMGFMT.data
@@ -552,7 +552,7 @@ virtual size: 64 MiB (67108864 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 data file: foo
@@ -567,7 +567,7 @@ virtual size: 64 MiB (67108864 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 data file raw: false
@@ -583,7 +583,7 @@ virtual size: 64 MiB (67108864 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 data file: TEST_DIR/t.IMGFMT.data
@@ -597,7 +597,7 @@ virtual size: 64 MiB (67108864 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 data file: TEST_DIR/t.IMGFMT.data
@@ -612,7 +612,7 @@ virtual size: 64 MiB (67108864 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 data file: TEST_DIR/t.IMGFMT.data
diff --git a/tests/qemu-iotests/082.out b/tests/qemu-iotests/082.out
index 077ed0f2c7..d0dd333117 100644
--- a/tests/qemu-iotests/082.out
+++ b/tests/qemu-iotests/082.out
@@ -17,7 +17,7 @@ virtual size: 128 MiB (134217728 bytes)
 cluster_size: 4096
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: true
 refcount bits: 16
 corrupt: false
@@ -31,7 +31,7 @@ virtual size: 128 MiB (134217728 bytes)
 cluster_size: 8192
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: true
 refcount bits: 16
 corrupt: false
@@ -329,7 +329,7 @@ virtual size: 128 MiB (134217728 bytes)
 cluster_size: 4096
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: true
 refcount bits: 16
 corrupt: false
@@ -342,7 +342,7 @@ virtual size: 128 MiB (134217728 bytes)
 cluster_size: 8192
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: true
 refcount bits: 16
 corrupt: false
@@ -639,7 +639,7 @@ virtual size: 128 MiB (134217728 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: true
 refcount bits: 16
 corrupt: false
@@ -652,7 +652,7 @@ virtual size: 130 MiB (136314880 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 corrupt: false
@@ -665,7 +665,7 @@ virtual size: 132 MiB (138412032 bytes)
 cluster_size: 65536
 Format specific information:

[PULL 24/24] block.h: remove outdated comment

2022-02-01 Thread Hanna Reitz
From: Emanuele Giuseppe Esposito 

The comment "disk I/O throttling" doesn't make any sense at all
any more. It was added in commit 0563e191516 to describe
bdrv_io_limits_enable()/disable(), which were removed in commit
97148076, so the comment is just a forgotten leftover.

Suggested-by: Kevin Wolf 
Signed-off-by: Emanuele Giuseppe Esposito 
Message-Id: <20220131125615.74612-1-eespo...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Hanna Reitz 
---
 include/block/block.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/block/block.h b/include/block/block.h
index 9d4050220b..e1713ee306 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -344,7 +344,6 @@ typedef unsigned int BdrvChildRole;
 char *bdrv_perm_names(uint64_t perm);
 uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission qapi_perm);
 
-/* disk I/O throttling */
 void bdrv_init(void);
 void bdrv_init_with_whitelist(void);
 bool bdrv_uses_whitelist(void);
-- 
2.34.1




[PULL 04/24] iotests.py: implement unsupported_imgopts

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

We are going to support some addition IMGOPTS in python iotests like
in bash iotests. Similarly to bash iotests, we want a way to skip some
tests which can't work with specific IMGOPTS.

Globally for python iotests we will not support things like
'data_file=$TEST_IMG.ext_data_file' in IMGOPTS, so, forbid this
globally in iotests.py.

Suggested-by: Hanna Reitz 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Hanna Reitz 
Message-Id: <20211223160144.1097696-3-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/iotests.py | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 30a8837ea2..2fa5dcba76 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -1226,6 +1226,17 @@ def _verify_virtio_scsi_pci_or_ccw() -> None:
 notrun('Missing virtio-scsi-pci or virtio-scsi-ccw in QEMU binary')
 
 
+def _verify_imgopts(unsupported: Sequence[str] = ()) -> None:
+imgopts = os.environ.get('IMGOPTS')
+# One of usage examples for IMGOPTS is "data_file=$TEST_IMG.ext_data_file"
+# but it supported only for bash tests. We don't have a concept of global
+# TEST_IMG in iotests.py, not saying about somehow parsing $variables.
+# So, for simplicity let's just not support any IMGOPTS with '$' inside.
+unsup = list(unsupported) + ['$']
+if imgopts and any(x in imgopts for x in unsup):
+notrun(f'not suitable for this imgopts: {imgopts}')
+
+
 def supports_quorum():
 return 'quorum' in qemu_img_pipe('--help')
 
@@ -1402,7 +1413,8 @@ def execute_setup_common(supported_fmts: Sequence[str] = 
(),
  unsupported_fmts: Sequence[str] = (),
  supported_protocols: Sequence[str] = (),
  unsupported_protocols: Sequence[str] = (),
- required_fmts: Sequence[str] = ()) -> bool:
+ required_fmts: Sequence[str] = (),
+ unsupported_imgopts: Sequence[str] = ()) -> bool:
 """
 Perform necessary setup for either script-style or unittest-style tests.
 
@@ -1422,6 +1434,7 @@ def execute_setup_common(supported_fmts: Sequence[str] = 
(),
 _verify_aio_mode(supported_aio_modes)
 _verify_formats(required_fmts)
 _verify_virtio_blk()
+_verify_imgopts(unsupported_imgopts)
 
 return debug
 
-- 
2.34.1




Re: [PATCH V2 for-6.2 0/2] fixes for bdrv_co_block_status

2022-02-01 Thread Kevin Wolf
Am 13.01.2022 um 15:44 hat Peter Lieven geschrieben:
> V1->V2:
>  Patch 1: Treat a hole just like an unallocated area. [Ilya]
>  Patch 2: Apply workaround only for pre-Quincy librbd versions and
>   ensure default striping and non child images. [Ilya]
> 
> Peter Lieven (2):
>   block/rbd: fix handling of holes in .bdrv_co_block_status
>   block/rbd: workaround for ceph issue #53784

Thanks, applied to the block branch.

Kevin




[PULL 07/24] iotests: drop qemu_img_verbose() helper

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

qemu_img_verbose() has a drawback of not going through generic
qemu_img_pipe_and_status(). qemu_img_verbose() is not very popular, so
update the only two users to qemu_img_log() and drop qemu_img_verbose()
at all.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Hanna Reitz 
Message-Id: <20211223160144.1097696-6-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/044| 5 +++--
 tests/qemu-iotests/044.out| 1 +
 tests/qemu-iotests/209| 7 ---
 tests/qemu-iotests/209.out| 2 ++
 tests/qemu-iotests/iotests.py | 8 
 5 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/tests/qemu-iotests/044 b/tests/qemu-iotests/044
index d696e6442a..a5ee9a7ded 100755
--- a/tests/qemu-iotests/044
+++ b/tests/qemu-iotests/044
@@ -24,7 +24,7 @@ import os
 import qcow2
 from qcow2 import QcowHeader
 import iotests
-from iotests import qemu_img, qemu_img_verbose, qemu_io
+from iotests import qemu_img, qemu_img_log, qemu_io
 import struct
 import subprocess
 import sys
@@ -112,10 +112,11 @@ class TestRefcountTableGrowth(iotests.QMPTestCase):
 
 def test_grow_refcount_table(self):
 qemu_io('-c', 'write 3800M 1M', test_img)
-qemu_img_verbose('check' , test_img)
+qemu_img_log('check' , test_img)
 pass
 
 if __name__ == '__main__':
+iotests.activate_logging()
 iotests.main(supported_fmts=['qcow2'],
  supported_protocols=['file'],
  unsupported_imgopts=['refcount_bits'])
diff --git a/tests/qemu-iotests/044.out b/tests/qemu-iotests/044.out
index 703cf3dee1..ff663b17d7 100644
--- a/tests/qemu-iotests/044.out
+++ b/tests/qemu-iotests/044.out
@@ -1,6 +1,7 @@
 No errors were found on the image.
 7292415/33554432 = 21.73% allocated, 0.00% fragmented, 0.00% compressed 
clusters
 Image end offset: 4296217088
+
 .
 --
 Ran 1 tests
diff --git a/tests/qemu-iotests/209 b/tests/qemu-iotests/209
index ff7efea11b..f6ad08ec42 100755
--- a/tests/qemu-iotests/209
+++ b/tests/qemu-iotests/209
@@ -20,8 +20,8 @@
 #
 
 import iotests
-from iotests import qemu_img_create, qemu_io, qemu_img_verbose, qemu_nbd, \
-file_path
+from iotests import qemu_img_create, qemu_io, qemu_img_log, qemu_nbd, \
+file_path, log
 
 iotests.script_initialize(supported_fmts=['qcow2'])
 
@@ -33,4 +33,5 @@ qemu_img_create('-f', iotests.imgfmt, disk, '1M')
 qemu_io('-f', iotests.imgfmt, '-c', 'write 0 512K', disk)
 
 qemu_nbd('-k', nbd_sock, '-x', 'exp', '-f', iotests.imgfmt, disk)
-qemu_img_verbose('map', '-f', 'raw', '--output=json', nbd_uri)
+qemu_img_log('map', '-f', 'raw', '--output=json', nbd_uri)
+log('done.')  # avoid new line at the end of output file
diff --git a/tests/qemu-iotests/209.out b/tests/qemu-iotests/209.out
index f27be3fa7b..515906ac7a 100644
--- a/tests/qemu-iotests/209.out
+++ b/tests/qemu-iotests/209.out
@@ -1,2 +1,4 @@
 [{ "start": 0, "length": 524288, "depth": 0, "present": true, "zero": false, 
"data": true, "offset": 0},
 { "start": 524288, "length": 524288, "depth": 0, "present": true, "zero": 
true, "data": false, "offset": 524288}]
+
+done.
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 740f8be36b..cc4bbbcf7b 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -235,14 +235,6 @@ def qemu_img_measure(*args):
 def qemu_img_check(*args):
 return json.loads(qemu_img_pipe("check", "--output", "json", *args))
 
-def qemu_img_verbose(*args):
-'''Run qemu-img without suppressing its output and return the exit code'''
-exitcode = subprocess.call(qemu_img_args + list(args))
-if exitcode < 0:
-sys.stderr.write('qemu-img received signal %i: %s\n'
- % (-exitcode, ' '.join(qemu_img_args + list(args
-return exitcode
-
 def qemu_img_pipe(*args: str) -> str:
 '''Run qemu-img and return its output'''
 return qemu_img_pipe_and_status(*args)[0]
-- 
2.34.1




[PULL 08/24] iotests.py: rewrite default luks support in qemu_img

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

Move the logic to more generic qemu_img_pipe_and_status(). Also behave
better when we have several -o options. And reuse argument parser of
course.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Hanna Reitz 
Message-Id: <20211223160144.1097696-7-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/iotests.py | 36 +--
 1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index cc4bbbcf7b..c382c527c8 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -168,9 +168,13 @@ def qemu_img_create_prepare_args(args: List[str]) -> 
List[str]:
 args = args[1:]
 
 p = argparse.ArgumentParser(allow_abbrev=False)
+# -o option may be specified several times
+p.add_argument('-o', action='append', default=[])
 p.add_argument('-f')
 parsed, remaining = p.parse_known_args(args)
 
+opts_list = parsed.o
+
 result = ['create']
 if parsed.f is not None:
 result += ['-f', parsed.f]
@@ -179,8 +183,18 @@ def qemu_img_create_prepare_args(args: List[str]) -> 
List[str]:
 # like extended_l2 or compression_type for qcow2. Test may want to create
 # additional images in other formats that doesn't support these options.
 # So, use IMGOPTS only for images created in imgfmt format.
-if parsed.f == imgfmt and 'IMGOPTS' in os.environ:
-result += ['-o', os.environ['IMGOPTS']]
+imgopts = os.environ.get('IMGOPTS')
+if imgopts and parsed.f == imgfmt:
+opts_list.insert(0, imgopts)
+
+# default luks support
+if parsed.f == 'luks' and \
+all('key-secret' not in opts for opts in opts_list):
+result += ['--object', luks_default_secret_object]
+opts_list.append(luks_default_key_secret_opt)
+
+for opts in opts_list:
+result += ['-o', opts]
 
 result += remaining
 
@@ -211,23 +225,7 @@ def ordered_qmp(qmsg, conv_keys=True):
 return qmsg
 
 def qemu_img_create(*args):
-args = list(args)
-
-# default luks support
-if '-f' in args and args[args.index('-f') + 1] == 'luks':
-if '-o' in args:
-i = args.index('-o')
-if 'key-secret' not in args[i + 1]:
-args[i + 1].append(luks_default_key_secret_opt)
-args.insert(i + 2, '--object')
-args.insert(i + 3, luks_default_secret_object)
-else:
-args = ['-o', luks_default_key_secret_opt,
-'--object', luks_default_secret_object] + args
-
-args.insert(0, 'create')
-
-return qemu_img(*args)
+return qemu_img('create', *args)
 
 def qemu_img_measure(*args):
 return json.loads(qemu_img_pipe("measure", "--output", "json", *args))
-- 
2.34.1




Re: [PATCH v3 0/2] block/export/fuse: Fix build failure on FreeBSD

2022-02-01 Thread Kevin Wolf
Am 01.02.2022 um 12:26 hat Philippe Mathieu-Daudé geschrieben:
> Since v2:
> - Rearrange if-else-if ladder first (Kevin)
> 
> Philippe Mathieu-Daudé (2):
>   block/export/fuse: Rearrange if-else-if ladder in fuse_fallocate()
>   block/export/fuse: Fix build failure on FreeBSD

Thanks, applied to the block branch.

Kevin




[PULL 12/24] iotests.py: filter compression type out

2022-02-01 Thread Hanna Reitz
From: Vladimir Sementsov-Ogievskiy 

We want iotests pass with both the default zlib compression and with
IMGOPTS='compression_type=zstd'.

Actually the only test that is interested in real compression type in
test output is 287 (test for qcow2 compression type) and it's in bash.
So for now we can safely filter out compression type in all qcow2
tests.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Hanna Reitz 
Message-Id: <20211223160144.1097696-11-vsement...@virtuozzo.com>
Signed-off-by: Hanna Reitz 
---
 tests/qemu-iotests/206.out| 10 +-
 tests/qemu-iotests/242.out| 10 +-
 tests/qemu-iotests/274.out| 10 +-
 tests/qemu-iotests/iotests.py |  2 ++
 4 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/206.out b/tests/qemu-iotests/206.out
index 80cd274223..7e95694777 100644
--- a/tests/qemu-iotests/206.out
+++ b/tests/qemu-iotests/206.out
@@ -18,7 +18,7 @@ virtual size: 128 MiB (134217728 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 corrupt: false
@@ -42,7 +42,7 @@ virtual size: 64 MiB (67108864 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 corrupt: false
@@ -66,7 +66,7 @@ virtual size: 32 MiB (33554432 bytes)
 cluster_size: 2097152
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: true
 refcount bits: 1
 corrupt: false
@@ -92,7 +92,7 @@ backing file: TEST_IMG.base
 backing file format: IMGFMT
 Format specific information:
 compat: 0.10
-compression type: zlib
+compression type: COMPRESSION_TYPE
 refcount bits: 16
 
 === Successful image creation (encrypted) ===
@@ -109,7 +109,7 @@ encrypted: yes
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 encrypt:
diff --git a/tests/qemu-iotests/242.out b/tests/qemu-iotests/242.out
index 3759c99284..ce231424a7 100644
--- a/tests/qemu-iotests/242.out
+++ b/tests/qemu-iotests/242.out
@@ -12,7 +12,7 @@ virtual size: 1 MiB (1048576 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 corrupt: false
@@ -34,7 +34,7 @@ virtual size: 1 MiB (1048576 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 bitmaps:
 [0]:
@@ -68,7 +68,7 @@ virtual size: 1 MiB (1048576 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 bitmaps:
 [0]:
@@ -110,7 +110,7 @@ virtual size: 1 MiB (1048576 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 bitmaps:
 [0]:
@@ -161,7 +161,7 @@ virtual size: 1 MiB (1048576 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 bitmaps:
 [0]:
diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out
index 1d2928e14d..1ce40d839a 100644
--- a/tests/qemu-iotests/274.out
+++ b/tests/qemu-iotests/274.out
@@ -50,7 +50,7 @@ backing file: TEST_DIR/PID-base
 backing file format: IMGFMT
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 corrupt: false
@@ -79,7 +79,7 @@ backing file: TEST_DIR/PID-base
 backing file format: IMGFMT
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 corrupt: false
@@ -114,7 +114,7 @@ backing file: TEST_DIR/PID-base
 backing file format: IMGFMT
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 corrupt: false
@@ -141,7 +141,7 @@ virtual size: 2 MiB (2097152 bytes)
 cluster_size: 65536
 Format specific information:
 compat: 1.1
-compression type: zlib
+compression type: COMPRESSION_TYPE
 lazy refcounts: false
 refcount bits: 16
 corrupt: false
@@ -176,7 +176,7 @@ backing file: TEST_DIR/PID-base
 backing file format: IMGFMT
 Format specific information:
 compat: 1.1
-

Re: [PATCH v3 2/2] block/export/fuse: Fix build failure on FreeBSD

2022-02-01 Thread Hanna Reitz

On 01.02.22 12:26, Philippe Mathieu-Daudé wrote:

When building on FreeBSD we get:

   [816/6851] Compiling C object libblockdev.fa.p/block_export_fuse.c.o
   ../block/export/fuse.c:628:16: error: use of undeclared identifier 
'FALLOC_FL_KEEP_SIZE'
   if (mode & FALLOC_FL_KEEP_SIZE) {
  ^
   ../block/export/fuse.c:651:16: error: use of undeclared identifier 
'FALLOC_FL_PUNCH_HOLE'
   if (mode & FALLOC_FL_PUNCH_HOLE) {
  ^
   ../block/export/fuse.c:652:22: error: use of undeclared identifier 
'FALLOC_FL_KEEP_SIZE'
   if (!(mode & FALLOC_FL_KEEP_SIZE)) {
^
   3 errors generated.
   FAILED: libblockdev.fa.p/block_export_fuse.c.o

Meson indeed reported FALLOC_FL_PUNCH_HOLE is not available:

   C compiler for the host machine: cc (clang 10.0.1 "FreeBSD clang version 
10.0.1")
   Checking for function "fallocate" : NO
   Checking for function "posix_fallocate" : YES
   Header  has symbol "FALLOC_FL_PUNCH_HOLE" : NO
   Header  has symbol "FALLOC_FL_ZERO_RANGE" : NO
   ...

Similarly to commit 304332039 ("block/export/fuse.c: fix musl build"),
guard the code requiring FALLOC_FL_KEEP_SIZE / FALLOC_FL_PUNCH_HOLE
definitions under CONFIG_FALLOCATE_PUNCH_HOLE #ifdef'ry.

Signed-off-by: Philippe Mathieu-Daudé 
---
  block/export/fuse.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/block/export/fuse.c b/block/export/fuse.c
index d25e478c0a2..fdda8e3c818 100644
--- a/block/export/fuse.c
+++ b/block/export/fuse.c
@@ -625,9 +625,11 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
  return;
  }
  
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE

  if (mode & FALLOC_FL_KEEP_SIZE) {
  length = MIN(length, blk_len - offset);
  }
+#endif /* CONFIG_FALLOCATE_PUNCH_HOLE */
  
  if (!mode) {

  /* We can only fallocate at the EOF with a truncate */
@@ -648,6 +650,7 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
  ret = fuse_do_truncate(exp, offset + length, true,
 PREALLOC_MODE_FALLOC);
  }
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
  else if (mode & FALLOC_FL_PUNCH_HOLE) {
  if (!(mode & FALLOC_FL_KEEP_SIZE)) {
  fuse_reply_err(req, EINVAL);
@@ -662,6 +665,7 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
  length -= size;
  } while (ret == 0 && length > 0);
  }
+#endif /* CONFIG_FALLOCATE_PUNCH_HOLE */
  #ifdef CONFIG_FALLOCATE_ZERO_RANGE
  else if (mode & FALLOC_FL_ZERO_RANGE) {
  if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + length > blk_len) {


I believe CONFIG_FALLOCATE_ZERO_RANGE only guarantees the presence of 
FALLOC_FL_ZERO_RANGE, so we probably should check for 
CONFIG_FALLOCATE_PUNCH_HOLE, too, here.


(Maybe in practice FALLOC_FL_ZERO_RANGE guarantees that 
FALLOC_FL_KEEP_SIZE exists, too, but then a check just wouldn’t hurt.)


Hanna




Re: [PATCH] qemu-img: Unify [-b [-F]] documentation

2022-02-01 Thread Vladimir Sementsov-Ogievskiy

31.01.2022 16:59, Hanna Reitz wrote:

qemu-img convert documents the backing file and backing format options
as follows:
 [-B backing_file [-F backing_fmt]]
whereas qemu-img create has this:
 [-b backing_file] [-F backing_fmt]

That is, for convert, we document that -F cannot be given without -B,
while for create, way say that they are independent.

Indeed, it is technically possible to give -F without -b, because it is
left to the block driver to decide whether this is an error or not, so
sometimes it is:

$ qemu-img create -f qed -F qed test.qed 64M
Formatting 'test.qed', fmt=qed size=67108864 backing_fmt=qed [...]

And sometimes it is not:

$ qemu-img create -f qcow2 -F qcow2 test.qcow2 64M
Formatting 'test.qcow2', fmt=qcow2 cluster_size=65536 [...]
qemu-img: test.qcow2: Backing format cannot be used without backing file

Generally, it does not make much sense, though, and users should only
give -F with -b, so document it that way, as we have already done for
qemu-img convert (commit 1899bf47375ad40555dcdff12ba49b4b8b82df38).

Reported-by: Tingting Mao
Signed-off-by: Hanna Reitz


Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir



Re: [PATCH v4 1/4] python/aqmp: Fix negotiation with pre-"oob" QEMU

2022-02-01 Thread Kevin Wolf
Am 01.02.2022 um 05:11 hat John Snow geschrieben:
> QEMU versions prior to the "oob" capability *also* can't accept the
> "enable" keyword argument at all. Fix the handshake process with older
> QEMU versions.
> 
> Signed-off-by: John Snow 
> Reviewed-by: Hanna Reitz 
> ---
>  python/qemu/aqmp/qmp_client.py | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/python/qemu/aqmp/qmp_client.py b/python/qemu/aqmp/qmp_client.py
> index f1a845cc82..90a8737f03 100644
> --- a/python/qemu/aqmp/qmp_client.py
> +++ b/python/qemu/aqmp/qmp_client.py
> @@ -292,9 +292,9 @@ async def _negotiate(self) -> None:
>  """
>  self.logger.debug("Negotiating capabilities ...")
>  
> -arguments: Dict[str, List[str]] = {'enable': []}
> +arguments: Dict[str, List[str]] = {}
>  if self._greeting and 'oob' in self._greeting.QMP.capabilities:
> -arguments['enable'].append('oob')
> +arguments.setdefault('enable', []).append('oob')
>  msg = self.make_execute_msg('qmp_capabilities', arguments=arguments)

In case you have some interest in bike sheds:

As long as we only ever append a single capability, it doesn't really
make a difference and an explicit setdefault() when adding it is fine.
But if we had more than one, maybe making arguments a defaultdict(list)
would be nicer.

Not worth respinning, of course, if you don't for another reason.

Kevin




Re: [RFC] block/nbd: Move s->ioc on AioContext change

2022-02-01 Thread Vladimir Sementsov-Ogievskiy

01.02.2022 14:40, Hanna Reitz wrote:

On 01.02.22 12:18, Vladimir Sementsov-Ogievskiy wrote:

28.01.2022 18:51, Hanna Reitz wrote:

s->ioc must always be attached to the NBD node's AioContext.  If that
context changes, s->ioc must be attached to the new context.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1990835
Signed-off-by: Hanna Reitz 
---
This is an RFC because I believe there are some other things in the NBD
block driver that need attention on an AioContext change, too. Namely,
there are two timers (reconnect_delay_timer and open_timer) that are
also attached to the node's AioContext, and I'm afraid they need to be
handled, too.  Probably pause them on detach, and resume them on attach,
but I'm not sure, which is why I'm posting this as an RFC to get some
comments from that from someone who knows this code better than me. :)

(Also, in a real v1, of course I'd want to add a regression test.)
---
  block/nbd.c | 28 
  1 file changed, 28 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 63dbfa807d..119a774c04 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -2036,6 +2036,25 @@ static void nbd_cancel_in_flight(BlockDriverState *bs)
  nbd_co_establish_connection_cancel(s->conn);
  }
  +static void nbd_attach_aio_context(BlockDriverState *bs,
+   AioContext *new_context)
+{
+    BDRVNBDState *s = bs->opaque;
+
+    if (s->ioc) {
+    qio_channel_attach_aio_context(s->ioc, new_context);
+    }
+}
+
+static void nbd_detach_aio_context(BlockDriverState *bs)
+{
+    BDRVNBDState *s = bs->opaque;
+
+    if (s->ioc) {
+    qio_channel_detach_aio_context(s->ioc);
+    }
+}
+
  static BlockDriver bdrv_nbd = {
  .format_name    = "nbd",
  .protocol_name  = "nbd",
@@ -2059,6 +2078,9 @@ static BlockDriver bdrv_nbd = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts    = nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+    .bdrv_attach_aio_context    = nbd_attach_aio_context,
+    .bdrv_detach_aio_context    = nbd_detach_aio_context,
  };
    static BlockDriver bdrv_nbd_tcp = {
@@ -2084,6 +2106,9 @@ static BlockDriver bdrv_nbd_tcp = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts    = nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+    .bdrv_attach_aio_context    = nbd_attach_aio_context,
+    .bdrv_detach_aio_context    = nbd_detach_aio_context,
  };
    static BlockDriver bdrv_nbd_unix = {
@@ -2109,6 +2134,9 @@ static BlockDriver bdrv_nbd_unix = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts    = nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+    .bdrv_attach_aio_context    = nbd_attach_aio_context,
+    .bdrv_detach_aio_context    = nbd_detach_aio_context,
  };
    static void bdrv_nbd_init(void)




Hmm. I was so happy to remove these handlers together with connection-coroutine 
:) . But you are right, seems I've removed too much :(.


open_timer exists only during bdrv_open() handler, so, I hope on attach/detach 
it should not exist.


That’s… kind of surprising.  It’s good for me here, but as far as I can see it 
means that all of qemu blocks until the connection succeeds, right?  That 
doesn’t seem quite ideal...


Right. Still the intended usage was for command-line, so we can wait for 
connection on Qemu start.

Using it in blockdev-add when vm is running is doubt-able. In v3 I had a patch 
to make blockdev-add a coroutine qmp command to solve this problem. But it 
raised a discussion and I decided that it's not a reason to block the whole 
feature.

https://patchwork.kernel.org/project/qemu-devel/patch/20210906190654.183421-3-vsement...@virtuozzo.com/



Anyway, good for me. O:)


reconnect_delay_timer should exist only during IO request: it's created during 
request if we don't have a connection. And request will not finish until timer 
elapsed or connection established (timer should be removed in this case too). 
So, again, when attaching / detaching the context we should be in a drained 
sections, so no in-flight requests and no reconnect_delay_timer.


Got it.  FWIW, other block drivers rely on this, too (e.g. null-aio with 
latency-ns set creates a timer in every I/O request and settles the request 
once the timer expires).



So, I think assertions that both timer pointers are NULL should be enough in 
attach / detach handlers.



Great!  I’ll cook up v1.




--
Best regards,
Vladimir



Re: [PATCH] qemu-img: Unify [-b [-F]] documentation

2022-02-01 Thread Kevin Wolf
Am 31.01.2022 um 14:59 hat Hanna Reitz geschrieben:
> qemu-img convert documents the backing file and backing format options
> as follows:
> [-B backing_file [-F backing_fmt]]
> whereas qemu-img create has this:
> [-b backing_file] [-F backing_fmt]
> 
> That is, for convert, we document that -F cannot be given without -B,
> while for create, way say that they are independent.
> 
> Indeed, it is technically possible to give -F without -b, because it is
> left to the block driver to decide whether this is an error or not, so
> sometimes it is:
> 
> $ qemu-img create -f qed -F qed test.qed 64M
> Formatting 'test.qed', fmt=qed size=67108864 backing_fmt=qed [...]
> 
> And sometimes it is not:
> 
> $ qemu-img create -f qcow2 -F qcow2 test.qcow2 64M
> Formatting 'test.qcow2', fmt=qcow2 cluster_size=65536 [...]
> qemu-img: test.qcow2: Backing format cannot be used without backing file
> 
> Generally, it does not make much sense, though, and users should only
> give -F with -b, so document it that way, as we have already done for
> qemu-img convert (commit 1899bf47375ad40555dcdff12ba49b4b8b82df38).
> 
> Reported-by: Tingting Mao 
> Signed-off-by: Hanna Reitz 

Thanks, applied to the block branch.

Kevin




Re: [PATCH v2] qsd: Document fuse's allow-other option

2022-02-01 Thread Kevin Wolf
Am 31.01.2022 um 11:31 hat Hanna Reitz geschrieben:
> We did not add documentation to the storage daemon's man page for fuse's
> allow-other option when it was introduced, so do that now.
> 
> Fixes: 8fc54f9428b9763f800 ("export/fuse: Add allow-other option")
> Signed-off-by: Hanna Reitz 

Thanks, applied to the block branch.

Kevin




Re: [PATCH] block.h: remove outdated comment

2022-02-01 Thread Hanna Reitz

On 31.01.22 13:56, Emanuele Giuseppe Esposito wrote:

The comment "disk I/O throttling" doesn't make any sense at all
any more. It was added in commit 0563e191516 to describe
bdrv_io_limits_enable()/disable(), which were removed in commit
97148076, so the comment is just a forgotten leftover.

Suggested-by: Kevin Wolf 
Signed-off-by: Emanuele Giuseppe Esposito 
---
  include/block/block.h | 1 -
  1 file changed, 1 deletion(-)


Thanks, applied to my block branch:

https://gitlab.com/hreitz/qemu/-/commits/block




Re: [PATCH] block.h: remove outdated comment

2022-02-01 Thread Kevin Wolf
Am 31.01.2022 um 13:56 hat Emanuele Giuseppe Esposito geschrieben:
> The comment "disk I/O throttling" doesn't make any sense at all
> any more. It was added in commit 0563e191516 to describe
> bdrv_io_limits_enable()/disable(), which were removed in commit
> 97148076, so the comment is just a forgotten leftover.
> 
> Suggested-by: Kevin Wolf 
> Signed-off-by: Emanuele Giuseppe Esposito 

Thanks, applied to the block branch.

Kevin




Re: [PATCH v6 01/33] main-loop.h: introduce qemu_in_main_thread()

2022-02-01 Thread Kevin Wolf
Am 01.02.2022 um 12:55 hat Kevin Wolf geschrieben:
> Am 31.01.2022 um 16:49 hat Paolo Bonzini geschrieben:
> > > > However, commit 5f50be9b5810293141bb53cfd0cb46e765367d56 changed the
> > > > stub to return false for a specific reason.
> > > 
> > > I must admit I don't really understand the reasoning there:
> > > 
> > >  With this change, the stub qemu_mutex_iothread_locked() must be 
> > > changed
> > >  from true to false.  The previous value of true was needed because 
> > > the
> > >  main thread did not have an AioContext in the thread-local variable,
> > >  but now it does have one.
> > > 
> > > This explains why it _can_ be changed to false for this caller, but not
> > > why it _must_ be changed.
> > 
> > See above: because it returns the wrong value for all threads except one,
> > and there are better ways to do a meaningful check in that one thread: using
> > qemu_get_current_aio_context(), which is what aio_co_enter did at the time
> > and qemu_in_main_thread() does with Emanuele's change.
> > 
> > > So is the problem with the unit tests really just that they never call
> > > qemu_init_main_loop() and therefore never set the AioContext for the
> > > main thread?
> > 
> > No, most of them do (and if some are missing it we can add it).
> 
> But if they do, why doesn't qemu_get_current_aio_context() already
> return the right result? In this case, my_aiocontext is set and it
> should never even call qemu_mutex_iothread_locked() - at least not in
> any case where qemu_in_main_thread() would return true.
> 
> So provided that qemu_init_main_loop() is called, both functions should
> be equivalent and we wouldn't need a second one.

Sorry, I was confused and comparing the wrong two functions.
qemu_get_current_aio_context() does return the right result and it's
exactly what qemu_in_main_thread() uses. So yes, it's right and it's
different from qemu_mutex_iothread_locked().

It would be less confusing if qemu_get_current_aio_context() used
qemu_in_main_thread() (with two different implementations of
qemu_in_main_thread() for the system emulator and tools) instead of the
other way around, but I guess that's harder to implement because we
would need a different way to figure out whether we're in the main
thread, at least as long as my_aiocontext is static and can't be
accessed in stubs. We could probably make it public, though.

Kevin




Re: [PATCH v6 01/33] main-loop.h: introduce qemu_in_main_thread()

2022-02-01 Thread Kevin Wolf
Am 31.01.2022 um 16:49 hat Paolo Bonzini geschrieben:
> > > However, commit 5f50be9b5810293141bb53cfd0cb46e765367d56 changed the
> > > stub to return false for a specific reason.
> > 
> > I must admit I don't really understand the reasoning there:
> > 
> >  With this change, the stub qemu_mutex_iothread_locked() must be changed
> >  from true to false.  The previous value of true was needed because the
> >  main thread did not have an AioContext in the thread-local variable,
> >  but now it does have one.
> > 
> > This explains why it _can_ be changed to false for this caller, but not
> > why it _must_ be changed.
> 
> See above: because it returns the wrong value for all threads except one,
> and there are better ways to do a meaningful check in that one thread: using
> qemu_get_current_aio_context(), which is what aio_co_enter did at the time
> and qemu_in_main_thread() does with Emanuele's change.
> 
> > So is the problem with the unit tests really just that they never call
> > qemu_init_main_loop() and therefore never set the AioContext for the
> > main thread?
> 
> No, most of them do (and if some are missing it we can add it).

But if they do, why doesn't qemu_get_current_aio_context() already
return the right result? In this case, my_aiocontext is set and it
should never even call qemu_mutex_iothread_locked() - at least not in
any case where qemu_in_main_thread() would return true.

So provided that qemu_init_main_loop() is called, both functions should
be equivalent and we wouldn't need a second one.

Kevin




Re: [PATCH] block/curl.c: Check error return from curl_easy_setopt()

2022-02-01 Thread Hanna Reitz

On 28.01.22 17:55, Peter Maydell wrote:

Coverity points out that we aren't checking the return value
from curl_easy_setopt() for any of the calls to it we make
in block/curl.c.

Fixes: Coverity CID 1459336, 1459482, 1460331
Signed-off-by: Peter Maydell 
---
Big fat disclaimer: tested only with 'make check', which I suspect
may not be exercising this block backend. Hints on how to test
more thoroughly are welcome.

  block/curl.c | 90 +---
  1 file changed, 58 insertions(+), 32 deletions(-)


One problem I see in general is that most of the setopt functions are 
(indirectly) called from `curl_open()`, which is supposed to return an 
error message.  Its `out` label seems to expect some error description 
in `state->errmsg`.  The error handling here doesn’t set such a description.


Then again, there are enough existing error paths that don’t set this 
description either, so it isn’t quite this patch’s duty to fix that 
situation.  I guess it would be nice if we had a wrapper for 
`curl_easy_setopt()` with an `Error **` parameter, so we could easily 
generate error messages that describe key and value (and then 
`curl_init_state()` should have an `Error **` parameter, too).


But this patch doesn’t make anything worse than it already is, so that’d 
rather be an idea for future clean-up.



diff --git a/block/curl.c b/block/curl.c
index 6a6cd729758..aaee1b17bef 100644
--- a/block/curl.c
+++ b/block/curl.c


[...]


@@ -879,7 +902,10 @@ static void curl_setup_preadv(BlockDriverState *bs, 
CURLAIOCB *acb)
  
  snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64, start, end);

  trace_curl_setup_preadv(acb->bytes, start, state->range);
-curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
+if (curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range)) {
+curl_clean_state(state);
+goto out;


I think we need to mark the request as failed by setting `acb->ret` to a 
negative value (and probably also clear `state->acb[0]` like the error 
path below does).


Hanna


+}
  
  if (curl_multi_add_handle(s->multi, state->curl) != CURLM_OK) {

  state->acb[0] = NULL;





Re: [PATCH] block: bdrv_set_backing_hd(): use drained section

2022-02-01 Thread Vladimir Sementsov-Ogievskiy

28.01.2022 17:12, Emanuele Giuseppe Esposito wrote:



On 27/01/2022 15:13, Kevin Wolf wrote:

Am 25.01.2022 um 11:12 hat Vladimir Sementsov-Ogievskiy geschrieben:

25.01.2022 12:24, Paolo Bonzini wrote:

On 1/24/22 18:37, Vladimir Sementsov-Ogievskiy wrote:

Graph modifications should be done in drained section. stream_prepare()
handler of block stream job call bdrv_set_backing_hd() without using
drained section and it's theoretically possible that some IO request
will interleave with graph modification and will use outdated pointers
to removed block nodes.

Some other callers use bdrv_set_backing_hd() not caring about drained
sections too. So it seems good to make a drained section exactly in
bdrv_set_backing_hd().


Emanuele has a similar patch in his series to protect all graph
modifications with drains:

@@ -3456,6 +3478,11 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,

   assert(qemu_in_main_thread());

+    bdrv_subtree_drained_begin_unlocked(bs);
+    if (backing_hd) {
+    bdrv_subtree_drained_begin_unlocked(backing_hd);
+    }
+
   ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
   if (ret < 0) {
   goto out;
@@ -3464,6 +3491,10 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
   ret = bdrv_refresh_perms(bs, errp);
   out:
   tran_finalize(tran, ret);
+    if (backing_hd) {
+    bdrv_subtree_drained_end_unlocked(backing_hd);
+    }
+    bdrv_subtree_drained_end_unlocked(bs);

   return ret;
   }

so the idea at least is correct.

I don't object to fixing this independently, but please check
1) if a subtree drain would be more appropriate, 2) whether
backing_hd should be drained as well, 3) whether we're guaranteed
to be holding the AioContext lock as required for
bdrv_drained_begin/end.



Hmm.

1. Subtree draining of backing_hd will not help, as bs is not drained,
we still may have in-fight request in bs, touching old bs->backing.


What do you mean bs is not drained? In my patch I drain both.



2. I think non-recursive drain of bs is enough. We modify only bs
node, so we should drain it. backing_hd itself is not modified. If
backing_hd participate in some other backing chain - it's not touched,
and in-flight requests in that other chain are not broken by
modification, so why to drain it? Same for old bs->backing and other
bs children. We are not interested in in-flight requests in subtree
which are not part of request in bs. So, if no inflight requests in
bs, we can modify bs and not care about requests in subtree.


I agree on both points. Emanuele's patch seems to be doing unnecessary
work there.


Wait, the point of my patches[*] is to protect
bdrv_replace_child_noperm(). See the cover letter of my series for more
info.

The reason for a subtree drain is that one callback of .attach() in
bdrv_replace_child_noperm() is bdrv_child_cb_attach().
This attaches the node in child->opaque->children list, so both nodes
pointed by the BdrvChild are modified (child->bs and child->opaque).
Simply draining on child->bs won't be enough to also get child->opaque
in my opinion[*].


So you mean that backing_hd is modified as its parent is changed, do I 
understand correctly?

Yes, it is modified in this point of view, but why this needs drain? If root bs 
is drained, we don't have in-flight requests touching this modified 
parent-child relationship.



Same applies with detach.
One interesting side note is what happens if we are moving the child
from one bs to another (old_bs and new_bs are both != NULL):
child->opaque will just lose and re-gain the same child.

Regarding this specific drain: I am missing something for sure here,
because if I try to follow the code I see that from
bdrv_set_backing_hd(bs, backing_hd)
the call stack eventually ends up to
bdrv_replace_child_noperm(child, new_bs /* -> backing_hd */)
and then the graph modification there is:
QLIST_INSERT_HEAD(_bs->parents, child, next_parent);

So why not protecting backing_hd?

Full stack:
bdrv_set_backing_hd(bs, backing_hd)
  bdrv_set_backing_noperm(bs, backing_hd)
   bdrv_set_file_or_backing_noperm(bs, backing_hd)
 bdrv_attach_child_noperm(parent_bs = bs, child_bs = backing_hd)
  bdrv_attach_child_common(child_bs = backing_hd, .., parent_bs = bs)
   new_child.bs = NULL;
   new_child.opaque = parent_bs = bs;
   bdrv_replace_child_noperm(new_child, child_bs = backing_hd)

[*] = BTW, I see that you understand this stuff way deeper than I do, so
feel free to review my drained-related series if you have time :)




3. Jobs are bound to aio context, so I believe that they care to hold
AioContext lock. For example, on path job_prepare may be called
through job_exit(), job_exit() does
aio_context_acquire(job->aio_context), or it may be called through
job_cancel(), which seems to be called under aio_context_acquire() as
well. So, seems in general we care about it, and of course
bdrv_set_backing_hd() must be called 

[PATCH v3 2/2] block/export/fuse: Fix build failure on FreeBSD

2022-02-01 Thread Philippe Mathieu-Daudé via
When building on FreeBSD we get:

  [816/6851] Compiling C object libblockdev.fa.p/block_export_fuse.c.o
  ../block/export/fuse.c:628:16: error: use of undeclared identifier 
'FALLOC_FL_KEEP_SIZE'
  if (mode & FALLOC_FL_KEEP_SIZE) {
 ^
  ../block/export/fuse.c:651:16: error: use of undeclared identifier 
'FALLOC_FL_PUNCH_HOLE'
  if (mode & FALLOC_FL_PUNCH_HOLE) {
 ^
  ../block/export/fuse.c:652:22: error: use of undeclared identifier 
'FALLOC_FL_KEEP_SIZE'
  if (!(mode & FALLOC_FL_KEEP_SIZE)) {
   ^
  3 errors generated.
  FAILED: libblockdev.fa.p/block_export_fuse.c.o

Meson indeed reported FALLOC_FL_PUNCH_HOLE is not available:

  C compiler for the host machine: cc (clang 10.0.1 "FreeBSD clang version 
10.0.1")
  Checking for function "fallocate" : NO
  Checking for function "posix_fallocate" : YES
  Header  has symbol "FALLOC_FL_PUNCH_HOLE" : NO
  Header  has symbol "FALLOC_FL_ZERO_RANGE" : NO
  ...

Similarly to commit 304332039 ("block/export/fuse.c: fix musl build"),
guard the code requiring FALLOC_FL_KEEP_SIZE / FALLOC_FL_PUNCH_HOLE
definitions under CONFIG_FALLOCATE_PUNCH_HOLE #ifdef'ry.

Signed-off-by: Philippe Mathieu-Daudé 
---
 block/export/fuse.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/block/export/fuse.c b/block/export/fuse.c
index d25e478c0a2..fdda8e3c818 100644
--- a/block/export/fuse.c
+++ b/block/export/fuse.c
@@ -625,9 +625,11 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 return;
 }
 
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
 if (mode & FALLOC_FL_KEEP_SIZE) {
 length = MIN(length, blk_len - offset);
 }
+#endif /* CONFIG_FALLOCATE_PUNCH_HOLE */
 
 if (!mode) {
 /* We can only fallocate at the EOF with a truncate */
@@ -648,6 +650,7 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 ret = fuse_do_truncate(exp, offset + length, true,
PREALLOC_MODE_FALLOC);
 }
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
 else if (mode & FALLOC_FL_PUNCH_HOLE) {
 if (!(mode & FALLOC_FL_KEEP_SIZE)) {
 fuse_reply_err(req, EINVAL);
@@ -662,6 +665,7 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 length -= size;
 } while (ret == 0 && length > 0);
 }
+#endif /* CONFIG_FALLOCATE_PUNCH_HOLE */
 #ifdef CONFIG_FALLOCATE_ZERO_RANGE
 else if (mode & FALLOC_FL_ZERO_RANGE) {
 if (!(mode & FALLOC_FL_KEEP_SIZE) && offset + length > blk_len) {
-- 
2.34.1




Re: [RFC] block/nbd: Move s->ioc on AioContext change

2022-02-01 Thread Hanna Reitz

On 01.02.22 12:18, Vladimir Sementsov-Ogievskiy wrote:

28.01.2022 18:51, Hanna Reitz wrote:

s->ioc must always be attached to the NBD node's AioContext.  If that
context changes, s->ioc must be attached to the new context.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1990835
Signed-off-by: Hanna Reitz 
---
This is an RFC because I believe there are some other things in the NBD
block driver that need attention on an AioContext change, too. Namely,
there are two timers (reconnect_delay_timer and open_timer) that are
also attached to the node's AioContext, and I'm afraid they need to be
handled, too.  Probably pause them on detach, and resume them on attach,
but I'm not sure, which is why I'm posting this as an RFC to get some
comments from that from someone who knows this code better than me. :)

(Also, in a real v1, of course I'd want to add a regression test.)
---
  block/nbd.c | 28 
  1 file changed, 28 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 63dbfa807d..119a774c04 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -2036,6 +2036,25 @@ static void 
nbd_cancel_in_flight(BlockDriverState *bs)

  nbd_co_establish_connection_cancel(s->conn);
  }
  +static void nbd_attach_aio_context(BlockDriverState *bs,
+   AioContext *new_context)
+{
+    BDRVNBDState *s = bs->opaque;
+
+    if (s->ioc) {
+    qio_channel_attach_aio_context(s->ioc, new_context);
+    }
+}
+
+static void nbd_detach_aio_context(BlockDriverState *bs)
+{
+    BDRVNBDState *s = bs->opaque;
+
+    if (s->ioc) {
+    qio_channel_detach_aio_context(s->ioc);
+    }
+}
+
  static BlockDriver bdrv_nbd = {
  .format_name    = "nbd",
  .protocol_name  = "nbd",
@@ -2059,6 +2078,9 @@ static BlockDriver bdrv_nbd = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts    = nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+    .bdrv_attach_aio_context    = nbd_attach_aio_context,
+    .bdrv_detach_aio_context    = nbd_detach_aio_context,
  };
    static BlockDriver bdrv_nbd_tcp = {
@@ -2084,6 +2106,9 @@ static BlockDriver bdrv_nbd_tcp = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts    = nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+    .bdrv_attach_aio_context    = nbd_attach_aio_context,
+    .bdrv_detach_aio_context    = nbd_detach_aio_context,
  };
    static BlockDriver bdrv_nbd_unix = {
@@ -2109,6 +2134,9 @@ static BlockDriver bdrv_nbd_unix = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts    = nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+    .bdrv_attach_aio_context    = nbd_attach_aio_context,
+    .bdrv_detach_aio_context    = nbd_detach_aio_context,
  };
    static void bdrv_nbd_init(void)




Hmm. I was so happy to remove these handlers together with 
connection-coroutine :) . But you are right, seems I've removed too 
much :(.



open_timer exists only during bdrv_open() handler, so, I hope on 
attach/detach it should not exist.


That’s… kind of surprising.  It’s good for me here, but as far as I can 
see it means that all of qemu blocks until the connection succeeds, 
right?  That doesn’t seem quite ideal...


Anyway, good for me. O:)

reconnect_delay_timer should exist only during IO request: it's 
created during request if we don't have a connection. And request will 
not finish until timer elapsed or connection established (timer should 
be removed in this case too). So, again, when attaching / detaching 
the context we should be in a drained sections, so no in-flight 
requests and no reconnect_delay_timer.


Got it.  FWIW, other block drivers rely on this, too (e.g. null-aio with 
latency-ns set creates a timer in every I/O request and settles the 
request once the timer expires).




So, I think assertions that both timer pointers are NULL should be 
enough in attach / detach handlers.




Great!  I’ll cook up v1.




Re: [RFC] block/nbd: Move s->ioc on AioContext change

2022-02-01 Thread Vladimir Sementsov-Ogievskiy

28.01.2022 18:51, Hanna Reitz wrote:

s->ioc must always be attached to the NBD node's AioContext.  If that
context changes, s->ioc must be attached to the new context.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1990835
Signed-off-by: Hanna Reitz 
---
This is an RFC because I believe there are some other things in the NBD
block driver that need attention on an AioContext change, too.  Namely,
there are two timers (reconnect_delay_timer and open_timer) that are
also attached to the node's AioContext, and I'm afraid they need to be
handled, too.  Probably pause them on detach, and resume them on attach,
but I'm not sure, which is why I'm posting this as an RFC to get some
comments from that from someone who knows this code better than me. :)

(Also, in a real v1, of course I'd want to add a regression test.)
---
  block/nbd.c | 28 
  1 file changed, 28 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 63dbfa807d..119a774c04 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -2036,6 +2036,25 @@ static void nbd_cancel_in_flight(BlockDriverState *bs)
  nbd_co_establish_connection_cancel(s->conn);
  }
  
+static void nbd_attach_aio_context(BlockDriverState *bs,

+   AioContext *new_context)
+{
+BDRVNBDState *s = bs->opaque;
+
+if (s->ioc) {
+qio_channel_attach_aio_context(s->ioc, new_context);
+}
+}
+
+static void nbd_detach_aio_context(BlockDriverState *bs)
+{
+BDRVNBDState *s = bs->opaque;
+
+if (s->ioc) {
+qio_channel_detach_aio_context(s->ioc);
+}
+}
+
  static BlockDriver bdrv_nbd = {
  .format_name= "nbd",
  .protocol_name  = "nbd",
@@ -2059,6 +2078,9 @@ static BlockDriver bdrv_nbd = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts= nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+.bdrv_attach_aio_context= nbd_attach_aio_context,
+.bdrv_detach_aio_context= nbd_detach_aio_context,
  };
  
  static BlockDriver bdrv_nbd_tcp = {

@@ -2084,6 +2106,9 @@ static BlockDriver bdrv_nbd_tcp = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts= nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+.bdrv_attach_aio_context= nbd_attach_aio_context,
+.bdrv_detach_aio_context= nbd_detach_aio_context,
  };
  
  static BlockDriver bdrv_nbd_unix = {

@@ -2109,6 +2134,9 @@ static BlockDriver bdrv_nbd_unix = {
  .bdrv_dirname   = nbd_dirname,
  .strong_runtime_opts= nbd_strong_runtime_opts,
  .bdrv_cancel_in_flight  = nbd_cancel_in_flight,
+
+.bdrv_attach_aio_context= nbd_attach_aio_context,
+.bdrv_detach_aio_context= nbd_detach_aio_context,
  };
  
  static void bdrv_nbd_init(void)





Hmm. I was so happy to remove these handlers together with connection-coroutine 
:) . But you are right, seems I've removed too much :(.


open_timer exists only during bdrv_open() handler, so, I hope on attach/detach 
it should not exist.

reconnect_delay_timer should exist only during IO request: it's created during 
request if we don't have a connection. And request will not finish until timer 
elapsed or connection established (timer should be removed in this case too). 
So, again, when attaching / detaching the context we should be in a drained 
sections, so no in-flight requests and no reconnect_delay_timer.

So, I think assertions that both timer pointers are NULL should be enough in 
attach / detach handlers.

--
Best regards,
Vladimir



[PATCH v3 1/2] block/export/fuse: Rearrange if-else-if ladder in fuse_fallocate()

2022-02-01 Thread Philippe Mathieu-Daudé via
In order to safely maintain a mixture of #ifdef'ry with if-else-if
ladder, rearrange the last statement (!mode) first. Since it is
mutually exclusive with the other conditions, checking it first
doesn't make any logical difference, but allows to add #ifdef'ry
around in a more cleanly way.

Suggested-by: Kevin Wolf 
Signed-off-by: Philippe Mathieu-Daudé 
---

One minor checkpatch error:

 ERROR: else should follow close brace '}'
 #29: FILE: block/export/fuse.c:651:
 +}
 +else if (mode & FALLOC_FL_PUNCH_HOLE) {
---
 block/export/fuse.c | 41 +
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/block/export/fuse.c b/block/export/fuse.c
index 6710d8aed86..d25e478c0a2 100644
--- a/block/export/fuse.c
+++ b/block/export/fuse.c
@@ -629,7 +629,26 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 length = MIN(length, blk_len - offset);
 }
 
-if (mode & FALLOC_FL_PUNCH_HOLE) {
+if (!mode) {
+/* We can only fallocate at the EOF with a truncate */
+if (offset < blk_len) {
+fuse_reply_err(req, EOPNOTSUPP);
+return;
+}
+
+if (offset > blk_len) {
+/* No preallocation needed here */
+ret = fuse_do_truncate(exp, offset, true, PREALLOC_MODE_OFF);
+if (ret < 0) {
+fuse_reply_err(req, -ret);
+return;
+}
+}
+
+ret = fuse_do_truncate(exp, offset + length, true,
+   PREALLOC_MODE_FALLOC);
+}
+else if (mode & FALLOC_FL_PUNCH_HOLE) {
 if (!(mode & FALLOC_FL_KEEP_SIZE)) {
 fuse_reply_err(req, EINVAL);
 return;
@@ -665,25 +684,7 @@ static void fuse_fallocate(fuse_req_t req, fuse_ino_t 
inode, int mode,
 } while (ret == 0 && length > 0);
 }
 #endif /* CONFIG_FALLOCATE_ZERO_RANGE */
-else if (!mode) {
-/* We can only fallocate at the EOF with a truncate */
-if (offset < blk_len) {
-fuse_reply_err(req, EOPNOTSUPP);
-return;
-}
-
-if (offset > blk_len) {
-/* No preallocation needed here */
-ret = fuse_do_truncate(exp, offset, true, PREALLOC_MODE_OFF);
-if (ret < 0) {
-fuse_reply_err(req, -ret);
-return;
-}
-}
-
-ret = fuse_do_truncate(exp, offset + length, true,
-   PREALLOC_MODE_FALLOC);
-} else {
+else {
 ret = -EOPNOTSUPP;
 }
 
-- 
2.34.1




[PATCH v3 0/2] block/export/fuse: Fix build failure on FreeBSD

2022-02-01 Thread Philippe Mathieu-Daudé via
Since v2:
- Rearrange if-else-if ladder first (Kevin)

Philippe Mathieu-Daudé (2):
  block/export/fuse: Rearrange if-else-if ladder in fuse_fallocate()
  block/export/fuse: Fix build failure on FreeBSD

 block/export/fuse.c | 45 +
 1 file changed, 25 insertions(+), 20 deletions(-)

-- 
2.34.1




Re: [RFC PATCH] block/export/fuse: Fix build failure on FreeBSD

2022-02-01 Thread Philippe Mathieu-Daudé via
On 1/27/22 17:15, Kevin Wolf wrote:
> Am 22.01.2022 um 14:49 hat Philippe Mathieu-Daudé geschrieben:
>> When building on FreeBSD we get:
>>
>>   [816/6851] Compiling C object libblockdev.fa.p/block_export_fuse.c.o
>>   ../block/export/fuse.c:628:16: error: use of undeclared identifier 
>> 'FALLOC_FL_KEEP_SIZE'
>>   if (mode & FALLOC_FL_KEEP_SIZE) {
>>  ^
>>   ../block/export/fuse.c:632:16: error: use of undeclared identifier 
>> 'FALLOC_FL_PUNCH_HOLE'
>>   if (mode & FALLOC_FL_PUNCH_HOLE) {
>>  ^
>>   ../block/export/fuse.c:633:22: error: use of undeclared identifier 
>> 'FALLOC_FL_KEEP_SIZE'
>>   if (!(mode & FALLOC_FL_KEEP_SIZE)) {
>>^
>>   3 errors generated.
>>   FAILED: libblockdev.fa.p/block_export_fuse.c.o
>>
>> Meson indeed reported FALLOC_FL_PUNCH_HOLE is not available:
>>
>>   C compiler for the host machine: cc (clang 10.0.1 "FreeBSD clang version 
>> 10.0.1")
>>   Checking for function "fallocate" : NO
>>   Checking for function "posix_fallocate" : YES
>>   Header  has symbol "FALLOC_FL_PUNCH_HOLE" : NO
>>   Header  has symbol "FALLOC_FL_ZERO_RANGE" : NO
>>   ...
>>
>> Similarly to commit 304332039 ("block/export/fuse.c: fix musl build"),
>> guard the code requiring FALLOC_FL_KEEP_SIZE / FALLOC_FL_PUNCH_HOLE
>> definitions under CONFIG_FALLOCATE_PUNCH_HOLE #ifdef'ry.
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>> Fragile #ifdef'ry... Any clever idea?
> 
> I guess we could reorder things. The last case (!mode) is mutually
> exclusive with the other conditions, so checking it first doesn't make a
> difference, and then you can #ifdef things out more cleanly.

This is indeed clever, thanks! v3 soon.



Re: [PATCH v6 01/33] main-loop.h: introduce qemu_in_main_thread()

2022-02-01 Thread Paolo Bonzini

On 2/1/22 10:08, Emanuele Giuseppe Esposito wrote:


+ *
+ * This function should never be used in the block layer, please
+ * instead refer to qemu_in_main_thread().



This function should never be used in the block layer, because unit 
tests, block layer tools and qemu-storage-daemon do not have a BQL.

Please instead refer to qemu_in_main_thread().


+
+/**
+ * qemu_in_main_thread: same as qemu_mutex_iothread_locked when
+ * softmmu/cpus.c implementation is linked. Otherwise this function
+ * checks that the current AioContext is the global AioContext
+ * (main loop).
+ *
+ * This is useful when checking that the BQL is held as a
+ * replacement of qemu_mutex_iothread_locked() usege in the
+ * block layer, since the former returns false when invoked by
+ * unit tests or other users like qemu-storage-daemon that end
+ * up using the stubs/iothread-lock.c implementation.
+ *
+ * This function should only be used in the block layer.
+ * Use this function to determine whether it is possible to safely
+ * access the block layer's globals.
+ */
+bool qemu_in_main_thread(void);


I think the reference to qemu_mutex_iothread_locked() complicates 
things.  It's enough to explain the different policies in my opinion:


/**
 * qemu_in_main_thread: return whether it's possible to safely access
 * the global state of the block layer.
 *
 * Global state of the block layer is not accessible from I/O threads
 * or worker threads; only from threads that "own" the default
 * AioContext that qemu_get_aio_context() returns.  For tests, block
 * layer tools and qemu-storage-daemon there is a designated thread that
 * runs the event loop for qemu_get_aio_context(), and that is the
 * main thread.
 *
 * For emulators, however, any thread that holds the BQL can act
 * as the block layer main thread; this will be any of the actual
 * main thread, the vCPU threads or the RCU thread.
 *
 * For clarity, do not use this function outside the block layer.
 */

Paolo



Re: [PATCH v6 02/33] include/block/block: split header into I/O and global state API

2022-02-01 Thread Paolo Bonzini

On 2/1/22 10:45, Emanuele Giuseppe Esposito wrote:

That said, even if they are a different category, I think it makes sense
to leave them in the same header file as I/O functions, because I/O
functions are locked out between drained_begin and drained_end.


Proposed category description:
/*
  * "Global OR I/O" API functions. These functions can run without
  * the BQL, but only in one specific iothread/main loop.
  *
  * More specifically, these functions use BDRV_POLL_WHILE(bs), which
  * requires the caller to be either in the main thread and hold
  * the BlockdriverState (bs) AioContext lock, or directly in the
  * home thread that runs the bs AioContext. Calling them from
  * another thread in another AioContext would cause deadlocks.
  *
  * Therefore, these functions are not proper I/O, because they
  * can't run in *any* iothreads, but only in a specific one.
  */

Functions that will surely go under this category:

BDRV_POLL_WHILE
bdrv_parent_drained_begin_single
bdrv_parent_drained_end_single
bdrv_drain_poll
bdrv_drained_begin
bdrv_do_drained_begin_quiesce
bdrv_subtree_drained_begin
bdrv_drained_end
bdrv_drained_end_no_poll
bdrv_subtree_drained_end

(all generated_co_wrapper)
bdrv_truncate
bdrv_check
bdrv_invalidate_cache
bdrv_flush
bdrv_pdiscard
bdrv_readv_vmstate
bdrv_writev_vmstate


What I am not sure:

* bdrv_drain_all_begin - bdrv_drain_all_end - bdrv_drain_all: these were
classified as GS, because thay are always called from the main loop.
Should they go in this new category?


1) They look at the list of BDS's, and 2) you can't in general be sure 
that all BDS's are in *your* AioContext if you call them from a specific 
AioContext.


So they should be GS.


* how should I interpret "all the callers of BDRV_POLL_WHILE"?
Meaning, if I consider also the callers of the callers, we end up
covering much much more functions. Should I only consider the direct
callers (ie the above)?


In general it is safe to make a function GS even if it is potentially 
"GS or I/O", because that _reduces_ the number of places you can call it 
from.  It's likewise safe to make it I/O-only, but probably it makes 
less sense.


Paolo



Re: [PATCH v6 02/33] include/block/block: split header into I/O and global state API

2022-02-01 Thread Emanuele Giuseppe Esposito



On 31/01/2022 16:58, Paolo Bonzini wrote:
> On 1/31/22 15:54, Kevin Wolf wrote:
>> So I guess the decision depends on what you're going to use the
>> categories in the future. I get the feeling that we have one more
>> category than this series introduces:
>>
>> * Global State (must run from the main thread/hold the BQL)
>> * I/O (can be called in any thread under the AioContext lock, doesn't
>>    modify global state, drain waits for it to complete)
>> * Common (doesn't even touch global state)
>> * iothread dependent (can run without the BQL, but only in one specific
>>    iothread while holding its AioContext lock; this would cover at least
>>    AIO_WAIT_WHILE() and all of its indirect callers)
> 
> Yes, I agree.
> 
> bdrv_drained_begin and friends are somewhat like a coroutine-level
> lock/unlock primitive, so they need to be available in both the main
> thread and the iothread.  They could be called iothread dependent,
> AioContext dependent, or perhaps "global or I/O".
> 
> That said, even if they are a different category, I think it makes sense
> to leave them in the same header file as I/O functions, because I/O
> functions are locked out between drained_begin and drained_end.
> 
> Paolo
> 

Proposed category description:
/*
 * "Global OR I/O" API functions. These functions can run without
 * the BQL, but only in one specific iothread/main loop.
 *
 * More specifically, these functions use BDRV_POLL_WHILE(bs), which
 * requires the caller to be either in the main thread and hold
 * the BlockdriverState (bs) AioContext lock, or directly in the
 * home thread that runs the bs AioContext. Calling them from
 * another thread in another AioContext would cause deadlocks.
 *
 * Therefore, these functions are not proper I/O, because they
 * can't run in *any* iothreads, but only in a specific one.
 */

Functions that will surely go under this category:

BDRV_POLL_WHILE
bdrv_parent_drained_begin_single
bdrv_parent_drained_end_single
bdrv_drain_poll
bdrv_drained_begin
bdrv_do_drained_begin_quiesce
bdrv_subtree_drained_begin
bdrv_drained_end
bdrv_drained_end_no_poll
bdrv_subtree_drained_end

(all generated_co_wrapper)
bdrv_truncate
bdrv_check
bdrv_invalidate_cache
bdrv_flush
bdrv_pdiscard
bdrv_readv_vmstate
bdrv_writev_vmstate


What I am not sure:

* bdrv_drain_all_begin - bdrv_drain_all_end - bdrv_drain_all: these were
classified as GS, because thay are always called from the main loop.
Should they go in this new category?

* how should I interpret "all the callers of BDRV_POLL_WHILE"?
Meaning, if I consider also the callers of the callers, we end up
covering much much more functions. Should I only consider the direct
callers (ie the above)?

Emanuele




Re: [PATCH v8 1/5] QIOChannel: Add flags on io_writev and introduce io_flush callback

2022-02-01 Thread Daniel P . Berrangé
On Tue, Feb 01, 2022 at 03:28:59AM -0300, Leonardo Bras wrote:
> Add flags to io_writev and introduce io_flush as optional callback to
> QIOChannelClass, allowing the implementation of zero copy writes by
> subclasses.
> 
> How to use them:
> - Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY),
> - Wait write completion with qio_channel_flush().
> 
> Notes:
> As some zero copy write implementations work asynchronously, it's
> recommended to keep the write buffer untouched until the return of
> qio_channel_flush(), to avoid the risk of sending an updated buffer
> instead of the buffer state during write.
> 
> As io_flush callback is optional, if a subclass does not implement it, then:
> - io_flush will return 0 without changing anything.
> 
> Also, some functions like qio_channel_writev_full_all() were adapted to
> receive a flag parameter. That allows shared code between zero copy and
> non-zero copy writev, and also an easier implementation on new flags.
> 
> Signed-off-by: Leonardo Bras 
> ---
>  include/io/channel.h| 38 -
>  chardev/char-io.c   |  2 +-
>  hw/remote/mpqemu-link.c |  2 +-
>  io/channel-buffer.c |  1 +
>  io/channel-command.c|  1 +
>  io/channel-file.c   |  1 +
>  io/channel-socket.c |  2 ++
>  io/channel-tls.c|  1 +
>  io/channel-websock.c|  1 +
>  io/channel.c| 53 +++--
>  migration/rdma.c|  1 +
>  scsi/pr-manager-helper.c|  2 +-
>  tests/unit/test-io-channel-socket.c |  1 +
>  13 files changed, 92 insertions(+), 14 deletions(-)
> 
> diff --git a/io/channel.c b/io/channel.c
> index e8b019dc36..b8b99fdc4c 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -72,18 +72,32 @@ ssize_t qio_channel_writev_full(QIOChannel *ioc,
>  size_t niov,
>  int *fds,
>  size_t nfds,
> +int flags,
>  Error **errp)
>  {
>  QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
>  
> -if ((fds || nfds) &&
> -!qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_FD_PASS)) {
> +if (fds || nfds) {
> +if (!qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_FD_PASS)) {
> +error_setg_errno(errp, EINVAL,
> + "Channel does not support file descriptor 
> passing");
> +return -1;
> +}
> +if (flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) {
> +error_setg_errno(errp, EINVAL,
> + "Zero Copy does not support file descriptor 
> passing");
> +return -1;
> +}

Here you gracefully reject FD passing when zero copy is requested
which is good.

> +}
> +

> @@ -235,10 +249,16 @@ int qio_channel_writev_full_all(QIOChannel *ioc,
>iov, niov,
>0, iov_size(iov, niov));
>  
> +if (flags & QIO_CHANNEL_WRITE_FLAG_ZERO_COPY) {
> +assert(fds == NULL && nfds == 0);
> +}

But here you  abort QEMU if FD passing is requested when zero copy
is set.

AFAICT, if you just delete this assert, the code to gracefully
report errors will do the right thing.

Without the assert:

  Reviewed-by: Daniel P. Berrangé 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v6 01/33] main-loop.h: introduce qemu_in_main_thread()

2022-02-01 Thread Emanuele Giuseppe Esposito



On 31/01/2022 16:49, Paolo Bonzini wrote:
> On 1/31/22 15:33, Kevin Wolf wrote:
>> I feel "use this function if your code is going to be used by unit
>> tests, but use qemu_mutex_iothread_locked() otherwise" is a very strange
>> interface. I'm relatively sure that qemu_mutex_iothread_locked() isn't
>> primarily used to make unit tests crash.
> 
> qemu_mutex_iothread_locked() should never be used in the block layer,
> because programs that use the block layer might not call an iothread
> lock, and indeed only the emulator have an iothread lock.
> 
> Making it always true would be okay when those programs were
> single-threaded, but really they all had worker threads so it was
> changed to false by commit 5f50be9b58 ("async: the main AioContext is
> only "current" if under the BQL", 2021-06-18).
> 
>> Documentation and the definition of the interface of a function
>> shouldn't be about the caller, but about the semantics of the function
>> itself. So, which question does qemu_mutex_iothread_locked() answer, and
>> which question does qemu_in_main_thread() answer?
> 
> qemu_mutex_iothread_locked() -> Does the program have exclusive access
> to the emulator's globals.
> 
> qemu_in_main_thread() -> Does the program have access to the block
> layer's globals.  In emulators this is governed by the iothread lock,
> elsewhere they are accessible only from the home thread of the initial
> AioContext.
> 
> So, in emulators it is the same question, but in the block layer one of
> them is actually meaningless.
> 
> Really the "bug" is that qemu_mutex_iothread_locked() should really not
> be used outside emulators.  There are just two uses, but it's hard to
> remove them.
> 
> So why are two functions needed?  Two reasons:
> 
> - stubs/iothread-lock.c cannot define qemu_mutex_iothread_locked() as
> "return qemu_get_current_aio_context() == qemu_get_aio_context();"
> because it would cause infinite recursion with
> qemu_get_current_aio_context()
> 
> - even if it could, frankly qemu_mutex_iothread_locked() is a horrible
> name, and in the context of the block layer it conflicts especially
> badly with the "iothread" concept.
> 
> Maybe we should embrace the BQL name and rename the functions that refer
> to the "iothread lock".  But then I don't want to have code in the block
> layer that refers to a "big lock".

So based on Paolo's reply, I would modify the function comment in this way:

@@ -242,6 +242,9 @@ AioContext *iohandler_get_aio_context(void);
  * must always be taken outside other locks.  This function helps
  * functions take different paths depending on whether the current
  * thread is running within the main loop mutex.
+ *
+ * This function should never be used in the block layer, please
+ * instead refer to qemu_in_main_thread().
  */
 bool qemu_mutex_iothread_locked(void);

+
+/**
+ * qemu_in_main_thread: same as qemu_mutex_iothread_locked when
+ * softmmu/cpus.c implementation is linked. Otherwise this function
+ * checks that the current AioContext is the global AioContext
+ * (main loop).
+ *
+ * This is useful when checking that the BQL is held as a
+ * replacement of qemu_mutex_iothread_locked() usege in the
+ * block layer, since the former returns false when invoked by
+ * unit tests or other users like qemu-storage-daemon that end
+ * up using the stubs/iothread-lock.c implementation.
+ *
+ * This function should only be used in the block layer.
+ * Use this function to determine whether it is possible to safely
+ * access the block layer's globals.
+ */
+bool qemu_in_main_thread(void);

What do you think?

Emanuele




Re: [PATCH 2/5] docs: Only mention iscsi in the man page when available

2022-02-01 Thread Kevin Wolf
Am 31.01.2022 um 19:57 hat Peter Maydell geschrieben:
> On Mon, 31 Jan 2022 at 17:33, Kevin Wolf  wrote:
> >
> > If libiscsi is disabled in the build, the man page shouldn't contain
> > information on how to construct iscsi URLs etc.
> >
> > This patch is best viewed with whitespace changes ignored.
> >
> > Signed-off-by: Kevin Wolf 
> > ---
> 
> > -``iSCSI``
> > -   iSCSI support allows QEMU to access iSCSI resources directly and use
> > -   as images for the guest storage. Both disk and cdrom images are
> > -   supported.
> > +.. only:: not DISABLE_LIBISCSI
> 
> Unfortunately the Sphinx "only::" tag is unusably broken by design.
> It doesn't work the way you might expect (similar to ifdef, making
> the docs be built as if the markup disabled by only:: was not
> present in the source rst files). Instead it filters out generated
> doctree nodes very late. The effect is that documentation that you
> try to suppress with an 'only' directive will still show up in
> the table of contents, index and search.
> 
> Upstream bug, closed 'wontfix':
> https://github.com/sphinx-doc/sphinx/issues/1420
> 
> I ran into this when I was looking at whether there were nicer ways
> to structure how we generate some of our manpages etc. Unfortunately
> my conclusion was that the only safe approach was to avoid use
> of the 'only::' directive entirely :-(

Hm, that's very disappointing. :-(

Does it affect anything that is used in man pages, though? Otherwise I
guess we could have something like ::only (not man) or (not DISABLE*) to
make things conditional at least in the man pages even if we can't in
HTML.

Or I guess the alternative would be manually preprocessing docs (maybe
even just with cpp) before feeding them to sphinx...

In fact, a large part of the man pages is already preprocessed by
hxtool. But obviously, the changes I make in this patch are outside of
it for the most part.

Kevin