[PATCH 13/15] iotests: Accommodate async QMP Exception classes

2021-09-16 Thread John Snow
(But continue to support the old ones for now, too.)

There are very few cases of any user of QEMUMachine or a subclass
thereof relying on a QMP Exception type. If you'd like to check for
yourself, you want to grep for all of the derivatives of QMPError,
excluding 'AQMPError' and its derivatives. That'd be these:

- QMPError
- QMPConnectError
- QMPCapabilitiesError
- QMPTimeoutError
- QMPProtocolError
- QMPResponseError
- QMPBadPortError


Signed-off-by: John Snow 
---
 scripts/simplebench/bench_block_job.py| 3 ++-
 tests/qemu-iotests/tests/mirror-top-perms | 6 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/scripts/simplebench/bench_block_job.py 
b/scripts/simplebench/bench_block_job.py
index 4f03c12169..a403c35b08 100755
--- a/scripts/simplebench/bench_block_job.py
+++ b/scripts/simplebench/bench_block_job.py
@@ -28,6 +28,7 @@
 sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
 from qemu.machine import QEMUMachine
 from qemu.qmp import QMPConnectError
+from qemu.aqmp import ConnectError
 
 
 def bench_block_job(cmd, cmd_args, qemu_args):
@@ -49,7 +50,7 @@ def bench_block_job(cmd, cmd_args, qemu_args):
 vm.launch()
 except OSError as e:
 return {'error': 'popen failed: ' + str(e)}
-except (QMPConnectError, socket.timeout):
+except (QMPConnectError, ConnectError, socket.timeout):
 return {'error': 'qemu failed: ' + str(vm.get_log())}
 
 try:
diff --git a/tests/qemu-iotests/tests/mirror-top-perms 
b/tests/qemu-iotests/tests/mirror-top-perms
index 451a0666f8..7d448f4d23 100755
--- a/tests/qemu-iotests/tests/mirror-top-perms
+++ b/tests/qemu-iotests/tests/mirror-top-perms
@@ -103,7 +103,11 @@ class TestMirrorTopPerms(iotests.QMPTestCase):
 print('ERROR: VM B launched successfully, this should not have '
   'happened')
 except qemu.qmp.QMPConnectError:
-assert 'Is another process using the image' in self.vm_b.get_log()
+pass
+except qemu.aqmp.ConnectError:
+pass
+
+assert 'Is another process using the image' in self.vm_b.get_log()
 
 result = self.vm.qmp('block-job-cancel',
  device='mirror')
-- 
2.31.1




[PATCH 11/15] python/aqmp: Create sync QMP wrapper for iotests

2021-09-16 Thread John Snow
This is a wrapper around the async QMPClient that mimics the old,
synchronous QEMUMonitorProtocol class. It is designed to be
interchangeable with the old implementation.

It does not, however, attempt to mimic Exception compatibility.

Signed-off-by: John Snow 
---
 python/qemu/aqmp/legacy.py | 131 +
 1 file changed, 131 insertions(+)
 create mode 100644 python/qemu/aqmp/legacy.py

diff --git a/python/qemu/aqmp/legacy.py b/python/qemu/aqmp/legacy.py
new file mode 100644
index 00..b8544083ad
--- /dev/null
+++ b/python/qemu/aqmp/legacy.py
@@ -0,0 +1,131 @@
+"""
+Sync QMP Wrapper
+
+This class pretends to be qemu.qmp.QEMUMonitorProtocol.
+"""
+
+import asyncio
+from typing import (
+Awaitable,
+List,
+Optional,
+TypeVar,
+Union,
+)
+
+import qemu.qmp
+from qemu.qmp import QMPMessage, QMPReturnValue, SocketAddrT
+
+from .qmp_client import QMPClient
+
+
+# pylint: disable=missing-docstring
+
+
+class QEMUMonitorProtocol(qemu.qmp.QEMUMonitorProtocol):
+def __init__(self, address: SocketAddrT,
+ server: bool = False,
+ nickname: Optional[str] = None):
+
+# pylint: disable=super-init-not-called
+self._aqmp = QMPClient(nickname)
+self._aloop = asyncio.get_event_loop()
+self._address = address
+self._timeout: Optional[float] = None
+
+_T = TypeVar('_T')
+
+def _sync(
+self, future: Awaitable[_T], timeout: Optional[float] = None
+) -> _T:
+return self._aloop.run_until_complete(
+asyncio.wait_for(future, timeout=timeout)
+)
+
+# __enter__ and __exit__ need no changes
+# parse_address needs no changes
+
+def connect(self, negotiate: bool = True) -> Optional[QMPMessage]:
+self._aqmp.await_greeting = negotiate
+self._aqmp.negotiate = negotiate
+
+self._sync(
+self._aqmp.connect(self._address)
+)
+
+if self._aqmp.greeting is not None:
+return dict(self._aqmp.greeting)
+return None
+
+def accept(self, timeout: Optional[float] = 15.0) -> QMPMessage:
+self._aqmp.await_greeting = True
+self._aqmp.negotiate = True
+
+self._sync(
+self._aqmp.accept(self._address),
+timeout
+)
+
+assert self._aqmp.greeting is not None
+return dict(self._aqmp.greeting)
+
+def cmd_obj(self, qmp_cmd: QMPMessage) -> QMPMessage:
+return dict(
+self._sync(
+# pylint: disable=protected-access
+
+# _raw() isn't a public API, because turning off
+# automatic ID assignment is discouraged. For
+# compatibility with iotests *only*, do it anyway.
+self._aqmp._raw(qmp_cmd, assign_id=False),
+self._timeout
+)
+)
+
+# Default impl of cmd() delegates to cmd_obj
+
+def command(self, cmd: str, **kwds: object) -> QMPReturnValue:
+return self._sync(
+self._aqmp.execute(cmd, kwds),
+self._timeout
+)
+
+def pull_event(self,
+   wait: Union[bool, float] = False) -> Optional[QMPMessage]:
+if wait is False:
+# Return None if there's no event ready to go
+if self._aqmp.events.empty():
+return None
+
+timeout = None
+if isinstance(wait, float):
+timeout = wait
+
+return dict(
+self._sync(
+self._aqmp.events.get(),
+timeout
+)
+)
+
+def get_events(self, wait: Union[bool, float] = False) -> List[QMPMessage]:
+events = [dict(x) for x in self._aqmp.events.clear()]
+if events:
+return events
+
+event = self.pull_event(wait)
+return [event] if event is not None else []
+
+def clear_events(self) -> None:
+self._aqmp.events.clear()
+
+def close(self) -> None:
+self._sync(
+self._aqmp.disconnect()
+)
+
+def settimeout(self, timeout: Optional[float]) -> None:
+self._timeout = timeout
+
+def send_fd_scm(self, fd: int) -> None:
+self._aqmp.send_fd_scm(fd)
-- 
2.31.1




[PATCH 08/15] python/aqmp: Create MessageModel and StandaloneModel classes

2021-09-16 Thread John Snow
This allows 'Greeting' to be subclass of 'Message'. We need the adapter
classes to avoid some typing problems that occur if we try to put too
much into the 'Model' class itself; the exact details of why are left as
an exercise to the reader.

Why bother? This makes 'Greeting' ⊆ 'Message', which is taxonomically
true; but the real motivation is to be able to inherit and abuse all of
the Mapping dunders so that we can call dict(greeting) or
bytes(greeting), for example.

Signed-off-by: John Snow 
---
 python/qemu/aqmp/models.py | 67 --
 1 file changed, 50 insertions(+), 17 deletions(-)

diff --git a/python/qemu/aqmp/models.py b/python/qemu/aqmp/models.py
index 24c94123ac..eaeeebc25c 100644
--- a/python/qemu/aqmp/models.py
+++ b/python/qemu/aqmp/models.py
@@ -15,23 +15,22 @@
 Sequence,
 )
 
+from .message import Message
+
 
 class Model:
 """
-Abstract data model, representing some QMP object of some kind.
-
-:param raw: The raw object to be validated.
-:raise KeyError: If any required fields are absent.
-:raise TypeError: If any required fields have the wrong type.
+Abstract data model, representing a QMP object of some kind.
 """
-def __init__(self, raw: Mapping[str, Any]):
-self._raw = raw
+@property
+def _raw(self) -> Mapping[str, Any]:
+raise NotImplementedError
 
 def _check_key(self, key: str) -> None:
 if key not in self._raw:
 raise KeyError(f"'{self._name}' object requires '{key}' member")
 
-def _check_value(self, key: str, type_: type, typestr: str) -> None:
+def _check_type(self, key: str, type_: type, typestr: str) -> None:
 assert key in self._raw
 if not isinstance(self._raw[key], type_):
 raise TypeError(
@@ -40,7 +39,7 @@ def _check_value(self, key: str, type_: type, typestr: str) 
-> None:
 
 def _check_member(self, key: str, type_: type, typestr: str) -> None:
 self._check_key(key)
-self._check_value(key, type_, typestr)
+self._check_type(key, type_, typestr)
 
 @property
 def _name(self) -> str:
@@ -50,7 +49,37 @@ def __repr__(self) -> str:
 return f"{self._name}({self._raw!r})"
 
 
-class Greeting(Model):
+class MessageModel(Message, Model):
+"""
+Abstract data model, representing a QMP Message of some sort.
+
+Adapter class that glues together `Model` and `Message`.
+"""
+def __init__(self, raw: Mapping[str, object]):
+super().__init__(raw)
+
+@property
+def _raw(self) -> Mapping[str, Any]:
+return self._object
+
+def __setitem__(self, key: str, value: object) -> None:
+# This is not good OO, but this turns off mutability here.
+raise RuntimeError("Cannot mutate MessageModel")
+
+
+class StandaloneModel(Model):
+"""
+Abstract data model, representing a (non-Message) QMP object of some sort.
+"""
+def __init__(self, raw: Mapping[str, object]):
+self._raw_mapping = raw
+
+@property
+def _raw(self) -> Mapping[str, Any]:
+return self._raw_mapping
+
+
+class Greeting(MessageModel):
 """
 Defined in qmp-spec.txt, section 2.2, "Server Greeting".
 
@@ -58,8 +87,9 @@ class Greeting(Model):
 :raise KeyError: If any required fields are absent.
 :raise TypeError: If any required fields have the wrong type.
 """
-def __init__(self, raw: Mapping[str, Any]):
+def __init__(self, raw: Mapping[str, object]):
 super().__init__(raw)
+
 #: 'QMP' member
 self.QMP: QMPGreeting  # pylint: disable=invalid-name
 
@@ -67,7 +97,7 @@ def __init__(self, raw: Mapping[str, Any]):
 self.QMP = QMPGreeting(self._raw['QMP'])
 
 
-class QMPGreeting(Model):
+class QMPGreeting(StandaloneModel):
 """
 Defined in qmp-spec.txt, section 2.2, "Server Greeting".
 
@@ -75,8 +105,9 @@ class QMPGreeting(Model):
 :raise KeyError: If any required fields are absent.
 :raise TypeError: If any required fields have the wrong type.
 """
-def __init__(self, raw: Mapping[str, Any]):
+def __init__(self, raw: Mapping[str, object]):
 super().__init__(raw)
+
 #: 'version' member
 self.version: Mapping[str, object]
 #: 'capabilities' member
@@ -89,7 +120,7 @@ def __init__(self, raw: Mapping[str, Any]):
 self.capabilities = self._raw['capabilities']
 
 
-class ErrorResponse(Model):
+class ErrorResponse(MessageModel):
 """
 Defined in qmp-spec.txt, section 2.4.2, "error".
 
@@ -97,8 +128,9 @@ class ErrorResponse(Model):
 :raise KeyError: If any required fields are absent.
 :raise TypeError: If any required fields have the wrong type.
 """
-def __init__(self, raw: Mapping[str, Any]):
+def __init__(self, raw: Mapping[str, object]):
 super().__init__(raw)
+
 #: 'error' member
 self.error: ErrorInfo
 #: 'id' member
@@ -111,7 +143,7 @@ def __init__(self, raw: 

[PATCH 09/15] python/machine: remove has_quit argument

2021-09-16 Thread John Snow
If we spy on the QMP commands instead, we don't need callers to remember
to pass it. Seems like a fair trade-off.

The one slightly weird bit is overloading this instance variable for
wait(), where we use it to mean "don't issue the qmp 'quit'
command". This means that wait() will "fail" if the QEMU process does
not terminate of its own accord.

In most cases, we probably did already actually issue quit -- some
iotests do this -- but in some others, we may be waiting for QEMU to
terminate for some other reason.

Signed-off-by: John Snow 
---
 python/qemu/machine/machine.py | 35 +++---
 tests/qemu-iotests/040 |  7 +--
 tests/qemu-iotests/218 |  2 +-
 tests/qemu-iotests/255 |  2 +-
 4 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 056d340e35..6e58d2f951 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -170,6 +170,7 @@ def __init__(self,
 self._console_socket: Optional[socket.socket] = None
 self._remove_files: List[str] = []
 self._user_killed = False
+self._has_quit = False
 
 def __enter__(self: _T) -> _T:
 return self
@@ -368,6 +369,7 @@ def _post_shutdown(self) -> None:
 command = ''
 LOG.warning(msg, -int(exitcode), command)
 
+self._has_quit = False
 self._user_killed = False
 self._launched = False
 
@@ -443,15 +445,13 @@ def _hard_shutdown(self) -> None:
 self._subp.kill()
 self._subp.wait(timeout=60)
 
-def _soft_shutdown(self, timeout: Optional[int],
-   has_quit: bool = False) -> None:
+def _soft_shutdown(self, timeout: Optional[int]) -> None:
 """
 Perform early cleanup, attempt to gracefully shut down the VM, and wait
 for it to terminate.
 
 :param timeout: Timeout in seconds for graceful shutdown.
 A value of None is an infinite wait.
-:param has_quit: When True, don't attempt to issue 'quit' QMP command
 
 :raise ConnectionReset: On QMP communication errors
 :raise subprocess.TimeoutExpired: When timeout is exceeded waiting for
@@ -460,21 +460,19 @@ def _soft_shutdown(self, timeout: Optional[int],
 self._early_cleanup()
 
 if self._qmp_connection:
-if not has_quit:
+if not self._has_quit:
 # Might raise ConnectionReset
-self._qmp.cmd('quit')
+self.qmp('quit')
 
 # May raise subprocess.TimeoutExpired
 self._subp.wait(timeout=timeout)
 
-def _do_shutdown(self, timeout: Optional[int],
- has_quit: bool = False) -> None:
+def _do_shutdown(self, timeout: Optional[int]) -> None:
 """
 Attempt to shutdown the VM gracefully; fallback to a hard shutdown.
 
 :param timeout: Timeout in seconds for graceful shutdown.
 A value of None is an infinite wait.
-:param has_quit: When True, don't attempt to issue 'quit' QMP command
 
 :raise AbnormalShutdown: When the VM could not be shut down gracefully.
 The inner exception will likely be ConnectionReset or
@@ -482,13 +480,13 @@ def _do_shutdown(self, timeout: Optional[int],
 may result in its own exceptions, likely subprocess.TimeoutExpired.
 """
 try:
-self._soft_shutdown(timeout, has_quit)
+self._soft_shutdown(timeout)
 except Exception as exc:
 self._hard_shutdown()
 raise AbnormalShutdown("Could not perform graceful shutdown") \
 from exc
 
-def shutdown(self, has_quit: bool = False,
+def shutdown(self,
  hard: bool = False,
  timeout: Optional[int] = 30) -> None:
 """
@@ -498,7 +496,6 @@ def shutdown(self, has_quit: bool = False,
 If the VM has not yet been launched, or shutdown(), wait(), or kill()
 have already been called, this method does nothing.
 
-:param has_quit: When true, do not attempt to issue 'quit' QMP command.
 :param hard: When true, do not attempt graceful shutdown, and
  suppress the SIGKILL warning log message.
 :param timeout: Optional timeout in seconds for graceful shutdown.
@@ -512,7 +509,7 @@ def shutdown(self, has_quit: bool = False,
 self._user_killed = True
 self._hard_shutdown()
 else:
-self._do_shutdown(timeout, has_quit)
+self._do_shutdown(timeout)
 finally:
 self._post_shutdown()
 
@@ -529,7 +526,9 @@ def wait(self, timeout: Optional[int] = 30) -> None:
 :param timeout: Optional timeout in seconds. Default 30 seconds.
 A value of `None` is an infinite wait.
 """
-

[PATCH 10/15] python/machine: Add support for AQMP backend

2021-09-16 Thread John Snow
To use the AQMP backend, Machine just needs to be a little more diligent
about what happens when closing a QMP connection. The operation is no
longer a freebie in the async world.

Because async QMP continues to check for messages asynchronously, it's
almost certainly likely that the loop will have exited due to EOF after
issuing the last 'quit' command. That error will ultimately be bubbled
up when attempting to close the QMP connection. The manager class here
then is free to discard it if it was expected.

Signed-off-by: John Snow 

---

Yes, I regret that this class has become quite a dumping ground for
complexity around the exit path. It's in need of a refactor to help
separate out the exception handling and cleanup mechanisms from the
VM-related stuff, but it's not a priority to do that just yet -- but
it's on the list.

---
 python/qemu/machine/machine.py | 51 ++
 1 file changed, 46 insertions(+), 5 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 6e58d2f951..8f5a6649e5 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -342,9 +342,15 @@ def _post_shutdown(self) -> None:
 # Comprehensive reset for the failed launch case:
 self._early_cleanup()
 
-if self._qmp_connection:
-self._qmp.close()
-self._qmp_connection = None
+try:
+self._close_qmp_connection()
+except Exception as err:  # pylint: disable=broad-except
+LOG.warning(
+"Exception closing QMP connection: %s",
+str(err) if str(err) else type(err).__name__
+)
+finally:
+assert self._qmp_connection is None
 
 self._close_qemu_log_file()
 
@@ -420,6 +426,31 @@ def _launch(self) -> None:
close_fds=False)
 self._post_launch()
 
+def _close_qmp_connection(self) -> None:
+"""
+Close the underlying QMP connection, if any.
+
+Dutifully report errors that occurred while closing, but assume
+that any error encountered indicates an abnormal termination
+process and not a failure to close.
+"""
+if self._qmp_connection is None:
+return
+
+try:
+self._qmp.close()
+except EOFError:
+# EOF can occur as an Exception here when using the Async
+# QMP backend. It indicates that the server closed the
+# stream. If we successfully issued 'quit' at any point,
+# then this was expected. If the remote went away without
+# our permission, it's worth reporting that as an abnormal
+# shutdown case.
+if not self._has_quit:
+raise
+finally:
+self._qmp_connection = None
+
 def _early_cleanup(self) -> None:
 """
 Perform any cleanup that needs to happen before the VM exits.
@@ -461,8 +492,18 @@ def _soft_shutdown(self, timeout: Optional[int]) -> None:
 
 if self._qmp_connection:
 if not self._has_quit:
-# Might raise ConnectionReset
-self.qmp('quit')
+try:
+# May raise ExecInterruptedError or StateError if the
+# connection dies or has already died.
+self.qmp('quit')
+finally:
+# If 'quit' fails, we'll still want to call close(),
+# which will raise any final errors that may have
+# occurred while trying to send 'quit'.
+self._close_qmp_connection()
+else:
+# Regardless, we want to tidy up the socket.
+self._close_qmp_connection()
 
 # May raise subprocess.TimeoutExpired
 self._subp.wait(timeout=timeout)
-- 
2.31.1




Re: [PATCH v3 00/16] python/iotests: Run iotest linters during Python CI

2021-09-16 Thread John Snow
On Thu, Sep 16, 2021 at 12:10 AM John Snow  wrote:

> GitLab: https://gitlab.com/jsnow/qemu/-/commits/python-package-iotest
> CI: https://gitlab.com/jsnow/qemu/-/pipelines/371611883
> Based-On: <20210915175318.853225-1-hre...@redhat.com>
>   "[PULL 00/32] Block patches"
>
> Since iotests are such a heavy and prominent user of the Python qemu.qmp
> and qemu.machine packages, it would be convenient if the Python linting
> suite also checked this client for any possible regressions introduced
> by shifting around signatures, types, or interfaces in these packages.
>
> (We'd eventually find those problems when iotest 297 ran, but with
> increasing distance between Python development and Block development,
> the risk of an accidental breakage in this regard increases. I,
> personally, know to run iotests (and especially 297) after changing
> Python code, but not everyone in the future might. Plus, I am lazy, and
> I like only having to push one button.)
>
> Add the ability for the Python CI to run the iotest linters too, which
> means that the iotest linters would be checked against:
>
> - Python 3.6, using a frozen set of linting packages at their oldest
>   supported versions, using 'pipenv'
> - Python 3.6 through Python 3.10 inclusive, using 'tox' and the latest
>   versions of mypy/pylint that happen to be installed during test
>   time. This CI test is allowed to fail with a warning, and can serve
>   as a bellwether for when new incompatible changes may disrupt the
>   linters. Testing against old and new Python interpreters alike can
>   help surface incompatibility issues we may need to be aware of.)
>
> Here are example outputs of those CI jobs with this series applied:
>  - "check-python-pipenv": https://gitlab.com/jsnow/qemu/-/jobs/1377735087
>  - "check-python-tox": https://gitlab.com/jsnow/qemu/-/jobs/1377735088
>
> You can also run these same tests locally from ./python, plus one more:
>
> - "make check-dev" to test against whatever python you have.
> - "make check-pipenv", if you have Python 3.6 and pipenv installed.
> - "make check-tox", to test against multiple python versions you have
> installed,
> from 3.6 to 3.10 inclusive. (CI tests against all 5.)
>
> See the old commit message for more sample output, etc.
>
> https://lists.gnu.org/archive/html/qemu-devel/2021-06/msg07056.html
>
> V3:
>  - Added patch 1 which has been submitted separately upstream,
>but was necessary for testing.
>  - Rebased on top of hreitz/block, which fixed some linting issues.
>  - Added a workaround for a rather nasty mypy bug ... >:(
>
> V2:
>  - Added patches 1-5 which do some more delinting.
>  - Added patch 8, which scans subdirs for tests to lint.
>  - Added patch 17, which improves the speed of mypy analysis.
>  - Patch 14 is different because of the new patch 8.
>
> John Snow (16):
>   python: Update for pylint 2.10
>   iotests/mirror-top-perms: Adjust imports
>   iotests/migrate-bitmaps-postcopy-test: declare instance variables
>   iotests/migrate-bitmaps-test: delint
>   iotests/297: modify is_python_file to work from any CWD
>   iotests/297: Add get_files() function
>   iotests/297: Don't rely on distro-specific linter binaries
>   iotests/297: Create main() function
>   iotests/297: Separate environment setup from test execution
>   iotests/297: Add 'directory' argument to run_linters
>   iotests/297: return error code from run_linters()
>   iotests/297: split linters.py off from 297
>   iotests/linters: Add entry point for Python CI linters
>   iotests/linters: Add workaround for mypy bug #9852
>   python: Add iotest linters to test suite
>   iotests/linters: check mypy files all at once
>
>  python/qemu/machine/machine.py|   9 +-
>  python/setup.cfg  |   1 +
>  python/tests/iotests.sh   |   4 +
>  tests/qemu-iotests/297|  81 ++-
>  tests/qemu-iotests/linters.py | 129 ++
>  .../tests/migrate-bitmaps-postcopy-test   |   3 +
>  tests/qemu-iotests/tests/migrate-bitmaps-test |  50 ---
>  tests/qemu-iotests/tests/mirror-top-perms |   7 +-
>  8 files changed, 186 insertions(+), 98 deletions(-)
>  create mode 100755 python/tests/iotests.sh
>  create mode 100755 tests/qemu-iotests/linters.py
>
> --
> 2.31.1
>
>
>
FWIW: I sent a new version of a pull request that adds pylint 2.10 *and*
2.11 support; the 2.11 release happened just yesterday, so I am going to
rebase this series.

Additionally, I found a new way to avoid sys.path hacking in all of our
test files entirely, so I will include that in this series, rebase, and
resend extremely soon. If you have difficulties applying this patchset or
testing it, sit tight for a refreshed version -- but most of these patches
can still be reviewed on their own merits in the meantime.

Thanks,
--js


[PATCH 14/15] python/aqmp: Remove scary message

2021-09-16 Thread John Snow
The scary message interferes with the iotests output. Coincidentally, if
iotests works by removing this, then it's good evidence that we don't
really need to scare people away from using it.

Signed-off-by: John Snow 
---
 python/qemu/aqmp/__init__.py | 14 --
 1 file changed, 14 deletions(-)

diff --git a/python/qemu/aqmp/__init__.py b/python/qemu/aqmp/__init__.py
index ab1782999c..4b7df53e00 100644
--- a/python/qemu/aqmp/__init__.py
+++ b/python/qemu/aqmp/__init__.py
@@ -21,8 +21,6 @@
 # This work is licensed under the terms of the GNU GPL, version 2.  See
 # the COPYING file in the top-level directory.
 
-import warnings
-
 from .error import AQMPError
 from .events import EventListener
 from .message import Message
@@ -30,18 +28,6 @@
 from .qmp_client import ExecInterruptedError, ExecuteError, QMPClient
 
 
-_WMSG = """
-
-The Asynchronous QMP library is currently in development and its API
-should be considered highly fluid and subject to change. It should
-not be used by any other scripts checked into the QEMU tree.
-
-Proceed with caution!
-"""
-
-warnings.warn(_WMSG, FutureWarning)
-
-
 # The order of these fields impact the Sphinx documentation order.
 __all__ = (
 # Classes, most to least important
-- 
2.31.1




[PATCH 15/15] python, iotests: replace qmp with aqmp

2021-09-16 Thread John Snow
Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with the sync
wrapper from qemu.aqmp instead.

Add an escape hatch in the form of the environment variable
QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into using
the old interface, proving that both implementations work concurrently.

Signed-off-by: John Snow 
---
 python/qemu/machine/machine.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 8f5a6649e5..6b005dd5d1 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -41,7 +41,6 @@
 )
 
 from qemu.qmp import (  # pylint: disable=import-error
-QEMUMonitorProtocol,
 QMPMessage,
 QMPReturnValue,
 SocketAddrT,
@@ -50,6 +49,12 @@
 from . import console_socket
 
 
+if os.environ.get('QEMU_PYTHON_LEGACY_QMP'):
+from qemu.qmp import QEMUMonitorProtocol
+else:
+from qemu.aqmp.legacy import QEMUMonitorProtocol
+
+
 LOG = logging.getLogger(__name__)
 
 
-- 
2.31.1




[PATCH 12/15] iotests: Disable AQMP logging under non-debug modes

2021-09-16 Thread John Snow
Disable the aqmp logger, which likes to (at the moment) print out
intermediate warnings and errors that cause session termination; disable
them so they don't interfere with the job output.

Leave any "CRITICAL" warnings enabled though, those are ones that we
should never see, no matter what.

Signed-off-by: John Snow 
---
 tests/qemu-iotests/iotests.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 273d2777ae..47e5f9738b 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -1383,6 +1383,8 @@ def execute_setup_common(supported_fmts: Sequence[str] = 
(),
 if debug:
 sys.argv.remove('-d')
 logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN))
+if not debug:
+logging.getLogger("qemu.aqmp.qmp_client").setLevel(logging.CRITICAL)
 
 _verify_image_format(supported_fmts, unsupported_fmts)
 _verify_protocol(supported_protocols, unsupported_protocols)
-- 
2.31.1




[PATCH 05/15] python/qmp: add send_fd_scm directly to QEMUMonitorProtocol

2021-09-16 Thread John Snow
It turns out you can do this directly from Python ... and because of
this, you don't need to worry about setting the inheritability of the
fds or spawning another process.

Doing this is helpful because it allows QEMUMonitorProtocol to keep its
file descriptor and socket object as private implementation details.

*that* is helpful in turn because it allows me to write a compatible,
 alternative implementation.

Signed-off-by: John Snow 
---
 python/qemu/machine/machine.py | 44 +++---
 python/qemu/qmp/__init__.py| 21 +++-
 2 files changed, 18 insertions(+), 47 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index ae945ca3c9..1c6532a3d6 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -213,48 +213,22 @@ def add_fd(self: _T, fd: int, fdset: int,
 def send_fd_scm(self, fd: Optional[int] = None,
 file_path: Optional[str] = None) -> int:
 """
-Send an fd or file_path to socket_scm_helper.
+Send an fd or file_path to the remote via SCM_RIGHTS.
 
-Exactly one of fd and file_path must be given.
-If it is file_path, the helper will open that file and pass its own fd.
+Exactly one of fd and file_path must be given.  If it is
+file_path, the file will be opened read-only and the new file
+descriptor will be sent to the remote.
 """
-# In iotest.py, the qmp should always use unix socket.
-assert self._qmp.is_scm_available()
-if self._socket_scm_helper is None:
-raise QEMUMachineError("No path to socket_scm_helper set")
-if not os.path.exists(self._socket_scm_helper):
-raise QEMUMachineError("%s does not exist" %
-   self._socket_scm_helper)
-
-# This did not exist before 3.4, but since then it is
-# mandatory for our purpose
-if hasattr(os, 'set_inheritable'):
-os.set_inheritable(self._qmp.get_sock_fd(), True)
-if fd is not None:
-os.set_inheritable(fd, True)
-
-fd_param = ["%s" % self._socket_scm_helper,
-"%d" % self._qmp.get_sock_fd()]
-
 if file_path is not None:
 assert fd is None
-fd_param.append(file_path)
+with open(file_path, "rb") as passfile:
+fd = passfile.fileno()
+self._qmp.send_fd_scm(fd)
 else:
 assert fd is not None
-fd_param.append(str(fd))
+self._qmp.send_fd_scm(fd)
 
-proc = subprocess.run(
-fd_param,
-stdin=subprocess.DEVNULL,
-stdout=subprocess.PIPE,
-stderr=subprocess.STDOUT,
-check=False,
-close_fds=False,
-)
-if proc.stdout:
-LOG.debug(proc.stdout)
-
-return proc.returncode
+return 0
 
 @staticmethod
 def _remove_if_exists(path: str) -> None:
diff --git a/python/qemu/qmp/__init__.py b/python/qemu/qmp/__init__.py
index ba15668c25..c3b8a70ec9 100644
--- a/python/qemu/qmp/__init__.py
+++ b/python/qemu/qmp/__init__.py
@@ -21,6 +21,7 @@
 import json
 import logging
 import socket
+import struct
 from types import TracebackType
 from typing import (
 Any,
@@ -408,18 +409,14 @@ def settimeout(self, timeout: Optional[float]) -> None:
 raise ValueError(msg)
 self.__sock.settimeout(timeout)
 
-def get_sock_fd(self) -> int:
+def send_fd_scm(self, fd: int) -> None:
 """
-Get the socket file descriptor.
-
-@return The file descriptor number.
-"""
-return self.__sock.fileno()
-
-def is_scm_available(self) -> bool:
+Send a file descriptor to the remote via SCM_RIGHTS.
 """
-Check if the socket allows for SCM_RIGHTS.
+if self.__sock.family != socket.AF_UNIX:
+raise RuntimeError("Can't use SCM_RIGHTS on non-AF_UNIX socket.")
 
-@return True if SCM_RIGHTS is available, otherwise False.
-"""
-return self.__sock.family == socket.AF_UNIX
+self.__sock.sendmsg(
+[b' '],
+[(socket.SOL_SOCKET, socket.SCM_RIGHTS, struct.pack('@i', fd))]
+)
-- 
2.31.1




[PATCH 06/15] python, iotests: remove socket_scm_helper

2021-09-16 Thread John Snow
It's not used anymore, now.

Signed-off-by: John Snow 
---
 tests/qemu-iotests/socket_scm_helper.c | 136 -
 python/qemu/machine/machine.py |   3 -
 python/qemu/machine/qtest.py   |   2 -
 tests/Makefile.include |   1 -
 tests/meson.build  |   4 -
 tests/qemu-iotests/iotests.py  |   3 -
 tests/qemu-iotests/meson.build |   5 -
 tests/qemu-iotests/testenv.py  |   8 +-
 8 files changed, 1 insertion(+), 161 deletions(-)
 delete mode 100644 tests/qemu-iotests/socket_scm_helper.c
 delete mode 100644 tests/qemu-iotests/meson.build

diff --git a/tests/qemu-iotests/socket_scm_helper.c 
b/tests/qemu-iotests/socket_scm_helper.c
deleted file mode 100644
index eb76d31aa9..00
--- a/tests/qemu-iotests/socket_scm_helper.c
+++ /dev/null
@@ -1,136 +0,0 @@
-/*
- * SCM_RIGHTS with unix socket help program for test
- *
- * Copyright IBM, Inc. 2013
- *
- * Authors:
- *  Wenchao Xia
- *
- * This work is licensed under the terms of the GNU LGPL, version 2 or later.
- * See the COPYING.LIB file in the top-level directory.
- */
-
-#include "qemu/osdep.h"
-#include 
-#include 
-
-/* #define SOCKET_SCM_DEBUG */
-
-/*
- * @fd and @fd_to_send will not be checked for validation in this function,
- * a blank will be sent as iov data to notify qemu.
- */
-static int send_fd(int fd, int fd_to_send)
-{
-struct msghdr msg;
-struct iovec iov[1];
-int ret;
-char control[CMSG_SPACE(sizeof(int))];
-struct cmsghdr *cmsg;
-
-memset(, 0, sizeof(msg));
-memset(control, 0, sizeof(control));
-
-/* Send a blank to notify qemu */
-iov[0].iov_base = (void *)" ";
-iov[0].iov_len = 1;
-
-msg.msg_iov = iov;
-msg.msg_iovlen = 1;
-
-msg.msg_control = control;
-msg.msg_controllen = sizeof(control);
-
-cmsg = CMSG_FIRSTHDR();
-
-cmsg->cmsg_len = CMSG_LEN(sizeof(int));
-cmsg->cmsg_level = SOL_SOCKET;
-cmsg->cmsg_type = SCM_RIGHTS;
-memcpy(CMSG_DATA(cmsg), _to_send, sizeof(int));
-
-do {
-ret = sendmsg(fd, , 0);
-} while (ret < 0 && errno == EINTR);
-
-if (ret < 0) {
-fprintf(stderr, "Failed to send msg, reason: %s\n", strerror(errno));
-}
-
-return ret;
-}
-
-/* Convert string to fd number. */
-static int get_fd_num(const char *fd_str, bool silent)
-{
-int sock;
-char *err;
-
-errno = 0;
-sock = strtol(fd_str, , 10);
-if (errno) {
-if (!silent) {
-fprintf(stderr, "Failed in strtol for socket fd, reason: %s\n",
-strerror(errno));
-}
-return -1;
-}
-if (!*fd_str || *err || sock < 0) {
-if (!silent) {
-fprintf(stderr, "bad numerical value for socket fd '%s'\n", 
fd_str);
-}
-return -1;
-}
-
-return sock;
-}
-
-/*
- * To make things simple, the caller needs to specify:
- * 1. socket fd.
- * 2. path of the file to be sent.
- */
-int main(int argc, char **argv, char **envp)
-{
-int sock, fd, ret;
-
-#ifdef SOCKET_SCM_DEBUG
-int i;
-for (i = 0; i < argc; i++) {
-fprintf(stderr, "Parameter %d: %s\n", i, argv[i]);
-}
-#endif
-
-if (argc != 3) {
-fprintf(stderr,
-"Usage: %s < socket-fd > < file-path >\n",
-argv[0]);
-return EXIT_FAILURE;
-}
-
-
-sock = get_fd_num(argv[1], false);
-if (sock < 0) {
-return EXIT_FAILURE;
-}
-
-fd = get_fd_num(argv[2], true);
-if (fd < 0) {
-/* Now only open a file in readonly mode for test purpose. If more
-   precise control is needed, use python script in file operation, 
which
-   is supposed to fork and exec this program. */
-fd = open(argv[2], O_RDONLY);
-if (fd < 0) {
-fprintf(stderr, "Failed to open file '%s'\n", argv[2]);
-return EXIT_FAILURE;
-}
-}
-
-ret = send_fd(sock, fd);
-if (ret < 0) {
-close(fd);
-return EXIT_FAILURE;
-}
-
-close(fd);
-return EXIT_SUCCESS;
-}
diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 1c6532a3d6..056d340e35 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -98,7 +98,6 @@ def __init__(self,
  name: Optional[str] = None,
  base_temp_dir: str = "/var/tmp",
  monitor_address: Optional[SocketAddrT] = None,
- socket_scm_helper: Optional[str] = None,
  sock_dir: Optional[str] = None,
  drain_console: bool = False,
  console_log: Optional[str] = None,
@@ -113,7 +112,6 @@ def __init__(self,
 @param name: prefix for socket and log file names (default: qemu-PID)
 @param base_temp_dir: default location where temp files are created
 @param monitor_address: address for QMP monitor
-@param socket_scm_helper: helper program, required for 

[PATCH 04/15] python/qmp: clear events on get_events() call

2021-09-16 Thread John Snow
All callers in the tree *already* clear the events after a call to
get_events(). Do it automatically instead and update callsites to remove
the manual clear call.

These semantics are quite a bit easier to emulate with async QMP, and
nobody appears to be abusing some emergent properties of what happens if
you decide not to clear them, so let's dial down to the dumber, simpler
thing.

Specifically: callers of clear() right after a call to get_events() are
more likely expressing their desire to not see any events they just
retrieved, whereas callers of clear_events() not in relation to a recent
call to pull_event/get_events are likely expressing their desire to
simply drop *all* pending events straight onto the floor. In the sync
world, this is safe enough; in the async world it's nearly impossible to
promise that nothing happens between getting and clearing the
events.

Making the retrieval also clear the queue is vastly simpler.

Signed-off-by: John Snow 
---
 python/qemu/machine/machine.py | 1 -
 python/qemu/qmp/__init__.py| 4 +++-
 python/qemu/qmp/qmp_shell.py   | 1 -
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 34131884a5..ae945ca3c9 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -631,7 +631,6 @@ def get_qmp_events(self, wait: bool = False) -> 
List[QMPMessage]:
 events = self._qmp.get_events(wait=wait)
 events.extend(self._events)
 del self._events[:]
-self._qmp.clear_events()
 return events
 
 @staticmethod
diff --git a/python/qemu/qmp/__init__.py b/python/qemu/qmp/__init__.py
index 269516a79b..ba15668c25 100644
--- a/python/qemu/qmp/__init__.py
+++ b/python/qemu/qmp/__init__.py
@@ -374,7 +374,9 @@ def get_events(self, wait: bool = False) -> 
List[QMPMessage]:
 @return The list of available QMP events.
 """
 self.__get_events(wait)
-return self.__events
+events = self.__events
+self.__events = []
+return events
 
 def clear_events(self) -> None:
 """
diff --git a/python/qemu/qmp/qmp_shell.py b/python/qemu/qmp/qmp_shell.py
index 337acfce2d..e7d7eb18f1 100644
--- a/python/qemu/qmp/qmp_shell.py
+++ b/python/qemu/qmp/qmp_shell.py
@@ -381,7 +381,6 @@ def read_exec_command(self) -> bool:
 if cmdline == '':
 for event in self.get_events():
 print(event)
-self.clear_events()
 return True
 
 return self._execute_cmd(cmdline)
-- 
2.31.1




[PATCH 07/15] python/aqmp: add send_fd_scm

2021-09-16 Thread John Snow
The single space is indeed required to successfully transmit the file
descriptor to QEMU.

Signed-off-by: John Snow 
---
 python/qemu/aqmp/qmp_client.py | 17 +
 1 file changed, 17 insertions(+)

diff --git a/python/qemu/aqmp/qmp_client.py b/python/qemu/aqmp/qmp_client.py
index d2ad7459f9..58f85990bc 100644
--- a/python/qemu/aqmp/qmp_client.py
+++ b/python/qemu/aqmp/qmp_client.py
@@ -9,6 +9,8 @@
 
 import asyncio
 import logging
+import socket
+import struct
 from typing import (
 Dict,
 List,
@@ -624,3 +626,18 @@ async def execute(self, cmd: str,
 """
 msg = self.make_execute_msg(cmd, arguments, oob=oob)
 return await self.execute_msg(msg)
+
+@upper_half
+@require(Runstate.RUNNING)
+def send_fd_scm(self, fd: int) -> None:
+"""
+Send a file descriptor to the remote via SCM_RIGHTS.
+"""
+assert self._writer is not None
+sock = self._writer.transport.get_extra_info('socket')
+
+# Python 3.9+ adds socket.send_fds(...)
+sock.sendmsg(
+[b' '],
+[(socket.SOL_SOCKET, socket.SCM_RIGHTS, struct.pack('@i', fd))]
+)
-- 
2.31.1




[PATCH 02/15] python/aqmp: add .empty() method to EventListener

2021-09-16 Thread John Snow
Synchronous clients may want to know if they're about to block waiting
for an event or not. A method such as this is necessary to implement a
compatible interface for the old QEMUMonitorProtocol using the new async
internals.

Signed-off-by: John Snow 
---
 python/qemu/aqmp/events.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/python/qemu/aqmp/events.py b/python/qemu/aqmp/events.py
index fb81d21610..271899f6b8 100644
--- a/python/qemu/aqmp/events.py
+++ b/python/qemu/aqmp/events.py
@@ -556,6 +556,12 @@ async def get(self) -> Message:
 """
 return await self._queue.get()
 
+def empty(self) -> bool:
+"""
+Return `True` if there are no pending events.
+"""
+return self._queue.empty()
+
 def clear(self) -> None:
 """
 Clear this listener of all pending events.
-- 
2.31.1




[PATCH 01/15] python/aqmp: add greeting property to QMPClient

2021-09-16 Thread John Snow
Expose the greeting as a read-only property of QMPClient so it can be
retrieved at-will.

Signed-off-by: John Snow 
---
 python/qemu/aqmp/qmp_client.py | 5 +
 1 file changed, 5 insertions(+)

diff --git a/python/qemu/aqmp/qmp_client.py b/python/qemu/aqmp/qmp_client.py
index 82e9dab124..d2ad7459f9 100644
--- a/python/qemu/aqmp/qmp_client.py
+++ b/python/qemu/aqmp/qmp_client.py
@@ -224,6 +224,11 @@ def __init__(self, name: Optional[str] = None) -> None:
 'asyncio.Queue[QMPClient._PendingT]'
 ] = {}
 
+@property
+def greeting(self) -> Optional[Greeting]:
+"""The `Greeting` from the QMP server, if any."""
+return self._greeting
+
 @upper_half
 async def _establish_session(self) -> None:
 """
-- 
2.31.1




[PATCH 00/15] Switch iotests to using Async QMP

2021-09-16 Thread John Snow
Based-on: <20210916220716.1353698-1-js...@redhat.com>
Based-on: <20210915162955.333025-1-js...@redhat.com>
  [PULL 0/2] Python patches
  [PATCH v4 00/27] python: introduce Asynchronous QMP package

Hiya,

This series continues where the first AQMP series left off and adds a
synchronous 'legacy' wrapper around the new AQMP interface, then drops
it straight into iotests to prove that AQMP is functional and totally
cool and fine.

In the event that a regression happens and I am not physically proximate
to inflict damage upon, one may set the QEMU_PYTHON_LEGACY_QMP variable
to any non-empty string as it pleases you to engage the QMP machinery
you are used to.

I'd like to try and get this committed early in the 6.2 development
cycle to give ample time to smooth over any possible regressions.
I've tested it locally and via gitlab CI and "worksforme".

John Snow (15):
  python/aqmp: add greeting property to QMPClient
  python/aqmp: add .empty() method to EventListener
  python/aqmp: Return cleared events from EventListener.clear()
  python/qmp: clear events on get_events() call
  python/qmp: add send_fd_scm directly to QEMUMonitorProtocol
  python, iotests: remove socket_scm_helper
  python/aqmp: add send_fd_scm
  python/aqmp: Create MessageModel and StandaloneModel classes
  python/machine: remove has_quit argument
  python/machine: Add support for AQMP backend
  python/aqmp: Create sync QMP wrapper for iotests
  iotests: Disable AQMP logging under non-debug modes
  iotests: Accommodate async QMP Exception classes
  python/aqmp: Remove scary message
  python, iotests: replace qmp with aqmp

 tests/qemu-iotests/socket_scm_helper.c| 136 -
 python/qemu/aqmp/__init__.py  |  14 ---
 python/qemu/aqmp/events.py|  15 ++-
 python/qemu/aqmp/legacy.py| 131 
 python/qemu/aqmp/models.py|  67 ---
 python/qemu/aqmp/qmp_client.py|  22 
 python/qemu/machine/machine.py| 139 +-
 python/qemu/machine/qtest.py  |   2 -
 python/qemu/qmp/__init__.py   |  25 ++--
 python/qemu/qmp/qmp_shell.py  |   1 -
 scripts/simplebench/bench_block_job.py|   3 +-
 tests/Makefile.include|   1 -
 tests/meson.build |   4 -
 tests/qemu-iotests/040|   7 +-
 tests/qemu-iotests/218|   2 +-
 tests/qemu-iotests/255|   2 +-
 tests/qemu-iotests/iotests.py |   5 +-
 tests/qemu-iotests/meson.build|   5 -
 tests/qemu-iotests/testenv.py |   8 +-
 tests/qemu-iotests/tests/mirror-top-perms |   6 +-
 20 files changed, 321 insertions(+), 274 deletions(-)
 delete mode 100644 tests/qemu-iotests/socket_scm_helper.c
 create mode 100644 python/qemu/aqmp/legacy.py
 delete mode 100644 tests/qemu-iotests/meson.build

-- 
2.31.1





[PATCH 03/15] python/aqmp: Return cleared events from EventListener.clear()

2021-09-16 Thread John Snow
This serves two purposes:

(1) It is now possible to discern whether or not clear() removed any
event(s) from the queue with absolute certainty, and

(2) It is now very easy to get a List of all pending events in one
chunk, which is useful for the sync bridge.

Signed-off-by: John Snow 
---
 python/qemu/aqmp/events.py | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/python/qemu/aqmp/events.py b/python/qemu/aqmp/events.py
index 271899f6b8..5f7150c78d 100644
--- a/python/qemu/aqmp/events.py
+++ b/python/qemu/aqmp/events.py
@@ -562,7 +562,7 @@ def empty(self) -> bool:
 """
 return self._queue.empty()
 
-def clear(self) -> None:
+def clear(self) -> List[Message]:
 """
 Clear this listener of all pending events.
 
@@ -570,17 +570,22 @@ def clear(self) -> None:
 pending FIFO queue synchronously. It can be also be used to
 manually clear any pending events, if desired.
 
+:return: The cleared events, if any.
+
 .. warning::
 Take care when discarding events. Cleared events will be
 silently tossed on the floor. All events that were ever
 accepted by this listener are visible in `history()`.
 """
+events = []
 while True:
 try:
-self._queue.get_nowait()
+events.append(self._queue.get_nowait())
 except asyncio.QueueEmpty:
 break
 
+return events
+
 def __aiter__(self) -> AsyncIterator[Message]:
 return self
 
-- 
2.31.1




Re: [RFC PATCH 0/4] block layer: split block APIs in graph and I/O

2021-09-16 Thread Paolo Bonzini
I think either -global or -global-state.

Paolo


Il gio 16 set 2021, 16:03 Emanuele Giuseppe Esposito 
ha scritto:

>
>
> On 15/09/2021 16:43, Stefan Hajnoczi wrote:
> > On Wed, Sep 15, 2021 at 02:11:41PM +0200, Paolo Bonzini wrote:
> >> On 13/09/21 15:10, Stefan Hajnoczi wrote:
> >>> On Wed, Sep 08, 2021 at 09:10:17AM -0400, Emanuele Giuseppe Esposito
> wrote:
>  Currently, block layer APIs like block-backend.h contain a mix of
>  functions that are either running in the main loop and under the
>  BQL, or are thread-safe functions and run in iothreads performing I/O.
>  The functions running under BQL also take care of modifying the
>  block graph, by using drain and/or aio_context_acquire/release.
>  This makes it very confusing to understand where each function
>  runs, and what assumptions it provided with regards to thread
>  safety.
> 
>  We call the functions running under BQL "graph API", and
>  distinguish them from the thread-safe "I/O API".
> >>>
> >>> Maybe "BQL" is clearer than "graph" because not all functions
> classified
> >>> as "graph" need to traverse/modify the graph.
> >>
> >> Bikeshedding, I like it! :)
> >>
> >> ... on the other hand, qemu-storage-daemon does not have a BQL (see
> patch
> >> 1); "graph API" functions run from the main (monitor) thread.
> >>
> >> The characteristic of the "graph API" is that they affect global state,
> so
> >> another possibility could be "global state API".  But is there any
> global
> >> state apart from the BlockDriverState graph and the associated
> >> BlockBackends?
> >
> > I would be happy with that name too.
> >
>
> Sounds good to me too, thanks.
> One more minor cosmetic thing: should I name the header
> block-backend-global-state.h or using block-backend-gs.h is
> straightforward enough?
>
> Thank you,
> Emanuele
>
>


Re: [PATCH RFC 02/13] hw/nvme: move zns helpers and types into zoned.h

2021-09-16 Thread Klaus Jensen
On Sep 16 09:06, Keith Busch wrote:
> On Tue, Sep 14, 2021 at 10:37:26PM +0200, Klaus Jensen wrote:
> > From: Klaus Jensen 
> > 
> > Move ZNS related helpers and types into zoned.h. Use a common prefix
> > (nvme_zoned or nvme_ns_zoned) for zns related functions.
> 
> Just a nitpicks on the naming, you can feel free to ignore.
> 
> Since we're within NVMe specific protocol here, using that terminology
> should be fine. I prefer "nvme_zns_" for the prefix.
> 
> And for function names like "nvme_zoned_zs()", the "zs" abbreviation
> expands to "zone_state", so "zone" is redunant. I think
> "nvme_zns_state()" is a good name for that one.

Naming is hard, so nitpicks are appreciated! I see your points, I'll fix
it up :)


signature.asc
Description: PGP signature


Re: [PATCH RFC 00/13] hw/nvme: experimental user-creatable objects

2021-09-16 Thread Klaus Jensen
On Sep 16 14:41, Kevin Wolf wrote:
> Am 14.09.2021 um 22:37 hat Klaus Jensen geschrieben:
> > From: Klaus Jensen 
> > 
> > Hi,
> > 
> > This is an attempt at adressing a bunch of issues that have presented
> > themselves since we added subsystem support. It's been brewing for a
> > while now.
> > 
> > Fundamentally, I've come to the conclusion that modeling namespaces and
> > subsystems as "devices" is wrong. They should have been user-creatable
> > objects. We've run into multiple issues with wrt. hotplugging due to how
> > namespaces hook up to the controller with a bus. The bus-based design
> > made a lot of sense when we didn't have subsystem support and it follows
> > the design of hw/scsi. But, the problem here is that the bus-based
> > design dictates a one parent relationship, and with shared namespaces,
> > that is just not true. If the namespaces are considered to have a single
> > parent, that parent is the subsystem, not any specific controller.
> > 
> > This series adds a set of experimental user-creatable objects:
> > 
> >   -object x-nvme-subsystem
> >   -object x-nvme-ns-nvm
> >   -object x-nvme-ns-zoned
> > 
> > It also adds a new controller device (-device x-nvme-ctrl) that supports
> > these new objects (and gets rid of a bunch of deprecated and confusing
> > parameters). This new approach has a bunch of benefits (other than just
> > fixing the hotplugging issues properly) - we also get support for some
> > nice introspection through some new dynamic properties:
> > 
> >   (qemu) qom-get /machine/peripheral/nvme-ctrl-1 attached-namespaces
> >   [
> >   "/objects/nvm-1",
> >   "/objects/zns-1"
> >   ]
> > 
> >   (qemu) qom-list /objects/zns-1
> >   type (string)
> >   subsys (link)
> >   nsid (uint32)
> >   uuid (string)
> >   attached-ctrls (str)
> >   eui64 (string)
> >   blockdev (string)
> >   pi-first (bool)
> >   pi-type (NvmeProtInfoType)
> >   extended-lba (bool)
> >   metadata-size (uint16)
> >   lba-size (size)
> >   zone-descriptor-extension-size (size)
> >   zone-cross-read (bool)
> >   zone-max-open (uint32)
> >   zone-capacity (size)
> >   zone-size (size)
> >   zone-max-active (uint32)
> > 
> >   (qemu) qom-get /objects/zns-1 pi-type
> >   "none"
> > 
> >   (qemu) qom-get /objects/zns-1 eui64
> >   "52:54:00:17:67:a0:40:15"
> > 
> >   (qemu) qom-get /objects/zns-1 zone-capacity
> >   12582912
> > 
> > Currently, there are no shortcuts, so you have to define the full
> > topology to get it up and running. Notice that the topology is explicit
> > (the 'subsys' and 'attached-ctrls' links). There are no 'nvme-bus'
> > anymore.
> > 
> >   -object x-nvme-subsystem,id=subsys0,subnqn=foo
> >   -device x-nvme-ctrl,id=nvme-ctrl-0,serial=foo,subsys=subsys0
> >   -device x-nvme-ctrl,id=nvme-ctrl-1,serial=bar,subsys=subsys0
> >   -drive  id=nvm-1,file=nvm-1.img,format=raw,if=none,discard=unmap
> >   -object 
> > x-nvme-ns-nvm,id=nvm-1,blockdev=nvm-1,nsid=1,subsys=subsys0,attached-ctrls=nvme-ctrl-1
> >   -drive  id=nvm-2,file=nvm-2.img,format=raw,if=none,discard=unmap
> >   -object 
> > x-nvme-ns-nvm,id=nvm-2,blockdev=nvm-2,nsid=2,subsys=subsys0,attached-ctrls=nvme-ctrl-0
> 
> I may be wrong here, but my first gut feeling when seeing this was that
> referencing the controller device in the namespace object feels
> backwards. Usually, we have objects that are created independently and
> then the devices reference them.
> 
> Your need to use a machine_done notifier is probably related to that,
> too, because it goes against the normal initialisation order, so you
> have to wait. Error handling also isn't really possible in the notifier
> any more, so this series seems to just print something to stderr, but
> ignore the error otherwise.
> 
> Did you consider passing a list of namespaces to the controller device
> instead?
> 
> I guess a problem that you have with both ways is that support for
> list options isn't great in QemuOpts, which is still used both for
> -object and -device in the system emulator...
> 

Heh. Exactly. The ability to better support lists with -object through
QAPI is why I did it like this...

Having the list of namespaces on the controller is preferable. I'll see
what I can come up with.

Thanks!


signature.asc
Description: PGP signature


Re: [PATCH RFC 02/13] hw/nvme: move zns helpers and types into zoned.h

2021-09-16 Thread Keith Busch
On Tue, Sep 14, 2021 at 10:37:26PM +0200, Klaus Jensen wrote:
> From: Klaus Jensen 
> 
> Move ZNS related helpers and types into zoned.h. Use a common prefix
> (nvme_zoned or nvme_ns_zoned) for zns related functions.

Just a nitpicks on the naming, you can feel free to ignore.

Since we're within NVMe specific protocol here, using that terminology
should be fine. I prefer "nvme_zns_" for the prefix.

And for function names like "nvme_zoned_zs()", the "zs" abbreviation
expands to "zone_state", so "zone" is redunant. I think
"nvme_zns_state()" is a good name for that one.



Re: [PATCH v3 02/16] iotests/mirror-top-perms: Adjust imports

2021-09-16 Thread Philippe Mathieu-Daudé
On 9/16/21 4:27 PM, John Snow wrote:
> On Thu, Sep 16, 2021 at 12:27 AM Philippe Mathieu-Daudé
> mailto:phi...@redhat.com>> wrote:
> 
> On 9/16/21 6:09 AM, John Snow wrote:
> > We need to import things from the qemu namespace; importing the
> > namespace alone doesn't bring the submodules with it -- unless someone
> > else (like iotests.py) imports them too.
> >
> > Adjust the imports.
> >
> > Signed-off-by: John Snow mailto:js...@redhat.com>>
> > ---
> >  tests/qemu-iotests/tests/mirror-top-perms | 7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/tests/qemu-iotests/tests/mirror-top-perms
> b/tests/qemu-iotests/tests/mirror-top-perms
> > index 2fc8dd66e0..de18182590 100755
> > --- a/tests/qemu-iotests/tests/mirror-top-perms
> > +++ b/tests/qemu-iotests/tests/mirror-top-perms
> > @@ -25,7 +25,8 @@ from iotests import qemu_img
> > 
> >  # Import qemu after iotests.py has amended sys.path
> >  # pylint: disable=wrong-import-order
> > -import qemu
> > +from qemu import qmp
> > +from qemu.machine import machine
> 
> Not straight-forward import name...
> 
> 
> You mean the 'qemu.machine.machine' path? If so, I agree. It will be
> fixed when I refactor QEMUMachine. A/QMP happens first.

Good news!




Re: [PATCH v3 01/16] python: Update for pylint 2.10

2021-09-16 Thread John Snow
On Thu, Sep 16, 2021 at 9:30 AM Alex Bennée  wrote:

>
> John Snow  writes:
>
> > A few new annoyances. Of note is the new warning for an unspecified
> > encoding when opening a text file, which actually does indicate a
> > potentially real problem; see
> > https://www.python.org/dev/peps/pep-0597/#motivation
> >
> > Use LC_CTYPE to determine an encoding to use for interpreting QEMU's
> > terminal output. Note that Python states: "language code and encoding
> > may be None if their values cannot be determined" -- use a platform
> > default as a backup.
> >
> > Signed-off-by: John Snow 
>
> I've cherry-picked this one into my testing/next because it fixes a
> failure but in case you get it merged elsewhere:
>
> Tested-by: Alex Bennée 
> Reviewed-by: Alex Bennée 
>
> --
> Alex Bennée
>
>
Dan had some comments here:
https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg04031.html

I'm waiting for his reply, but feel free to use this as a build fix in the
meantime for testing, I don't think it should hurt anything.


Re: [PATCH v3 02/16] iotests/mirror-top-perms: Adjust imports

2021-09-16 Thread John Snow
On Thu, Sep 16, 2021 at 12:27 AM Philippe Mathieu-Daudé 
wrote:

> On 9/16/21 6:09 AM, John Snow wrote:
> > We need to import things from the qemu namespace; importing the
> > namespace alone doesn't bring the submodules with it -- unless someone
> > else (like iotests.py) imports them too.
> >
> > Adjust the imports.
> >
> > Signed-off-by: John Snow 
> > ---
> >  tests/qemu-iotests/tests/mirror-top-perms | 7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/tests/qemu-iotests/tests/mirror-top-perms
> b/tests/qemu-iotests/tests/mirror-top-perms
> > index 2fc8dd66e0..de18182590 100755
> > --- a/tests/qemu-iotests/tests/mirror-top-perms
> > +++ b/tests/qemu-iotests/tests/mirror-top-perms
> > @@ -25,7 +25,8 @@ from iotests import qemu_img
> >
> >  # Import qemu after iotests.py has amended sys.path
> >  # pylint: disable=wrong-import-order
> > -import qemu
> > +from qemu import qmp
> > +from qemu.machine import machine
>
> Not straight-forward import name...
>
>
You mean the 'qemu.machine.machine' path? If so, I agree. It will be fixed
when I refactor QEMUMachine. A/QMP happens first.


> Reviewed-by: Philippe Mathieu-Daudé 
>
>


Re: [RFC PATCH 0/4] block layer: split block APIs in graph and I/O

2021-09-16 Thread Emanuele Giuseppe Esposito




On 15/09/2021 16:43, Stefan Hajnoczi wrote:

On Wed, Sep 15, 2021 at 02:11:41PM +0200, Paolo Bonzini wrote:

On 13/09/21 15:10, Stefan Hajnoczi wrote:

On Wed, Sep 08, 2021 at 09:10:17AM -0400, Emanuele Giuseppe Esposito wrote:

Currently, block layer APIs like block-backend.h contain a mix of
functions that are either running in the main loop and under the
BQL, or are thread-safe functions and run in iothreads performing I/O.
The functions running under BQL also take care of modifying the
block graph, by using drain and/or aio_context_acquire/release.
This makes it very confusing to understand where each function
runs, and what assumptions it provided with regards to thread
safety.

We call the functions running under BQL "graph API", and
distinguish them from the thread-safe "I/O API".


Maybe "BQL" is clearer than "graph" because not all functions classified
as "graph" need to traverse/modify the graph.


Bikeshedding, I like it! :)

... on the other hand, qemu-storage-daemon does not have a BQL (see patch
1); "graph API" functions run from the main (monitor) thread.

The characteristic of the "graph API" is that they affect global state, so
another possibility could be "global state API".  But is there any global
state apart from the BlockDriverState graph and the associated
BlockBackends?


I would be happy with that name too.



Sounds good to me too, thanks.
One more minor cosmetic thing: should I name the header 
block-backend-global-state.h or using block-backend-gs.h is 
straightforward enough?


Thank you,
Emanuele




Re: [PATCH v3 01/16] python: Update for pylint 2.10

2021-09-16 Thread Alex Bennée


John Snow  writes:

> A few new annoyances. Of note is the new warning for an unspecified
> encoding when opening a text file, which actually does indicate a
> potentially real problem; see
> https://www.python.org/dev/peps/pep-0597/#motivation
>
> Use LC_CTYPE to determine an encoding to use for interpreting QEMU's
> terminal output. Note that Python states: "language code and encoding
> may be None if their values cannot be determined" -- use a platform
> default as a backup.
>
> Signed-off-by: John Snow 

I've cherry-picked this one into my testing/next because it fixes a
failure but in case you get it merged elsewhere:

Tested-by: Alex Bennée 
Reviewed-by: Alex Bennée 

-- 
Alex Bennée



Re: [PATCH 5/8] qdev: improve find_device_state() to distinguish simple not found case

2021-09-16 Thread Vladimir Sementsov-Ogievskiy

16.09.2021 13:48, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


We'll need this for realizing qdev_find_child() in the next commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  softmmu/qdev-monitor.c | 48 +-
  1 file changed, 33 insertions(+), 15 deletions(-)

diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
index 721dec2d82..0117989009 100644
--- a/softmmu/qdev-monitor.c
+++ b/softmmu/qdev-monitor.c
@@ -819,7 +819,12 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, 
Error **errp)
  object_unref(OBJECT(dev));
  }
  
-static DeviceState *find_device_state(const char *id, Error **errp)

+/*
+ * Returns: 1 when found, @dev set
+ *  0 not found, @dev and @errp untouched
+ * <0 error, or id is ambiguous, @errp set
+ */
+static int find_device_state(const char *id, DeviceState **dev, Error **errp)
  {
  Object *obj;
  
@@ -835,17 +840,16 @@ static DeviceState *find_device_state(const char *id, Error **errp)

  }
  
  if (!obj) {

-error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
-  "Device '%s' not found", id);
-return NULL;
+return 0;
  }
  
  if (!object_dynamic_cast(obj, TYPE_DEVICE)) {

  error_setg(errp, "%s is not a hotpluggable device", id);
-return NULL;
+return -EINVAL;
  }
  
-return DEVICE(obj);

+*dev = DEVICE(obj);
+return 1;
  }
  
  void qdev_unplug(DeviceState *dev, Error **errp)

@@ -894,16 +898,25 @@ void qdev_unplug(DeviceState *dev, Error **errp)
  
  void qmp_device_del(const char *id, Error **errp)

  {
-DeviceState *dev = find_device_state(id, errp);
-if (dev != NULL) {
-if (dev->pending_deleted_event) {
-error_setg(errp, "Device %s is already in the "
- "process of unplug", id);
-return;
+int ret;
+DeviceState *dev;
+
+ret = find_device_state(id, , errp);
+if (ret <= 0) {
+if (ret == 0) {
+error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+  "Device '%s' not found", id);
  }
+return;
+}
  
-qdev_unplug(dev, errp);

+if (dev->pending_deleted_event) {
+error_setg(errp, "Device %s is already in the "
+ "process of unplug", id);
+return;
  }
+
+qdev_unplug(dev, errp);
  }
  
  void hmp_device_add(Monitor *mon, const QDict *qdict)

@@ -925,11 +938,16 @@ void hmp_device_del(Monitor *mon, const QDict *qdict)
  
  BlockBackend *blk_by_qdev_id(const char *id, Error **errp)

  {
+int ret;
  DeviceState *dev;
  BlockBackend *blk;
  
-dev = find_device_state(id, errp);

-if (dev == NULL) {
+ret = find_device_state(id, , errp);
+if (ret <= 0) {
+if (ret == 0) {
+error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+  "Device '%s' not found", id);
+}
  return NULL;
  }


Awkward.

Before, find_device_state() either finds something (and returns it) or
doesn't (and sets @errp).

Afterward, it can fail to find in two ways, and only one of it sets
@errp.  The existing callers laboriously fuse the two back together.
The next commit adds a caller that doesn't.

Failure modes that need to be handled differently are often the result
of a function doing too much.  Let's have a closer look at this one
before the patch:

 static DeviceState *find_device_state(const char *id, Error **errp)
 {
 Object *obj;

 if (id[0] == '/') {
 obj = object_resolve_path(id, NULL);

This interprets @id as a QOM path, and tries to resolve it.

On failure, @obj becomes NULL.  On success, it points to an object of
arbitrary type.

 } else {
 char *root_path = object_get_canonical_path(qdev_get_peripheral());
 char *path = g_strdup_printf("%s/%s", root_path, id);

 g_free(root_path);
 obj = object_resolve_path_type(path, TYPE_DEVICE, NULL);
 g_free(path);

This interprets @id as qdev ID, maps it to a QOM path, and tries to
resolve it to a TYPE_DEVICE.  Fails when the path doesn't resolve, and
when it resolves to something that isn't a TYPE_DEVICE.  The latter
can't happen as long as we put only devices under /machine/peripheral/.

On failure, @obj becomes NULL.  On success, it points to a TYPE_DEVICE
object.

 }

 if (!obj) {
 error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
   "Device '%s' not found", id);
 return NULL;
 }

 if (!object_dynamic_cast(obj, TYPE_DEVICE)) {
 error_setg(errp, "%s is not a hotpluggable device", id);
 return NULL;
 }

Unclean.

If we somehow ended up with a non-device /machine/peripheral/foo, then
find_device_state("foo", errp) would fail the first way, but
find_device_state("/machine/peripheral/foo", errp) would fail the 

Re: [PATCH RFC 00/13] hw/nvme: experimental user-creatable objects

2021-09-16 Thread Kevin Wolf
Am 14.09.2021 um 22:37 hat Klaus Jensen geschrieben:
> From: Klaus Jensen 
> 
> Hi,
> 
> This is an attempt at adressing a bunch of issues that have presented
> themselves since we added subsystem support. It's been brewing for a
> while now.
> 
> Fundamentally, I've come to the conclusion that modeling namespaces and
> subsystems as "devices" is wrong. They should have been user-creatable
> objects. We've run into multiple issues with wrt. hotplugging due to how
> namespaces hook up to the controller with a bus. The bus-based design
> made a lot of sense when we didn't have subsystem support and it follows
> the design of hw/scsi. But, the problem here is that the bus-based
> design dictates a one parent relationship, and with shared namespaces,
> that is just not true. If the namespaces are considered to have a single
> parent, that parent is the subsystem, not any specific controller.
> 
> This series adds a set of experimental user-creatable objects:
> 
>   -object x-nvme-subsystem
>   -object x-nvme-ns-nvm
>   -object x-nvme-ns-zoned
> 
> It also adds a new controller device (-device x-nvme-ctrl) that supports
> these new objects (and gets rid of a bunch of deprecated and confusing
> parameters). This new approach has a bunch of benefits (other than just
> fixing the hotplugging issues properly) - we also get support for some
> nice introspection through some new dynamic properties:
> 
>   (qemu) qom-get /machine/peripheral/nvme-ctrl-1 attached-namespaces
>   [
>   "/objects/nvm-1",
>   "/objects/zns-1"
>   ]
> 
>   (qemu) qom-list /objects/zns-1
>   type (string)
>   subsys (link)
>   nsid (uint32)
>   uuid (string)
>   attached-ctrls (str)
>   eui64 (string)
>   blockdev (string)
>   pi-first (bool)
>   pi-type (NvmeProtInfoType)
>   extended-lba (bool)
>   metadata-size (uint16)
>   lba-size (size)
>   zone-descriptor-extension-size (size)
>   zone-cross-read (bool)
>   zone-max-open (uint32)
>   zone-capacity (size)
>   zone-size (size)
>   zone-max-active (uint32)
> 
>   (qemu) qom-get /objects/zns-1 pi-type
>   "none"
> 
>   (qemu) qom-get /objects/zns-1 eui64
>   "52:54:00:17:67:a0:40:15"
> 
>   (qemu) qom-get /objects/zns-1 zone-capacity
>   12582912
> 
> Currently, there are no shortcuts, so you have to define the full
> topology to get it up and running. Notice that the topology is explicit
> (the 'subsys' and 'attached-ctrls' links). There are no 'nvme-bus'
> anymore.
> 
>   -object x-nvme-subsystem,id=subsys0,subnqn=foo
>   -device x-nvme-ctrl,id=nvme-ctrl-0,serial=foo,subsys=subsys0
>   -device x-nvme-ctrl,id=nvme-ctrl-1,serial=bar,subsys=subsys0
>   -drive  id=nvm-1,file=nvm-1.img,format=raw,if=none,discard=unmap
>   -object 
> x-nvme-ns-nvm,id=nvm-1,blockdev=nvm-1,nsid=1,subsys=subsys0,attached-ctrls=nvme-ctrl-1
>   -drive  id=nvm-2,file=nvm-2.img,format=raw,if=none,discard=unmap
>   -object 
> x-nvme-ns-nvm,id=nvm-2,blockdev=nvm-2,nsid=2,subsys=subsys0,attached-ctrls=nvme-ctrl-0

I may be wrong here, but my first gut feeling when seeing this was that
referencing the controller device in the namespace object feels
backwards. Usually, we have objects that are created independently and
then the devices reference them.

Your need to use a machine_done notifier is probably related to that,
too, because it goes against the normal initialisation order, so you
have to wait. Error handling also isn't really possible in the notifier
any more, so this series seems to just print something to stderr, but
ignore the error otherwise.

Did you consider passing a list of namespaces to the controller device
instead?

I guess a problem that you have with both ways is that support for list
options isn't great in QemuOpts, which is still used both for -object
and -device in the system emulator...

Kevin




Re: [PATCH v5 0/6] block/rbd: migrate to coroutines and add write zeroes support

2021-09-16 Thread Peter Lieven

Am 09.07.21 um 12:21 schrieb Kevin Wolf:

Am 08.07.2021 um 20:23 hat Peter Lieven geschrieben:

Am 08.07.2021 um 14:18 schrieb Kevin Wolf :

Am 07.07.2021 um 20:13 hat Peter Lieven geschrieben:

Am 06.07.2021 um 17:25 schrieb Kevin Wolf :
Am 06.07.2021 um 16:55 hat Peter Lieven geschrieben:

I will have a decent look after my vacation.

Sounds good, thanks. Enjoy your vacation!

As I had to fire up my laptop to look into another issue anyway, I
have sent two patches for updating MAINTAINERS and to fix the int vs.
bool mix for task->complete.

I think you need to reevaluate your definition of vacation. ;-)

Lets talk about this when the kids are grown up. Sometimes sending
patches can be quite relaxing :-)

Heh, fair enough. :-)


But thanks anyway.


As Paolos fix (5f50be9b5) is relatively new and there are maybe other
non obvious problems when removing the BH indirection and we are close
to soft freeze I would leave the BH removal change for 6.2.

Sure, code cleanups aren't urgent.

Isn’t the indirection also a slight performance drop?

Yeah, I guess technically it is, though I doubt it's measurable.



As promised I was trying to remove the indirection through the BH after Qemu 
6.1 release.

However, if I remove the BH I run into the following assertion while running 
some fio tests:


qemu-system-x86_64: ../block/block-backend.c:1197: blk_wait_while_drained: Assertion 
`blk->in_flight > 0' failed.


Any idea?


This is what I changed:


diff --git a/block/rbd.c b/block/rbd.c
index 3cb24f9981..bc1dbc20f7 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1063,13 +1063,6 @@ static int qemu_rbd_resize(BlockDriverState *bs, 
uint64_t size)
 return 0;
 }

-static void qemu_rbd_finish_bh(void *opaque)
-{
-    RBDTask *task = opaque;
-    task->complete = true;
-    aio_co_wake(task->co);
-}
-
 /*
  * This is the completion callback function for all rbd aio calls
  * started from qemu_rbd_start_co().
@@ -1083,8 +1076,8 @@ static void qemu_rbd_completion_cb(rbd_completion_t c, 
RBDTask *task)
 {
 task->ret = rbd_aio_get_return_value(c);
 rbd_aio_release(c);
-    aio_bh_schedule_oneshot(bdrv_get_aio_context(task->bs),
-    qemu_rbd_finish_bh, task);
+    task->complete = true;
+    aio_co_wake(task->co);
 }


Peter






[PATCH V3] block/rbd: implement bdrv_co_block_status

2021-09-16 Thread Peter Lieven
the qemu rbd driver currently lacks support for bdrv_co_block_status.
This results mainly in incorrect progress during block operations (e.g.
qemu-img convert with an rbd image as source).

This patch utilizes the rbd_diff_iterate2 call from librbd to detect
allocated and unallocated (all zero areas).

To avoid querying the ceph OSDs for the answer this is only done if
the image has the fast-diff feature which depends on the object-map and
exclusive-lock features. In this case it is guaranteed that the information
is present in memory in the librbd client and thus very fast.

If fast-diff is not available all areas are reported to be allocated
which is the current behaviour if bdrv_co_block_status is not implemented.

Signed-off-by: Peter Lieven 
---
V2->V3:
- check rbd_flags every time (they can change during runtime) [Ilya]
- also check for fast-diff invalid flag [Ilya]
- *map and *file cant be NULL [Ilya]
- set ret = BDRV_BLOCK_ZERO | BDRV_BLOCK_OFFSET_VALID in case of an
  unallocated area [Ilya]
- typo: catched -> caught [Ilya]
- changed wording about fast-diff, object-map and exclusive lock in
  commit msg [Ilya]

V1->V2:
- add commit comment [Stefano]
- use failed_post_open [Stefano]
- remove redundant assert [Stefano]
- add macro+comment for the magic -9000 value [Stefano]
- always set *file if its non NULL [Stefano]

 block/rbd.c | 126 
 1 file changed, 126 insertions(+)

diff --git a/block/rbd.c b/block/rbd.c
index dcf82b15b8..3cb24f9981 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1259,6 +1259,131 @@ static ImageInfoSpecific 
*qemu_rbd_get_specific_info(BlockDriverState *bs,
 return spec_info;
 }
 
+typedef struct rbd_diff_req {
+uint64_t offs;
+uint64_t bytes;
+int exists;
+} rbd_diff_req;
+
+/*
+ * rbd_diff_iterate2 allows to interrupt the exection by returning a negative
+ * value in the callback routine. Choose a value that does not conflict with
+ * an existing exitcode and return it if we want to prematurely stop the
+ * execution because we detected a change in the allocation status.
+ */
+#define QEMU_RBD_EXIT_DIFF_ITERATE2 -9000
+
+static int qemu_rbd_co_block_status_cb(uint64_t offs, size_t len,
+   int exists, void *opaque)
+{
+struct rbd_diff_req *req = opaque;
+
+assert(req->offs + req->bytes <= offs);
+
+if (req->exists && offs > req->offs + req->bytes) {
+/*
+ * we started in an allocated area and jumped over an unallocated area,
+ * req->bytes contains the length of the allocated area before the
+ * unallocated area. stop further processing.
+ */
+return QEMU_RBD_EXIT_DIFF_ITERATE2;
+}
+if (req->exists && !exists) {
+/*
+ * we started in an allocated area and reached a hole. req->bytes
+ * contains the length of the allocated area before the hole.
+ * stop further processing.
+ */
+return QEMU_RBD_EXIT_DIFF_ITERATE2;
+}
+if (!req->exists && exists && offs > req->offs) {
+/*
+ * we started in an unallocated area and hit the first allocated
+ * block. req->bytes must be set to the length of the unallocated area
+ * before the allocated area. stop further processing.
+ */
+req->bytes = offs - req->offs;
+return QEMU_RBD_EXIT_DIFF_ITERATE2;
+}
+
+/*
+ * assert that we caught all cases above and allocation state has not
+ * changed during callbacks.
+ */
+assert(exists == req->exists || !req->bytes);
+req->exists = exists;
+
+/*
+ * assert that we either return an unallocated block or have got callbacks
+ * for all allocated blocks present.
+ */
+assert(!req->exists || offs == req->offs + req->bytes);
+req->bytes = offs + len - req->offs;
+
+return 0;
+}
+
+static int coroutine_fn qemu_rbd_co_block_status(BlockDriverState *bs,
+ bool want_zero, int64_t 
offset,
+ int64_t bytes, int64_t *pnum,
+ int64_t *map,
+ BlockDriverState **file)
+{
+BDRVRBDState *s = bs->opaque;
+int ret, r;
+struct rbd_diff_req req = { .offs = offset };
+uint64_t features, flags;
+
+assert(offset + bytes <= s->image_size);
+
+/* default to all sectors allocated */
+ret = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
+*map = offset;
+*file = bs;
+*pnum = bytes;
+
+/* check if RBD image supports fast-diff */
+r = rbd_get_features(s->image, );
+if (r < 0) {
+goto out;
+}
+if (!(features & RBD_FEATURE_FAST_DIFF)) {
+goto out;
+}
+
+/* check if RBD fast-diff result is valid */
+r = rbd_get_flags(s->image, );
+if (r < 0) {
+goto out;
+}
+if (flags & RBD_FLAG_FAST_DIFF_INVALID) {
+

Re: [PATCH 5/8] qdev: improve find_device_state() to distinguish simple not found case

2021-09-16 Thread Markus Armbruster
Vladimir Sementsov-Ogievskiy  writes:

> We'll need this for realizing qdev_find_child() in the next commit.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  softmmu/qdev-monitor.c | 48 +-
>  1 file changed, 33 insertions(+), 15 deletions(-)
>
> diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c
> index 721dec2d82..0117989009 100644
> --- a/softmmu/qdev-monitor.c
> +++ b/softmmu/qdev-monitor.c
> @@ -819,7 +819,12 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, 
> Error **errp)
>  object_unref(OBJECT(dev));
>  }
>  
> -static DeviceState *find_device_state(const char *id, Error **errp)
> +/*
> + * Returns: 1 when found, @dev set
> + *  0 not found, @dev and @errp untouched
> + * <0 error, or id is ambiguous, @errp set
> + */
> +static int find_device_state(const char *id, DeviceState **dev, Error **errp)
>  {
>  Object *obj;
>  
> @@ -835,17 +840,16 @@ static DeviceState *find_device_state(const char *id, 
> Error **errp)
>  }
>  
>  if (!obj) {
> -error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
> -  "Device '%s' not found", id);
> -return NULL;
> +return 0;
>  }
>  
>  if (!object_dynamic_cast(obj, TYPE_DEVICE)) {
>  error_setg(errp, "%s is not a hotpluggable device", id);
> -return NULL;
> +return -EINVAL;
>  }
>  
> -return DEVICE(obj);
> +*dev = DEVICE(obj);
> +return 1;
>  }
>  
>  void qdev_unplug(DeviceState *dev, Error **errp)
> @@ -894,16 +898,25 @@ void qdev_unplug(DeviceState *dev, Error **errp)
>  
>  void qmp_device_del(const char *id, Error **errp)
>  {
> -DeviceState *dev = find_device_state(id, errp);
> -if (dev != NULL) {
> -if (dev->pending_deleted_event) {
> -error_setg(errp, "Device %s is already in the "
> - "process of unplug", id);
> -return;
> +int ret;
> +DeviceState *dev;
> +
> +ret = find_device_state(id, , errp);
> +if (ret <= 0) {
> +if (ret == 0) {
> +error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
> +  "Device '%s' not found", id);
>  }
> +return;
> +}
>  
> -qdev_unplug(dev, errp);
> +if (dev->pending_deleted_event) {
> +error_setg(errp, "Device %s is already in the "
> + "process of unplug", id);
> +return;
>  }
> +
> +qdev_unplug(dev, errp);
>  }
>  
>  void hmp_device_add(Monitor *mon, const QDict *qdict)
> @@ -925,11 +938,16 @@ void hmp_device_del(Monitor *mon, const QDict *qdict)
>  
>  BlockBackend *blk_by_qdev_id(const char *id, Error **errp)
>  {
> +int ret;
>  DeviceState *dev;
>  BlockBackend *blk;
>  
> -dev = find_device_state(id, errp);
> -if (dev == NULL) {
> +ret = find_device_state(id, , errp);
> +if (ret <= 0) {
> +if (ret == 0) {
> +error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
> +  "Device '%s' not found", id);
> +}
>  return NULL;
>  }

Awkward.

Before, find_device_state() either finds something (and returns it) or
doesn't (and sets @errp).

Afterward, it can fail to find in two ways, and only one of it sets
@errp.  The existing callers laboriously fuse the two back together.
The next commit adds a caller that doesn't.

Failure modes that need to be handled differently are often the result
of a function doing too much.  Let's have a closer look at this one
before the patch:

static DeviceState *find_device_state(const char *id, Error **errp)
{
Object *obj;

if (id[0] == '/') {
obj = object_resolve_path(id, NULL);

This interprets @id as a QOM path, and tries to resolve it.

On failure, @obj becomes NULL.  On success, it points to an object of
arbitrary type.

} else {
char *root_path = object_get_canonical_path(qdev_get_peripheral());
char *path = g_strdup_printf("%s/%s", root_path, id);

g_free(root_path);
obj = object_resolve_path_type(path, TYPE_DEVICE, NULL);
g_free(path);

This interprets @id as qdev ID, maps it to a QOM path, and tries to
resolve it to a TYPE_DEVICE.  Fails when the path doesn't resolve, and
when it resolves to something that isn't a TYPE_DEVICE.  The latter
can't happen as long as we put only devices under /machine/peripheral/.

On failure, @obj becomes NULL.  On success, it points to a TYPE_DEVICE
object.

}

if (!obj) {
error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
  "Device '%s' not found", id);
return NULL;
}

if (!object_dynamic_cast(obj, TYPE_DEVICE)) {
error_setg(errp, "%s is not a hotpluggable device", id);
return NULL;
}

Unclean.

If we somehow ended up with a non-device /machine/peripheral/foo, then
find_device_state("foo", errp) would fail the 

Re: [PATCH 2/8] block: add BlockParentClass class

2021-09-16 Thread Vladimir Sementsov-Ogievskiy

16.09.2021 11:34, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


Add a class that will unify block parents for blockdev-replace
functionality we are going to add.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  include/block/block-parent.h | 32 +
  block/block-parent.c | 66 
  block/meson.build|  1 +
  3 files changed, 99 insertions(+)
  create mode 100644 include/block/block-parent.h
  create mode 100644 block/block-parent.c

diff --git a/include/block/block-parent.h b/include/block/block-parent.h
new file mode 100644
index 00..bd9c9d91e6
--- /dev/null
+++ b/include/block/block-parent.h
@@ -0,0 +1,32 @@
+/*
+ * Block Parent class
+ *
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Authors:
+ *  Vladimir Sementsov-Ogievskiy 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef BLOCK_PARENT_H
+#define BLOCK_PARENT_H
+
+#include "block/block.h"
+
+typedef struct BlockParentClass {
+const char *name;
+
+int (*find_child)(const char *parent_id, const char *child_name,
+  BlockDriverState *child_bs, BdrvChild **child,
+  Error **errp);


Callbacks should come with a contract.


will add




+QTAILQ_ENTRY(BlockParentClass) next;
+} BlockParentClass;
+
+void block_parent_class_register(BlockParentClass *cls);
+
+BdrvChild *block_find_child(const char *parent_id, const char *child_name,
+BlockDriverState *child_bs, Error **errp);
+
+#endif /* BLOCK_PARENT_H */
diff --git a/block/block-parent.c b/block/block-parent.c
new file mode 100644
index 00..73b6026c42
--- /dev/null
+++ b/block/block-parent.c
@@ -0,0 +1,66 @@
+/*
+ * Block Parent class
+ *
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Authors:
+ *  Vladimir Sementsov-Ogievskiy 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "block/block-parent.h"
+#include "qapi/error.h"
+
+static QTAILQ_HEAD(, BlockParentClass) block_parent_classes =
+QTAILQ_HEAD_INITIALIZER(block_parent_classes);
+
+void block_parent_class_register(BlockParentClass *cls)
+{
+QTAILQ_INSERT_HEAD(_parent_classes, cls, next);
+}
+
+BdrvChild *block_find_child(const char *parent_id, const char *child_name,
+BlockDriverState *child_bs, Error **errp)
+{
+BdrvChild *found_child = NULL;
+BlockParentClass *found_cls = NULL, *cls;
+
+QTAILQ_FOREACH(cls, _parent_classes, next) {
+int ret;
+BdrvChild *c;
+
+/*
+ * Note that .find_child must fail if parent is found but doesn't have
+ * corresponding child.
+ */
+ret = cls->find_child(parent_id, child_name, child_bs, , errp);
+if (ret < 0) {
+return NULL;
+}
+if (ret == 0) {
+continue;
+}
+
+if (!found_child) {
+found_cls = cls;
+found_child = c;
+continue;
+}
+
+error_setg(errp, "{%s, %s} parent-child pair is ambiguous: it match "
+   "both %s and %s", parent_id, child_name, found_cls->name,
+   cls->name);
+return NULL;
+}
+
+if (!found_child) {
+error_setg(errp, "{%s, %s} parent-child pair not found", parent_id,
+   child_name);
+return NULL;
+}
+
+return found_child;
+}


Especially when the callback can return NULL with and without setting an
error!


Hmm. Callback returns int.

And this block_find_child() function return NULL only together with errp set.




diff --git a/block/meson.build b/block/meson.build
index 0450914c7a..5200e0ffce 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -10,6 +10,7 @@ block_ss.add(files(
'blkverify.c',
'block-backend.c',
'block-copy.c',
+  'block-parent.c',
'commit.c',
'copy-on-read.c',
'preallocate.c',





--
Best regards,
Vladimir



Re: [PULL 00/32] Block patches

2021-09-16 Thread Peter Maydell
On Wed, 15 Sept 2021 at 18:53, Hanna Reitz  wrote:
>
> The following changes since commit 0b6206b9c6825619cd721085fe082d7a0abc9af4:
>
>   Merge remote-tracking branch 'remotes/rth-gitlab/tags/pull-tcg-20210914-4' 
> into staging (2021-09-15 13:27:49 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/XanClic/qemu.git tags/pull-block-2021-09-15
>
> for you to fetch changes up to 1899bf47375ad40555dcdff12ba49b4b8b82df38:
>
>   qemu-img: Add -F shorthand to convert (2021-09-15 18:42:38 +0200)
>
> 
> Block patches:
> - Block-status cache for data regions
> - qcow2 optimization (when using subclusters)
> - iotests delinting, and let 297 (lint checker) cover named iotests
> - qcow2 check improvements
> - Added -F (target backing file format) option to qemu-img convert
> - Mirror job fix
> - Fix for when a migration is initiated while a backup job runs
> - Fix for uncached qemu-img convert to a volume with 4k sectors (for an
>   unaligned image)
> - Minor gluster driver fix
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/6.2
for any user-visible changes.

-- PMM



Re: [PATCH 2/8] block: add BlockParentClass class

2021-09-16 Thread Markus Armbruster
Vladimir Sementsov-Ogievskiy  writes:

> Add a class that will unify block parents for blockdev-replace
> functionality we are going to add.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/block-parent.h | 32 +
>  block/block-parent.c | 66 
>  block/meson.build|  1 +
>  3 files changed, 99 insertions(+)
>  create mode 100644 include/block/block-parent.h
>  create mode 100644 block/block-parent.c
>
> diff --git a/include/block/block-parent.h b/include/block/block-parent.h
> new file mode 100644
> index 00..bd9c9d91e6
> --- /dev/null
> +++ b/include/block/block-parent.h
> @@ -0,0 +1,32 @@
> +/*
> + * Block Parent class
> + *
> + * Copyright (c) 2021 Virtuozzo International GmbH.
> + *
> + * Authors:
> + *  Vladimir Sementsov-Ogievskiy 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef BLOCK_PARENT_H
> +#define BLOCK_PARENT_H
> +
> +#include "block/block.h"
> +
> +typedef struct BlockParentClass {
> +const char *name;
> +
> +int (*find_child)(const char *parent_id, const char *child_name,
> +  BlockDriverState *child_bs, BdrvChild **child,
> +  Error **errp);

Callbacks should come with a contract.

> +QTAILQ_ENTRY(BlockParentClass) next;
> +} BlockParentClass;
> +
> +void block_parent_class_register(BlockParentClass *cls);
> +
> +BdrvChild *block_find_child(const char *parent_id, const char *child_name,
> +BlockDriverState *child_bs, Error **errp);
> +
> +#endif /* BLOCK_PARENT_H */
> diff --git a/block/block-parent.c b/block/block-parent.c
> new file mode 100644
> index 00..73b6026c42
> --- /dev/null
> +++ b/block/block-parent.c
> @@ -0,0 +1,66 @@
> +/*
> + * Block Parent class
> + *
> + * Copyright (c) 2021 Virtuozzo International GmbH.
> + *
> + * Authors:
> + *  Vladimir Sementsov-Ogievskiy 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "block/block-parent.h"
> +#include "qapi/error.h"
> +
> +static QTAILQ_HEAD(, BlockParentClass) block_parent_classes =
> +QTAILQ_HEAD_INITIALIZER(block_parent_classes);
> +
> +void block_parent_class_register(BlockParentClass *cls)
> +{
> +QTAILQ_INSERT_HEAD(_parent_classes, cls, next);
> +}
> +
> +BdrvChild *block_find_child(const char *parent_id, const char *child_name,
> +BlockDriverState *child_bs, Error **errp)
> +{
> +BdrvChild *found_child = NULL;
> +BlockParentClass *found_cls = NULL, *cls;
> +
> +QTAILQ_FOREACH(cls, _parent_classes, next) {
> +int ret;
> +BdrvChild *c;
> +
> +/*
> + * Note that .find_child must fail if parent is found but doesn't 
> have
> + * corresponding child.
> + */
> +ret = cls->find_child(parent_id, child_name, child_bs, , errp);
> +if (ret < 0) {
> +return NULL;
> +}
> +if (ret == 0) {
> +continue;
> +}
> +
> +if (!found_child) {
> +found_cls = cls;
> +found_child = c;
> +continue;
> +}
> +
> +error_setg(errp, "{%s, %s} parent-child pair is ambiguous: it match "
> +   "both %s and %s", parent_id, child_name, found_cls->name,
> +   cls->name);
> +return NULL;
> +}
> +
> +if (!found_child) {
> +error_setg(errp, "{%s, %s} parent-child pair not found", parent_id,
> +   child_name);
> +return NULL;
> +}
> +
> +return found_child;
> +}

Especially when the callback can return NULL with and without setting an
error!

> diff --git a/block/meson.build b/block/meson.build
> index 0450914c7a..5200e0ffce 100644
> --- a/block/meson.build
> +++ b/block/meson.build
> @@ -10,6 +10,7 @@ block_ss.add(files(
>'blkverify.c',
>'block-backend.c',
>'block-copy.c',
> +  'block-parent.c',
>'commit.c',
>'copy-on-read.c',
>'preallocate.c',




Re: [PATCH] qemu-storage-daemon: Only display FUSE help when FUSE is built-in

2021-09-16 Thread Kevin Wolf
Am 15.09.2021 um 23:36 hat Philippe Mathieu-Daudé geschrieben:
> ping & Cc'ing qemu-trivial@ (reviewed twice) ...
> 
> On 8/16/21 8:04 PM, Philippe Mathieu-Daudé wrote:
> > When configuring QEMU with --disable-fuse, the qemu-storage-daemon
> > still reports FUSE command line options in its help:
> > 
> >   $ qemu-storage-daemon -h
> >   Usage: qemu-storage-daemon [options]
> >   QEMU storage daemon
> > 
> > --export [type=]fuse,id=,node-name=,mountpoint=
> >  [,growable=on|off][,writable=on|off]
> >export the specified block node over FUSE
> > 
> > Remove this help message when FUSE is disabled, to avoid:
> > 
> >   $ qemu-storage-daemon --export fuse
> >   qemu-storage-daemon: --export fuse: Invalid parameter 'fuse'
> > 
> > Reported-by: Qing Wang 
> > Signed-off-by: Philippe Mathieu-Daudé 

Thanks, applied to the block branch.

Kevin