Re: [PULL 1/1] hw/ufs: Fix buffer overflow bug

2024-04-29 Thread Thomas Huth

On 30/04/2024 06.32, Thomas Huth wrote:

On 30/04/2024 02.17, Richard Henderson wrote:

On 4/28/24 20:25, Jeuk Kim wrote:

From: Jeuk Kim 

It fixes the buffer overflow vulnerability in the ufs device.
The bug was detected by sanitizers.

You can reproduce it by:

cat << EOF |\
qemu-system-x86_64 \
-display none -machine accel=qtest -m 512M -M q35 -nodefaults -drive \
file=null-co://,if=none,id=disk0 -device ufs,id=ufs_bus -device \
ufs-lu,drive=disk0,bus=ufs_bus -qtest stdio
outl 0xcf8 0x8810
outl 0xcfc 0xe000
outl 0xcf8 0x8804
outw 0xcfc 0x06
write 0xe058 0x1 0xa7
write 0xa 0x1 0x50
EOF

Resolves: #2299
Fixes: 329f16624499 ("hw/ufs: Support for Query Transfer Requests")
Reported-by: Zheyu Ma 
Signed-off-by: Jeuk Kim 
---
  hw/ufs/ufs.c | 8 
  1 file changed, 8 insertions(+)


For some reason this appears to cause failures on s390x:

   https://gitlab.com/qemu-project/qemu/-/jobs/6740883283

All of the timeouts are new with this patch alone applied,
and go away when reverted.

I wasn't aware that these tests used ufs, but I have no
other explanation...


I don't know for sure, but the test failure might instead be related to the 
problem that gets fixed by 
https://lore.kernel.org/qemu-devel/20240429075908.36302-1-th...@redhat.com/ 
... I'm preparing a pull request for that fix right now, so maybe you could 
try this ufs pull request afterwards again to see whether the problem is fixed?


Hmm, thinking about it twice, it cannot be the reason: That bug affects 
aarch64/arm only, and in above CI run, some other targets were failing. So 
the problem must be something else, indeed.


 Thomas




Re: [PULL 1/1] hw/ufs: Fix buffer overflow bug

2024-04-29 Thread Thomas Huth

On 30/04/2024 02.17, Richard Henderson wrote:

On 4/28/24 20:25, Jeuk Kim wrote:

From: Jeuk Kim 

It fixes the buffer overflow vulnerability in the ufs device.
The bug was detected by sanitizers.

You can reproduce it by:

cat << EOF |\
qemu-system-x86_64 \
-display none -machine accel=qtest -m 512M -M q35 -nodefaults -drive \
file=null-co://,if=none,id=disk0 -device ufs,id=ufs_bus -device \
ufs-lu,drive=disk0,bus=ufs_bus -qtest stdio
outl 0xcf8 0x8810
outl 0xcfc 0xe000
outl 0xcf8 0x8804
outw 0xcfc 0x06
write 0xe058 0x1 0xa7
write 0xa 0x1 0x50
EOF

Resolves: #2299
Fixes: 329f16624499 ("hw/ufs: Support for Query Transfer Requests")
Reported-by: Zheyu Ma 
Signed-off-by: Jeuk Kim 
---
  hw/ufs/ufs.c | 8 
  1 file changed, 8 insertions(+)


For some reason this appears to cause failures on s390x:

   https://gitlab.com/qemu-project/qemu/-/jobs/6740883283

All of the timeouts are new with this patch alone applied,
and go away when reverted.

I wasn't aware that these tests used ufs, but I have no
other explanation...


I don't know for sure, but the test failure might instead be related to the 
problem that gets fixed by 
https://lore.kernel.org/qemu-devel/20240429075908.36302-1-th...@redhat.com/ 
... I'm preparing a pull request for that fix right now, so maybe you could 
try this ufs pull request afterwards again to see whether the problem is fixed?


 Thomas





Re: [PULL 1/1] hw/ufs: Fix buffer overflow bug

2024-04-29 Thread Richard Henderson

On 4/28/24 20:25, Jeuk Kim wrote:

From: Jeuk Kim 

It fixes the buffer overflow vulnerability in the ufs device.
The bug was detected by sanitizers.

You can reproduce it by:

cat << EOF |\
qemu-system-x86_64 \
-display none -machine accel=qtest -m 512M -M q35 -nodefaults -drive \
file=null-co://,if=none,id=disk0 -device ufs,id=ufs_bus -device \
ufs-lu,drive=disk0,bus=ufs_bus -qtest stdio
outl 0xcf8 0x8810
outl 0xcfc 0xe000
outl 0xcf8 0x8804
outw 0xcfc 0x06
write 0xe058 0x1 0xa7
write 0xa 0x1 0x50
EOF

Resolves: #2299
Fixes: 329f16624499 ("hw/ufs: Support for Query Transfer Requests")
Reported-by: Zheyu Ma 
Signed-off-by: Jeuk Kim 
---
  hw/ufs/ufs.c | 8 
  1 file changed, 8 insertions(+)


For some reason this appears to cause failures on s390x:

  https://gitlab.com/qemu-project/qemu/-/jobs/6740883283

All of the timeouts are new with this patch alone applied,
and go away when reverted.

I wasn't aware that these tests used ufs, but I have no
other explanation...


r~



Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-04-29 Thread Michael Galaxy

Reviewed-by: Michael Galaxy 

Thanks Yu Zhang and Peter.

- Michael

On 4/29/24 15:45, Yu Zhang wrote:

Hello Michael and Peter,

We are very glad at your quick and kind reply about our plan to take
over the maintenance of your code. The message is for presenting our
plan and working together.
If we were able to obtain the maintainer's role, our plan is:

1. Create the necessary unit-test cases and get them integrated into
the current QEMU GitLab-CI pipeline
2. Review and test the code changes by other developers to ensure that
nothing is broken in the changed code before being merged by the
community
3. Based on our current practice and application scenario, look for
possible improvements when necessary

Besides that, a patch is attached to announce this change in the community.

With your generous support, we hope that the development community
will make a positive decision for us.

Kind regards,
Yu Zhang@ IONOS Cloud

On Mon, Apr 29, 2024 at 4:57 PM Peter Xu  wrote:

On Mon, Apr 29, 2024 at 08:08:10AM -0500, Michael Galaxy wrote:

Hi All (and Peter),

Hi, Michael,


My name is Michael Galaxy (formerly Hines). Yes, I changed my last name
(highly irregular for a male) and yes, that's my real last name:
https://urldefense.com/v3/__https://www.linkedin.com/in/mrgalaxy/__;!!GjvTz_vk!TZmnCE90EK692dSjZGr-2cpOEZBQTBsTO2bW5z3rSbpZgNVCexZkxwDXhmIOWG2GAKZAUovQ5xe5coQ$
 )

I'm the original author of the RDMA implementation. I've been discussing
with Yu Zhang for a little bit about potentially handing over maintainership
of the codebase to his team.

I simply have zero access to RoCE or Infiniband hardware at all,
unfortunately. so I've never been able to run tests or use what I wrote at
work, and as all of you know, if you don't have a way to test something,
then you can't maintain it.

Yu Zhang put a (very kind) proposal forward to me to ask the community if
they feel comfortable training his team to maintain the codebase (and run
tests) while they learn about it.

The "while learning" part is fine at least to me.  IMHO the "ownership" to
the code, or say, taking over the responsibility, may or may not need 100%
mastering the code base first.  There should still be some fundamental
confidence to work on the code though as a starting point, then it's about
serious use case to back this up, and careful testings while getting more
familiar with it.


If you don't mind, I'd like to let him send over his (very detailed)
proposal,

Yes please, it's exactly the time to share the plan.  The hope is we try to
reach a consensus before or around the middle of this release (9.1).
Normally QEMU has a 3~4 months window for each release and 9.1 schedule is
not yet out, but I think it means we make a decision before or around
middle of June.

Thanks,


- Michael

On 4/11/24 11:36, Yu Zhang wrote:

1) Either a CI test covering at least the major RDMA paths, or at least
  periodically tests for each QEMU release will be needed.

We use a batch of regression test cases for the stack, which covers the
test for QEMU. I did such test for most of the QEMU releases planned as
candidates for rollout.

The migration test needs a pair of (either physical or virtual) servers with
InfiniBand network, which makes it difficult to do on a single server. The
nested VM could be a possible approach, for which we may need virtual
InfiniBand network. Is SoftRoCE [1] a choice? I will try it and let you know.

[1]  
https://urldefense.com/v3/__https://enterprise-support.nvidia.com/s/article/howto-configure-soft-roce__;!!GjvTz_vk!VEqNfg3Kdf58Oh1FkGL6ErDLfvUXZXPwMTaXizuIQeIgJiywPzuwbqx8wM0KUsyopw_EYQxWvGHE3ig$

Thanks and best regards!

On Thu, Apr 11, 2024 at 4:20 PM Peter Xu  wrote:

On Wed, Apr 10, 2024 at 09:49:15AM -0400, Peter Xu wrote:

On Wed, Apr 10, 2024 at 02:28:59AM +, Zhijian Li (Fujitsu) via wrote:

on 4/10/2024 3:46 AM, Peter Xu wrote:


Is there document/link about the unittest/CI for migration tests, Why
are those tests missing?
Is it hard or very special to set up an environment for that? maybe we
can help in this regards.

See tests/qtest/migration-test.c.  We put most of our migration tests
there and that's covered in CI.

I think one major issue is CI systems don't normally have rdma devices.
Can rdma migration test be carried out without a real hardware?

Yeah,  RXE aka. SOFT-RoCE is able to emulate the RDMA, for example
$ sudo rdma link add rxe_eth0 type rxe netdev eth0  # on host
then we can get a new RDMA interface "rxe_eth0".
This new RDMA interface is able to do the QEMU RDMA migration.

Also, the loopback(lo) device is able to emulate the RDMA interface
"rxe_lo", however when
I tried(years ago) to do RDMA migration over this
interface(rdma:127.0.0.1:) , it got something wrong.
So i gave up enabling the RDMA migration qtest at that time.

Thanks, Zhijian.

I'm not sure adding an emu-link for rdma is doable for CI systems, though.
Maybe someone more familiar with how CI works can chim in.

Some people got dropped on

Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-04-29 Thread Yu Zhang
Hello Michael and Peter,

We are very glad at your quick and kind reply about our plan to take
over the maintenance of your code. The message is for presenting our
plan and working together.
If we were able to obtain the maintainer's role, our plan is:

1. Create the necessary unit-test cases and get them integrated into
the current QEMU GitLab-CI pipeline
2. Review and test the code changes by other developers to ensure that
nothing is broken in the changed code before being merged by the
community
3. Based on our current practice and application scenario, look for
possible improvements when necessary

Besides that, a patch is attached to announce this change in the community.

With your generous support, we hope that the development community
will make a positive decision for us.

Kind regards,
Yu Zhang@ IONOS Cloud

On Mon, Apr 29, 2024 at 4:57 PM Peter Xu  wrote:
>
> On Mon, Apr 29, 2024 at 08:08:10AM -0500, Michael Galaxy wrote:
> > Hi All (and Peter),
>
> Hi, Michael,
>
> >
> > My name is Michael Galaxy (formerly Hines). Yes, I changed my last name
> > (highly irregular for a male) and yes, that's my real last name:
> > https://www.linkedin.com/in/mrgalaxy/)
> >
> > I'm the original author of the RDMA implementation. I've been discussing
> > with Yu Zhang for a little bit about potentially handing over maintainership
> > of the codebase to his team.
> >
> > I simply have zero access to RoCE or Infiniband hardware at all,
> > unfortunately. so I've never been able to run tests or use what I wrote at
> > work, and as all of you know, if you don't have a way to test something,
> > then you can't maintain it.
> >
> > Yu Zhang put a (very kind) proposal forward to me to ask the community if
> > they feel comfortable training his team to maintain the codebase (and run
> > tests) while they learn about it.
>
> The "while learning" part is fine at least to me.  IMHO the "ownership" to
> the code, or say, taking over the responsibility, may or may not need 100%
> mastering the code base first.  There should still be some fundamental
> confidence to work on the code though as a starting point, then it's about
> serious use case to back this up, and careful testings while getting more
> familiar with it.
>
> >
> > If you don't mind, I'd like to let him send over his (very detailed)
> > proposal,
>
> Yes please, it's exactly the time to share the plan.  The hope is we try to
> reach a consensus before or around the middle of this release (9.1).
> Normally QEMU has a 3~4 months window for each release and 9.1 schedule is
> not yet out, but I think it means we make a decision before or around
> middle of June.
>
> Thanks,
>
> >
> > - Michael
> >
> > On 4/11/24 11:36, Yu Zhang wrote:
> > > > 1) Either a CI test covering at least the major RDMA paths, or at least
> > > >  periodically tests for each QEMU release will be needed.
> > > We use a batch of regression test cases for the stack, which covers the
> > > test for QEMU. I did such test for most of the QEMU releases planned as
> > > candidates for rollout.
> > >
> > > The migration test needs a pair of (either physical or virtual) servers 
> > > with
> > > InfiniBand network, which makes it difficult to do on a single server. The
> > > nested VM could be a possible approach, for which we may need virtual
> > > InfiniBand network. Is SoftRoCE [1] a choice? I will try it and let you 
> > > know.
> > >
> > > [1]  
> > > https://urldefense.com/v3/__https://enterprise-support.nvidia.com/s/article/howto-configure-soft-roce__;!!GjvTz_vk!VEqNfg3Kdf58Oh1FkGL6ErDLfvUXZXPwMTaXizuIQeIgJiywPzuwbqx8wM0KUsyopw_EYQxWvGHE3ig$
> > >
> > > Thanks and best regards!
> > >
> > > On Thu, Apr 11, 2024 at 4:20 PM Peter Xu  wrote:
> > > > On Wed, Apr 10, 2024 at 09:49:15AM -0400, Peter Xu wrote:
> > > > > On Wed, Apr 10, 2024 at 02:28:59AM +, Zhijian Li (Fujitsu) via 
> > > > > wrote:
> > > > > >
> > > > > > on 4/10/2024 3:46 AM, Peter Xu wrote:
> > > > > >
> > > > > > > > Is there document/link about the unittest/CI for migration 
> > > > > > > > tests, Why
> > > > > > > > are those tests missing?
> > > > > > > > Is it hard or very special to set up an environment for that? 
> > > > > > > > maybe we
> > > > > > > > can help in this regards.
> > > > > > > See tests/qtest/migration-test.c.  We put most of our migration 
> > > > > > > tests
> > > > > > > there and that's covered in CI.
> > > > > > >
> > > > > > > I think one major issue is CI systems don't normally have rdma 
> > > > > > > devices.
> > > > > > > Can rdma migration test be carried out without a real hardware?
> > > > > > Yeah,  RXE aka. SOFT-RoCE is able to emulate the RDMA, for example
> > > > > > $ sudo rdma link add rxe_eth0 type rxe netdev eth0  # on host
> > > > > > then we can get a new RDMA interface "rxe_eth0".
> > > > > > This new RDMA interface is able to do the QEMU RDMA migration.
> > > > > >
> > > > > > Also, the loopback(lo) device is able to emulate the RDMA interface
> > > > > > "rxe_lo", however whe

Re: [PULL 6/6] iotests: add backup-discard-source

2024-04-29 Thread Vladimir Sementsov-Ogievskiy

[Add John]

On 29.04.24 17:18, Richard Henderson wrote:

On 4/29/24 04:51, Vladimir Sementsov-Ogievskiy wrote:

Add test for a new backup option: discard-source.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Fiona Ebner 
Tested-by: Fiona Ebner 
Message-Id: <20240313152822.626493-6-vsement...@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  .../qemu-iotests/tests/backup-discard-source  | 152 ++
  .../tests/backup-discard-source.out   |   5 +
  2 files changed, 157 insertions(+)
  create mode 100755 tests/qemu-iotests/tests/backup-discard-source
  create mode 100644 tests/qemu-iotests/tests/backup-discard-source.out


This fails check-python-minreqs

   https://gitlab.com/qemu-project/qemu/-/jobs/6739551782

It appears to be a pylint issue.




tests/export-incoming-iothread:1:0: R0801: Similar lines in 2 files
==tests.backup-discard-source:[52:61]
==tests.copy-before-write:[54:63]
'file': {
'driver': iotests.imgfmt,
'file': {
'driver': 'file',
'filename': source_img,
}
},
'target': {
'driver': iotests.imgfmt, (duplicate-code)



Hmm. I don't think, that something should be changed in code. splitting out 
part of this json object to a function? That's a test for QMP command, and it's 
good that we see the command as is in one place. I can swap some lines or 
rename variables to hack the linter, but I'd prefer not doing so:)


For me that looks like a false-positive: yes it's a duplication, but it should 
better be duplication, than complicating raw json objects by reusing common 
parts.


What to do? As described in 22305c2a081b8b6 "python: Reduce strictness of pylint's 
duplicate-code check", this check could be either be disabled completely, or we can 
increase min-similarity-lines config value.

I'd just disable it completely. Any thoughts?


--
Best regards,
Vladimir




Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-04-29 Thread Peter Xu
On Mon, Apr 29, 2024 at 08:08:10AM -0500, Michael Galaxy wrote:
> Hi All (and Peter),

Hi, Michael,

> 
> My name is Michael Galaxy (formerly Hines). Yes, I changed my last name
> (highly irregular for a male) and yes, that's my real last name:
> https://www.linkedin.com/in/mrgalaxy/)
> 
> I'm the original author of the RDMA implementation. I've been discussing
> with Yu Zhang for a little bit about potentially handing over maintainership
> of the codebase to his team.
> 
> I simply have zero access to RoCE or Infiniband hardware at all,
> unfortunately. so I've never been able to run tests or use what I wrote at
> work, and as all of you know, if you don't have a way to test something,
> then you can't maintain it.
> 
> Yu Zhang put a (very kind) proposal forward to me to ask the community if
> they feel comfortable training his team to maintain the codebase (and run
> tests) while they learn about it.

The "while learning" part is fine at least to me.  IMHO the "ownership" to
the code, or say, taking over the responsibility, may or may not need 100%
mastering the code base first.  There should still be some fundamental
confidence to work on the code though as a starting point, then it's about
serious use case to back this up, and careful testings while getting more
familiar with it.

> 
> If you don't mind, I'd like to let him send over his (very detailed)
> proposal,

Yes please, it's exactly the time to share the plan.  The hope is we try to
reach a consensus before or around the middle of this release (9.1).
Normally QEMU has a 3~4 months window for each release and 9.1 schedule is
not yet out, but I think it means we make a decision before or around
middle of June.

Thanks,

> 
> - Michael
> 
> On 4/11/24 11:36, Yu Zhang wrote:
> > > 1) Either a CI test covering at least the major RDMA paths, or at least
> > >  periodically tests for each QEMU release will be needed.
> > We use a batch of regression test cases for the stack, which covers the
> > test for QEMU. I did such test for most of the QEMU releases planned as
> > candidates for rollout.
> > 
> > The migration test needs a pair of (either physical or virtual) servers with
> > InfiniBand network, which makes it difficult to do on a single server. The
> > nested VM could be a possible approach, for which we may need virtual
> > InfiniBand network. Is SoftRoCE [1] a choice? I will try it and let you 
> > know.
> > 
> > [1]  
> > https://urldefense.com/v3/__https://enterprise-support.nvidia.com/s/article/howto-configure-soft-roce__;!!GjvTz_vk!VEqNfg3Kdf58Oh1FkGL6ErDLfvUXZXPwMTaXizuIQeIgJiywPzuwbqx8wM0KUsyopw_EYQxWvGHE3ig$
> > 
> > Thanks and best regards!
> > 
> > On Thu, Apr 11, 2024 at 4:20 PM Peter Xu  wrote:
> > > On Wed, Apr 10, 2024 at 09:49:15AM -0400, Peter Xu wrote:
> > > > On Wed, Apr 10, 2024 at 02:28:59AM +, Zhijian Li (Fujitsu) via 
> > > > wrote:
> > > > > 
> > > > > on 4/10/2024 3:46 AM, Peter Xu wrote:
> > > > > 
> > > > > > > Is there document/link about the unittest/CI for migration tests, 
> > > > > > > Why
> > > > > > > are those tests missing?
> > > > > > > Is it hard or very special to set up an environment for that? 
> > > > > > > maybe we
> > > > > > > can help in this regards.
> > > > > > See tests/qtest/migration-test.c.  We put most of our migration 
> > > > > > tests
> > > > > > there and that's covered in CI.
> > > > > > 
> > > > > > I think one major issue is CI systems don't normally have rdma 
> > > > > > devices.
> > > > > > Can rdma migration test be carried out without a real hardware?
> > > > > Yeah,  RXE aka. SOFT-RoCE is able to emulate the RDMA, for example
> > > > > $ sudo rdma link add rxe_eth0 type rxe netdev eth0  # on host
> > > > > then we can get a new RDMA interface "rxe_eth0".
> > > > > This new RDMA interface is able to do the QEMU RDMA migration.
> > > > > 
> > > > > Also, the loopback(lo) device is able to emulate the RDMA interface
> > > > > "rxe_lo", however when
> > > > > I tried(years ago) to do RDMA migration over this
> > > > > interface(rdma:127.0.0.1:) , it got something wrong.
> > > > > So i gave up enabling the RDMA migration qtest at that time.
> > > > Thanks, Zhijian.
> > > > 
> > > > I'm not sure adding an emu-link for rdma is doable for CI systems, 
> > > > though.
> > > > Maybe someone more familiar with how CI works can chim in.
> > > Some people got dropped on the cc list for unknown reason, I'm adding them
> > > back (Fabiano, Peter Maydell, Phil).  Let's make sure nobody is dropped by
> > > accident.
> > > 
> > > I'll try to summarize what is still missing, and I think these will be
> > > greatly helpful if we don't want to deprecate rdma migration:
> > > 
> > >1) Either a CI test covering at least the major RDMA paths, or at least
> > >   periodically tests for each QEMU release will be needed.
> > > 
> > >2) Some performance tests between modern RDMA and NIC devices are
> > >   welcomed.  The current knowledge is modern NIC can work similarly to
> > 

Re: [PATCH] block/copy-before-write: use uint64_t for timeout in nanoseconds

2024-04-29 Thread Philippe Mathieu-Daudé

Hi Fiona,

On 29/4/24 16:19, Fiona Ebner wrote:

Not everybody uses an email client that shows the patch content just
after the subject (your first lines wasn't making sense at first).

Simply duplicating the subject helps to understand:

  Use uint64_t for timeout in nanoseconds ...


rather than the uint32_t for which the maximum is slightly more than 4
seconds and larger values would overflow. The QAPI interface allows
specifying the number of seconds, so only values 0 to 4 are safe right
now, other values lead to a much lower timeout than a user expects.

The block_copy() call where this is used already takes a uint64_t for
the timeout, so no change required there.

Fixes: 6db7fd1ca9 ("block/copy-before-write: implement cbw-timeout option")
Reported-by: Friedrich Weber 
Signed-off-by: Fiona Ebner 
---
  block/copy-before-write.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 8aba27a71d..026fa9840f 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -43,7 +43,7 @@ typedef struct BDRVCopyBeforeWriteState {
  BlockCopyState *bcs;
  BdrvChild *target;
  OnCbwError on_cbw_error;
-uint32_t cbw_timeout_ns;
+uint64_t cbw_timeout_ns;


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH] block/copy-before-write: use uint64_t for timeout in nanoseconds

2024-04-29 Thread Vladimir Sementsov-Ogievskiy

On 29.04.24 17:46, Fiona Ebner wrote:

Am 29.04.24 um 16:36 schrieb Philippe Mathieu-Daudé:

Hi Fiona,

On 29/4/24 16:19, Fiona Ebner wrote:

Not everybody uses an email client that shows the patch content just
after the subject (your first lines wasn't making sense at first).

Simply duplicating the subject helps to understand:

   Use uint64_t for timeout in nanoseconds ...



Oh, sorry. I'll try to remember that for the future. Should I re-send as
a v2?



Not necessary, I can touch up the message when applying to my block branch.

Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir




Re: [PATCH v3 4/5] qapi: introduce device-sync-config

2024-04-29 Thread Vladimir Sementsov-Ogievskiy

On 29.04.24 16:04, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


On 29.04.24 13:51, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


On 24.04.24 14:48, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


Add command to sync config from vhost-user backend to the device. It
may be helpful when VHOST_USER_SLAVE_CONFIG_CHANGE_MSG failed or not
triggered interrupt to the guest or just not available (not supported
by vhost-user server).

Command result is racy if allow it during migration. Let's allow the
sync only in RUNNING state.

Signed-off-by: Vladimir Sementsov-Ogievskiy 


[...]


diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index 0117d243c4..296af52322 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -5,6 +5,7 @@
#include "qemu/notify.h"

bool runstate_check(RunState state);

+const char *current_run_state_str(void);
void runstate_set(RunState new_state);
RunState runstate_get(void);
bool runstate_is_running(void);
diff --git a/qapi/qdev.json b/qapi/qdev.json
index facaa0bc6a..e8be79c3d5 100644
--- a/qapi/qdev.json
+++ b/qapi/qdev.json
@@ -161,3 +161,24 @@
##
{ 'event': 'DEVICE_UNPLUG_GUEST_ERROR',
  'data': { '*device': 'str', 'path': 'str' } }
+
+##
+# @device-sync-config:
+#
+# Synchronize config from backend to the guest. The command notifies
+# re-read the device config from the backend and notifies the guest
+# to re-read the config. The command may be used to notify the guest
+# about block device capcity change. Currently only vhost-user-blk
+# device supports this.


I'm not sure I understand this.  To work towards an understanding, I
rephrase it, and you point out the errors.

Synchronize device configuration from host to guest part.  First,
copy the configuration from the host part (backend) to the guest
part (frontend).  Then notify guest software that device
configuration changed.


Correct, thanks


Perhaps

Synchronize guest-visible device configuration with the backend's
configuration, and notify guest software that device configuration
changed.

This may be useful to notify the guest of a block device capacity
change.  Currenrly, only vhost-user-blk devices support this.


Sounds good


Except I fat-fingered "Currently".



Next question: what happens when the device *doesn't* support this?


An error "device-sync-config is not supported ..."


Okay.


I wonder how configuration can get out of sync.  Can you explain?



The example (and the original feature, which triggered developing this) is 
vhost disk resize. If vhost-server (backend) doesn't support 
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG, neither QEMU nor guest will know that disk 
capacity changed.


Sounds like we wouldn't need this command if we could make the
vhost-server support VHOST_USER_SLAVE_CONFIG_CHANGE_MSG.  Is making it
support it impractical?  Or are there other uses for this command?


Qemu's internal vhost-server do support it. But that's not the only vhost-user server) So 
the command is useful for those servers which doesn't support 
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG. Note, that this message requires setting up additional 
channel of server -> client communication. That was the reason, why the 
"change-msg" solution was rejected in our downstream: it's safer to reuse existing 
channel (QMP), than to add and support an additional channel.

Also, the command may help to debug the system, when 
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG doesn't work for some reason.


Suggest to work this into the commit message.


+#
+# @id: the device's ID or QOM path
+#
+# Features:
+#
+# @unstable: The command is experimental.
+#
+# Since: 9.1
+##
+{ 'command': 'device-sync-config',
+  'features': [ 'unstable' ],
+  'data': {'id': 'str'} }
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 7e075d91c1..cb35ea0b86 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -23,6 +23,7 @@
   #include "monitor/monitor.h"
   #include "monitor/qdev.h"
   #include "sysemu/arch_init.h"
+#include "sysemu/runstate.h"
   #include "qapi/error.h"
   #include "qapi/qapi-commands-qdev.h"
   #include "qapi/qmp/dispatch.h"
@@ -969,6 +970,52 @@ void qmp_device_del(const char *id, Error **errp)
}
}

+int qdev_sync_config(DeviceState *dev, Error **errp)

+{
+DeviceClass *dc = DEVICE_GET_CLASS(dev);
+
+if (!dc->sync_config) {
+error_setg(errp, "device-sync-config is not supported for '%s'",
+   object_get_typename(OBJECT(dev)));
+return -ENOTSUP;
+}
+
+return dc->sync_config(dev, errp);
+}
+
+void qmp_device_sync_config(const char *id, Error **errp)
+{
+DeviceState *dev;
+
+/*
+ * During migration there is a race between syncing`config and
+ * migrating it, so let's just not allow it.


Can you briefly explain the race?


If at the moment of qmp command, corresponding config alre

Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-04-29 Thread Michael Galaxy

Hi All (and Peter),

My name is Michael Galaxy (formerly Hines). Yes, I changed my last name 
(highly irregular for a male) and yes, that's my real last name: 
https://www.linkedin.com/in/mrgalaxy/)


I'm the original author of the RDMA implementation. I've been discussing 
with Yu Zhang for a little bit about potentially handing over 
maintainership of the codebase to his team.


I simply have zero access to RoCE or Infiniband hardware at all, 
unfortunately. so I've never been able to run tests or use what I wrote 
at work, and as all of you know, if you don't have a way to test 
something, then you can't maintain it.


Yu Zhang put a (very kind) proposal forward to me to ask the community 
if they feel comfortable training his team to maintain the codebase (and 
run tests) while they learn about it.


If you don't mind, I'd like to let him send over his (very detailed) 
proposal,


- Michael

On 4/11/24 11:36, Yu Zhang wrote:

1) Either a CI test covering at least the major RDMA paths, or at least
 periodically tests for each QEMU release will be needed.

We use a batch of regression test cases for the stack, which covers the
test for QEMU. I did such test for most of the QEMU releases planned as
candidates for rollout.

The migration test needs a pair of (either physical or virtual) servers with
InfiniBand network, which makes it difficult to do on a single server. The
nested VM could be a possible approach, for which we may need virtual
InfiniBand network. Is SoftRoCE [1] a choice? I will try it and let you know.

[1]  
https://urldefense.com/v3/__https://enterprise-support.nvidia.com/s/article/howto-configure-soft-roce__;!!GjvTz_vk!VEqNfg3Kdf58Oh1FkGL6ErDLfvUXZXPwMTaXizuIQeIgJiywPzuwbqx8wM0KUsyopw_EYQxWvGHE3ig$

Thanks and best regards!

On Thu, Apr 11, 2024 at 4:20 PM Peter Xu  wrote:

On Wed, Apr 10, 2024 at 09:49:15AM -0400, Peter Xu wrote:

On Wed, Apr 10, 2024 at 02:28:59AM +, Zhijian Li (Fujitsu) via wrote:


on 4/10/2024 3:46 AM, Peter Xu wrote:


Is there document/link about the unittest/CI for migration tests, Why
are those tests missing?
Is it hard or very special to set up an environment for that? maybe we
can help in this regards.

See tests/qtest/migration-test.c.  We put most of our migration tests
there and that's covered in CI.

I think one major issue is CI systems don't normally have rdma devices.
Can rdma migration test be carried out without a real hardware?

Yeah,  RXE aka. SOFT-RoCE is able to emulate the RDMA, for example
$ sudo rdma link add rxe_eth0 type rxe netdev eth0  # on host
then we can get a new RDMA interface "rxe_eth0".
This new RDMA interface is able to do the QEMU RDMA migration.

Also, the loopback(lo) device is able to emulate the RDMA interface
"rxe_lo", however when
I tried(years ago) to do RDMA migration over this
interface(rdma:127.0.0.1:) , it got something wrong.
So i gave up enabling the RDMA migration qtest at that time.

Thanks, Zhijian.

I'm not sure adding an emu-link for rdma is doable for CI systems, though.
Maybe someone more familiar with how CI works can chim in.

Some people got dropped on the cc list for unknown reason, I'm adding them
back (Fabiano, Peter Maydell, Phil).  Let's make sure nobody is dropped by
accident.

I'll try to summarize what is still missing, and I think these will be
greatly helpful if we don't want to deprecate rdma migration:

   1) Either a CI test covering at least the major RDMA paths, or at least
  periodically tests for each QEMU release will be needed.

   2) Some performance tests between modern RDMA and NIC devices are
  welcomed.  The current knowledge is modern NIC can work similarly to
  RDMA in performance, then it's debatable why we still maintain so much
  rdma specific code.

   3) No need to be soild patchsets for this one, but some plan to improve
  RDMA migration code so that it is not almost isolated from the rest
  protocols.

   4) Someone to look after this code for real.

For 2) and 3) more info is here:

https://urldefense.com/v3/__https://lore.kernel.org/r/ZhWa0YeAb9ySVKD1@x1n__;!!GjvTz_vk!VEqNfg3Kdf58Oh1FkGL6ErDLfvUXZXPwMTaXizuIQeIgJiywPzuwbqx8wM0KUsyopw_EYQxWpIWYBhQ$

Here 4) can be the most important as Markus pointed out.  We just didn't
get there yet on the discussions, but maybe Markus is right that we should
talk that first.

Thanks,

--
Peter Xu





Re: [PATCH] block/copy-before-write: use uint64_t for timeout in nanoseconds

2024-04-29 Thread Fiona Ebner
Am 29.04.24 um 16:36 schrieb Philippe Mathieu-Daudé:
> Hi Fiona,
> 
> On 29/4/24 16:19, Fiona Ebner wrote:
> 
> Not everybody uses an email client that shows the patch content just
> after the subject (your first lines wasn't making sense at first).
> 
> Simply duplicating the subject helps to understand:
> 
>   Use uint64_t for timeout in nanoseconds ...
> 

Oh, sorry. I'll try to remember that for the future. Should I re-send as
a v2?

Best Regards,
Fiona




[PATCH] block/copy-before-write: use uint64_t for timeout in nanoseconds

2024-04-29 Thread Fiona Ebner
rather than the uint32_t for which the maximum is slightly more than 4
seconds and larger values would overflow. The QAPI interface allows
specifying the number of seconds, so only values 0 to 4 are safe right
now, other values lead to a much lower timeout than a user expects.

The block_copy() call where this is used already takes a uint64_t for
the timeout, so no change required there.

Fixes: 6db7fd1ca9 ("block/copy-before-write: implement cbw-timeout option")
Reported-by: Friedrich Weber 
Signed-off-by: Fiona Ebner 
---
 block/copy-before-write.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 8aba27a71d..026fa9840f 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -43,7 +43,7 @@ typedef struct BDRVCopyBeforeWriteState {
 BlockCopyState *bcs;
 BdrvChild *target;
 OnCbwError on_cbw_error;
-uint32_t cbw_timeout_ns;
+uint64_t cbw_timeout_ns;
 
 /*
  * @lock: protects access to @access_bitmap, @done_bitmap and
-- 
2.39.2





Re: [PULL 6/6] iotests: add backup-discard-source

2024-04-29 Thread Richard Henderson

On 4/29/24 04:51, Vladimir Sementsov-Ogievskiy wrote:

Add test for a new backup option: discard-source.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Fiona Ebner 
Tested-by: Fiona Ebner 
Message-Id: <20240313152822.626493-6-vsement...@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  .../qemu-iotests/tests/backup-discard-source  | 152 ++
  .../tests/backup-discard-source.out   |   5 +
  2 files changed, 157 insertions(+)
  create mode 100755 tests/qemu-iotests/tests/backup-discard-source
  create mode 100644 tests/qemu-iotests/tests/backup-discard-source.out


This fails check-python-minreqs

  https://gitlab.com/qemu-project/qemu/-/jobs/6739551782

It appears to be a pylint issue.


r~



Re: [PULL 0/1] ufs queue

2024-04-29 Thread Richard Henderson

On 4/29/24 06:41, Stefan Hajnoczi wrote:

On Mon, Apr 29, 2024 at 12:25:37PM +0900, Jeuk Kim wrote:

From: Jeuk Kim 

The following changes since commit fd87be1dada5672f877e03c2ca8504458292c479:

   Merge tag 'accel-20240426' of https://github.com/philmd/qemu into staging 
(2024-04-26 15:28:13 -0700)

are available in the Git repository at:

   https://gitlab.com/jeuk20.kim/qemu.git tags/pull-ufs-20240429

for you to fetch changes up to f2c8aeb1afefcda92054c448b21fc59cdd99db30:

   hw/ufs: Fix buffer overflow bug (2024-04-29 12:13:35 +0900)


ufs queue

- Fix ufs sanitizer vulnerability


Jeuk Kim (1):
   hw/ufs: Fix buffer overflow bug

  hw/ufs/ufs.c | 8 
  1 file changed, 8 insertions(+)



Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block

It will be included in my next block pull request.

You are welcome to send pull requests directly to the qemu.git/master
maintainer (Richard Henderson is on duty for this release cycle). If you
do that, make sure to GPG sign your pull request.


He did. I have

Merge tag 'pull-ufs-20240429' of https://gitlab.com/jeuk20.kim/qemu into 
staging

ufs queue

# -BEGIN PGP SIGNATURE-
#
# iQIzBAABCgAdFiEEUBfYMVl8eKPZB+73EuIgTA5dtgIFAmYvEScACgkQEuIgTA5d
# tgL3Qg//R3IcISQqqDaJ/ySzKGmkyohJSc6ySLYvla4Aki7PV+um2Dx/XNS7uG2b
# d3Qz4m6QaOKsocLfldRTn2FxVK238Rp5HNny5vc0kGRdwpR514B7aU0FhpT7qObS
# wbbgRdDddIBIiCFLhtXtg5/TK2h32VxGrVI6llX4gmd2VzqM0e4xeG1Oj8rZseOY
# SAgvDv68s1YwlO1p1vPvst/H+mUKYkqtPN1mjfCIn5tM6ss8kCLUnKjqGAg1BnSN
# xwaGrqqOlzQK2+aV02eiItiow8evU/h+c9eiTnBo/EvBwjoBn6flNXABWXFENnmP
# JjVIFeiNzSFhBPDzO23GXviuEt96j5lrcGYR48HYMZfEbJNpblXzWvEGMZWnXNgx
# Q3cpcarZ4vSWIflR9OnCSQaGLA0Ny6YqLbmrM/oD+v67EITafKKc+flmiF7DBASB
# fUoEsdffdA37LDtygJb7hfUhvPQWWAujmGzZ1cDP8Oa0MhT7aiD0Z/WqhhjVQbM0
# iLiCDDD0cc0pmT3vw3EnEjKjnSkY3H62Q7pnYHiQgij4Ls/Rdd/P7OkSd0aI82t0
# TooWGZJnyf8rjAzY2cEB1Twrhmhuyt9NnGxip9W8JsQBZMLabD2CahOm83zsk7jZ
# 3fOONz6XrW2ttFkLZcRd4x4YjKONjEXsSX2ZrXTZ5t3USz/VNvY=
# =Vwyi
# -END PGP SIGNATURE-
# gpg: Signature made Sun 28 Apr 2024 08:16:55 PM PDT
# gpg:using RSA key 5017D831597C78A3D907EEF712E2204C0E5DB602
# gpg: Good signature from "Jeuk Kim " [unknown]
# gpg: aka "Jeuk Kim " [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:  There is no indication that the signature belongs to the 
owner.
# Primary key fingerprint: 5017 D831 597C 78A3 D907  EEF7 12E2 204C 0E5D 
B602

queued for the next merge.


r~



[PULL 1/1] hw/ufs: Fix buffer overflow bug

2024-04-29 Thread Stefan Hajnoczi
From: Jeuk Kim 

It fixes the buffer overflow vulnerability in the ufs device.
The bug was detected by sanitizers.

You can reproduce it by:

cat << EOF |\
qemu-system-x86_64 \
-display none -machine accel=qtest -m 512M -M q35 -nodefaults -drive \
file=null-co://,if=none,id=disk0 -device ufs,id=ufs_bus -device \
ufs-lu,drive=disk0,bus=ufs_bus -qtest stdio
outl 0xcf8 0x8810
outl 0xcfc 0xe000
outl 0xcf8 0x8804
outw 0xcfc 0x06
write 0xe058 0x1 0xa7
write 0xa 0x1 0x50
EOF

Resolves: #2299
Fixes: 329f16624499 ("hw/ufs: Support for Query Transfer Requests")
Reported-by: Zheyu Ma 
Signed-off-by: Jeuk Kim 
Signed-off-by: Stefan Hajnoczi 
Message-ID: 

---
 hw/ufs/ufs.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/ufs/ufs.c b/hw/ufs/ufs.c
index eccdb852a0..bac78a32bb 100644
--- a/hw/ufs/ufs.c
+++ b/hw/ufs/ufs.c
@@ -126,6 +126,10 @@ static MemTxResult ufs_dma_read_req_upiu(UfsRequest *req)
 copy_size = sizeof(UtpUpiuHeader) + UFS_TRANSACTION_SPECIFIC_FIELD_SIZE +
 data_segment_length;
 
+if (copy_size > sizeof(req->req_upiu)) {
+copy_size = sizeof(req->req_upiu);
+}
+
 ret = ufs_addr_read(u, req_upiu_base_addr, &req->req_upiu, copy_size);
 if (ret) {
 trace_ufs_err_dma_read_req_upiu(req->slot, req_upiu_base_addr);
@@ -225,6 +229,10 @@ static MemTxResult ufs_dma_write_rsp_upiu(UfsRequest *req)
 copy_size = rsp_upiu_byte_len;
 }
 
+if (copy_size > sizeof(req->rsp_upiu)) {
+copy_size = sizeof(req->rsp_upiu);
+}
+
 ret = ufs_addr_write(u, rsp_upiu_base_addr, &req->rsp_upiu, copy_size);
 if (ret) {
 trace_ufs_err_dma_write_rsp_upiu(req->slot, rsp_upiu_base_addr);
-- 
2.44.0




[PULL 0/1] Block patches

2024-04-29 Thread Stefan Hajnoczi
The following changes since commit fd87be1dada5672f877e03c2ca8504458292c479:

  Merge tag 'accel-20240426' of https://github.com/philmd/qemu into staging 
(2024-04-26 15:28:13 -0700)

are available in the Git repository at:

  https://gitlab.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to d1c4580662bf75bf6875bb5e1ad446b300816ac7:

  hw/ufs: Fix buffer overflow bug (2024-04-29 09:33:06 -0400)


Pull request

Buffer overflow fix for Universal Flash Storage (UFS) emulation.



Jeuk Kim (1):
  hw/ufs: Fix buffer overflow bug

 hw/ufs/ufs.c | 8 
 1 file changed, 8 insertions(+)

-- 
2.44.0




Re: [PULL 0/1] ufs queue

2024-04-29 Thread Stefan Hajnoczi
On Mon, Apr 29, 2024 at 12:25:37PM +0900, Jeuk Kim wrote:
> From: Jeuk Kim 
> 
> The following changes since commit fd87be1dada5672f877e03c2ca8504458292c479:
> 
>   Merge tag 'accel-20240426' of https://github.com/philmd/qemu into staging 
> (2024-04-26 15:28:13 -0700)
> 
> are available in the Git repository at:
> 
>   https://gitlab.com/jeuk20.kim/qemu.git tags/pull-ufs-20240429
> 
> for you to fetch changes up to f2c8aeb1afefcda92054c448b21fc59cdd99db30:
> 
>   hw/ufs: Fix buffer overflow bug (2024-04-29 12:13:35 +0900)
> 
> 
> ufs queue
> 
> - Fix ufs sanitizer vulnerability
> 
> 
> Jeuk Kim (1):
>   hw/ufs: Fix buffer overflow bug
> 
>  hw/ufs/ufs.c | 8 
>  1 file changed, 8 insertions(+)
> 

Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block

It will be included in my next block pull request.

You are welcome to send pull requests directly to the qemu.git/master
maintainer (Richard Henderson is on duty for this release cycle). If you
do that, make sure to GPG sign your pull request.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH v3 4/5] qapi: introduce device-sync-config

2024-04-29 Thread Markus Armbruster
Vladimir Sementsov-Ogievskiy  writes:

> On 29.04.24 13:51, Markus Armbruster wrote:
>> Vladimir Sementsov-Ogievskiy  writes:
>> 
>>> On 24.04.24 14:48, Markus Armbruster wrote:
 Vladimir Sementsov-Ogievskiy  writes:

> Add command to sync config from vhost-user backend to the device. It
> may be helpful when VHOST_USER_SLAVE_CONFIG_CHANGE_MSG failed or not
> triggered interrupt to the guest or just not available (not supported
> by vhost-user server).
>
> Command result is racy if allow it during migration. Let's allow the
> sync only in RUNNING state.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> 
>> [...]
>> 
> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
> index 0117d243c4..296af52322 100644
> --- a/include/sysemu/runstate.h
> +++ b/include/sysemu/runstate.h
> @@ -5,6 +5,7 @@
>#include "qemu/notify.h"
>
>bool runstate_check(RunState state);
> +const char *current_run_state_str(void);
>void runstate_set(RunState new_state);
>RunState runstate_get(void);
>bool runstate_is_running(void);
> diff --git a/qapi/qdev.json b/qapi/qdev.json
> index facaa0bc6a..e8be79c3d5 100644
> --- a/qapi/qdev.json
> +++ b/qapi/qdev.json
> @@ -161,3 +161,24 @@
>##
>{ 'event': 'DEVICE_UNPLUG_GUEST_ERROR',
>  'data': { '*device': 'str', 'path': 'str' } }
> +
> +##
> +# @device-sync-config:
> +#
> +# Synchronize config from backend to the guest. The command notifies
> +# re-read the device config from the backend and notifies the guest
> +# to re-read the config. The command may be used to notify the guest
> +# about block device capcity change. Currently only vhost-user-blk
> +# device supports this.

 I'm not sure I understand this.  To work towards an understanding, I
 rephrase it, and you point out the errors.

Synchronize device configuration from host to guest part.  First,
copy the configuration from the host part (backend) to the guest
part (frontend).  Then notify guest software that device
configuration changed.
>>>
>>> Correct, thanks
>> 
>> Perhaps
>> 
>>Synchronize guest-visible device configuration with the backend's
>>configuration, and notify guest software that device configuration
>>changed.
>> 
>>This may be useful to notify the guest of a block device capacity
>>change.  Currenrly, only vhost-user-blk devices support this.
>
> Sounds good

Except I fat-fingered "Currently".

>> 
>> Next question: what happens when the device *doesn't* support this?
>
> An error "device-sync-config is not supported ..."

Okay.

 I wonder how configuration can get out of sync.  Can you explain?

>>>
>>> The example (and the original feature, which triggered developing this) is 
>>> vhost disk resize. If vhost-server (backend) doesn't support 
>>> VHOST_USER_SLAVE_CONFIG_CHANGE_MSG, neither QEMU nor guest will know that 
>>> disk capacity changed.
>> 
>> Sounds like we wouldn't need this command if we could make the
>> vhost-server support VHOST_USER_SLAVE_CONFIG_CHANGE_MSG.  Is making it
>> support it impractical?  Or are there other uses for this command?
>
> Qemu's internal vhost-server do support it. But that's not the only 
> vhost-user server) So the command is useful for those servers which doesn't 
> support VHOST_USER_SLAVE_CONFIG_CHANGE_MSG. Note, that this message requires 
> setting up additional channel of server -> client communication. That was the 
> reason, why the "change-msg" solution was rejected in our downstream: it's 
> safer to reuse existing channel (QMP), than to add and support an additional 
> channel.
>
> Also, the command may help to debug the system, when 
> VHOST_USER_SLAVE_CONFIG_CHANGE_MSG doesn't work for some reason.

Suggest to work this into the commit message.

> +#
> +# @id: the device's ID or QOM path
> +#
> +# Features:
> +#
> +# @unstable: The command is experimental.
> +#
> +# Since: 9.1
> +##
> +{ 'command': 'device-sync-config',
> +  'features': [ 'unstable' ],
> +  'data': {'id': 'str'} }
> diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
> index 7e075d91c1..cb35ea0b86 100644
> --- a/system/qdev-monitor.c
> +++ b/system/qdev-monitor.c
> @@ -23,6 +23,7 @@
>   #include "monitor/monitor.h"
>   #include "monitor/qdev.h"
>   #include "sysemu/arch_init.h"
> +#include "sysemu/runstate.h"
>   #include "qapi/error.h"
>   #include "qapi/qapi-commands-qdev.h"
>   #include "qapi/qmp/dispatch.h"
> @@ -969,6 +970,52 @@ void qmp_device_del(const char *id, Error **errp)
>}
>}
>
> +int qdev_sync_config(DeviceState *dev, Error **errp)
> +{
> +DeviceClass *dc = DEVICE_GET_CLASS(dev);
> +
> +if (!dc->sync_config) {
> +

Re: [PATCH v3 4/5] qapi: introduce device-sync-config

2024-04-29 Thread Vladimir Sementsov-Ogievskiy

On 29.04.24 13:51, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


On 24.04.24 14:48, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


Add command to sync config from vhost-user backend to the device. It
may be helpful when VHOST_USER_SLAVE_CONFIG_CHANGE_MSG failed or not
triggered interrupt to the guest or just not available (not supported
by vhost-user server).

Command result is racy if allow it during migration. Let's allow the
sync only in RUNNING state.

Signed-off-by: Vladimir Sementsov-Ogievskiy 


[...]


diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index 0117d243c4..296af52322 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -5,6 +5,7 @@
   #include "qemu/notify.h"
   
   bool runstate_check(RunState state);

+const char *current_run_state_str(void);
   void runstate_set(RunState new_state);
   RunState runstate_get(void);
   bool runstate_is_running(void);
diff --git a/qapi/qdev.json b/qapi/qdev.json
index facaa0bc6a..e8be79c3d5 100644
--- a/qapi/qdev.json
+++ b/qapi/qdev.json
@@ -161,3 +161,24 @@
   ##
   { 'event': 'DEVICE_UNPLUG_GUEST_ERROR',
 'data': { '*device': 'str', 'path': 'str' } }
+
+##
+# @device-sync-config:
+#
+# Synchronize config from backend to the guest. The command notifies
+# re-read the device config from the backend and notifies the guest
+# to re-read the config. The command may be used to notify the guest
+# about block device capcity change. Currently only vhost-user-blk
+# device supports this.


I'm not sure I understand this.  To work towards an understanding, I
rephrase it, and you point out the errors.

   Synchronize device configuration from host to guest part.  First,
   copy the configuration from the host part (backend) to the guest
   part (frontend).  Then notify guest software that device
   configuration changed.


Correct, thanks


Perhaps

   Synchronize guest-visible device configuration with the backend's
   configuration, and notify guest software that device configuration
   changed.

   This may be useful to notify the guest of a block device capacity
   change.  Currenrly, only vhost-user-blk devices support this.


Sounds good



Next question: what happens when the device *doesn't* support this?


An error "device-sync-config is not supported ..."




I wonder how configuration can get out of sync.  Can you explain?



The example (and the original feature, which triggered developing this) is 
vhost disk resize. If vhost-server (backend) doesn't support 
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG, neither QEMU nor guest will know that disk 
capacity changed.


Sounds like we wouldn't need this command if we could make the
vhost-server support VHOST_USER_SLAVE_CONFIG_CHANGE_MSG.  Is making it
support it impractical?  Or are there other uses for this command?


Qemu's internal vhost-server do support it. But that's not the only vhost-user server) So 
the command is useful for those servers which doesn't support 
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG. Note, that this message requires setting up additional 
channel of server -> client communication. That was the reason, why the 
"change-msg" solution was rejected in our downstream: it's safer to reuse existing 
channel (QMP), than to add and support an additional channel.

Also, the command may help to debug the system, when 
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG doesn't work for some reason.




+#
+# @id: the device's ID or QOM path
+#
+# Features:
+#
+# @unstable: The command is experimental.
+#
+# Since: 9.1
+##
+{ 'command': 'device-sync-config',
+  'features': [ 'unstable' ],
+  'data': {'id': 'str'} }
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 7e075d91c1..cb35ea0b86 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -23,6 +23,7 @@
  #include "monitor/monitor.h"
  #include "monitor/qdev.h"
  #include "sysemu/arch_init.h"
+#include "sysemu/runstate.h"
  #include "qapi/error.h"
  #include "qapi/qapi-commands-qdev.h"
  #include "qapi/qmp/dispatch.h"
@@ -969,6 +970,52 @@ void qmp_device_del(const char *id, Error **errp)
   }
   }
   
+int qdev_sync_config(DeviceState *dev, Error **errp)

+{
+DeviceClass *dc = DEVICE_GET_CLASS(dev);
+
+if (!dc->sync_config) {
+error_setg(errp, "device-sync-config is not supported for '%s'",
+   object_get_typename(OBJECT(dev)));
+return -ENOTSUP;
+}
+
+return dc->sync_config(dev, errp);
+}
+
+void qmp_device_sync_config(const char *id, Error **errp)
+{
+DeviceState *dev;
+
+/*
+ * During migration there is a race between syncing`config and
+ * migrating it, so let's just not allow it.


Can you briefly explain the race?


If at the moment of qmp command, corresponding config already migrated to the 
target, we'll change only the config on source, but on the target we'll still 
have outdated config.


For RAM, dirty tracking ensures the change gets sent.  But this is
device mem

Re: [PULL 0/6] Block jobs patches for 2024-04-29

2024-04-29 Thread Vladimir Sementsov-Ogievskiy

Sorry for too much CC-ing, I've mistakenly added 
--cc-cmd=./scripts/get_maintainer.pl


On 29.04.24 14:51, Vladimir Sementsov-Ogievskiy wrote:

The following changes since commit fd87be1dada5672f877e03c2ca8504458292c479:

   Merge tag 'accel-20240426' of https://github.com/philmd/qemu into staging 
(2024-04-26 15:28:13 -0700)

are available in the Git repository at:

   https://gitlab.com/vsementsov/qemu.git tags/pull-block-jobs-2024-04-29

for you to fetch changes up to 2ca7608c6b8d57fd6347b11af12a0f035263efef:

   iotests: add backup-discard-source (2024-04-29 13:35:30 +0300)


Block jobs patches for 2024-04-29

- backup: discard-source parameter
- blockcommit: Reopen base image as RO after abort


Alexander Ivanov (1):
   blockcommit: Reopen base image as RO after abort

Vladimir Sementsov-Ogievskiy (5):
   block/copy-before-write: fix permission
   block/copy-before-write: support unligned snapshot-discard
   block/copy-before-write: create block_copy bitmap in filter node
   qapi: blockdev-backup: add discard-source parameter
   iotests: add backup-discard-source

  block/backup.c |   5 +-
  block/block-copy.c |  12 +-
  block/copy-before-write.c  |  39 +--
  block/copy-before-write.h  |   1 +
  block/mirror.c |  11 +-
  block/replication.c|   4 +-
  blockdev.c |   2 +-
  include/block/block-common.h   |   2 +
  include/block/block-copy.h |   2 +
  include/block/block_int-global-state.h |   2 +-
  qapi/block-core.json   |   4 +
  tests/qemu-iotests/257.out | 112 +-
  tests/qemu-iotests/tests/backup-discard-source | 152 
+
  tests/qemu-iotests/tests/backup-discard-source.out |   5 +
  14 files changed, 281 insertions(+), 72 deletions(-)
  create mode 100755 tests/qemu-iotests/tests/backup-discard-source
  create mode 100644 tests/qemu-iotests/tests/backup-discard-source.out

Alexander Ivanov (1):
   blockcommit: Reopen base image as RO after abort

Vladimir Sementsov-Ogievskiy (5):
   block/copy-before-write: fix permission
   block/copy-before-write: support unligned snapshot-discard
   block/copy-before-write: create block_copy bitmap in filter node
   qapi: blockdev-backup: add discard-source parameter
   iotests: add backup-discard-source

  block/backup.c|   5 +-
  block/block-copy.c|  12 +-
  block/copy-before-write.c |  39 -
  block/copy-before-write.h |   1 +
  block/mirror.c|  11 +-
  block/replication.c   |   4 +-
  blockdev.c|   2 +-
  include/block/block-common.h  |   2 +
  include/block/block-copy.h|   2 +
  include/block/block_int-global-state.h|   2 +-
  qapi/block-core.json  |   4 +
  tests/qemu-iotests/257.out| 112 ++---
  .../qemu-iotests/tests/backup-discard-source  | 152 ++
  .../tests/backup-discard-source.out   |   5 +
  14 files changed, 281 insertions(+), 72 deletions(-)
  create mode 100755 tests/qemu-iotests/tests/backup-discard-source
  create mode 100644 tests/qemu-iotests/tests/backup-discard-source.out



--
Best regards,
Vladimir




[PULL 6/6] iotests: add backup-discard-source

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
Add test for a new backup option: discard-source.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Fiona Ebner 
Tested-by: Fiona Ebner 
Message-Id: <20240313152822.626493-6-vsement...@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 .../qemu-iotests/tests/backup-discard-source  | 152 ++
 .../tests/backup-discard-source.out   |   5 +
 2 files changed, 157 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/backup-discard-source
 create mode 100644 tests/qemu-iotests/tests/backup-discard-source.out

diff --git a/tests/qemu-iotests/tests/backup-discard-source 
b/tests/qemu-iotests/tests/backup-discard-source
new file mode 100755
index 00..2391b12acd
--- /dev/null
+++ b/tests/qemu-iotests/tests/backup-discard-source
@@ -0,0 +1,152 @@
+#!/usr/bin/env python3
+#
+# Test backup discard-source parameter
+#
+# Copyright (c) Virtuozzo International GmbH.
+# Copyright (c) Yandex
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+
+import iotests
+from iotests import qemu_img_create, qemu_img_map, qemu_io
+
+
+temp_img = os.path.join(iotests.test_dir, 'temp')
+source_img = os.path.join(iotests.test_dir, 'source')
+target_img = os.path.join(iotests.test_dir, 'target')
+size = '1M'
+
+
+def get_actual_size(vm, node_name):
+nodes = vm.cmd('query-named-block-nodes', flat=True)
+node = next(n for n in nodes if n['node-name'] == node_name)
+return node['image']['actual-size']
+
+
+class TestBackup(iotests.QMPTestCase):
+def setUp(self):
+qemu_img_create('-f', iotests.imgfmt, source_img, size)
+qemu_img_create('-f', iotests.imgfmt, temp_img, size)
+qemu_img_create('-f', iotests.imgfmt, target_img, size)
+qemu_io('-c', 'write 0 1M', source_img)
+
+self.vm = iotests.VM()
+self.vm.launch()
+
+self.vm.cmd('blockdev-add', {
+'node-name': 'cbw',
+'driver': 'copy-before-write',
+'file': {
+'driver': iotests.imgfmt,
+'file': {
+'driver': 'file',
+'filename': source_img,
+}
+},
+'target': {
+'driver': iotests.imgfmt,
+'discard': 'unmap',
+'node-name': 'temp',
+'file': {
+'driver': 'file',
+'filename': temp_img
+}
+}
+})
+
+self.vm.cmd('blockdev-add', {
+'node-name': 'access',
+'discard': 'unmap',
+'driver': 'snapshot-access',
+'file': 'cbw'
+})
+
+self.vm.cmd('blockdev-add', {
+'driver': iotests.imgfmt,
+'node-name': 'target',
+'file': {
+'driver': 'file',
+'filename': target_img
+}
+})
+
+self.assertLess(get_actual_size(self.vm, 'temp'), 512 * 1024)
+
+def tearDown(self):
+# That should fail, because region is discarded
+self.vm.hmp_qemu_io('access', 'read 0 1M')
+
+self.vm.shutdown()
+
+self.assertTrue('read failed: Permission denied' in self.vm.get_log())
+
+# Final check that temp image is empty
+mapping = qemu_img_map(temp_img)
+self.assertEqual(len(mapping), 1)
+self.assertEqual(mapping[0]['start'], 0)
+self.assertEqual(mapping[0]['length'], 1024 * 1024)
+self.assertEqual(mapping[0]['data'], False)
+
+os.remove(temp_img)
+os.remove(source_img)
+os.remove(target_img)
+
+def do_backup(self):
+self.vm.cmd('blockdev-backup', device='access',
+sync='full', target='target',
+job_id='backup0',
+discard_source=True)
+
+self.vm.event_wait(name='BLOCK_JOB_COMPLETED')
+
+def test_discard_written(self):
+"""
+1. Guest writes
+2. copy-before-write operation, data is stored to temp
+3. start backup(discard_source=True), check that data is
+   removed from temp
+"""
+# Trigger copy-before-write operation
+result = self.vm.hmp_qemu_io('cbw', 'write 0 1M')
+self.assert_qmp(result, 'return', '')
+
+# Check that data is written to temporary image
+self.assertGreater(get_actual_size(self

[PULL 5/6] qapi: blockdev-backup: add discard-source parameter

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
Add a parameter that enables discard-after-copy. That is mostly useful
in "push backup with fleecing" scheme, when source is snapshot-access
format driver node, based on copy-before-write filter snapshot-access
API:

[guest]  [snapshot-access] ~~ blockdev-backup ~~> [backup target]
   ||
   | root   | file
   vv
[copy-before-write]
   | |
   | file| target
   v v
[active disk]   [temp.img]

In this case discard-after-copy does two things:

 - discard data in temp.img to save disk space
 - avoid further copy-before-write operation in discarded area

Note that we have to declare WRITE permission on source in
copy-before-write filter, for discard to work. Still we can't take it
unconditionally, as it will break normal backup from RO source. So, we
have to add a parameter and pass it thorough bdrv_open flags.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Fiona Ebner 
Tested-by: Fiona Ebner 
Acked-by: Markus Armbruster 
Message-Id: <20240313152822.626493-5-vsement...@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/backup.c |  5 +++--
 block/block-copy.c |  9 +
 block/copy-before-write.c  | 15 +--
 block/copy-before-write.h  |  1 +
 block/replication.c|  4 ++--
 blockdev.c |  2 +-
 include/block/block-common.h   |  2 ++
 include/block/block-copy.h |  1 +
 include/block/block_int-global-state.h |  2 +-
 qapi/block-core.json   |  4 
 10 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index ec29d6b810..3dd2e229d2 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -356,7 +356,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
   BlockDriverState *target, int64_t speed,
   MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
   BitmapSyncMode bitmap_mode,
-  bool compress,
+  bool compress, bool discard_source,
   const char *filter_node_name,
   BackupPerf *perf,
   BlockdevOnError on_source_error,
@@ -457,7 +457,8 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 goto error;
 }
 
-cbw = bdrv_cbw_append(bs, target, filter_node_name, &bcs, errp);
+cbw = bdrv_cbw_append(bs, target, filter_node_name, discard_source,
+  &bcs, errp);
 if (!cbw) {
 goto error;
 }
diff --git a/block/block-copy.c b/block/block-copy.c
index 8fca2c3698..7e3b378528 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -137,6 +137,7 @@ typedef struct BlockCopyState {
 CoMutex lock;
 int64_t in_flight_bytes;
 BlockCopyMethod method;
+bool discard_source;
 BlockReqList reqs;
 QLIST_HEAD(, BlockCopyCallState) calls;
 /*
@@ -353,6 +354,7 @@ static int64_t 
block_copy_calculate_cluster_size(BlockDriverState *target,
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
  BlockDriverState *copy_bitmap_bs,
  const BdrvDirtyBitmap *bitmap,
+ bool discard_source,
  Error **errp)
 {
 ERRP_GUARD();
@@ -418,6 +420,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, 
BdrvChild *target,
 cluster_size),
 };
 
+s->discard_source = discard_source;
 block_copy_set_copy_opts(s, false, false);
 
 ratelimit_init(&s->rate_limit);
@@ -589,6 +592,12 @@ static coroutine_fn int block_copy_task_entry(AioTask 
*task)
 co_put_to_shres(s->mem, t->req.bytes);
 block_copy_task_end(t, ret);
 
+if (s->discard_source && ret == 0) {
+int64_t nbytes =
+MIN(t->req.offset + t->req.bytes, s->len) - t->req.offset;
+bdrv_co_pdiscard(s->source, t->req.offset, nbytes);
+}
+
 return ret;
 }
 
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index ed2c228da7..cd65524e26 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -44,6 +44,7 @@ typedef struct BDRVCopyBeforeWriteState {
 BdrvChild *target;
 OnCbwError on_cbw_error;
 uint32_t cbw_timeout_ns;
+bool discard_source;
 
 /*
  * @lock: protects access to @access_bitmap, @done_bitmap and
@@ -357,6 +358,8 @@ cbw_child_perm(BlockDriverState *bs, BdrvChild *c, 
BdrvChildRole role,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
 {
+BDRVCopyBeforeWriteState *s = bs->opaque;
+
 if (!(role & BDRV_CHILD_FILTERED)) {
 /*
  * Target child
@@ -381,6 +384,10 @@ cbw_child_perm(BlockDriverState *bs, BdrvChild *c, 
BdrvChildRole role,
  * start

[PULL 1/6] blockcommit: Reopen base image as RO after abort

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
From: Alexander Ivanov 

If a blockcommit is aborted the base image remains in RW mode, that leads
to a fail of subsequent live migration.

How to reproduce:
  $ virsh snapshot-create-as vm snp1 --disk-only

  *** write something to the disk inside the guest ***

  $ virsh blockcommit vm vda --active --shallow && virsh blockjob vm vda --abort
  $ lsof /vzt/vm.qcow2
  COMMAND  PID USER   FD   TYPE DEVICE   SIZE/OFF NODE NAME
  qemu-syst 433203 root   45u   REG  253,0 1724776448  133 /vzt/vm.qcow2
  $ cat /proc/433203/fdinfo/45
  pos:0
  flags:  02140002 < The last 2 means RW mode

If the base image is in RW mode at the end of blockcommit and was in RO
mode before blockcommit, reopen the base BDS in RO.

Signed-off-by: Alexander Ivanov 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20240404091136.129811-1-alexander.iva...@virtuozzo.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/mirror.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 1bdce3b657..61f0a717b7 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -93,6 +93,7 @@ typedef struct MirrorBlockJob {
 int64_t active_write_bytes_in_flight;
 bool prepared;
 bool in_drain;
+bool base_ro;
 } MirrorBlockJob;
 
 typedef struct MirrorBDSOpaque {
@@ -794,6 +795,10 @@ static int mirror_exit_common(Job *job)
 bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, &error_abort);
 bdrv_graph_wrunlock();
 
+if (abort && s->base_ro && !bdrv_is_read_only(target_bs)) {
+bdrv_reopen_set_read_only(target_bs, true, NULL);
+}
+
 bdrv_drained_end(target_bs);
 bdrv_unref(target_bs);
 
@@ -1717,6 +1722,7 @@ static BlockJob *mirror_start_job(
  bool is_none_mode, BlockDriverState *base,
  bool auto_complete, const char *filter_node_name,
  bool is_mirror, MirrorCopyMode copy_mode,
+ bool base_ro,
  Error **errp)
 {
 MirrorBlockJob *s;
@@ -1800,6 +1806,7 @@ static BlockJob *mirror_start_job(
 bdrv_unref(mirror_top_bs);
 
 s->mirror_top_bs = mirror_top_bs;
+s->base_ro = base_ro;
 
 /* No resize for the target either; while the mirror is still running, a
  * consistent read isn't necessarily possible. We could possibly allow
@@ -2029,7 +2036,7 @@ void mirror_start(const char *job_id, BlockDriverState 
*bs,
  speed, granularity, buf_size, backing_mode, zero_target,
  on_source_error, on_target_error, unmap, NULL, NULL,
  &mirror_job_driver, is_none_mode, base, false,
- filter_node_name, true, copy_mode, errp);
+ filter_node_name, true, copy_mode, false, errp);
 }
 
 BlockJob *commit_active_start(const char *job_id, BlockDriverState *bs,
@@ -2058,7 +2065,7 @@ BlockJob *commit_active_start(const char *job_id, 
BlockDriverState *bs,
  on_error, on_error, true, cb, opaque,
  &commit_active_job_driver, false, base, auto_complete,
  filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
- errp);
+ base_read_only, errp);
 if (!job) {
 goto error_restore_flags;
 }
-- 
2.34.1




[PULL 4/6] block/copy-before-write: create block_copy bitmap in filter node

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
Currently block_copy creates copy_bitmap in source node. But that is in
bad relation with .independent_close=true of copy-before-write filter:
source node may be detached and removed before .bdrv_close() handler
called, which should call block_copy_state_free(), which in turn should
remove copy_bitmap.

That's all not ideal: it would be better if internal bitmap of
block-copy object is not attached to any node. But that is not possible
now.

The simplest solution is just create copy_bitmap in filter node, where
anyway two other bitmaps are created.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Fiona Ebner 
Tested-by: Fiona Ebner 
Message-Id: <20240313152822.626493-4-vsement...@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/block-copy.c |   3 +-
 block/copy-before-write.c  |   2 +-
 include/block/block-copy.h |   1 +
 tests/qemu-iotests/257.out | 112 ++---
 4 files changed, 60 insertions(+), 58 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 9ee3dd7ef5..8fca2c3698 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -351,6 +351,7 @@ static int64_t 
block_copy_calculate_cluster_size(BlockDriverState *target,
 }
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
+ BlockDriverState *copy_bitmap_bs,
  const BdrvDirtyBitmap *bitmap,
  Error **errp)
 {
@@ -367,7 +368,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, 
BdrvChild *target,
 return NULL;
 }
 
-copy_bitmap = bdrv_create_dirty_bitmap(source->bs, cluster_size, NULL,
+copy_bitmap = bdrv_create_dirty_bitmap(copy_bitmap_bs, cluster_size, NULL,
errp);
 if (!copy_bitmap) {
 return NULL;
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 6d89af0b29..ed2c228da7 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -468,7 +468,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
 ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
  bs->file->bs->supported_zero_flags);
 
-s->bcs = block_copy_state_new(bs->file, s->target, bitmap, errp);
+s->bcs = block_copy_state_new(bs->file, s->target, bs, bitmap, errp);
 if (!s->bcs) {
 error_prepend(errp, "Cannot create block-copy-state: ");
 return -EINVAL;
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 0700953ab8..8b41643bfa 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -25,6 +25,7 @@ typedef struct BlockCopyState BlockCopyState;
 typedef struct BlockCopyCallState BlockCopyCallState;
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
+ BlockDriverState *copy_bitmap_bs,
  const BdrvDirtyBitmap *bitmap,
  Error **errp);
 
diff --git a/tests/qemu-iotests/257.out b/tests/qemu-iotests/257.out
index aa76131ca9..c33dd7f3a9 100644
--- a/tests/qemu-iotests/257.out
+++ b/tests/qemu-iotests/257.out
@@ -120,16 +120,16 @@ write -P0x67 0x3fe 0x2
 "granularity": 65536,
 "persistent": false,
 "recording": false
-  }
-],
-"drive0": [
+  },
   {
 "busy": false,
 "count": 0,
 "granularity": 65536,
 "persistent": false,
 "recording": false
-  },
+  }
+],
+"drive0": [
   {
 "busy": false,
 "count": 458752,
@@ -596,16 +596,16 @@ write -P0x67 0x3fe 0x2
 "granularity": 65536,
 "persistent": false,
 "recording": false
-  }
-],
-"drive0": [
+  },
   {
 "busy": false,
 "count": 0,
 "granularity": 65536,
 "persistent": false,
 "recording": false
-  },
+  }
+],
+"drive0": [
   {
 "busy": false,
 "count": 458752,
@@ -865,16 +865,16 @@ write -P0x67 0x3fe 0x2
 "granularity": 65536,
 "persistent": false,
 "recording": false
-  }
-],
-"drive0": [
+  },
   {
 "busy": false,
 "count": 0,
 "granularity": 65536,
 "persistent": false,
 "recording": false
-  },
+  }
+],
+"drive0": [
   {
 "busy": false,
 "count": 458752,
@@ -1341,16 +1341,16 @@ write -P0x67 0x3fe 0x2
 "granularity": 65536,
 "persistent": false,
 "recording": false
-  }
-],
-"drive0": [
+  },
   {
 "busy": false,
 "count": 0,
 "granularity": 65536,
 "persistent": false,
 "recording": false
-  },
+  }
+],
+"drive0": [
   {
 "busy": false,
 "count": 458752,

[PULL 0/6] Block jobs patches for 2024-04-29

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
The following changes since commit fd87be1dada5672f877e03c2ca8504458292c479:

  Merge tag 'accel-20240426' of https://github.com/philmd/qemu into staging 
(2024-04-26 15:28:13 -0700)

are available in the Git repository at:

  https://gitlab.com/vsementsov/qemu.git tags/pull-block-jobs-2024-04-29

for you to fetch changes up to 2ca7608c6b8d57fd6347b11af12a0f035263efef:

  iotests: add backup-discard-source (2024-04-29 13:35:30 +0300)


Block jobs patches for 2024-04-29

- backup: discard-source parameter
- blockcommit: Reopen base image as RO after abort


Alexander Ivanov (1):
  blockcommit: Reopen base image as RO after abort

Vladimir Sementsov-Ogievskiy (5):
  block/copy-before-write: fix permission
  block/copy-before-write: support unligned snapshot-discard
  block/copy-before-write: create block_copy bitmap in filter node
  qapi: blockdev-backup: add discard-source parameter
  iotests: add backup-discard-source

 block/backup.c |   5 +-
 block/block-copy.c |  12 +-
 block/copy-before-write.c  |  39 +--
 block/copy-before-write.h  |   1 +
 block/mirror.c |  11 +-
 block/replication.c|   4 +-
 blockdev.c |   2 +-
 include/block/block-common.h   |   2 +
 include/block/block-copy.h |   2 +
 include/block/block_int-global-state.h |   2 +-
 qapi/block-core.json   |   4 +
 tests/qemu-iotests/257.out | 112 +-
 tests/qemu-iotests/tests/backup-discard-source | 152 
+
 tests/qemu-iotests/tests/backup-discard-source.out |   5 +
 14 files changed, 281 insertions(+), 72 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/backup-discard-source
 create mode 100644 tests/qemu-iotests/tests/backup-discard-source.out

Alexander Ivanov (1):
  blockcommit: Reopen base image as RO after abort

Vladimir Sementsov-Ogievskiy (5):
  block/copy-before-write: fix permission
  block/copy-before-write: support unligned snapshot-discard
  block/copy-before-write: create block_copy bitmap in filter node
  qapi: blockdev-backup: add discard-source parameter
  iotests: add backup-discard-source

 block/backup.c|   5 +-
 block/block-copy.c|  12 +-
 block/copy-before-write.c |  39 -
 block/copy-before-write.h |   1 +
 block/mirror.c|  11 +-
 block/replication.c   |   4 +-
 blockdev.c|   2 +-
 include/block/block-common.h  |   2 +
 include/block/block-copy.h|   2 +
 include/block/block_int-global-state.h|   2 +-
 qapi/block-core.json  |   4 +
 tests/qemu-iotests/257.out| 112 ++---
 .../qemu-iotests/tests/backup-discard-source  | 152 ++
 .../tests/backup-discard-source.out   |   5 +
 14 files changed, 281 insertions(+), 72 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/backup-discard-source
 create mode 100644 tests/qemu-iotests/tests/backup-discard-source.out

-- 
2.34.1




[PULL 3/6] block/copy-before-write: support unligned snapshot-discard

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
First thing that crashes on unligned access here is
bdrv_reset_dirty_bitmap(). Correct way is to align-down the
snapshot-discard request.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Fiona Ebner 
Tested-by: Fiona Ebner 
Message-Id: <20240313152822.626493-3-vsement...@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/copy-before-write.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 3e3af30c08..6d89af0b29 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -325,14 +325,24 @@ static int coroutine_fn GRAPH_RDLOCK
 cbw_co_pdiscard_snapshot(BlockDriverState *bs, int64_t offset, int64_t bytes)
 {
 BDRVCopyBeforeWriteState *s = bs->opaque;
+uint32_t cluster_size = block_copy_cluster_size(s->bcs);
+int64_t aligned_offset = QEMU_ALIGN_UP(offset, cluster_size);
+int64_t aligned_end = QEMU_ALIGN_DOWN(offset + bytes, cluster_size);
+int64_t aligned_bytes;
+
+if (aligned_end <= aligned_offset) {
+return 0;
+}
+aligned_bytes = aligned_end - aligned_offset;
 
 WITH_QEMU_LOCK_GUARD(&s->lock) {
-bdrv_reset_dirty_bitmap(s->access_bitmap, offset, bytes);
+bdrv_reset_dirty_bitmap(s->access_bitmap, aligned_offset,
+aligned_bytes);
 }
 
-block_copy_reset(s->bcs, offset, bytes);
+block_copy_reset(s->bcs, aligned_offset, aligned_bytes);
 
-return bdrv_co_pdiscard(s->target, offset, bytes);
+return bdrv_co_pdiscard(s->target, aligned_offset, aligned_bytes);
 }
 
 static void GRAPH_RDLOCK cbw_refresh_filename(BlockDriverState *bs)
-- 
2.34.1




[PULL 2/6] block/copy-before-write: fix permission

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
In case when source node does not have any parents, the condition still
works as required: backup job do create the parent by

  block_job_create -> block_job_add_bdrv -> bdrv_root_attach_child

Still, in this case checking @perm variable doesn't work, as backup job
creates the root blk with empty permissions (as it rely on CBW filter
to require correct permissions and don't want to create extra
conflicts).

So, we should not check @perm.

The hack may be dropped entirely when transactional insertion of
filter (when we don't try to recalculate permissions in intermediate
state, when filter does conflict with original parent of the source
node) merged (old big series
"[PATCH v5 00/45] Transactional block-graph modifying API"[1] and it's
current in-flight part is "[PATCH v8 0/7] blockdev-replace"[2])

[1] https://patchew.org/QEMU/20220330212902.590099-1-vsement...@openvz.org/
[2] https://patchew.org/QEMU/2023101718.932733-1-vsement...@yandex-team.ru/

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Tested-by: Fiona Ebner 
Message-Id: <20240313152822.626493-2-vsement...@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/copy-before-write.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 8aba27a71d..3e3af30c08 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -364,9 +364,13 @@ cbw_child_perm(BlockDriverState *bs, BdrvChild *c, 
BdrvChildRole role,
perm, shared, nperm, nshared);
 
 if (!QLIST_EMPTY(&bs->parents)) {
-if (perm & BLK_PERM_WRITE) {
-*nperm = *nperm | BLK_PERM_CONSISTENT_READ;
-}
+/*
+ * Note, that source child may be shared with backup job. Backup 
job
+ * does create own blk parent on copy-before-write node, so this
+ * works even if source node does not have any parents before 
backup
+ * start
+ */
+*nperm = *nperm | BLK_PERM_CONSISTENT_READ;
 *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
 }
 }
-- 
2.34.1




Re: [PULL 1/1] hw/ufs: Fix buffer overflow bug

2024-04-29 Thread Michael Tokarev

29.04.2024 06:25, Jeuk Kim wrote:

From: Jeuk Kim 

It fixes the buffer overflow vulnerability in the ufs device.
The bug was detected by sanitizers.


...

Resolves: #2299
Fixes: 329f16624499 ("hw/ufs: Support for Query Transfer Requests")
Reported-by: Zheyu Ma 
Signed-off-by: Jeuk Kim 


Cc: qemu-stable@ for 8.2 and 9.0 series.

Please do not forget to Cc qemu-stable@ for relevant changes.

Thanks,

/mjt



Re: [PATCH v3 4/5] qapi: introduce device-sync-config

2024-04-29 Thread Markus Armbruster
Vladimir Sementsov-Ogievskiy  writes:

> On 24.04.24 14:48, Markus Armbruster wrote:
>> Vladimir Sementsov-Ogievskiy  writes:
>> 
>>> Add command to sync config from vhost-user backend to the device. It
>>> may be helpful when VHOST_USER_SLAVE_CONFIG_CHANGE_MSG failed or not
>>> triggered interrupt to the guest or just not available (not supported
>>> by vhost-user server).
>>>
>>> Command result is racy if allow it during migration. Let's allow the
>>> sync only in RUNNING state.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 

[...]

>>> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
>>> index 0117d243c4..296af52322 100644
>>> --- a/include/sysemu/runstate.h
>>> +++ b/include/sysemu/runstate.h
>>> @@ -5,6 +5,7 @@
>>>   #include "qemu/notify.h"
>>>   
>>>   bool runstate_check(RunState state);
>>> +const char *current_run_state_str(void);
>>>   void runstate_set(RunState new_state);
>>>   RunState runstate_get(void);
>>>   bool runstate_is_running(void);
>>> diff --git a/qapi/qdev.json b/qapi/qdev.json
>>> index facaa0bc6a..e8be79c3d5 100644
>>> --- a/qapi/qdev.json
>>> +++ b/qapi/qdev.json
>>> @@ -161,3 +161,24 @@
>>>   ##
>>>   { 'event': 'DEVICE_UNPLUG_GUEST_ERROR',
>>> 'data': { '*device': 'str', 'path': 'str' } }
>>> +
>>> +##
>>> +# @device-sync-config:
>>> +#
>>> +# Synchronize config from backend to the guest. The command notifies
>>> +# re-read the device config from the backend and notifies the guest
>>> +# to re-read the config. The command may be used to notify the guest
>>> +# about block device capcity change. Currently only vhost-user-blk
>>> +# device supports this.
>> 
>> I'm not sure I understand this.  To work towards an understanding, I
>> rephrase it, and you point out the errors.
>> 
>>   Synchronize device configuration from host to guest part.  First,
>>   copy the configuration from the host part (backend) to the guest
>>   part (frontend).  Then notify guest software that device
>>   configuration changed.
>
> Correct, thanks

Perhaps

  Synchronize guest-visible device configuration with the backend's
  configuration, and notify guest software that device configuration
  changed.

  This may be useful to notify the guest of a block device capacity
  change.  Currenrly, only vhost-user-blk devices support this.

Next question: what happens when the device *doesn't* support this?

>> I wonder how configuration can get out of sync.  Can you explain?
>> 
>
> The example (and the original feature, which triggered developing this) is 
> vhost disk resize. If vhost-server (backend) doesn't support 
> VHOST_USER_SLAVE_CONFIG_CHANGE_MSG, neither QEMU nor guest will know that 
> disk capacity changed.

Sounds like we wouldn't need this command if we could make the
vhost-server support VHOST_USER_SLAVE_CONFIG_CHANGE_MSG.  Is making it
support it impractical?  Or are there other uses for this command?

>>> +#
>>> +# @id: the device's ID or QOM path
>>> +#
>>> +# Features:
>>> +#
>>> +# @unstable: The command is experimental.
>>> +#
>>> +# Since: 9.1
>>> +##
>>> +{ 'command': 'device-sync-config',
>>> +  'features': [ 'unstable' ],
>>> +  'data': {'id': 'str'} }
>>> diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
>>> index 7e075d91c1..cb35ea0b86 100644
>>> --- a/system/qdev-monitor.c
>>> +++ b/system/qdev-monitor.c
>>> @@ -23,6 +23,7 @@
>>>  #include "monitor/monitor.h"
>>>  #include "monitor/qdev.h"
>>>  #include "sysemu/arch_init.h"
>>> +#include "sysemu/runstate.h"
>>>  #include "qapi/error.h"
>>>  #include "qapi/qapi-commands-qdev.h"
>>>  #include "qapi/qmp/dispatch.h"
>>> @@ -969,6 +970,52 @@ void qmp_device_del(const char *id, Error **errp)
>>>   }
>>>   }
>>>   
>>> +int qdev_sync_config(DeviceState *dev, Error **errp)
>>> +{
>>> +DeviceClass *dc = DEVICE_GET_CLASS(dev);
>>> +
>>> +if (!dc->sync_config) {
>>> +error_setg(errp, "device-sync-config is not supported for '%s'",
>>> +   object_get_typename(OBJECT(dev)));
>>> +return -ENOTSUP;
>>> +}
>>> +
>>> +return dc->sync_config(dev, errp);
>>> +}
>>> +
>>> +void qmp_device_sync_config(const char *id, Error **errp)
>>> +{
>>> +DeviceState *dev;
>>> +
>>> +/*
>>> + * During migration there is a race between syncing`config and
>>> + * migrating it, so let's just not allow it.
>> 
>> Can you briefly explain the race?
>
> If at the moment of qmp command, corresponding config already migrated to the 
> target, we'll change only the config on source, but on the target we'll still 
> have outdated config.

For RAM, dirty tracking ensures the change gets sent.  But this is
device memory.  Correct?

>>> + *
>>> + * Moreover, let's not rely on setting up interrupts in paused
>>> + * state, which may be a part of migration process.
>> 
>> What dependence exactly are you avoiding?  Config synchronization
>> depending on guest interrupt delivery?
>
> Right, guest is notified by pci_set_irq.

If we allowed it in pau

Re: [PATCH 0/3] Make it possible to compile the x86 binaries without FDC

2024-04-29 Thread Kevin Wolf
[ Cc: qemu-block ]

Am 25.04.2024 um 20:43 hat Thomas Huth geschrieben:
> For downstream versions of QEMU, we'd like to be able to compile QEMU
> without the FDC code included (since it's not required for modern VMs
> anymore and the FDC code has rather a bad reputation, see the VENOM CVE).
> 
> The q35 machine can already be instantiated without FDC, but for being
> able to link a binary without the FDC code, the Kconfig file needs some
> tweaks and there are two spots in the pc code that directly call functions
> from the FDC code - those need to be disabled via #ifdefs.
> 
> The third patch changes the i440fx and isapc machine types so that
> they can work without the FDC device, too, in case it has not been
> compiled into the binary. It's marked as RFC since I assume that the
> FDC was originally a fix compononent of these motherboards, so I'm
> unsure whether we should allow the disablement there. OTOH, it seems
> to work fine, and the FDC is only disabled when it is not available
> in the binary, so I hope this patch is fine, too.
> 
> Thomas Huth (3):
>   hw/i386/pc: Allow to compile without CONFIG_FDC_ISA
>   hw/i386/Kconfig: Allow to compile Q35 without FDC_ISA
>   hw/i386: Add the possibility to use i440fx and isapc without FDC
> 
>  hw/i386/pc.c  | 13 +
>  hw/i386/pc_piix.c |  6 --
>  hw/i386/Kconfig   |  2 +-
>  3 files changed, 14 insertions(+), 7 deletions(-)
> 
> -- 
> 2.44.0
> 
> 




[PATCH v4 2/3] vhost-user-blk: split vhost_user_blk_sync_config()

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
Split vhost_user_blk_sync_config() out from
vhost_user_blk_handle_config_change(), to be reused in the following
commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 hw/block/vhost-user-blk.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 9e6bbc6950..091d2c6acf 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -88,27 +88,39 @@ static void vhost_user_blk_set_config(VirtIODevice *vdev, 
const uint8_t *config)
 s->blkcfg.wce = blkcfg->wce;
 }
 
+static int vhost_user_blk_sync_config(DeviceState *dev, Error **errp)
+{
+int ret;
+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+
+ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
+   vdev->config_len, errp);
+if (ret < 0) {
+return ret;
+}
+
+memcpy(vdev->config, &s->blkcfg, vdev->config_len);
+virtio_notify_config(vdev);
+
+return 0;
+}
+
 static int vhost_user_blk_handle_config_change(struct vhost_dev *dev)
 {
 int ret;
-VirtIODevice *vdev = dev->vdev;
-VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);
 Error *local_err = NULL;
 
 if (!dev->started) {
 return 0;
 }
 
-ret = vhost_dev_get_config(dev, (uint8_t *)&s->blkcfg,
-   vdev->config_len, &local_err);
+ret = vhost_user_blk_sync_config(DEVICE(dev->vdev), &local_err);
 if (ret < 0) {
 error_report_err(local_err);
 return ret;
 }
 
-memcpy(dev->vdev->config, &s->blkcfg, vdev->config_len);
-virtio_notify_config(dev->vdev);
-
 return 0;
 }
 
-- 
2.34.1




[PATCH v4 0/3] vhost-user-blk: live resize additional APIs

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
v4:
Fixes 01-02 from v3 are already merged.
02: new, split out from 03
03: refacting vhost_user_blk_handle_config_change() split out to 02
drop current_run_state_str() helper
some rewordings (Markus)

Vladimir Sementsov-Ogievskiy (3):
  qdev-monitor: add option to report GenericError from find_device_state
  vhost-user-blk: split vhost_user_blk_sync_config()
  qapi: introduce device-sync-config

 hw/block/vhost-user-blk.c | 27 -
 hw/virtio/virtio-pci.c|  9 ++
 include/hw/qdev-core.h|  3 ++
 qapi/qdev.json| 23 ++
 system/qdev-monitor.c | 63 ---
 5 files changed, 114 insertions(+), 11 deletions(-)

-- 
2.34.1




[PATCH v4 3/3] qapi: introduce device-sync-config

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
Add command to sync config from vhost-user backend to the device. It
may be helpful when VHOST_USER_SLAVE_CONFIG_CHANGE_MSG failed or not
triggered interrupt to the guest or just not available (not supported
by vhost-user server).

Command result is racy if allow it during migration. Let's allow the
sync only in RUNNING state.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 hw/block/vhost-user-blk.c |  1 +
 hw/virtio/virtio-pci.c|  9 
 include/hw/qdev-core.h|  3 +++
 qapi/qdev.json| 23 +++
 system/qdev-monitor.c | 48 +++
 5 files changed, 84 insertions(+)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 091d2c6acf..2f301f380c 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -588,6 +588,7 @@ static void vhost_user_blk_class_init(ObjectClass *klass, 
void *data)
 
 device_class_set_props(dc, vhost_user_blk_properties);
 dc->vmsd = &vmstate_vhost_user_blk;
+dc->sync_config = vhost_user_blk_sync_config;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 vdc->realize = vhost_user_blk_device_realize;
 vdc->unrealize = vhost_user_blk_device_unrealize;
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index b1d02f4b3d..0d91e8b5dc 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2351,6 +2351,14 @@ static void virtio_pci_dc_realize(DeviceState *qdev, 
Error **errp)
 vpciklass->parent_dc_realize(qdev, errp);
 }
 
+static int virtio_pci_sync_config(DeviceState *dev, Error **errp)
+{
+VirtIOPCIProxy *proxy = VIRTIO_PCI(dev);
+VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
+
+return qdev_sync_config(DEVICE(vdev), errp);
+}
+
 static void virtio_pci_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -2367,6 +2375,7 @@ static void virtio_pci_class_init(ObjectClass *klass, 
void *data)
 device_class_set_parent_realize(dc, virtio_pci_dc_realize,
 &vpciklass->parent_dc_realize);
 rc->phases.hold = virtio_pci_bus_reset_hold;
+dc->sync_config = virtio_pci_sync_config;
 }
 
 static const TypeInfo virtio_pci_info = {
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 9228e96c87..87135bdcdf 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -95,6 +95,7 @@ typedef void (*DeviceUnrealize)(DeviceState *dev);
 typedef void (*DeviceReset)(DeviceState *dev);
 typedef void (*BusRealize)(BusState *bus, Error **errp);
 typedef void (*BusUnrealize)(BusState *bus);
+typedef int (*DeviceSyncConfig)(DeviceState *dev, Error **errp);
 
 /**
  * struct DeviceClass - The base class for all devices.
@@ -162,6 +163,7 @@ struct DeviceClass {
 DeviceReset reset;
 DeviceRealize realize;
 DeviceUnrealize unrealize;
+DeviceSyncConfig sync_config;
 
 /**
  * @vmsd: device state serialisation description for
@@ -546,6 +548,7 @@ bool qdev_hotplug_allowed(DeviceState *dev, Error **errp);
  */
 HotplugHandler *qdev_get_hotplug_handler(DeviceState *dev);
 void qdev_unplug(DeviceState *dev, Error **errp);
+int qdev_sync_config(DeviceState *dev, Error **errp);
 void qdev_simple_device_unplug_cb(HotplugHandler *hotplug_dev,
   DeviceState *dev, Error **errp);
 void qdev_machine_creation_done(void);
diff --git a/qapi/qdev.json b/qapi/qdev.json
index facaa0bc6a..fc5e125a45 100644
--- a/qapi/qdev.json
+++ b/qapi/qdev.json
@@ -161,3 +161,26 @@
 ##
 { 'event': 'DEVICE_UNPLUG_GUEST_ERROR',
   'data': { '*device': 'str', 'path': 'str' } }
+
+##
+# @device-sync-config:
+#
+# Synchronize device configuration from host to guest part.  First,
+# copy the configuration from the host part (backend) to the guest
+# part (frontend).  Then notify guest software that device
+# configuration changed.
+# The command may be used to notify the guest about block device
+# capcity change.  Currently only vhost-user-blk device supports
+# this.
+#
+# @id: the device's ID or QOM path
+#
+# Features:
+#
+# @unstable: The command is experimental.
+#
+# Since: 9.1
+##
+{ 'command': 'device-sync-config',
+  'features': [ 'unstable' ],
+  'data': {'id': 'str'} }
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 264978aa40..47bfc0506e 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -23,6 +23,7 @@
 #include "monitor/monitor.h"
 #include "monitor/qdev.h"
 #include "sysemu/arch_init.h"
+#include "sysemu/runstate.h"
 #include "qapi/error.h"
 #include "qapi/qapi-commands-qdev.h"
 #include "qapi/qmp/dispatch.h"
@@ -971,6 +972,53 @@ void qmp_device_del(const char *id, Error **errp)
 }
 }
 
+int qdev_sync_config(DeviceState *dev, Error **errp)
+{
+DeviceClass *dc = DEVICE_GET_CLASS(dev);
+
+if (!dc->sync_config) {
+error_setg(errp, "device-sync-config is not supported for '%s'",
+   object_get_typename(OBJECT(dev)));
+return -ENOTSUP;

[PATCH v4 1/3] qdev-monitor: add option to report GenericError from find_device_state

2024-04-29 Thread Vladimir Sementsov-Ogievskiy
Here we just prepare for the following patch, making possible to report
GenericError as recommended.

This patch doesn't aim to prevent further use of DeviceNotFound by
future interfaces:

 - find_device_state() is used in blk_by_qdev_id() and qmp_get_blk()
   functions, which may lead to spread of DeviceNotFound anyway
 - also, nothing prevent simply copy-pasting find_device_state() calls
   with false argument

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 system/qdev-monitor.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 6af6ef7d66..264978aa40 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -879,13 +879,20 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, 
Error **errp)
 object_unref(OBJECT(dev));
 }
 
-static DeviceState *find_device_state(const char *id, Error **errp)
+/*
+ * Note that creating new APIs using error classes other than GenericError is
+ * not recommended. Set use_generic_error=true for new interfaces.
+ */
+static DeviceState *find_device_state(const char *id, bool use_generic_error,
+  Error **errp)
 {
 Object *obj = object_resolve_path_at(qdev_get_peripheral(), id);
 DeviceState *dev;
 
 if (!obj) {
-error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
+error_set(errp,
+  (use_generic_error ?
+   ERROR_CLASS_GENERIC_ERROR : ERROR_CLASS_DEVICE_NOT_FOUND),
   "Device '%s' not found", id);
 return NULL;
 }
@@ -950,7 +957,7 @@ void qdev_unplug(DeviceState *dev, Error **errp)
 
 void qmp_device_del(const char *id, Error **errp)
 {
-DeviceState *dev = find_device_state(id, errp);
+DeviceState *dev = find_device_state(id, false, errp);
 if (dev != NULL) {
 if (dev->pending_deleted_event &&
 (dev->pending_deleted_expires_ms == 0 ||
@@ -1070,7 +1077,7 @@ BlockBackend *blk_by_qdev_id(const char *id, Error **errp)
 
 GLOBAL_STATE_CODE();
 
-dev = find_device_state(id, errp);
+dev = find_device_state(id, false, errp);
 if (dev == NULL) {
 return NULL;
 }
-- 
2.34.1




Re: [PATCH v3 5/5] qapi: introduce CONFIG_READ event

2024-04-29 Thread Vladimir Sementsov-Ogievskiy

On 24.04.24 15:11, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


Send a new event when guest reads virtio-pci config after
virtio_notify_config() call.

That's useful to check that guest fetched modified config, for example
after resizing disk backend.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  hw/virtio/virtio-pci.c |  9 +
  include/monitor/qdev.h |  2 ++
  monitor/monitor.c  |  1 +
  qapi/qdev.json | 33 +
  stubs/qdev.c   |  6 ++
  system/qdev-monitor.c  |  6 ++
  6 files changed, 57 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 92afbae71c..c0c158dae2 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -23,6 +23,7 @@
  #include "hw/boards.h"
  #include "hw/virtio/virtio.h"
  #include "migration/qemu-file-types.h"
+#include "monitor/qdev.h"
  #include "hw/pci/pci.h"
  #include "hw/pci/pci_bus.h"
  #include "hw/qdev-properties.h"
@@ -530,6 +531,10 @@ static uint64_t virtio_pci_config_read(void *opaque, 
hwaddr addr,
  }
  addr -= config;
  
+if (vdev->generation > 0) {

+qdev_virtio_config_read_event(DEVICE(proxy));
+}
+
  switch (size) {
  case 1:
  val = virtio_config_readb(vdev, addr);
@@ -1884,6 +1889,10 @@ static uint64_t virtio_pci_device_read(void *opaque, 
hwaddr addr,
  return UINT64_MAX;
  }
  
+if (vdev->generation > 0) {

+qdev_virtio_config_read_event(DEVICE(proxy));
+}
+
  switch (size) {
  case 1:
  val = virtio_config_modern_readb(vdev, addr);
diff --git a/include/monitor/qdev.h b/include/monitor/qdev.h
index 1d57bf6577..fc9a834dca 100644
--- a/include/monitor/qdev.h
+++ b/include/monitor/qdev.h
@@ -36,4 +36,6 @@ DeviceState *qdev_device_add_from_qdict(const QDict *opts,
   */
  const char *qdev_set_id(DeviceState *dev, char *id, Error **errp);
  
+void qdev_virtio_config_read_event(DeviceState *dev);

+
  #endif
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 01ede1babd..5b06146503 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -316,6 +316,7 @@ static MonitorQAPIEventConf 
monitor_qapi_event_conf[QAPI_EVENT__MAX] = {
  [QAPI_EVENT_VSERPORT_CHANGE]   = { 1000 * SCALE_MS },
  [QAPI_EVENT_MEMORY_DEVICE_SIZE_CHANGE] = { 1000 * SCALE_MS },
  [QAPI_EVENT_HV_BALLOON_STATUS_REPORT] = { 1000 * SCALE_MS },
+[QAPI_EVENT_VIRTIO_CONFIG_READ] = { 300 * SCALE_MS },


All the other rate-limited events use 1s.  Why 0.3s for this one?


No actual reason, just seemed to me that 1s is too much. Should be better to 
keep all limits to be the same, until no concrete reason to break it.




  };
  
  /*

diff --git a/qapi/qdev.json b/qapi/qdev.json
index e8be79c3d5..29a4f47360 100644
--- a/qapi/qdev.json
+++ b/qapi/qdev.json
@@ -182,3 +182,36 @@
  { 'command': 'device-sync-config',
'features': [ 'unstable' ],
'data': {'id': 'str'} }
+
+##
+# @VIRTIO_CONFIG_READ:
+#
+# Emitted whenever guest reads virtio device configuration after
+# configuration change.


Is it emitted whenever the guest reads, or only when it reads after a
configuration change?


Hmm, it's emitted only when vdev->generation > 0, which generally mean that 
there was at least one call to virtio_notify_config()... That's not the logic, which 
could be simply described here.


Actually, now I think that event was a premature improvement. In our final 
downstream solution only the command device-sync-config is used, not the event. 
I see that the concept of the event is objectionable, I think, I'll better just 
drop it in v4.




+#
+# The event may be used in pair with device-sync-config. It shows
+# that guest has re-read updated configuration. It doesn't
+# guarantee that guest successfully handled it and updated the
+# view of the device for the user, but still it's a kind of
+# success indicator.


The event is virtio-only.  device-sync-config isn't.  Why?


+#
+# @device: device name
+#
+# @path: device path
+#
+# Features:
+#
+# @unstable: The event is experimental.
+#


Missing:

# Note: This event is rate-limited.
#


+# Since: 9.1
+#
+# Example:
+#
+# <- { "event": "VIRTIO_CONFIG_READ",
+#  "data": { "device": "virtio-net-pci-0",
+#"path": "/machine/peripheral/virtio-net-pci-0" },
+#  "timestamp": { "seconds": 1265044230, "microseconds": 450486 } }
+##
+{ 'event': 'VIRTIO_CONFIG_READ',
+  'features': [ 'unstable' ],
+  'data': { '*device': 'str', 'path': 'str' } }
diff --git a/stubs/qdev.c b/stubs/qdev.c
index 6869f6f90a..ab6c4afe0b 100644
--- a/stubs/qdev.c
+++ b/stubs/qdev.c
@@ -26,3 +26,9 @@ void qapi_event_send_device_unplug_guest_error(const char 
*device,
  {
  /* Nothing to do. */
  }
+
+void qapi_event_send_virtio_config_read(const char *device,
+const char *path)
+{
+/* Nothing to do. */
+}
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index cb35ea0b

Re: [PATCH v3 4/5] qapi: introduce device-sync-config

2024-04-29 Thread Vladimir Sementsov-Ogievskiy

On 24.04.24 14:48, Markus Armbruster wrote:

Vladimir Sementsov-Ogievskiy  writes:


Add command to sync config from vhost-user backend to the device. It
may be helpful when VHOST_USER_SLAVE_CONFIG_CHANGE_MSG failed or not
triggered interrupt to the guest or just not available (not supported
by vhost-user server).

Command result is racy if allow it during migration. Let's allow the
sync only in RUNNING state.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  hw/block/vhost-user-blk.c | 27 --
  hw/virtio/virtio-pci.c|  9 
  include/hw/qdev-core.h|  3 +++
  include/sysemu/runstate.h |  1 +
  qapi/qdev.json| 21 +
  system/qdev-monitor.c | 47 +++
  system/runstate.c |  5 +
  7 files changed, 106 insertions(+), 7 deletions(-)

diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
index 9e6bbc6950..2f301f380c 100644
--- a/hw/block/vhost-user-blk.c
+++ b/hw/block/vhost-user-blk.c
@@ -88,27 +88,39 @@ static void vhost_user_blk_set_config(VirtIODevice *vdev, 
const uint8_t *config)
  s->blkcfg.wce = blkcfg->wce;
  }
  
+static int vhost_user_blk_sync_config(DeviceState *dev, Error **errp)

+{
+int ret;
+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+VHostUserBlk *s = VHOST_USER_BLK(vdev);
+
+ret = vhost_dev_get_config(&s->dev, (uint8_t *)&s->blkcfg,
+   vdev->config_len, errp);
+if (ret < 0) {
+return ret;
+}
+
+memcpy(vdev->config, &s->blkcfg, vdev->config_len);
+virtio_notify_config(vdev);
+
+return 0;
+}
+
  static int vhost_user_blk_handle_config_change(struct vhost_dev *dev)
  {
  int ret;
-VirtIODevice *vdev = dev->vdev;
-VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);
  Error *local_err = NULL;
  
  if (!dev->started) {

  return 0;
  }
  
-ret = vhost_dev_get_config(dev, (uint8_t *)&s->blkcfg,

-   vdev->config_len, &local_err);
+ret = vhost_user_blk_sync_config(DEVICE(dev->vdev), &local_err);
  if (ret < 0) {
  error_report_err(local_err);
  return ret;
  }
  
-memcpy(dev->vdev->config, &s->blkcfg, vdev->config_len);

-virtio_notify_config(dev->vdev);
-
  return 0;
  }


This factors vhost_user_blk_sync_config() out of
vhost_user_blk_handle_config_change() for reuse.  Correct?


Yes. Will split to a separate patch in v4



  
@@ -576,6 +588,7 @@ static void vhost_user_blk_class_init(ObjectClass *klass, void *data)
  
  device_class_set_props(dc, vhost_user_blk_properties);

  dc->vmsd = &vmstate_vhost_user_blk;
+dc->sync_config = vhost_user_blk_sync_config;
  set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
  vdc->realize = vhost_user_blk_device_realize;
  vdc->unrealize = vhost_user_blk_device_unrealize;
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index eaaf86402c..92afbae71c 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -2501,6 +2501,14 @@ static void virtio_pci_dc_realize(DeviceState *qdev, 
Error **errp)
  vpciklass->parent_dc_realize(qdev, errp);
  }
  
+static int virtio_pci_sync_config(DeviceState *dev, Error **errp)

+{
+VirtIOPCIProxy *proxy = VIRTIO_PCI(dev);
+VirtIODevice *vdev = virtio_bus_get_device(&proxy->bus);
+
+return qdev_sync_config(DEVICE(vdev), errp);
+}
+
  static void virtio_pci_class_init(ObjectClass *klass, void *data)
  {
  DeviceClass *dc = DEVICE_CLASS(klass);
@@ -2517,6 +2525,7 @@ static void virtio_pci_class_init(ObjectClass *klass, 
void *data)
  device_class_set_parent_realize(dc, virtio_pci_dc_realize,
  &vpciklass->parent_dc_realize);
  rc->phases.hold = virtio_pci_bus_reset_hold;
+dc->sync_config = virtio_pci_sync_config;
  }
  


I tried to follow the callbacks, but quickly gave up.  Leaving to a
reviewer who understands virtio.


  static const TypeInfo virtio_pci_info = {
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 9228e96c87..87135bdcdf 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -95,6 +95,7 @@ typedef void (*DeviceUnrealize)(DeviceState *dev);
  typedef void (*DeviceReset)(DeviceState *dev);
  typedef void (*BusRealize)(BusState *bus, Error **errp);
  typedef void (*BusUnrealize)(BusState *bus);
+typedef int (*DeviceSyncConfig)(DeviceState *dev, Error **errp);
  
  /**

   * struct DeviceClass - The base class for all devices.
@@ -162,6 +163,7 @@ struct DeviceClass {
  DeviceReset reset;
  DeviceRealize realize;
  DeviceUnrealize unrealize;
+DeviceSyncConfig sync_config;
  
  /**

   * @vmsd: device state serialisation description for
@@ -546,6 +548,7 @@ bool qdev_hotplug_allowed(DeviceState *dev, Error **errp);
   */
  HotplugHandler *qdev_get_hotplug_handler(DeviceState *dev);
  void qdev_unplug(DeviceState *dev, Error **errp);
+int qdev_sync_config(DeviceState *dev