Re: [PATCH for-9.0 v3 0/2] qtest/virtio-9p-test.c: fix slow tests

2024-03-27 Thread Greg Kurz
On Wed, 27 Mar 2024 11:20:09 -0300
Daniel Henrique Barboza  wrote:

> Hi,
> 
> In this new version we took a different approach after the discussions
> we had in [1]. The tests are now untouched, and we're addressing the root
> cause directly: the fact that we have a single temp dir for all the test
> execution in qos-test.
> 
> We're now creating and cleaning temp dirs for each individual test by
> calling virtio_9p_create_local_test_dir() in the .before callback for
> the local 9p tests (assign_9p_local_driver()). In this same callback we
> queue the cleanup function that will erase the created temp dir. The
> cleanup will run after the test ran successfully.
> 
> This approach is similar to what other qtests do (in fact this design was
> taken from vhost-user-test.c) so it's not like we're doing something
> novel.
> 
> I kept the revert of the slow test gate because Gitlab seems to approve
> it:
> 
> https://gitlab.com/danielhb/qemu/-/pipelines/1229836634
> 
> Feel free to take just patch 1 if we're not sure about re-enabling these
> tests in Gitlab.
> 
> 
> Changes from v3:
> - patches 1 to 6: dropped
> - patch 1 (new):
>   - create and remove temporary dirs on each test
> - v2 link: https://mail.gnu.org/archive/html/qemu-devel/2024-03/msg06335.html
> 
> [1] https://mail.gnu.org/archive/html/qemu-devel/2024-03/msg06400.html
> 
> Daniel Henrique Barboza (2):
>   qtest/virtio-9p-test.c: create/remove temp dirs after each test
>   qtest/virtio-9p-test.c: remove g_test_slow() gate
> 
>  tests/qtest/virtio-9p-test.c | 32 +++-
>  1 file changed, 11 insertions(+), 21 deletions(-)
> 

Definitely better !

Full series,

Reviewed-by:Greg Kurz 

Cheers,

-- 
Greg



Re: [PATCH for-9.0 1/3] qtest/virtio-9p-test.c: consolidate create dir, file and symlink tests

2024-03-27 Thread Greg Kurz
On Wed, 27 Mar 2024 13:26:45 +0100
Christian Schoenebeck  wrote:

> On Wednesday, March 27, 2024 12:28:17 PM CET Daniel Henrique Barboza wrote:
> > On 3/27/24 07:14, Christian Schoenebeck wrote:
> > > On Wednesday, March 27, 2024 10:33:27 AM CET Daniel Henrique Barboza 
> > > wrote:
> > >> On 3/27/24 05:47, Christian Schoenebeck wrote:
> > >>> On Tuesday, March 26, 2024 6:47:17 PM CET Daniel Henrique Barboza wrote:
> > >>>> On 3/26/24 14:05, Greg Kurz wrote:
> > >>>>> On Tue, 26 Mar 2024 10:26:04 -0300
> > >>>>> Daniel Henrique Barboza  wrote:
> > >>>>>
> > >>>>>> The local 9p driver in virtio-9p-test.c its temporary dir right at 
> > >>>>>> the
> > >>>>>> start of qos-test (via virtio_9p_create_local_test_dir()) and only
> > >>>>>> deletes it after qos-test is finished (via
> > >>>>>> virtio_9p_remove_local_test_dir()).
> > >>>>>>
> > >>>>>> This means that any qos-test machine that ends up running 
> > >>>>>> virtio-9p-test local
> > >>>>>> tests more than once will end up re-using the same temp dir. This is
> > >>>>>> what's happening in [1] after we introduced the riscv machine nodes: 
> > >>>>>> if
> > >>>>>> we enable slow tests with the '-m slow' flag using 
> > >>>>>> qemu-system-riscv64,
> > >>>>>> this is what happens:
> > >>>>>>
> > >>>>>> - a temp dir is created, e.g. qtest-9p-local-WZLDL2;
> > >>>>>>
> > >>>>>> - virtio-9p-device tests will run virtio-9p-test successfully;
> > >>>>>>
> > >>>>>> - virtio-9p-pci tests will run virtio-9p-test, and fail right at the
> > >>>>>>  first slow test at fs_create_dir() because the "01" file was 
> > >>>>>> already
> > >>>>>>  created by fs_create_dir() test when running with the 
> > >>>>>> virtio-9p-device.
> > >>>>>>
> > >>>>>> We can fix it by making every test clean up their changes in the
> > >>>>>> filesystem after they're done. But we don't need every test either:
> > >>>>>> what fs_create_file() does is already exercised in fs_unlinkat_dir(),
> > >>>>>> i.e. a dir is created, verified to be created, and then removed. 
> > >>>>>> Fixing
> > >>>>>> fs_create_file() would turn it into fs_unlikat_dir(), so we don't 
> > >>>>>> need
> > >>>>>> both. The same theme follows every test in virtio-9p-test.c, where 
> > >>>>>> the
> > >>>>>> 'unlikat' variant does the same thing the 'create' does but with some
> > >>>>>> cleaning in the end.
> > >>>>>>
> > >>>>>> Consolide some tests as follows:
> > >>>>>>
> > >>>>>> - fs_create_dir() is removed. fs_unlinkat_dir() is renamed to
> > >>>>>>  fs_create_unlinkat_dir();
> > >>>>>>
> > >>>>>> - fs_create_file() is removed. fs_unlinkat_file() is renamed to
> > >>>>>>  fs_create_unlinkat_file(). The "04" dir it uses is now being 
> > >>>>>> removed;
> > >>>>>>
> > >>>>>> - fs_symlink_file() is removed. fs_unlinkat_symlink() is renamed to
> > >>>>>>  fs_create_unlinkat_symlink(). Both "real_file" and the "06" dir 
> > >>>>>> it
> > >>>>>>  creates is now being removed.
> > >>>>>>
> > >>>>>
> > >>>>> The  change looks good functionally but it breaks the legitimate 
> > >>>>> assumption
> > >>>>> that files "06/*" come from test #6 and so on... I think you should 
> > >>>>> consider
> > >>>>> renumbering to avoid confusion when debugging logs.
> > >>>>>
> > >>>>> Since this will bring more hunks, please split this in enough 
> > >>>>> reviewable
> > >>>>> patches.
> > >>>>
> &g

Re: [PATCH for-9.0 1/3] qtest/virtio-9p-test.c: consolidate create dir, file and symlink tests

2024-03-26 Thread Greg Kurz
On Tue, 26 Mar 2024 10:26:04 -0300
Daniel Henrique Barboza  wrote:

> The local 9p driver in virtio-9p-test.c its temporary dir right at the
> start of qos-test (via virtio_9p_create_local_test_dir()) and only
> deletes it after qos-test is finished (via
> virtio_9p_remove_local_test_dir()).
> 
> This means that any qos-test machine that ends up running virtio-9p-test local
> tests more than once will end up re-using the same temp dir. This is
> what's happening in [1] after we introduced the riscv machine nodes: if
> we enable slow tests with the '-m slow' flag using qemu-system-riscv64,
> this is what happens:
> 
> - a temp dir is created, e.g. qtest-9p-local-WZLDL2;
> 
> - virtio-9p-device tests will run virtio-9p-test successfully;
> 
> - virtio-9p-pci tests will run virtio-9p-test, and fail right at the
>   first slow test at fs_create_dir() because the "01" file was already
>   created by fs_create_dir() test when running with the virtio-9p-device.
> 
> We can fix it by making every test clean up their changes in the
> filesystem after they're done. But we don't need every test either:
> what fs_create_file() does is already exercised in fs_unlinkat_dir(),
> i.e. a dir is created, verified to be created, and then removed. Fixing
> fs_create_file() would turn it into fs_unlikat_dir(), so we don't need
> both. The same theme follows every test in virtio-9p-test.c, where the
> 'unlikat' variant does the same thing the 'create' does but with some
> cleaning in the end.
> 
> Consolide some tests as follows:
> 
> - fs_create_dir() is removed. fs_unlinkat_dir() is renamed to
>   fs_create_unlinkat_dir();
> 
> - fs_create_file() is removed. fs_unlinkat_file() is renamed to
>   fs_create_unlinkat_file(). The "04" dir it uses is now being removed;
> 
> - fs_symlink_file() is removed. fs_unlinkat_symlink() is renamed to
>   fs_create_unlinkat_symlink(). Both "real_file" and the "06" dir it
>   creates is now being removed.
> 

The  change looks good functionally but it breaks the legitimate assumption
that files "06/*" come from test #6 and so on... I think you should consider
renumbering to avoid confusion when debugging logs.

Since this will bring more hunks, please split this in enough reviewable
patches.

Cheers,

--
Greg

> We're still missing the 'hardlink' tests. We'll do it in the next patch
> since it's less trivial to consolidate than these.
> 
> [1] https://mail.gnu.org/archive/html/qemu-devel/2024-03/msg05807.html
> 
> Reported-by: Thomas Huth 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  tests/qtest/virtio-9p-test.c | 97 +++-
>  1 file changed, 29 insertions(+), 68 deletions(-)
> 
> diff --git a/tests/qtest/virtio-9p-test.c b/tests/qtest/virtio-9p-test.c
> index 65e69491e5..cdbe3e78ea 100644
> --- a/tests/qtest/virtio-9p-test.c
> +++ b/tests/qtest/virtio-9p-test.c
> @@ -506,26 +506,8 @@ static void fs_readdir_split_512(void *obj, void *data,
>  
>  /* tests using the 9pfs 'local' fs driver */
>  
> -static void fs_create_dir(void *obj, void *data, QGuestAllocator *t_alloc)
> -{
> -QVirtio9P *v9p = obj;
> -v9fs_set_allocator(t_alloc);
> -struct stat st;
> -g_autofree char *root_path = virtio_9p_test_path("");
> -g_autofree char *new_dir = virtio_9p_test_path("01");
> -
> -g_assert(root_path != NULL);
> -
> -tattach({ .client = v9p });
> -tmkdir({ .client = v9p, .atPath = "/", .name = "01" });
> -
> -/* check if created directory really exists now ... */
> -g_assert(stat(new_dir, ) == 0);
> -/* ... and is actually a directory */
> -g_assert((st.st_mode & S_IFMT) == S_IFDIR);
> -}
> -
> -static void fs_unlinkat_dir(void *obj, void *data, QGuestAllocator *t_alloc)
> +static void fs_create_unlinkat_dir(void *obj, void *data,
> +   QGuestAllocator *t_alloc)
>  {
>  QVirtio9P *v9p = obj;
>  v9fs_set_allocator(t_alloc);
> @@ -551,28 +533,13 @@ static void fs_unlinkat_dir(void *obj, void *data, 
> QGuestAllocator *t_alloc)
>  g_assert(stat(new_dir, ) != 0);
>  }
>  
> -static void fs_create_file(void *obj, void *data, QGuestAllocator *t_alloc)
> -{
> -QVirtio9P *v9p = obj;
> -v9fs_set_allocator(t_alloc);
> -struct stat st;
> -g_autofree char *new_file = virtio_9p_test_path("03/1st_file");
> -
> -tattach({ .client = v9p });
> -tmkdir({ .client = v9p, .atPath = "/", .name = "03" });
> -tlcreate({ .client = v9p, .atPath = "03", .name = "1st_file" });
> -
> -/* check if created file exists now ... */
> -g_assert(stat(new_file, ) == 0);
> -/* ... and is a regular file */
> -g_assert((st.st_mode & S_IFMT) == S_IFREG);
> -}
> -
> -static void fs_unlinkat_file(void *obj, void *data, QGuestAllocator *t_alloc)
> +static void fs_create_unlinkat_file(void *obj, void *data,
> +QGuestAllocator *t_alloc)
>  {
>  QVirtio9P *v9p = obj;
>  v9fs_set_allocator(t_alloc);
>  struct stat st;
> +g_autofree 

Re: [PATCH for-9.0 0/3] qtest/virtio-9p-test.c: fix slow tests

2024-03-26 Thread Greg Kurz
Bom dia Daniel !

On Tue, 26 Mar 2024 10:26:03 -0300
Daniel Henrique Barboza  wrote:

> Hi,
> 
> Thomas reported in [1] a problem that happened with the RISC-V machine
> where some tests from virtio-9p-test.c were failing with '-m slow', i.e.
> enabling slow tests.
> 
> In the end it wasn't a RISC-V specific problem. It just so happens that
> the recently added riscv machine nodes runs the tests from
> virtio-9p-test two times for each qos-test run: one with the
> virtio-9p-device device and another with the virtio-9p-pci. The temp dir
> for these tests is being created at the start of qos-test and removed
> only at the end of qos-test, and the tests are leaving dirs and files
> behind. virtio-9-device tests run first, creates stuff in the temp dir,
> then when virtio-9p-pci tests runs again it'll fail because the previous
> run left created dirs and files in the same temp dir. Here's a run that
> exemplifies the problem:
> 
> $ MALLOC_PERTURB_=21 V=2 QTEST_QEMU_BINARY=./qemu-system-riscv64 
> ./tests/qtest/qos-test -m slow
> (...)
> # starting QEMU: exec ./qemu-system-riscv64 -qtest 
> unix:/tmp/qtest-621710.sock -qtest-log /dev/null -chardev 
> socket,path=/tmp/qtest-621710.qmp,id=char0 -mon chardev=char0,mode=control 
> -display none -audio none -M virt,aclint=on,aia=aplic-imsic -fsdev 
> local,id=fsdev0,path='/home/danielhb/work/qemu/build/qtest-9p-local-7E16K2',security_model=mapped-xattr
>  -device virtio-9p-device,fsdev=fsdev0,mount_tag=qtest -accel qtest
> ( goes ok ...)
> # starting QEMU: exec ./qemu-system-riscv64 -qtest 
> unix:/tmp/qtest-621710.sock -qtest-log /dev/null -chardev 
> socket,path=/tmp/qtest-621710.qmp,id=char0 -mon chardev=char0,mode=control 
> -display none -audio none -M virt,aclint=on,aia=aplic-imsic -fsdev 
> local,id=fsdev0,path='/home/danielhb/work/qemu/build/qtest-9p-local-7E16K2',security_model=mapped-xattr
>  -device virtio-9p-pci,fsdev=fsdev0,addr=04.0,mount_tag=qtest -accel qtest
> ok 168 
> /riscv64/virt/generic-pcihost/pci-bus-generic/pci-bus/virtio-9p-pci/virtio-9p/virtio-9p-tests/local/config
> Received response 7 (RLERROR) instead of 73 (RMKDIR)
> Rlerror has errno 17 (File exists)
> **
> ERROR:../tests/qtest/libqos/virtio-9p-client.c:275:v9fs_req_recv: assertion 
> failed (hdr.id == id): (7 == 73)
> 
> As we can see we're running both 'virtio-9p-device' tests and 'virtio-9p-pci'
> tests using the same '/home/danielhb/work/qemu/build/qtest-9p-local-7E16K2'
> temp dir. 
> 


Good catch ! I'll try to find some time to review.

> The quick fix I came up with was to make each test clean themselves up
> after each run. The tests were also consolidated, i.e. fewer tests with the
> same coverage, because the 'unlikat' tests were doing the same thing the
> 'create' tests were doing but removing stuff after. Might as well keep just
> the 'unlikat' tests.
> 

As long as coverage is preserved, I'm fine with consolidation of the
checks. In any case, last call goes to Christian.

> I also went ahead and reverted 558f5c42efd ("tests/9pfs: Mark "local"
> tests as "slow"") after realizing that the problem I was fixing is also
> the same problem that this patch was trying to working around with the
> skip [2]. I validated this change in this Gitlab pipeline:
> 

Are you sure with that ? Issues look very similar indeed but not
exactly the same.

Cheers,

--
Greg

> https://gitlab.com/danielhb/qemu/-/pipelines/1227953967
> 
> [1] https://mail.gnu.org/archive/html/qemu-devel/2024-03/msg05807.html
> [2] https://lists.nongnu.org/archive/html/qemu-devel/2020-11/msg05510.html
> 
> Daniel Henrique Barboza (3):
>   qtest/virtio-9p-test.c: consolidate create dir, file and symlink tests
>   qtest/virtio-9p-test.c: consolidate hardlink tests
>   qtest/virtio-9p-test.c: remove g_test_slow() gate
> 
>  tests/qtest/virtio-9p-test.c | 155 +++
>  1 file changed, 48 insertions(+), 107 deletions(-)
> 



Re: [PATCH 1/3] hw/i386: Add `\n` to hint message

2024-01-30 Thread Greg Kurz
On Tue, 30 Jan 2024 21:43:27 +0530
Ani Sinha  wrote:

> 
> 
> > On 30-Jan-2024, at 21:26, Greg Kurz  wrote:
> > 
> > error_fprintf() doesn't add newlines.
> 
> ^
> 
> Should be error_printf(). Ditto for other patches.
> 

Thanks. Posted a v2.

> > 
> > Signed-off-by: Greg Kurz 
> > ---
> > hw/i386/acpi-build.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index edc979379c03..e990b0ae927f 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -2697,7 +2697,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
> > *machine)
> > " migration may not work",
> > tables_blob->len, legacy_table_size);
> > error_printf("Try removing CPUs, NUMA nodes, memory slots"
> > - " or PCI bridges.");
> > + " or PCI bridges.\n");
> > }
> > g_array_set_size(tables_blob, legacy_table_size);
> > } else {
> > @@ -2709,7 +2709,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
> > *machine)
> > " migration may not work",
> > tables_blob->len, ACPI_BUILD_TABLE_SIZE / 2);
> > error_printf("Try removing CPUs, NUMA nodes, memory slots"
> > - " or PCI bridges.");
> > + " or PCI bridges.\n");
> > }
> > acpi_align_size(tables_blob, ACPI_BUILD_TABLE_SIZE);
> > }
> > -- 
> > 2.43.0
> > 
> 



-- 
Greg



[PATCH v2 1/3] hw/i386: Add `\n` to hint message

2024-01-30 Thread Greg Kurz
error_printf() doesn't add newlines.

Signed-off-by: Greg Kurz 
---
 hw/i386/acpi-build.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index edc979379c03..e990b0ae927f 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2697,7 +2697,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 " migration may not work",
 tables_blob->len, legacy_table_size);
 error_printf("Try removing CPUs, NUMA nodes, memory slots"
- " or PCI bridges.");
+ " or PCI bridges.\n");
 }
 g_array_set_size(tables_blob, legacy_table_size);
 } else {
@@ -2709,7 +2709,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 " migration may not work",
 tables_blob->len, ACPI_BUILD_TABLE_SIZE / 2);
 error_printf("Try removing CPUs, NUMA nodes, memory slots"
- " or PCI bridges.");
+ " or PCI bridges.\n");
 }
 acpi_align_size(tables_blob, ACPI_BUILD_TABLE_SIZE);
 }
-- 
2.43.0




[PATCH v2 0/3] acpi-build: Fix hint messages

2024-01-30 Thread Greg Kurz
ACPI build for ARM, i386 and Loongarch all have the
same warning report with a hint for the user. The
hint is printed with error_printf() as expected but
it lacks the terminating '\n'.

v2:
- s/error_fprintf/error_printf in commit logs (Ani)

Greg Kurz (3):
  hw/i386: Add `\n` to hint message
  hw/loongarch: Add `\n` to hint message
  hw/arm: Add `\n` to hint message

 hw/arm/virt-acpi-build.c  | 2 +-
 hw/i386/acpi-build.c  | 4 ++--
 hw/loongarch/acpi-build.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

-- 
2.43.0





[PATCH v2 3/3] hw/arm: Add `\n` to hint message

2024-01-30 Thread Greg Kurz
error_printf() doesn't add newlines.

Signed-off-by: Greg Kurz 
---
 hw/arm/virt-acpi-build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 17aeec7a6f56..48febde1ccd1 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -1008,7 +1008,7 @@ void virt_acpi_build(VirtMachineState *vms, 
AcpiBuildTables *tables)
 " migration may not work",
 tables_blob->len, ACPI_BUILD_TABLE_SIZE / 2);
 error_printf("Try removing CPUs, NUMA nodes, memory slots"
- " or PCI bridges.");
+ " or PCI bridges.\n");
 }
 acpi_align_size(tables_blob, ACPI_BUILD_TABLE_SIZE);
 
-- 
2.43.0




[PATCH v2 2/3] hw/loongarch: Add `\n` to hint message

2024-01-30 Thread Greg Kurz
error_printf() doesn't add newlines.

Signed-off-by: Greg Kurz 
---
 hw/loongarch/acpi-build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
index 730bc4a748c4..a1c419874123 100644
--- a/hw/loongarch/acpi-build.c
+++ b/hw/loongarch/acpi-build.c
@@ -509,7 +509,7 @@ static void acpi_build(AcpiBuildTables *tables, 
MachineState *machine)
 " migration may not work",
 tables_blob->len, ACPI_BUILD_TABLE_SIZE / 2);
 error_printf("Try removing CPUs, NUMA nodes, memory slots"
- " or PCI bridges.");
+ " or PCI bridges.\n");
 }
 
 acpi_align_size(tables->linker->cmd_blob, ACPI_BUILD_ALIGN_SIZE);
-- 
2.43.0




[PATCH 3/3] hw/arm: Add `\n` to hint message

2024-01-30 Thread Greg Kurz
error_fprintf() doesn't add newlines.

Signed-off-by: Greg Kurz 
---
 hw/arm/virt-acpi-build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 17aeec7a6f56..48febde1ccd1 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -1008,7 +1008,7 @@ void virt_acpi_build(VirtMachineState *vms, 
AcpiBuildTables *tables)
 " migration may not work",
 tables_blob->len, ACPI_BUILD_TABLE_SIZE / 2);
 error_printf("Try removing CPUs, NUMA nodes, memory slots"
- " or PCI bridges.");
+ " or PCI bridges.\n");
 }
 acpi_align_size(tables_blob, ACPI_BUILD_TABLE_SIZE);
 
-- 
2.43.0




[PATCH 0/3] acpi-build: Fix hint messages

2024-01-30 Thread Greg Kurz
ACPI build for ARM, i386 and Loongarch all have the
same warning report with a hint for the user. The
hint is printed with error_printf() as expected but
it lacks the terminating '\n'.

Greg Kurz (3):
  hw/i386: Add `\n` to hint message
  hw/loongarch: Add `\n` to hint message
  hw/arm: Add `\n` to hint message

 hw/arm/virt-acpi-build.c  | 2 +-
 hw/i386/acpi-build.c  | 4 ++--
 hw/loongarch/acpi-build.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

-- 
2.43.0





[PATCH 1/3] hw/i386: Add `\n` to hint message

2024-01-30 Thread Greg Kurz
error_fprintf() doesn't add newlines.

Signed-off-by: Greg Kurz 
---
 hw/i386/acpi-build.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index edc979379c03..e990b0ae927f 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2697,7 +2697,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 " migration may not work",
 tables_blob->len, legacy_table_size);
 error_printf("Try removing CPUs, NUMA nodes, memory slots"
- " or PCI bridges.");
+ " or PCI bridges.\n");
 }
 g_array_set_size(tables_blob, legacy_table_size);
 } else {
@@ -2709,7 +2709,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 " migration may not work",
 tables_blob->len, ACPI_BUILD_TABLE_SIZE / 2);
 error_printf("Try removing CPUs, NUMA nodes, memory slots"
- " or PCI bridges.");
+ " or PCI bridges.\n");
 }
 acpi_align_size(tables_blob, ACPI_BUILD_TABLE_SIZE);
 }
-- 
2.43.0




[PATCH 2/3] hw/loongarch: Add `\n` to hint message

2024-01-30 Thread Greg Kurz
error_fprintf() doesn't add newlines.

Signed-off-by: Greg Kurz 
---
 hw/loongarch/acpi-build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/loongarch/acpi-build.c b/hw/loongarch/acpi-build.c
index 730bc4a748c4..a1c419874123 100644
--- a/hw/loongarch/acpi-build.c
+++ b/hw/loongarch/acpi-build.c
@@ -509,7 +509,7 @@ static void acpi_build(AcpiBuildTables *tables, 
MachineState *machine)
 " migration may not work",
 tables_blob->len, ACPI_BUILD_TABLE_SIZE / 2);
 error_printf("Try removing CPUs, NUMA nodes, memory slots"
- " or PCI bridges.");
+ " or PCI bridges.\n");
 }
 
 acpi_align_size(tables->linker->cmd_blob, ACPI_BUILD_ALIGN_SIZE);
-- 
2.43.0




Re: [PATCH 2/6] hw/9pfs/9p-synth: Use RCU_READ macro

2024-01-23 Thread Greg Kurz
On Wed, 24 Jan 2024 08:41:57 +0100
Philippe Mathieu-Daudé  wrote:

> Replace the manual rcu_read_(un)lock calls by the
> WITH_RCU_READ_LOCK_GUARD macro (See commit ef46ae67ba
> "docs/style: call out the use of GUARD macros").
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/9pfs/9p-synth.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 

Acked-by: Greg Kurz 

> diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
> index 0ac79a500b..419ea69e3a 100644
> --- a/hw/9pfs/9p-synth.c
> +++ b/hw/9pfs/9p-synth.c
> @@ -241,15 +241,15 @@ static struct dirent *synth_get_dentry(V9fsSynthNode 
> *dir,
>  int i = 0;
>  V9fsSynthNode *node;
>  
> -rcu_read_lock();
> -QLIST_FOREACH(node, >child, sibling) {
> -/* This is the off child of the directory */
> -if (i == off) {
> -break;
> +WITH_RCU_READ_LOCK_GUARD() {
> +QLIST_FOREACH(node, >child, sibling) {
> +/* This is the off child of the directory */
> +if (i == off) {
> +break;
> +}
> +i++;
>  }
> -i++;
>  }
> -rcu_read_unlock();
>  if (!node) {
>  /* end of directory */
>  return NULL;
> @@ -494,13 +494,13 @@ static int synth_name_to_path(FsContext *ctx, V9fsPath 
> *dir_path,
>  goto out;
>  }
>  /* search for the name in the children */
> -rcu_read_lock();
> -QLIST_FOREACH(node, _node->child, sibling) {
> -if (!strcmp(node->name, name)) {
> -break;
> +WITH_RCU_READ_LOCK_GUARD() {
> +QLIST_FOREACH(node, _node->child, sibling) {
> +if (!strcmp(node->name, name)) {
> +break;
> +}
>  }
>  }
> -rcu_read_unlock();
>  
>  if (!node) {
>  errno = ENOENT;



-- 
Greg



Re: [PATCH 20/71] hw/9pfs: Constify VMState

2023-11-05 Thread Greg Kurz
On Sun,  5 Nov 2023 22:57:36 -0800
Richard Henderson  wrote:

> Signed-off-by: Richard Henderson 
> ---

Acked-by: Greg Kurz 

>  hw/9pfs/virtio-9p-device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> index 5f522e68e9..efa41cfd73 100644
> --- a/hw/9pfs/virtio-9p-device.c
> +++ b/hw/9pfs/virtio-9p-device.c
> @@ -237,7 +237,7 @@ static const VMStateDescription vmstate_virtio_9p = {
>  .name = "virtio-9p",
>  .minimum_version_id = 1,
>  .version_id = 1,
> -.fields = (VMStateField[]) {
> +.fields = (const VMStateField[]) {
>  VMSTATE_VIRTIO_DEVICE,
>  VMSTATE_END_OF_LIST()
>  },



-- 
Greg



Re: [PATCH 07/13] RFC migration: icp/server is a mess

2023-10-20 Thread Greg Kurz
On Fri, 20 Oct 2023 17:49:38 +1000
"Nicholas Piggin"  wrote:

> On Fri Oct 20, 2023 at 7:39 AM AEST, Greg Kurz wrote:
> > On Thu, 19 Oct 2023 21:08:25 +0200
> > Juan Quintela  wrote:
> >
> > > Current code does:
> > > - register pre_2_10_vmstate_dummy_icp with "icp/server" and instance
> > >   dependinfg on cpu number
> > > - for newer machines, it register vmstate_icp with "icp/server" name
> > >   and instance 0
> > > - now it unregisters "icp/server" for the 1st instance.
> > > 
> > > This is wrong at many levels:
> > > - we shouldn't have two VMSTATEDescriptions with the same name
> > > - In case this is the only solution that we can came with, it needs to
> > >   be:
> > >   * register pre_2_10_vmstate_dummy_icp
> > >   * unregister pre_2_10_vmstate_dummy_icp
> > >   * register real vmstate_icp
> > > 
> > > As the initialization of this machine is already complex enough, I
> > > need help from PPC maintainers to fix this.
> > > 
> > > Volunteers?
> > > 
> > > CC: Cedric Le Goater 
> > > CC: Daniel Henrique Barboza 
> > > CC: David Gibson 
> > > CC: Greg Kurz 
> > > 
> > > Signed-off-by: Juan Quintela 
> > > ---
> > >  hw/ppc/spapr.c | 7 ++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index cb840676d3..8531d13492 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -143,7 +143,12 @@ static bool pre_2_10_vmstate_dummy_icp_needed(void 
> > > *opaque)
> > >  }
> > >  
> > >  static const VMStateDescription pre_2_10_vmstate_dummy_icp = {
> > > -.name = "icp/server",
> > > +/*
> > > + * Hack ahead.  We can't have two devices with the same name and
> > > + * instance id.  So I rename this to pass make check.
> > > + * Real help from people who knows the hardware is needed.
> > > + */
> > > +.name = "pre-2.10-icp/server",
> > >  .version_id = 1,
> > >  .minimum_version_id = 1,
> > >  .needed = pre_2_10_vmstate_dummy_icp_needed,
> >
> > I guess this fix is acceptable as well and a lot simpler than
> > reverting the hack actually. Outcome is the same : drop
> > compat with pseries-2.9 and older.
> >
> > Reviewed-by: Greg Kurz 
> 
> So the reason we can't have duplicate names registered, aside from it
> surely going bad if we actually send or receive a stream at the point
> they are registered, is the duplcate check introduced in patch 9? But
> before that, this hack does seem to actually work because the duplicate
> is unregistered right away.
> 

Correct.

> If I understand the workaround, there is an asymmetry in the migration
> sequence in that receiving an unexpected object would cause a failure,
> but going from newer to older would just skip some "expected" objects
> and that didn't cause a problem. So you only have to deal with ignoring
> the unexpected ones going form older to newer.
> 

Correct.

> Side question, is it possible to flag the problem of *not* receiving
> an object that you did expect? That might be a source of bugs too.
> 

AFAICR we try to only migrate state that differs from reset : the
destination cannot really assume it will receive anything for a
given device.

> Anyway, I wonder if we could fix this spapr problem by adding a special
> case wild card instance matcher to ignore it? It's still a bit hacky
> but maybe a bit nicer. I don't mind deprecating the machine soon if
> you want to clear the wildcard hack away soon, but it would be nice to
> separate the deprecation and removal from the fix, if possible.
> 
> This patch is not tested but hopefully helps illustrate the idea.
> 

I'm not sure this will fly with older QEMUs that don't know about
VMSTATE_INSTANCE_ID_WILD... but I'll let Juan comment on that.

> Thanks,
> Nick
> 

Cheers,

--
Greg

> diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
> index 1a31fb7293..8ce03edefa 100644
> --- a/include/migration/vmstate.h
> +++ b/include/migration/vmstate.h
> @@ -1205,6 +1205,7 @@ int vmstate_save_state_v(QEMUFile *f, const 
> VMStateDescription *vmsd,
>  bool vmstate_save_needed(const VMStateDescription *vmsd, void *opaque);
>  
>  #define  VMSTATE_INSTANCE_ID_ANY  -1
> +#define  VMSTATE_INSTANCE_ID_WILD -2
>  
>  /* Returns: 0 on success, -1 on failure */
>  int vmstate_register_with_alia

Re: [PATCH 07/13] RFC migration: icp/server is a mess

2023-10-20 Thread Greg Kurz
On Fri, 20 Oct 2023 09:30:44 +0200
Juan Quintela  wrote:

> Greg Kurz  wrote:
> > On Thu, 19 Oct 2023 21:08:25 +0200
> > Juan Quintela  wrote:
> >
> >> Current code does:
> >> - register pre_2_10_vmstate_dummy_icp with "icp/server" and instance
> >>   dependinfg on cpu number
> >> - for newer machines, it register vmstate_icp with "icp/server" name
> >>   and instance 0
> >> - now it unregisters "icp/server" for the 1st instance.
> >> 
> >> This is wrong at many levels:
> >> - we shouldn't have two VMSTATEDescriptions with the same name
> >> - In case this is the only solution that we can came with, it needs to
> >>   be:
> >>   * register pre_2_10_vmstate_dummy_icp
> >>   * unregister pre_2_10_vmstate_dummy_icp
> >>   * register real vmstate_icp
> >> 
> >> As the initialization of this machine is already complex enough, I
> >> need help from PPC maintainers to fix this.
> >> 
> >> Volunteers?
> >> 
> >> CC: Cedric Le Goater 
> >> CC: Daniel Henrique Barboza 
> >> CC: David Gibson 
> >> CC: Greg Kurz 
> >> 
> >> Signed-off-by: Juan Quintela 
> >> ---
> >>  hw/ppc/spapr.c | 7 ++-
> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index cb840676d3..8531d13492 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -143,7 +143,12 @@ static bool pre_2_10_vmstate_dummy_icp_needed(void 
> >> *opaque)
> >>  }
> >>  
> >>  static const VMStateDescription pre_2_10_vmstate_dummy_icp = {
> >> -.name = "icp/server",
> >> +/*
> >> + * Hack ahead.  We can't have two devices with the same name and
> >> + * instance id.  So I rename this to pass make check.
> >> + * Real help from people who knows the hardware is needed.
> >> + */
> >> +.name = "pre-2.10-icp/server",
> >>  .version_id = 1,
> >>  .minimum_version_id = 1,
> >>  .needed = pre_2_10_vmstate_dummy_icp_needed,
> >
> > I guess this fix is acceptable as well and a lot simpler than
> > reverting the hack actually. Outcome is the same : drop
> > compat with pseries-2.9 and older.
> >
> > Reviewed-by: Greg Kurz 
> 
> I fully agree with you here.
> The other options given on this thread is deprecate that machines, but I
> would like to have this series sooner than 2 releases.

Yeah and, especially, the deprecation of all these machine types is
itself a massive chunk of work as it will call to identify and
remove other related workarounds as well. Given that pretty much
everyone working in PPC/PAPR moved away, can the community handle
such a big change ?

>  And what ppc is
> doing here is (and has always been) a hack and an abuse about how
> vmstate registrations is supposed to work.
> 

Sorry again... We should have involved migration experts at the time. :-)

> Thanks, Juan.
> 

Cheers,

-- 
Greg



Re: [PATCH 07/13] RFC migration: icp/server is a mess

2023-10-19 Thread Greg Kurz
On Thu, 19 Oct 2023 21:08:25 +0200
Juan Quintela  wrote:

> Current code does:
> - register pre_2_10_vmstate_dummy_icp with "icp/server" and instance
>   dependinfg on cpu number
> - for newer machines, it register vmstate_icp with "icp/server" name
>   and instance 0
> - now it unregisters "icp/server" for the 1st instance.
> 
> This is wrong at many levels:
> - we shouldn't have two VMSTATEDescriptions with the same name
> - In case this is the only solution that we can came with, it needs to
>   be:
>   * register pre_2_10_vmstate_dummy_icp
>   * unregister pre_2_10_vmstate_dummy_icp
>   * register real vmstate_icp
> 
> As the initialization of this machine is already complex enough, I
> need help from PPC maintainers to fix this.
> 
> Volunteers?
> 
> CC: Cedric Le Goater 
> CC: Daniel Henrique Barboza 
> CC: David Gibson 
> CC: Greg Kurz 
> 
> Signed-off-by: Juan Quintela 
> ---
>  hw/ppc/spapr.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index cb840676d3..8531d13492 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -143,7 +143,12 @@ static bool pre_2_10_vmstate_dummy_icp_needed(void 
> *opaque)
>  }
>  
>  static const VMStateDescription pre_2_10_vmstate_dummy_icp = {
> -.name = "icp/server",
> +/*
> + * Hack ahead.  We can't have two devices with the same name and
> + * instance id.  So I rename this to pass make check.
> + * Real help from people who knows the hardware is needed.
> + */
> +.name = "pre-2.10-icp/server",
>  .version_id = 1,
>  .minimum_version_id = 1,
>  .needed = pre_2_10_vmstate_dummy_icp_needed,

I guess this fix is acceptable as well and a lot simpler than
reverting the hack actually. Outcome is the same : drop
compat with pseries-2.9 and older.

Reviewed-by: Greg Kurz 

-- 
Greg



Re: [PATCH 07/13] RFC migration: icp/server is a mess

2023-10-19 Thread Greg Kurz
Hi Juan,

On Thu, 19 Oct 2023 21:08:25 +0200
Juan Quintela  wrote:

> Current code does:
> - register pre_2_10_vmstate_dummy_icp with "icp/server" and instance
>   dependinfg on cpu number
> - for newer machines, it register vmstate_icp with "icp/server" name
>   and instance 0
> - now it unregisters "icp/server" for the 1st instance.
> 

Heh I remember about this hack... it was caused by some rework in
the interrupt controller that broke migration.

> This is wrong at many levels:
> - we shouldn't have two VMSTATEDescriptions with the same name

I don't know how bad it is. The idea here is to send extra
state in the stream because older QEMU expect it (but won't use
it), so it made sense to keep the same name.

> - In case this is the only solution that we can came with, it needs to
>   be:
>   * register pre_2_10_vmstate_dummy_icp
>   * unregister pre_2_10_vmstate_dummy_icp
>   * register real vmstate_icp
> 
> As the initialization of this machine is already complex enough, I
> need help from PPC maintainers to fix this.
> 

What about dropping all this code, i.e. basically reverting 46f7afa37096 
("spapr:
fix migration of ICPState objects from/to older QEMU") ?

Unless we still care to migrate pseries machine types from 2017 of
course...

> Volunteers?
> 

Not working on PPC anymore since almost two years, I certainly don't have time,
nor motivation to fix this. I might be able to answer some questions or to
review someone else's patch that gets rid of the offending code, at best.

Cheers,

--
Greg


> CC: Cedric Le Goater 
> CC: Daniel Henrique Barboza 
> CC: David Gibson 
> CC: Greg Kurz 
> 
> Signed-off-by: Juan Quintela 
> ---
>  hw/ppc/spapr.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index cb840676d3..8531d13492 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -143,7 +143,12 @@ static bool pre_2_10_vmstate_dummy_icp_needed(void 
> *opaque)
>  }
>  
>  static const VMStateDescription pre_2_10_vmstate_dummy_icp = {
> -.name = "icp/server",
> +/*
> + * Hack ahead.  We can't have two devices with the same name and
> + * instance id.  So I rename this to pass make check.
> + * Real help from people who knows the hardware is needed.
> + */
> +.name = "pre-2.10-icp/server",
>  .version_id = 1,
>  .minimum_version_id = 1,
>  .needed = pre_2_10_vmstate_dummy_icp_needed,



-- 
Greg



Re: [PATCH] MAINTAINERS: Nick Piggin PPC maintainer, other PPC changes

2023-09-15 Thread Greg Kurz
On Fri, 15 Sep 2023 08:05:07 -0300
Daniel Henrique Barboza  wrote:

> Update all relevant PowerPC entries as follows:
> 
> - Nick Piggin is promoted to Maintainer in all qemu-ppc subsystems.
>   Nick has  been a solid contributor for the last couple of years and
>   has the required knowledge and motivation to drive the boat.
> 

Have a good trip Nick ! :-)

> - Greg Kurz is being removed from all qemu-ppc entries. Greg has moved
>   to other areas of interest and will retire from qemu-ppc.  Thanks Mr
>   Kurz for all the years of service.
> 

My pleasure !

> - David Gibson was removed as 'Reviewer' from PowerPC TCG CPUs and PPC
>   KVM CPUs. Change done per his request.
> 
> - Daniel Barboza downgraded from 'Maintainer' to 'Reviewer' in sPAPR and
>   PPC KVM CPUs. It has been a long since I last touched those areas and
>   it's not justified to be kept as maintainer in them.
> 
> - Cedric Le Goater and Daniel Barboza removed as 'Reviewer' in VOF. We
>   don't have the required knowledge to justify it.
> 
> - VOF support downgraded from 'Maintained' to 'Odd Fixes' since it
>   better reflects the current state of the subsystem.
> 
> Acked-by: Cédric Le Goater 
> Signed-off-by: Daniel Henrique Barboza 
> ---

Acked-by: Greg Kurz 

Cheers,

-- 
Greg

>  MAINTAINERS | 20 +++-
>  1 file changed, 7 insertions(+), 13 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 00562f924f..c4aa1c1c9f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -298,11 +298,9 @@ F: hw/openrisc/
>  F: tests/tcg/openrisc/
>  
>  PowerPC TCG CPUs
> +M: Nicholas Piggin 
>  M: Daniel Henrique Barboza 
>  R: Cédric Le Goater 
> -R: David Gibson 
> -R: Greg Kurz 
> -R: Nicholas Piggin 
>  L: qemu-...@nongnu.org
>  S: Odd Fixes
>  F: target/ppc/
> @@ -438,10 +436,9 @@ F: target/mips/kvm*
>  F: target/mips/sysemu/
>  
>  PPC KVM CPUs
> -M: Daniel Henrique Barboza 
> +M: Nicholas Piggin 
> +R: Daniel Henrique Barboza 
>  R: Cédric Le Goater 
> -R: David Gibson 
> -R: Greg Kurz 
>  S: Odd Fixes
>  F: target/ppc/kvm.c
>  
> @@ -1430,10 +1427,10 @@ F: include/hw/rtc/m48t59.h
>  F: tests/avocado/ppc_prep_40p.py
>  
>  sPAPR (pseries)
> -M: Daniel Henrique Barboza 
> +M: Nicholas Piggin 
> +R: Daniel Henrique Barboza 
>  R: Cédric Le Goater 
>  R: David Gibson 
> -R: Greg Kurz 
>  R: Harsh Prateek Bora 
>  L: qemu-...@nongnu.org
>  S: Odd Fixes
> @@ -1452,8 +1449,8 @@ F: tests/avocado/ppc_pseries.py
>  
>  PowerNV (Non-Virtualized)
>  M: Cédric Le Goater 
> +M: Nicholas Piggin 
>  R: Frédéric Barrat 
> -R: Nicholas Piggin 
>  L: qemu-...@nongnu.org
>  S: Odd Fixes
>  F: docs/system/ppc/powernv.rst
> @@ -1497,12 +1494,9 @@ F: include/hw/pci-host/mv64361.h
>  
>  Virtual Open Firmware (VOF)
>  M: Alexey Kardashevskiy 
> -R: Cédric Le Goater 
> -R: Daniel Henrique Barboza 
>  R: David Gibson 
> -R: Greg Kurz 
>  L: qemu-...@nongnu.org
> -S: Maintained
> +S: Odd Fixes
>  F: hw/ppc/spapr_vof*
>  F: hw/ppc/vof*
>  F: include/hw/ppc/vof*



Re: [PATCH v5 8/9] fsdev: Use ThrottleDirection instread of bool is_write

2023-08-07 Thread Greg Kurz
On Fri, 28 Jul 2023 10:20:05 +0800
zhenwei pi  wrote:

> 'bool is_write' style is obsolete from throttle framework, adapt
> fsdev to the new style.
> 
> Cc: Greg Kurz 
> Reviewed-by: Hanna Czenczek 
> Signed-off-by: zhenwei pi 

Reviewed-by: Greg Kurz 

> ---
>  fsdev/qemu-fsdev-throttle.c | 14 +++---
>  fsdev/qemu-fsdev-throttle.h |  4 ++--
>  hw/9pfs/cofile.c|  4 ++--
>  3 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/fsdev/qemu-fsdev-throttle.c b/fsdev/qemu-fsdev-throttle.c
> index 1c137d6f0f..d912da906d 100644
> --- a/fsdev/qemu-fsdev-throttle.c
> +++ b/fsdev/qemu-fsdev-throttle.c
> @@ -94,22 +94,22 @@ void fsdev_throttle_init(FsThrottle *fst)
>  }
>  }
>  
> -void coroutine_fn fsdev_co_throttle_request(FsThrottle *fst, bool is_write,
> +void coroutine_fn fsdev_co_throttle_request(FsThrottle *fst,
> +ThrottleDirection direction,
>  struct iovec *iov, int iovcnt)
>  {
> -ThrottleDirection direction = is_write ? THROTTLE_WRITE : THROTTLE_READ;
> -
> +assert(direction < THROTTLE_MAX);
>  if (throttle_enabled(>cfg)) {
>  if (throttle_schedule_timer(>ts, >tt, direction) ||
> -!qemu_co_queue_empty(>throttled_reqs[is_write])) {
> -qemu_co_queue_wait(>throttled_reqs[is_write], NULL);
> +!qemu_co_queue_empty(>throttled_reqs[direction])) {
> +qemu_co_queue_wait(>throttled_reqs[direction], NULL);
>  }
>  
>  throttle_account(>ts, direction, iov_size(iov, iovcnt));
>  
> -if (!qemu_co_queue_empty(>throttled_reqs[is_write]) &&
> +if (!qemu_co_queue_empty(>throttled_reqs[direction]) &&
>  !throttle_schedule_timer(>ts, >tt, direction)) {
> -qemu_co_queue_next(>throttled_reqs[is_write]);
> +qemu_co_queue_next(>throttled_reqs[direction]);
>  }
>  }
>  }
> diff --git a/fsdev/qemu-fsdev-throttle.h b/fsdev/qemu-fsdev-throttle.h
> index a21aecddc7..daa8ca2494 100644
> --- a/fsdev/qemu-fsdev-throttle.h
> +++ b/fsdev/qemu-fsdev-throttle.h
> @@ -23,14 +23,14 @@ typedef struct FsThrottle {
>  ThrottleState ts;
>  ThrottleTimers tt;
>  ThrottleConfig cfg;
> -CoQueue  throttled_reqs[2];
> +CoQueue  throttled_reqs[THROTTLE_MAX];
>  } FsThrottle;
>  
>  int fsdev_throttle_parse_opts(QemuOpts *, FsThrottle *, Error **);
>  
>  void fsdev_throttle_init(FsThrottle *);
>  
> -void coroutine_fn fsdev_co_throttle_request(FsThrottle *, bool ,
> +void coroutine_fn fsdev_co_throttle_request(FsThrottle *, ThrottleDirection ,
>  struct iovec *, int);
>  
>  void fsdev_throttle_cleanup(FsThrottle *);
> diff --git a/hw/9pfs/cofile.c b/hw/9pfs/cofile.c
> index 9c5344039e..71174c3e4a 100644
> --- a/hw/9pfs/cofile.c
> +++ b/hw/9pfs/cofile.c
> @@ -252,7 +252,7 @@ int coroutine_fn v9fs_co_pwritev(V9fsPDU *pdu, 
> V9fsFidState *fidp,
>  if (v9fs_request_cancelled(pdu)) {
>  return -EINTR;
>  }
> -fsdev_co_throttle_request(s->ctx.fst, true, iov, iovcnt);
> +fsdev_co_throttle_request(s->ctx.fst, THROTTLE_WRITE, iov, iovcnt);
>  v9fs_co_run_in_worker(
>  {
>  err = s->ops->pwritev(>ctx, >fs, iov, iovcnt, offset);
> @@ -272,7 +272,7 @@ int coroutine_fn v9fs_co_preadv(V9fsPDU *pdu, 
> V9fsFidState *fidp,
>  if (v9fs_request_cancelled(pdu)) {
>  return -EINTR;
>  }
> -fsdev_co_throttle_request(s->ctx.fst, false, iov, iovcnt);
> +fsdev_co_throttle_request(s->ctx.fst, THROTTLE_READ, iov, iovcnt);
>  v9fs_co_run_in_worker(
>  {
>  err = s->ops->preadv(>ctx, >fs, iov, iovcnt, offset);



-- 
Greg



Re: [PATCH v6] ppc: Enable 2nd DAWR support on p10

2023-07-08 Thread Greg Kurz
On Fri, 7 Jul 2023 21:31:47 +0530
Shivaprasad G Bhat  wrote:

> On 7/7/23 17:52, Daniel Henrique Barboza wrote:
> >
> >
> > On 7/7/23 08:59, Greg Kurz wrote:
> >> Hi Daniel and Shiva !
> >>
> >> On Fri, 7 Jul 2023 08:09:47 -0300
> >> Daniel Henrique Barboza  wrote:
> >>
> >>> This one was a buzzer shot.
> >>>
> >>
> >> Indeed ! :-) I would have appreciated some more time to re-assess
> >> my R-b tag on this 2 year old bug though ;-)
> >
> > My bad! I never thought it was that old. Never occured to me to check 
> > when
> > the previous version was sent.
> >
> > Folks, please bear in mind that a Reviewed-by is given on the context 
> > when the
> > patch was sent. A handful of months? Keep the R-bs. 6 months, from one 
> > release
> > to the other? Things starts to get a little murky. 2 years? hahaha c'mon
> 
> 
> Apologies, since v5 didn't need any rework I retained the Reviewed-bys.
> 
> I agree, I should have been explicit in changelog about how old it is.
> 
> 
> > At the very least you need to point out that the acks are old.
> >
> >
> >>
> >> My concerns were that the DAWR1 spapr cap was still not enabled by
> >> default but I guess it is because POWER9 is still the default cpu
> >> type. Related, the apply function should probably spit a warning
> >> with TCG instead of failing, like already done for some other
> >> TCG limitations (e.g. cap_safe_bounds_check_apply()). This will
> >> be needed for `make test` to succeed when DAWR1 is eventually
> >> enabled by default. Not needed right now.
> >>
> Thanks Greg, I will convert the errors to warnings for DAWR1 caps checks
> 
> in the next version. However, I dont see any new "make test" failures 
> with the patch.
> 
> Here are the logs "make test",
> 
> With patch - 
> https://gist.github.com/shivaprasadbhat/859f7f4a0c105ac1232b7ab5d8e161e8#file-gistfile1-txt
> 
> Without patch - 
> https://gist.github.com/shivaprasadbhat/25e5db9254cbe3292017f16adf41ecc1#file-gistfile1-txt
> 

"make test" failures will happen only when DAWR1 is enabled by default.
Retry your test with this change in spapr_machine_class_init() :

+    smc->default_caps.caps[SPAPR_CAP_DAWR1] = SPAPR_CAP_OFF;
-    smc->default_caps.caps[SPAPR_CAP_DAWR1] = SPAPR_CAP_ON;

> 
> >> My R-b still stands then ! :-)
> >
> > This patch got lucky then. If you/Cedric remove your acks I would 
> > simply drop the
> > patch and re-send the PR with the greatest of ease, no remorse 
> > whatsoever.
> >
> >
> > Thanks,
> >
> > Daniel
> >
> >>
> >> Cheers,
> >>
> >> -- 
> >> Greg
> >>
> >>>
> >>> Queued in gitlab.com/danielhb/qemu/tree/ppc-next. Thanks,
> >>>
> >>>
> >>> Daniel
> >>>
> >>>
> >>> On 7/7/23 05:47, Shivaprasad G Bhat wrote:
> >>>> From: Ravi Bangoria 
> >>>>
> >>>> As per the PAPR, bit 0 of byte 64 in pa-features property
> >>>> indicates availability of 2nd DAWR registers. i.e. If this bit is 
> >>>> set, 2nd
> >>>> DAWR is present, otherwise not. Use KVM_CAP_PPC_DAWR1 capability to 
> >>>> find
> >>>> whether kvm supports 2nd DAWR or not. If it's supported, allow user 
> >>>> to set
> >>>> the pa-feature bit in guest DT using cap-dawr1 machine capability. 
> >>>> Though,
> >>>> watchpoint on powerpc TCG guest is not supported and thus 2nd DAWR 
> >>>> is not
> >>>> enabled for TCG mode.
> >>>>
> >>>> Signed-off-by: Ravi Bangoria 
> >>>> Reviewed-by: Greg Kurz 
> >>>> Reviewed-by: Cédric Le Goater 
> >>>> Signed-off-by: Shivaprasad G Bhat 
> >>>> ---
> >>>> Changelog:
> >>>> v5: 
> >>>> https://lore.kernel.org/all/20210412114433.129702-1-ravi.bango...@linux.ibm.com/
> >>>> v5->v6:
> >>>>     - The other patches in the original series already merged.
> >>>>     - Rebased to the top of the tree. So, the 
> >>>> gen_spr_book3s_310_dbg() is renamed
> >>>>   to register_book3s_310_dbg_sprs() and moved to cpu_init.c 
> >>>> accordingly.
> >>>>     - No functional changes.
> >>>>
> >>>> v4: 
> >>>> https://lore.kern

Re: [PATCH v6] ppc: Enable 2nd DAWR support on p10

2023-07-07 Thread Greg Kurz
Hi Daniel and Shiva !

On Fri, 7 Jul 2023 08:09:47 -0300
Daniel Henrique Barboza  wrote:

> This one was a buzzer shot.
> 

Indeed ! :-) I would have appreciated some more time to re-assess
my R-b tag on this 2 year old bug though ;-)

My concerns were that the DAWR1 spapr cap was still not enabled by
default but I guess it is because POWER9 is still the default cpu
type. Related, the apply function should probably spit a warning
with TCG instead of failing, like already done for some other
TCG limitations (e.g. cap_safe_bounds_check_apply()). This will
be needed for `make test` to succeed when DAWR1 is eventually
enabled by default. Not needed right now.

My R-b still stands then ! :-)

Cheers,

--
Greg

> 
> Queued in gitlab.com/danielhb/qemu/tree/ppc-next. Thanks,
> 
> 
> Daniel
> 
> 
> On 7/7/23 05:47, Shivaprasad G Bhat wrote:
> > From: Ravi Bangoria 
> > 
> > As per the PAPR, bit 0 of byte 64 in pa-features property
> > indicates availability of 2nd DAWR registers. i.e. If this bit is set, 2nd
> > DAWR is present, otherwise not. Use KVM_CAP_PPC_DAWR1 capability to find
> > whether kvm supports 2nd DAWR or not. If it's supported, allow user to set
> > the pa-feature bit in guest DT using cap-dawr1 machine capability. Though,
> > watchpoint on powerpc TCG guest is not supported and thus 2nd DAWR is not
> > enabled for TCG mode.
> > 
> > Signed-off-by: Ravi Bangoria 
> > Reviewed-by: Greg Kurz 
> > Reviewed-by: Cédric Le Goater 
> > Signed-off-by: Shivaprasad G Bhat 
> > ---
> > Changelog:
> > v5: 
> > https://lore.kernel.org/all/20210412114433.129702-1-ravi.bango...@linux.ibm.com/
> > v5->v6:
> >- The other patches in the original series already merged.
> >- Rebased to the top of the tree. So, the gen_spr_book3s_310_dbg() is 
> > renamed
> >  to register_book3s_310_dbg_sprs() and moved to cpu_init.c accordingly.
> >- No functional changes.
> > 
> > v4: 
> > https://lore.kernel.org/r/20210406053833.282907-1-ravi.bango...@linux.ibm.com
> > v3->v4:
> >- Make error message more proper.
> > 
> > v3: 
> > https://lore.kernel.org/r/20210330095350.36309-1-ravi.bango...@linux.ibm.com
> > v3->v4:
> >- spapr_dt_pa_features(): POWER10 processor is compatible with 3.0
> >  (PCR_COMPAT_3_00). No need to ppc_check_compat(3_10) for now as
> >  ppc_check_compati(3_00) will also be true. ppc_check_compat(3_10)
> >  can be added while introducing pa_features_310 in future.
> >- Use error_append_hint() for hints. Also add ERRP_GUARD().
> >- Add kvmppc_set_cap_dawr1() stub function for CONFIG_KVM=n.
> > 
> > v2: 
> > https://lore.kernel.org/r/20210329041906.213991-1-ravi.bango...@linux.ibm.com
> > v2->v3:
> >- Don't introduce pa_features_310[], instead, reuse pa_features_300[]
> >  for 3.1 guests, as there is no difference between initial values of
> >  them atm.
> >- Call gen_spr_book3s_310_dbg() from init_proc_POWER10() instead of
> >  init_proc_POWER8(). Also, Don't call gen_spr_book3s_207_dbg() from
> >  gen_spr_book3s_310_dbg() as init_proc_POWER10() already calls it.
> > 
> > v1: 
> > https://lore.kernel.org/r/20200723104220.314671-1-ravi.bango...@linux.ibm.com
> > v1->v2:
> >- Introduce machine capability cap-dawr1 to enable/disable
> >  the feature. By default, 2nd DAWR is OFF for guests even
> >  when host kvm supports it. User has to manually enable it
> >  with -machine cap-dawr1=on if he wishes to use it.
> >- Split the header file changes into separate patch. (Sync
> >  headers from v5.12-rc3)
> > 
> > [1] https://git.kernel.org/torvalds/c/bd1de1a0e6eff
> > 
> >   hw/ppc/spapr.c |7 ++-
> >   hw/ppc/spapr_caps.c|   32 
> >   include/hw/ppc/spapr.h |6 +-
> >   target/ppc/cpu.h   |2 ++
> >   target/ppc/cpu_init.c  |   15 +++
> >   target/ppc/kvm.c   |   12 
> >   target/ppc/kvm_ppc.h   |   12 
> >   7 files changed, 84 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 54dbfd7fe9..1e54e0c719 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -241,7 +241,7 @@ static void spapr_dt_pa_features(SpaprMachineState 
> > *spapr,
> >   0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
> >   /* 54: DecFP, 56: DecI, 58: SHA */
> >   0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
> > -/* 60: NM atomic, 62: RNG */
&

Re: [PATCH] MAINTAINERS: raise status of 9p to 'Maintained'

2023-07-03 Thread Greg Kurz
On Mon, 3 Jul 2023 16:34:17 +0200
Christian Schoenebeck  wrote:

> Change status of 9p from 'Odd Fixes' to 'Maintained', as this better
> reflects current situation. I already take care of 9p patches for a
> while, which included new features as well.
> 

Thanks for the good work ! :-)

> Based-on: 
> Signed-off-by: Christian Schoenebeck 
> ---

Reviewed-by: Greg Kurz 

>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index e8a3205eb4..71f2479ec5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2120,7 +2120,7 @@ F: include/sysemu/balloon.h
>  virtio-9p
>  M: Greg Kurz 
>  M: Christian Schoenebeck 
> -S: Odd Fixes
> +S: Maintained
>  W: https://wiki.qemu.org/Documentation/9p
>  F: hw/9pfs/
>  X: hw/9pfs/xen-9p*



-- 
Greg



Re: [PATCH v4] target: ppc: Use MSR_HVB bit to get the target endianness for memory dump

2023-07-03 Thread Greg Kurz
On Fri, 23 Jun 2023 03:25:06 -0400
Narayana Murty N  wrote:

> Currently on PPC64 qemu always dumps the guest memory in
> Big Endian (BE) format even though the guest running in Little Endian
> (LE) mode. So crash tool fails to load the dump as illustrated below:
> 
> Log :
> $ virsh dump DOMAIN --memory-only dump.file
> 
> Domain 'DOMAIN' dumped to dump.file
> 
> $ crash vmlinux dump.file
> 
> 
> crash 8.0.2-1.el9
> 
> WARNING: endian mismatch:
>   crash utility: little-endian
>   dump.file: big-endian
> 
> WARNING: machine type mismatch:
>   crash utility: PPC64
>   dump.file: (unknown)
> 
> crash: dump.file: not a supported file format
> 
> 
> This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode
> to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode
> always set for powerNV even though the guest is not running in hv mode.
> The hv mode should be taken from msr_mask MSR_HVB bit
> (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing
> MSR_HVB value to ppc_interrupts_little_endian() in order to determine
> the guest endianness.
> 
> The crash tool also expects guest kernel endianness should match the
> endianness of the dump.
> 
> The patch was tested on POWER9 box booted with Linux as host in
> following cases:
> 
> Host-Endianess Qemu-Target-MachineQemu-Generated-Guest
>   Memory-Dump-Format
> BE powernv(OPAL/PowerNV)   LE
> BE powernv(OPAL/PowerNV)   BE
> LE powernv(OPAL/PowerNV)   LE
> LE powernv(OPAL/PowerNV)   BE
> LE pseries(OPAL/PowerNV/pSeries) KVMHV LE
> LE pseries TCG LE
> 
> Fixes: 5609400a4228 ("target/ppc: Set the correct endianness for powernv 
> memory
> dumps")
> Signed-off-by: Narayana Murty N 
> ---

Thanks !

Reviewed-by: Greg Kurz 

> Changes since V3:
> commit message modified as per feedback from Greg Kurz, Cédric Le
> Goater and Nicholas Piggin.
> Changes since V2:
> commit message modified as per feedback from Nicholas Piggin.
> Changes since V1:
> https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmli...@linux.ibm.com/
> The approach to solve the issue was changed based on feedback from
> Fabiano Rosas on patch V1.
> ---
>  target/ppc/arch_dump.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c
> index f58e6359d5..a8315659d9 100644
> --- a/target/ppc/arch_dump.c
> +++ b/target/ppc/arch_dump.c
> @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info,
>  info->d_machine = PPC_ELF_MACHINE;
>  info->d_class = ELFCLASS;
>  
> -if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) {
> +if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) {
>  info->d_endian = ELFDATA2LSB;
>  } else {
>  info->d_endian = ELFDATA2MSB;



-- 
Greg



Re: [PATCH] ppc: spapr: Fix device tree entries in absence of XIVE native mode

2023-06-30 Thread Greg Kurz
On Fri, 30 Jun 2023 11:00:56 +0530
Gautam Menghani  wrote:

> Currently, XIVE native exploitation mode is not supported in nested
> guests. When we boot up a nested guest on PowerNV platform, we observe 
> the following entries in the device tree of nested guest:
> 
> ```
> device_type = "power-ivpe";
> compatible = "ibm,power-ivpe";
> ```
> 
> But as per LoPAR section B.5.9[1], these entries should only be present
> when XIVE native exploitation mode is being used. Presently, there is no 
> support for nested virtualization in the context of XIVE, and hence, DT 
> shouldn't advertise support for XIVE interrupt controller to a nested guest. 
> 
> Also, according to the present behaviour, when we boot a nested KVM
> guest, the following QEMU warnings are reported   :
> ```
> Calling ibm,client-architecture-support...qemu-system-ppc64: warning: 
> kernel_irqchip allowed but unavailable: IRQ_XIVE capability must be present 
> for KVM
> Falling back to kernel-irqchip=off

This is expected since the XIVE native mode is only available
on bare metal... arguably the warning could be silenced but
it was deemed informative

> .
> .
> .
> [0.00][T0] xive: Using IRQ range [0-0]
> [0.00][T0] xive: Interrupt handling initialized with spapr backend
> [0.00][T0] xive: Using priority 6 for all interrupts
> [0.00][T0] xive: Using 64kB queues
> ```
> 
> With this patch, the above warnings are no longer observed in nested guest's 
> dmesg and also the device tree contains the following entries:
> ```
> device_type = "PowerPC-External-Interrupt-Presentation";
> compatible = "IBM,ppc-xicp";
> ```
> 

Hmm... this is a behavior change : same QEMU invocation that would run XIVE
before will now run XICS... :-\

> Also add an additional check to handle the scenarios where
> ic-mode= is explicitly specified by user - make the code error out
> when XIVE native capability is not there and user specifies
> ic-mode=xive.
> 
> Testing:
> 1. This patch has been tested on a P9 PowerNV machine by spinning up both a
> KVM guest and nested KVM guest. The guest can use XIVE native mode just fine 
> with correct DT entries and for nested guest, interrupt emulation is being 
> used 
> and the DT contains correct entries.
> 
> 2. This patch also has been tested on KVM on PowerVM platform. In this
> scenario, we can boot up a KVM guest on top of a Power Hypervisor guest.
> Kernel patches - 
> lore.kernel.org/linuxppc-dev/20230605064848.12319-1-...@linux.vnet.ibm.com
> QEMU tree to test - github.com/mikey/qemu/tree/kvm-papr
> 
> [1] : https://files.openpower.foundation/s/ZmtZyCGiJ2oJHim
> 
> Signed-off-by: Gautam Menghani 
> ---
>  hw/ppc/spapr.c |  2 +-
>  hw/ppc/spapr_irq.c | 14 +-
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 54dbfd7fe9..6434742369 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2840,7 +2840,7 @@ static void spapr_machine_init(MachineState *machine)
>  spapr_ovec_set(spapr->ov5, OV5_DRMEM_V2);
>  
>  /* advertise XIVE on POWER9 machines */
> -if (spapr->irq->xive) {
> +if (kvmppc_has_cap_xive() && spapr->irq->xive) {

Nak. The behavior of the machine is only dictated by the
command line arguments passed to QEMU, not by the host
capabilities. This is to guarantee migrability.

If the machine is expected to expose XIVE but the host
doesn't support it, e.g. boston P9 machines, then QEMU
falls back on emulating it.

>  spapr_ovec_set(spapr->ov5, OV5_XIVE_EXPLOIT);
>  }
>  
> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
> index a0d1e1298e..856bba042a 100644
> --- a/hw/ppc/spapr_irq.c
> +++ b/hw/ppc/spapr_irq.c
> @@ -20,6 +20,7 @@
>  #include "hw/qdev-properties.h"
>  #include "cpu-models.h"
>  #include "sysemu/kvm.h"
> +#include "kvm_ppc.h"
>  
>  #include "trace.h"
>  
> @@ -294,6 +295,7 @@ uint32_t spapr_irq_nr_msis(SpaprMachineState *spapr)
>  void spapr_irq_init(SpaprMachineState *spapr, Error **errp)
>  {
>  SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> +bool cap_xive = kvmppc_has_cap_xive();
>  
>  if (kvm_enabled() && kvm_kernel_irqchip_split()) {
>  error_setg(errp, "kernel_irqchip split mode not supported on 
> pseries");
> @@ -304,6 +306,16 @@ void spapr_irq_init(SpaprMachineState *spapr, Error 
> **errp)
>  return;
>  }
>  
> +/*
> + * Check for valid ic-mode - XIVE native won't work if hypervisor doesn't
> + * have support
> + */
> +if (!cap_xive && !spapr->irq->xics) {
> +error_setg(errp,
> +"XIVE native mode not available, don't use ic-mode=xive");
> +return;
> +}
> +
>  /* Initialize the MSI IRQ allocator. */
>  spapr_irq_msi_init(spapr);
>  
> @@ -323,7 +335,7 @@ void spapr_irq_init(SpaprMachineState *spapr, Error 
> **errp)
>  spapr->ics = ICS_SPAPR(obj);
>  }
>  
> -if (spapr->irq->xive) {
> +if (cap_xive && 

Re: [PATCH v3 6/6] target/ppc: Remove pointless checks of CONFIG_USER_ONLY in 'kvm_ppc.h'

2023-06-29 Thread Greg Kurz
On Tue, 27 Jun 2023 13:51:24 +0200
Philippe Mathieu-Daudé  wrote:

> Signed-off-by: Philippe Mathieu-Daudé 
> ---

Reviewed-by: Greg Kurz 

>  target/ppc/kvm_ppc.h | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index 901e188c9a..6a4dd9c560 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -42,7 +42,6 @@ int kvmppc_booke_watchdog_enable(PowerPCCPU *cpu);
>  target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
>   bool radix, bool gtse,
>   uint64_t proc_tbl);
> -#ifndef CONFIG_USER_ONLY
>  bool kvmppc_spapr_use_multitce(void);
>  int kvmppc_spapr_enable_inkernel_multitce(void);
>  void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t page_shift,
> @@ -52,7 +51,6 @@ int kvmppc_remove_spapr_tce(void *table, int pfd, uint32_t 
> window_size);
>  int kvmppc_reset_htab(int shift_hint);
>  uint64_t kvmppc_vrma_limit(unsigned int hash_shift);
>  bool kvmppc_has_cap_spapr_vfio(void);
> -#endif /* !CONFIG_USER_ONLY */
>  bool kvmppc_has_cap_epr(void);
>  int kvmppc_define_rtas_kernel_token(uint32_t token, const char *function);
>  int kvmppc_get_htab_fd(bool write, uint64_t index, Error **errp);
> @@ -262,7 +260,6 @@ static inline void kvmppc_set_reg_tb_offset(PowerPCCPU 
> *cpu, int64_t tb_offset)
>  {
>  }
>  
> -#ifndef CONFIG_USER_ONLY
>  static inline bool kvmppc_spapr_use_multitce(void)
>  {
>  return false;
> @@ -322,8 +319,6 @@ static inline void kvmppc_write_hpte(hwaddr ptex, 
> uint64_t pte0, uint64_t pte1)
>  abort();
>  }
>  
> -#endif /* !CONFIG_USER_ONLY */
> -
>  static inline bool kvmppc_has_cap_epr(void)
>  {
>  return false;



-- 
Greg



Re: [PATCH v3 4/6] target/ppc: Define TYPE_HOST_POWERPC_CPU in cpu-qom.h

2023-06-29 Thread Greg Kurz
On Tue, 27 Jun 2023 13:51:22 +0200
Philippe Mathieu-Daudé  wrote:

> TYPE_HOST_POWERPC_CPU is used in various places of cpu_init.c,
> in order to restrict "kvm_ppc.h" to sysemu, move this QOM-related
> definition to cpu-qom.h.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---

Reviewed-by: Greg Kurz 

>  target/ppc/cpu-qom.h | 2 ++
>  target/ppc/kvm_ppc.h | 2 --
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
> index c2bff349cc..4e4061068e 100644
> --- a/target/ppc/cpu-qom.h
> +++ b/target/ppc/cpu-qom.h
> @@ -36,6 +36,8 @@ OBJECT_DECLARE_CPU_TYPE(PowerPCCPU, PowerPCCPUClass, 
> POWERPC_CPU)
>  #define CPU_RESOLVING_TYPE TYPE_POWERPC_CPU
>  #define cpu_list ppc_cpu_list
>  
> +#define TYPE_HOST_POWERPC_CPU POWERPC_CPU_TYPE_NAME("host")
> +
>  ObjectClass *ppc_cpu_class_by_name(const char *name);
>  
>  typedef struct CPUArchState CPUPPCState;
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index 49954a300b..901e188c9a 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -13,8 +13,6 @@
>  #include "exec/hwaddr.h"
>  #include "cpu.h"
>  
> -#define TYPE_HOST_POWERPC_CPU POWERPC_CPU_TYPE_NAME("host")
> -
>  #ifdef CONFIG_KVM
>  
>  uint32_t kvmppc_get_tbfreq(void);



-- 
Greg



Re: [PATCH v3 5/6] target/ppc: Restrict 'kvm_ppc.h' to sysemu in cpu_init.c

2023-06-29 Thread Greg Kurz
On Tue, 27 Jun 2023 13:51:23 +0200
Philippe Mathieu-Daudé  wrote:

> User emulation shouldn't need any of the KVM prototypes
> declared in "kvm_ppc.h".
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---

Reviewed-by: Greg Kurz 

>  target/ppc/cpu_init.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index aeff71d063..f2afb539eb 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -21,7 +21,6 @@
>  #include "qemu/osdep.h"
>  #include "disas/dis-asm.h"
>  #include "gdbstub/helpers.h"
> -#include "kvm_ppc.h"
>  #include "sysemu/cpus.h"
>  #include "sysemu/hw_accel.h"
>  #include "sysemu/tcg.h"
> @@ -49,6 +48,7 @@
>  #ifndef CONFIG_USER_ONLY
>  #include "hw/boards.h"
>  #include "hw/intc/intc.h"
> +#include "kvm_ppc.h"
>  #endif
>  
>  /* #define PPC_DEBUG_SPR */



-- 
Greg



Re: [PATCH v3 3/6] target/ppc: Move CPU QOM definitions to cpu-qom.h

2023-06-28 Thread Greg Kurz
On Tue, 27 Jun 2023 13:51:21 +0200
Philippe Mathieu-Daudé  wrote:

> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  target/ppc/cpu-qom.h | 5 +
>  target/ppc/cpu.h | 6 --
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
> index 9666f54f65..c2bff349cc 100644
> --- a/target/ppc/cpu-qom.h
> +++ b/target/ppc/cpu-qom.h
> @@ -31,6 +31,11 @@
>  
>  OBJECT_DECLARE_CPU_TYPE(PowerPCCPU, PowerPCCPUClass, POWERPC_CPU)
>  
> +#define POWERPC_CPU_TYPE_SUFFIX "-" TYPE_POWERPC_CPU
> +#define POWERPC_CPU_TYPE_NAME(model) model POWERPC_CPU_TYPE_SUFFIX
> +#define CPU_RESOLVING_TYPE TYPE_POWERPC_CPU
> +#define cpu_list ppc_cpu_list
> +
>  ObjectClass *ppc_cpu_class_by_name(const char *name);
>  
>  typedef struct CPUArchState CPUPPCState;
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index af12c93ebc..e91e1774e5 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -1468,12 +1468,6 @@ static inline uint64_t ppc_dump_gpr(CPUPPCState *env, 
> int gprn)
>  int ppc_dcr_read(ppc_dcr_t *dcr_env, int dcrn, uint32_t *valp);
>  int ppc_dcr_write(ppc_dcr_t *dcr_env, int dcrn, uint32_t val);
>  
> -#define POWERPC_CPU_TYPE_SUFFIX "-" TYPE_POWERPC_CPU
> -#define POWERPC_CPU_TYPE_NAME(model) model POWERPC_CPU_TYPE_SUFFIX
> -#define CPU_RESOLVING_TYPE TYPE_POWERPC_CPU
> -

These seem appropriate to be moved to "cpu-qom.h".

> -#define cpu_list ppc_cpu_list

This one is much older according to git blame :

c913706581460 target/ppc/cpu.h (Igor Mammedov 2017-08-30 1469) 
#define POWERPC_CPU_TYPE_SUFFIX "-" TYPE_POWERPC_CPU
c913706581460 target/ppc/cpu.h (Igor Mammedov 2017-08-30 1470) 
#define POWERPC_CPU_TYPE_NAME(model) model POWERPC_CPU_TYPE_SUFFIX
0dacec874fa3b target/ppc/cpu.h (Igor Mammedov 2018-02-07 1471) 
#define CPU_RESOLVING_TYPE TYPE_POWERPC_CPU
c913706581460 target/ppc/cpu.h (Igor Mammedov 2017-08-30 1472) 
c732abe222795 target-ppc/cpu.h (Jocelyn Mayer 2007-10-12 1473) 
#define cpu_list ppc_cpu_list

It is some plumbing used for `-cpu help`, not exactly QOM stuff.
Maybe keep it in "cpu.h" as all other targets do ?

> -
>  /* MMU modes definitions */
>  #define MMU_USER_IDX 0
>  static inline int cpu_mmu_index(CPUPPCState *env, bool ifetch)



-- 
Greg



Re: [PATCH v3 2/6] target/ppc: Reorder #ifdef'ry in kvm_ppc.h

2023-06-28 Thread Greg Kurz
On Tue, 27 Jun 2023 13:51:20 +0200
Philippe Mathieu-Daudé  wrote:

> Keep a single if/else/endif block checking CONFIG_KVM.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  target/ppc/kvm_ppc.h | 62 
>  1 file changed, 28 insertions(+), 34 deletions(-)
> 
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index 2e395416f0..49954a300b 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -93,7 +93,34 @@ void kvmppc_set_reg_tb_offset(PowerPCCPU *cpu, int64_t 
> tb_offset);
>  
>  int kvm_handle_nmi(PowerPCCPU *cpu, struct kvm_run *run);
>  
> -#else
> +#define kvmppc_eieio() \
> +do {  \
> +if (kvm_enabled()) {  \
> +asm volatile("eieio" : : : "memory"); \
> +} \
> +} while (0)
> +
> +/* Store data cache blocks back to memory */
> +static inline void kvmppc_dcbst_range(PowerPCCPU *cpu, uint8_t *addr, int 
> len)
> +{
> +uint8_t *p;
> +
> +for (p = addr; p < addr + len; p += cpu->env.dcache_line_size) {
> +asm volatile("dcbst 0,%0" : : "r"(p) : "memory");
> +}
> +}
> +
> +/* Invalidate instruction cache blocks */
> +static inline void kvmppc_icbi_range(PowerPCCPU *cpu, uint8_t *addr, int len)
> +{
> +uint8_t *p;
> +
> +for (p = addr; p < addr + len; p += cpu->env.icache_line_size) {
> +asm volatile("icbi 0,%0" : : "r"(p));
> +}
> +}
> +
> +#else /* !CONFIG_KVM */
>  
>  static inline uint32_t kvmppc_get_tbfreq(void)
>  {
> @@ -440,10 +467,6 @@ static inline bool 
> kvmppc_pvr_workaround_required(PowerPCCPU *cpu)
>  return false;
>  }
>  
> -#endif
> -
> -#ifndef CONFIG_KVM
> -
>  #define kvmppc_eieio() do { } while (0)
>  
>  static inline void kvmppc_dcbst_range(PowerPCCPU *cpu, uint8_t *addr, int 
> len)
> @@ -454,35 +477,6 @@ static inline void kvmppc_icbi_range(PowerPCCPU *cpu, 
> uint8_t *addr, int len)
>  {
>  }
>  
> -#else   /* CONFIG_KVM */
> -
> -#define kvmppc_eieio() \

Arguably the kvm and non-kvm implementations will now come from
different commits in git blame. I personally favor keeping the
git blame consistency over bare code movement that doesn't fix
any actual bug.

Also this patch doesn't seem to be strictly needed to reach the
goal of kicking "kvm_ppc.h" out of user emulation.

> -do {  \
> -if (kvm_enabled()) {  \
> -asm volatile("eieio" : : : "memory"); \
> -} \
> -} while (0)
> -
> -/* Store data cache blocks back to memory */
> -static inline void kvmppc_dcbst_range(PowerPCCPU *cpu, uint8_t *addr, int 
> len)
> -{
> -uint8_t *p;
> -
> -for (p = addr; p < addr + len; p += cpu->env.dcache_line_size) {
> -asm volatile("dcbst 0,%0" : : "r"(p) : "memory");
> -}
> -}
> -
> -/* Invalidate instruction cache blocks */
> -static inline void kvmppc_icbi_range(PowerPCCPU *cpu, uint8_t *addr, int len)
> -{
> -uint8_t *p;
> -
> -for (p = addr; p < addr + len; p += cpu->env.icache_line_size) {
> -asm volatile("icbi 0,%0" : : "r"(p));
> -}
> -}
> -
>  #endif  /* CONFIG_KVM */
>  
>  #endif /* KVM_PPC_H */



-- 
Greg



Re: [PATCH v3 1/6] target/ppc: Have 'kvm_ppc.h' include 'sysemu/kvm.h'

2023-06-28 Thread Greg Kurz
On Tue, 27 Jun 2023 13:51:19 +0200
Philippe Mathieu-Daudé  wrote:

> "kvm_ppc.h" declares:
> 
>   int kvm_handle_nmi(PowerPCCPU *cpu, struct kvm_run *run);
> 
> 'struct kvm_run' is declared in "sysemu/kvm.h", include it.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---

Reviewed-by: Greg Kurz 

>  target/ppc/kvm_ppc.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index 611debc3ce..2e395416f0 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -9,6 +9,7 @@
>  #ifndef KVM_PPC_H
>  #define KVM_PPC_H
>  
> +#include "sysemu/kvm.h"
>  #include "exec/hwaddr.h"
>  #include "cpu.h"
>  



-- 
Greg



Re: [PATCH] target/ppc: Only generate decodetree files when TCG is enabled

2023-06-27 Thread Greg Kurz
On Mon, 26 Jun 2023 16:01:00 +0200
Philippe Mathieu-Daudé  wrote:

> No need to generate TCG-specific decodetree files
> when TCG is disabled.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---

Reviewed-by: Greg Kurz 

>  target/ppc/meson.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/meson.build b/target/ppc/meson.build
> index a69f174f41..4c2635039e 100644
> --- a/target/ppc/meson.build
> +++ b/target/ppc/meson.build
> @@ -28,7 +28,7 @@ gen = [
>   extra_args: ['--static-decode=decode_insn64',
>'--insnwidth=64']),
>  ]
> -ppc_ss.add(gen)
> +ppc_ss.add(when: 'CONFIG_TCG', if_true: gen)
>  
>  ppc_ss.add(when: 'CONFIG_KVM', if_true: files('kvm.c'), if_false: 
> files('kvm-stub.c'))
>  ppc_ss.add(when: 'CONFIG_USER_ONLY', if_true: files('user_only_helper.c'))



-- 
Greg



Re: [SPAM] [PATCH v4] 9pfs: deprecate 'proxy' backend

2023-06-26 Thread Greg Kurz
ng with this daemon, in a future version of QEMU!
> +
>  Pass-through security model in QEMU 9p server needs root privilege to do
>  few file operations (like chown, chmod to any mode/uid:gid).  There are two
>  issues in pass-through security model:
> diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
> index 3da64e9f72..9a50ee370b 100644
> --- a/fsdev/qemu-fsdev.c
> +++ b/fsdev/qemu-fsdev.c
> @@ -133,6 +133,14 @@ int qemu_fsdev_add(QemuOpts *opts, Error **errp)
>  }
>  
>  if (fsdriver) {
> +if (strncmp(fsdriver, "proxy", 5) == 0) {
> +warn_report(
> +"'-fsdev proxy' and '-virtfs proxy' are deprecated, use "
> +"'local' instead of 'proxy, or consider deploying virtiofsd "
> +"instead"

Ditto.

LGTM.

Reviewed-by: Greg Kurz 

> +);
> +}
> +
>  for (i = 0; i < ARRAY_SIZE(FsDrivers); i++) {
>  if (strcmp(FsDrivers[i].name, fsdriver) == 0) {
>  break;
> diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
> index d9511f429c..144aaf585a 100644
> --- a/fsdev/virtfs-proxy-helper.c
> +++ b/fsdev/virtfs-proxy-helper.c
> @@ -9,6 +9,11 @@
>   * the COPYING file in the top-level directory.
>   */
>  
> +/*
> + * NOTE: The 9p 'proxy' backend is deprecated (since QEMU 8.1) and will be
> + * removed in a future version of QEMU!
> + */
> +
>  #include "qemu/osdep.h"
>  #include 
>  #include 
> @@ -1057,6 +1062,10 @@ int main(int argc, char **argv)
>  struct statfs st_fs;
>  #endif
>  
> +fprintf(stderr, "NOTE: The 9p 'proxy' backend is deprecated (since "
> +"QEMU 8.1) and will be removed in a future version of "
> +"QEMU!\n");
> +
>  prog_name = g_path_get_basename(argv[0]);
>  
>  is_daemon = true;
> diff --git a/hw/9pfs/9p-proxy.c b/hw/9pfs/9p-proxy.c
> index 99d115ff0d..905cae6992 100644
> --- a/hw/9pfs/9p-proxy.c
> +++ b/hw/9pfs/9p-proxy.c
> @@ -15,6 +15,11 @@
>   * https://wiki.qemu.org/Documentation/9p
>   */
>  
> +/*
> + * NOTE: The 9p 'proxy' backend is deprecated (since QEMU 8.1) and will be
> + * removed in a future version of QEMU!
> + */
> +
>  #include "qemu/osdep.h"
>  #include 
>  #include 
> diff --git a/hw/9pfs/9p-proxy.h b/hw/9pfs/9p-proxy.h
> index b84301d001..9be4718d3e 100644
> --- a/hw/9pfs/9p-proxy.h
> +++ b/hw/9pfs/9p-proxy.h
> @@ -10,6 +10,11 @@
>   * the COPYING file in the top-level directory.
>   */
>  
> +/*
> + * NOTE: The 9p 'proxy' backend is deprecated (since QEMU 8.1) and will be
> + * removed in a future version of QEMU!
> + */
> +
>  #ifndef QEMU_9P_PROXY_H
>  #define QEMU_9P_PROXY_H
>  
> diff --git a/meson.build b/meson.build
> index 34306a6205..05c01b72bb 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -4170,7 +4170,7 @@ if have_block
>summary_info += {'Block whitelist (ro)': 
> get_option('block_drv_ro_whitelist')}
>summary_info += {'Use block whitelist in tools': 
> get_option('block_drv_whitelist_in_tools')}
>summary_info += {'VirtFS (9P) support':have_virtfs}
> -  summary_info += {'VirtFS (9P) Proxy Helper support': 
> have_virtfs_proxy_helper}
> +  summary_info += {'VirtFS (9P) Proxy Helper support (deprecated)': 
> have_virtfs_proxy_helper}
>summary_info += {'Live block migration': 
> config_host_data.get('CONFIG_LIVE_BLOCK_MIGRATION')}
>summary_info += {'replication support': 
> config_host_data.get('CONFIG_REPLICATION')}
>summary_info += {'bochs support': get_option('bochs').allowed()}
> diff --git a/qemu-options.hx b/qemu-options.hx
> index b57489d7ca..3a6c7d3ef9 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -1735,7 +1735,9 @@ SRST
>  Accesses to the filesystem are done by QEMU.
>  
>  ``proxy``
> -Accesses to the filesystem are done by virtfs-proxy-helper(1).
> +Accesses to the filesystem are done by virtfs-proxy-helper(1). This
> +option is deprecated (since QEMU 8.1) and will be removed in a future
> +version of QEMU. Use ``local`` instead.
>  
>  ``synth``
>  Synthetic filesystem, only used by QTests.
> @@ -1867,6 +1869,8 @@ SRST
>  
>  ``proxy``
>  Accesses to the filesystem are done by virtfs-proxy-helper(1).
> +This option is deprecated (since QEMU 8.1) and will be removed in a
> +future version of QEMU. Use ``local`` instead.
>  
>  ``synth``
>  Synthetic filesystem, only used by QTests.

-- 
Greg



Re: [PATCH v3] 9pfs: deprecate 'proxy' backend

2023-06-21 Thread Greg Kurz
On Wed, 21 Jun 2023 16:16:46 +0200
Christian Schoenebeck  wrote:

> On Wednesday, June 21, 2023 3:41:36 PM CEST Greg Kurz wrote:
> > On Wed, 21 Jun 2023 15:32:39 +0200
> > Christian Schoenebeck  wrote:
> > 
> > > On Thursday, June 15, 2023 11:35:05 AM CEST Christian Schoenebeck wrote:
> > > > On Saturday, June 10, 2023 3:39:44 PM CEST Christian Schoenebeck wrote:
> > > > > As recent CVE-2023-2861 once again showed, the 9p 'proxy' fs driver 
> > > > > is 
> in
> > > > > bad shape. Using the 'proxy' backend was already discouraged for 
> safety
> > > > > reasons before and we recommended to use the 'local' backend instead,
> > > > > but now it is time to officially deprecate the 'proxy' backend.
> > > > > 
> > > > > Signed-off-by: Christian Schoenebeck 
> > > 
> > > Ping
> > > 
> > 
> > It seems you missed the review I posted last week :
> > 
> > https://patchew.org/QEMU/e1q7ytt-0005fl...@lizzy.crudebyte.com/
> #20230612165742.ea08@bahia
> 
> Oh, I never received your email. I'll check the logs what happened there.
> 
> I'll send a v4 with your suggestions, they make sense.
> 
> Do you want me to add your "for the records" comments as well in the
> deprecation notice?
> 

This was more targeting qemu-devel archives or git log, but feel
free to provide relevant details in the deprecation notice.

I agree with Daniel that virtiofsd should also be mentioned as
an alternative.

> Best regards,
> Christian Schoenebeck
> 
> 

Cheers,

-- 
Greg



Re: [PATCH v3] 9pfs: deprecate 'proxy' backend

2023-06-21 Thread Greg Kurz
On Wed, 21 Jun 2023 15:32:39 +0200
Christian Schoenebeck  wrote:

> On Thursday, June 15, 2023 11:35:05 AM CEST Christian Schoenebeck wrote:
> > On Saturday, June 10, 2023 3:39:44 PM CEST Christian Schoenebeck wrote:
> > > As recent CVE-2023-2861 once again showed, the 9p 'proxy' fs driver is in
> > > bad shape. Using the 'proxy' backend was already discouraged for safety
> > > reasons before and we recommended to use the 'local' backend instead,
> > > but now it is time to officially deprecate the 'proxy' backend.
> > > 
> > > Signed-off-by: Christian Schoenebeck 
> 
> Ping
> 

It seems you missed the review I posted last week :

https://patchew.org/QEMU/e1q7ytt-0005fl...@lizzy.crudebyte.com/#20230612165742.ea08@bahia

> > > ---
> > >  v2 -> v3:
> > >  * Fix copy wasted typo (-> 'backend').
> > > 
> > >  MAINTAINERS|  7 +++
> > >  docs/about/deprecated.rst  | 17 +
> > >  docs/tools/virtfs-proxy-helper.rst |  3 +++
> > >  fsdev/qemu-fsdev.c |  5 +
> > >  fsdev/virtfs-proxy-helper.c|  5 +
> > >  hw/9pfs/9p-proxy.c |  5 +
> > >  hw/9pfs/9p-proxy.h |  5 +
> > >  meson.build|  2 +-
> > >  qemu-options.hx|  6 +-
> > >  softmmu/vl.c   |  5 +
> > >  10 files changed, 58 insertions(+), 2 deletions(-)
> > 
> > Or would it be better to split this up, e.g. into 3 separate patches 
> > (runtime
> > messages, docs, MAINTAINERS)?
> > 
> > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > index 436b3f0afe..185d694b2e 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -2118,13 +2118,20 @@ S: Odd Fixes
> > >  W: https://wiki.qemu.org/Documentation/9p
> > >  F: hw/9pfs/
> > >  X: hw/9pfs/xen-9p*
> > > +X: hw/9pfs/9p-proxy*
> > >  F: fsdev/
> > > +X: fsdev/virtfs-proxy-helper.c
> > >  F: docs/tools/virtfs-proxy-helper.rst
> > 
> > I missed virtfs-proxy-helper.rst here. That should be moved to the new 
> > 'proxy'
> > section below as well.
> > 
> > >  F: tests/qtest/virtio-9p-test.c
> > >  F: tests/qtest/libqos/virtio-9p*
> > >  T: git https://gitlab.com/gkurz/qemu.git 9p-next
> > >  T: git https://github.com/cschoenebeck/qemu.git 9p.next
> > >  
> > > +virtio-9p-proxy
> > > +F: hw/9pfs/9p-proxy*
> > > +F: fsdev/virtfs-proxy-helper.c
> > > +S: Obsolete
> > > +
> > >  virtio-blk
> > >  M: Stefan Hajnoczi 
> > >  L: qemu-bl...@nongnu.org
> > > diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
> > > index 0743459862..9b2c780365 100644
> > > --- a/docs/about/deprecated.rst
> > > +++ b/docs/about/deprecated.rst
> > > @@ -343,6 +343,23 @@ the addition of volatile memory support, it is now 
> > > necessary to distinguish
> > >  between persistent and volatile memory backends.  As such, memdev is 
> > > deprecated
> > >  in favor of persistent-memdev.
> > >  
> > > +``-fsdev proxy`` and ``-virtfs proxy`` (since 8.1)
> > > +^^
> > > +
> > > +The 9p ``proxy`` filesystem backend driver has been deprecated and will 
> > > be
> > > +removed in a future version of QEMU. Please use ``-fsdev local`` or
> > > +``-virtfs local`` for using the ``local`` 9p filesystem backend instead.
> > > +
> > > +The 9p ``proxy`` backend was originally developed as an alternative to 
> > > the 9p
> > > +``local`` backend. The idea was to enhance security by dispatching 
> > > actual low
> > > +level filesystem operations from 9p server (QEMU process) over to a 
> > > separate
> > > +process (the virtfs-proxy-helper binary). However this alternative never 
> > > gained
> > > +momentum. The proxy backend is much slower than the local backend, 
> > > hasn't seen
> > > +any development in years, and showed to be less secure, especially due 
> > > to the
> > > +fact that its helper daemon must be run as root, whereas with the local 
> > > backend
> > > +QEMU is typically run as unprivileged user and allows to tighten 
> > > behaviour by
> > > +mapping permissions et al.
> > > +
> > >  
> > >  Block device options
> > >  
> > > diff --git a/docs/tools/virtfs-proxy-helper.rst 
> > > b/docs/tools/virtfs-proxy-helper.rst
> > > index 6cdeedf8e9..bd310ebb07 100644
> > > --- a/docs/tools/virtfs-proxy-helper.rst
> > > +++ b/docs/tools/virtfs-proxy-helper.rst
> > > @@ -9,6 +9,9 @@ Synopsis
> > >  Description
> > >  ---
> > >  
> > > +NOTE: The 9p 'proxy' backend is deprecated (since QEMU 8.1) and will be
> > > +removed, along with this daemon, in a future version of QEMU!
> > > +
> > >  Pass-through security model in QEMU 9p server needs root privilege to do
> > >  few file operations (like chown, chmod to any mode/uid:gid).  There are 
> > > two
> > >  issues in pass-through security model:
> > > diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
> > > index 3da64e9f72..242f54ab49 100644
> > > --- a/fsdev/qemu-fsdev.c
> > > +++ b/fsdev/qemu-fsdev.c
> > > @@ -133,6 

Re: [PATCH] hw/ppc/spapr: Test whether TCG is enabled with tcg_enabled()

2023-06-20 Thread Greg Kurz
On Tue, 20 Jun 2023 09:55:49 +0200
Claudio Fontana  wrote:

> On 6/20/23 09:48, Philippe Mathieu-Daudé wrote:
> > Although the PPC target only supports the TCG and KVM
> > accelerators, QEMU supports more. We can no assume that
> > '!kvm == tcg', so test for the correct accelerator. This
> > also eases code review, because here we don't care about
> > KVM, we really want to test for TCG.
> > 
> > Signed-off-by: Philippe Mathieu-Daudé 
> 
> I don't remember anymore, but what about qtest ? It is usually the forgotten 
> case in these kind of tests... so much complexity :-)
> 

This check was added with TCG in mind because it is a known limitation.
I don't see any reason to prevent qtest from being used with the rest
of this function though.

> Ciao,
> 
> Claudio
> 
> 
> > ---
> >  hw/ppc/spapr.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index dcb7f1c70a..c4b666587b 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2524,7 +2524,7 @@ static void spapr_set_vsmt_mode(SpaprMachineState 
> > *spapr, Error **errp)
> >  int ret;
> >  unsigned int smp_threads = ms->smp.threads;
> >  
> > -if (!kvm_enabled() && (smp_threads > 1)) {
> > +if (tcg_enabled() && (smp_threads > 1)) {

Bonjour Philippe,

Please drop the unneeded parens in the second check.

With this fixed,

Reviewed-by: Greg Kurz 

Cheers,

--
Greg

> >  error_setg(errp, "TCG cannot support more than 1 thread/core "
> > "on a pseries machine");
> >  return;
> 



-- 
Greg



Re: [SPAM] [PATCH v3] 9pfs: deprecate 'proxy' backend

2023-06-12 Thread Greg Kurz
Hi Christian,

On Sat, 10 Jun 2023 15:39:44 +0200
Christian Schoenebeck  wrote:

> As recent CVE-2023-2861 once again showed, the 9p 'proxy' fs driver is in
> bad shape. Using the 'proxy' backend was already discouraged for safety
> reasons before and we recommended to use the 'local' backend instead,
> but now it is time to officially deprecate the 'proxy' backend.
> 

For the records :

The 'proxy' backend is an old thing that predates vhost-user. It
really turns QEMU into a proxy : all requests go to QEMU, which
forwads them to the helper and the other way around for responses.
Data is copied both ways. All of this severely damages latency.

If someone really wants to offload 9p stuff to a separate process,
they should come up with a vhost-user-9p implementation. The whole
server would be offloaded, possibly sharing most of the code with
the in-QEMU server, QEMU wouldn't be involved anymore in the I/Os.
No more copies.

> Signed-off-by: Christian Schoenebeck 
> ---
>  v2 -> v3:
>  * Fix copy wasted typo (-> 'backend').
> 
>  MAINTAINERS|  7 +++
>  docs/about/deprecated.rst  | 17 +
>  docs/tools/virtfs-proxy-helper.rst |  3 +++
>  fsdev/qemu-fsdev.c |  5 +
>  fsdev/virtfs-proxy-helper.c|  5 +
>  hw/9pfs/9p-proxy.c |  5 +
>  hw/9pfs/9p-proxy.h |  5 +
>  meson.build|  2 +-
>  qemu-options.hx|  6 +-
>  softmmu/vl.c   |  5 +
>  10 files changed, 58 insertions(+), 2 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 436b3f0afe..185d694b2e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2118,13 +2118,20 @@ S: Odd Fixes
>  W: https://wiki.qemu.org/Documentation/9p
>  F: hw/9pfs/
>  X: hw/9pfs/xen-9p*
> +X: hw/9pfs/9p-proxy*
>  F: fsdev/
> +X: fsdev/virtfs-proxy-helper.c
>  F: docs/tools/virtfs-proxy-helper.rst
>  F: tests/qtest/virtio-9p-test.c
>  F: tests/qtest/libqos/virtio-9p*
>  T: git https://gitlab.com/gkurz/qemu.git 9p-next
>  T: git https://github.com/cschoenebeck/qemu.git 9p.next
>  
> +virtio-9p-proxy
> +F: hw/9pfs/9p-proxy*
> +F: fsdev/virtfs-proxy-helper.c
> +S: Obsolete
> +
>  virtio-blk
>  M: Stefan Hajnoczi 
>  L: qemu-bl...@nongnu.org
> diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
> index 0743459862..9b2c780365 100644
> --- a/docs/about/deprecated.rst
> +++ b/docs/about/deprecated.rst
> @@ -343,6 +343,23 @@ the addition of volatile memory support, it is now 
> necessary to distinguish
>  between persistent and volatile memory backends.  As such, memdev is 
> deprecated
>  in favor of persistent-memdev.
>  
> +``-fsdev proxy`` and ``-virtfs proxy`` (since 8.1)
> +^^
> +
> +The 9p ``proxy`` filesystem backend driver has been deprecated and will be
> +removed in a future version of QEMU. Please use ``-fsdev local`` or
> +``-virtfs local`` for using the ``local`` 9p filesystem backend instead.
> +
> +The 9p ``proxy`` backend was originally developed as an alternative to the 9p
> +``local`` backend. The idea was to enhance security by dispatching actual low
> +level filesystem operations from 9p server (QEMU process) over to a separate
> +process (the virtfs-proxy-helper binary). However this alternative never 
> gained
> +momentum. The proxy backend is much slower than the local backend, hasn't 
> seen
> +any development in years, and showed to be less secure, especially due to the
> +fact that its helper daemon must be run as root, whereas with the local 
> backend
> +QEMU is typically run as unprivileged user and allows to tighten behaviour by
> +mapping permissions et al.
> +
>  
>  Block device options
>  
> diff --git a/docs/tools/virtfs-proxy-helper.rst 
> b/docs/tools/virtfs-proxy-helper.rst
> index 6cdeedf8e9..bd310ebb07 100644
> --- a/docs/tools/virtfs-proxy-helper.rst
> +++ b/docs/tools/virtfs-proxy-helper.rst
> @@ -9,6 +9,9 @@ Synopsis
>  Description
>  ---
>  
> +NOTE: The 9p 'proxy' backend is deprecated (since QEMU 8.1) and will be
> +removed, along with this daemon, in a future version of QEMU!
> +
>  Pass-through security model in QEMU 9p server needs root privilege to do
>  few file operations (like chown, chmod to any mode/uid:gid).  There are two
>  issues in pass-through security model:
> diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c
> index 3da64e9f72..242f54ab49 100644
> --- a/fsdev/qemu-fsdev.c
> +++ b/fsdev/qemu-fsdev.c
> @@ -133,6 +133,11 @@ int qemu_fsdev_add(QemuOpts *opts, Error **errp)
>  }
>  
>  if (fsdriver) {
> +if (strncmp(fsdriver, "proxy", 5) == 0) {
> +warn_report("'-fsdev proxy' is deprecated, use '-fsdev local' "
> +"instead");
> +}
> +
>  for (i = 0; i < ARRAY_SIZE(FsDrivers); i++) {
>  if (strcmp(FsDrivers[i].name, fsdriver) == 0) {
>  break;
> diff 

Re: [PATCH v3] 9pfs: prevent opening special files (CVE-2023-2861)

2023-06-07 Thread Greg Kurz
On Wed, 7 Jun 2023 15:50:01 +0200
Christian Schoenebeck  wrote:

> The 9p protocol does not specifically define how server shall behave when
> client tries to open a special file, however from security POV it does
> make sense for 9p server to prohibit opening any special file on host side
> in general. A sane Linux 9p client for instance would never attempt to
> open a special file on host side, it would always handle those exclusively
> on its guest side. A malicious client however could potentially escape
> from the exported 9p tree by creating and opening a device file on host
> side.
> 
> With QEMU this could only be exploited in the following unsafe setups:
> 
>   - Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough'
> security model.
> 
> or
> 
>   - Using 9p 'proxy' fs driver (which is running its helper daemon as
> root).
> 
> These setups were already discouraged for safety reasons before,
> however for obvious reasons we are now tightening behaviour on this.
> 
> Fixes: CVE-2023-2861
> Reported-by: Yanwu Shen 
> Reported-by: Jietao Xiao 
> Reported-by: Jinku Li 
> Reported-by: Wenbo Shen 
> Signed-off-by: Christian Schoenebeck 
> ---
>  v2 -> v3:
>  - Drop O_CREAT check and its comment.
>  - Eliminate code duplication.
> 
>  fsdev/virtfs-proxy-helper.c | 26 --
>  hw/9pfs/9p-util.h   | 33 +
>  2 files changed, 57 insertions(+), 2 deletions(-)
> 
> diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
> index 5cafcd7703..256d7bfcec 100644
> --- a/fsdev/virtfs-proxy-helper.c
> +++ b/fsdev/virtfs-proxy-helper.c
> @@ -26,6 +26,7 @@
>  #include "qemu/xattr.h"
>  #include "9p-iov-marshal.h"
>  #include "hw/9pfs/9p-proxy.h"
> +#include "hw/9pfs/9p-util.h"
>  #include "fsdev/9p-iov-marshal.h"
>  
>  #define PROGNAME "virtfs-proxy-helper"
> @@ -338,6 +339,27 @@ static void resetugid(int suid, int sgid)
>  }
>  }
>  
> +/*
> + * Open regular file or directory. Attempts to open any special file are
> + * rejected.
> + *
> + * returns file descriptor or -1 on error
> + */
> +static int open_regular(const char *pathname, int flags, mode_t mode) {
> +int fd;
> +
> +fd = open(pathname, flags, mode);
> +if (fd < 0) {
> +return fd;
> +}
> +
> +if (check_is_regular_file_or_dir(fd) < 0) {
> +return -1;
> +}
> +
> +return fd;
> +}
> +
>  /*
>   * send response in two parts
>   * 1) ProxyHeader
> @@ -682,7 +704,7 @@ static int do_create(struct iovec *iovec)
>  if (ret < 0) {
>  goto unmarshal_err_out;
>  }
> -ret = open(path.data, flags, mode);
> +ret = open_regular(path.data, flags, mode);
>  if (ret < 0) {
>  ret = -errno;
>  }
> @@ -707,7 +729,7 @@ static int do_open(struct iovec *iovec)
>  if (ret < 0) {
>  goto err_out;
>  }
> -ret = open(path.data, flags);
> +ret = open_regular(path.data, flags, 0);
>  if (ret < 0) {
>  ret = -errno;
>  }
> diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
> index c314cf381d..9b0a9e5878 100644
> --- a/hw/9pfs/9p-util.h
> +++ b/hw/9pfs/9p-util.h
> @@ -13,6 +13,8 @@
>  #ifndef QEMU_9P_UTIL_H
>  #define QEMU_9P_UTIL_H
>  
> +#include "qemu/error-report.h"
> +
>  #ifdef O_PATH
>  #define O_PATH_9P_UTIL O_PATH
>  #else
> @@ -95,6 +97,7 @@ static inline int errno_to_dotl(int err) {
>  #endif
>  
>  #define qemu_openat openat
> +#define qemu_fstat  fstat
>  #define qemu_fstatatfstatat
>  #define qemu_mkdiratmkdirat
>  #define qemu_renameat   renameat
> @@ -108,6 +111,32 @@ static inline void close_preserve_errno(int fd)
>  errno = serrno;
>  }
>  
> +/* CVE-2023-2861: Prohibit opening any special file directly on host
> + * (especially device files), as a compromised client could potentially gain
> + * access outside exported tree under certain, unsafe setups. We expect
> + * client to handle I/O on special files exclusively on guest side.
> + */
> +static inline int check_is_regular_file_or_dir(int fd)
> +{
> +struct stat stbuf;
> +
> +if (qemu_fstat(fd, ) < 0) {
> +close_preserve_errno(fd);

Maybe worth to mention somewhere that this function not only
checks but also closes the fd if it doesn't point to a regular
file or directory. Or maybe change the name, e.g.
filter_out_special_files() ?

Anyway the fix is fine enough to address the CVE.

Reviewed-by: Greg Kurz 

> +return -1;
> +}
> +if (!S_ISREG(stbuf.s

Re: [PATCH v2] 9pfs: prevent opening special files (CVE-2023-2861)

2023-06-07 Thread Greg Kurz
On Wed, 07 Jun 2023 13:02:17 +0200
Christian Schoenebeck  wrote:

> On Tuesday, June 6, 2023 6:00:28 PM CEST Greg Kurz wrote:
> > Hi Christian,
> > 
> > On Tue, 06 Jun 2023 15:57:50 +0200
> > Christian Schoenebeck  wrote:
> > 
> > > The 9p protocol does not specifically define how server shall behave when
> > > client tries to open a special file, however from security POV it does
> > > make sense for 9p server to prohibit opening any special file on host side
> > > in general. A sane Linux 9p client for instance would never attempt to
> > > open a special file on host side, it would always handle those exclusively
> > > on its guest side. A malicious client however could potentially escape
> > > from the exported 9p tree by creating and opening a device file on host
> > > side.
> > > 
> > > With QEMU this could only be exploited in the following unsafe setups:
> > > 
> > >   - Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough'
> > > security model.
> > > 
> > > or
> > > 
> > >   - Using 9p 'proxy' fs driver (which is running its helper daemon as
> > > root).
> > > 
> > > These setups were already discouraged for safety reasons before,
> > > however for obvious reasons we are now tightening behaviour on this.
> > > 
> > > Fixes: CVE-2023-2861
> > > Reported-by: Yanwu Shen 
> > > Reported-by: Jietao Xiao 
> > > Reported-by: Jinku Li 
> > > Reported-by: Wenbo Shen 
> > > Signed-off-by: Christian Schoenebeck 
> > > ---
> > >  v1 -> v2:
> > >  - Add equivalent fix for 'proxy' fs driver.
> > >  - Minor adjustments on commit log.
> > > 
> > 
> > Note that this might be a bit confusing for reviewers since
> > v1 was never posted to qemu-devel. Technically, this should
> > have been posted without the v2 tag.
> 
> I felt it wouldn't make it any better, as it might otherwise confuse those who
> already got the previous two patch emails.
> 

No big deal.

> > >  fsdev/virtfs-proxy-helper.c | 48 +++--
> > >  hw/9pfs/9p-util.h   | 29 ++
> > >  2 files changed, 75 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
> > > index 5cafcd7703..f311519fa3 100644
> > > --- a/fsdev/virtfs-proxy-helper.c
> > > +++ b/fsdev/virtfs-proxy-helper.c
> > > @@ -26,6 +26,7 @@
> > >  #include "qemu/xattr.h"
> > >  #include "9p-iov-marshal.h"
> > >  #include "hw/9pfs/9p-proxy.h"
> > > +#include "hw/9pfs/9p-util.h"
> > >  #include "fsdev/9p-iov-marshal.h"
> > >  
> > >  #define PROGNAME "virtfs-proxy-helper"
> > > @@ -338,6 +339,49 @@ static void resetugid(int suid, int sgid)
> > >  }
> > >  }
> > >  
> > > +/*
> > > + * Open regular file or directory. Attempts to open any special file are
> > > + * rejected.
> > > + *
> > > + * returns file descriptor or -1 on error
> > > + */
> > > +static int open_regular(const char *pathname, int flags, mode_t mode) {
> > > +int fd;
> > > +struct stat stbuf;
> > > +
> > > +fd = open(pathname, flags, mode);
> > > +if (fd < 0) {
> > > +return fd;
> > > +}
> > > +
> > > +/* CVE-2023-2861: Prohibit opening any special file directly on host
> > > + * (especially device files), as a compromised client could 
> > > potentially
> > > + * gain access outside exported tree under certain, unsafe setups. We
> > > + * expect client to handle I/O on special files exclusively on guest 
> > > side.
> > > + */
> > > +if (qemu_fstat(fd, ) < 0) {
> > > +close_preserve_errno(fd);
> > > +return -1;
> > > +}
> > > +if (!S_ISREG(stbuf.st_mode) && !S_ISDIR(stbuf.st_mode)) {
> > > +/* Tcreate and Tlcreate 9p messages mandate to immediately open 
> > > the
> > > + * created file for I/O. So this is not (necessarily) due to a 
> > > broken
> > > + * client, and hence no error message is to be reported in this 
> > > case.
> > > + */
> > > +if (!(flags & O_CREAT)) {
> > 
> > Tlcreate is explicitly about creating regular files

Re: [PATCH v2] 9pfs: prevent opening special files (CVE-2023-2861)

2023-06-06 Thread Greg Kurz
Hi Christian,

On Tue, 06 Jun 2023 15:57:50 +0200
Christian Schoenebeck  wrote:

> The 9p protocol does not specifically define how server shall behave when
> client tries to open a special file, however from security POV it does
> make sense for 9p server to prohibit opening any special file on host side
> in general. A sane Linux 9p client for instance would never attempt to
> open a special file on host side, it would always handle those exclusively
> on its guest side. A malicious client however could potentially escape
> from the exported 9p tree by creating and opening a device file on host
> side.
> 
> With QEMU this could only be exploited in the following unsafe setups:
> 
>   - Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough'
> security model.
> 
> or
> 
>   - Using 9p 'proxy' fs driver (which is running its helper daemon as
> root).
> 
> These setups were already discouraged for safety reasons before,
> however for obvious reasons we are now tightening behaviour on this.
> 
> Fixes: CVE-2023-2861
> Reported-by: Yanwu Shen 
> Reported-by: Jietao Xiao 
> Reported-by: Jinku Li 
> Reported-by: Wenbo Shen 
> Signed-off-by: Christian Schoenebeck 
> ---
>  v1 -> v2:
>  - Add equivalent fix for 'proxy' fs driver.
>  - Minor adjustments on commit log.
> 

Note that this might be a bit confusing for reviewers since
v1 was never posted to qemu-devel. Technically, this should
have been posted without the v2 tag.

>  fsdev/virtfs-proxy-helper.c | 48 +++--
>  hw/9pfs/9p-util.h   | 29 ++
>  2 files changed, 75 insertions(+), 2 deletions(-)
> 
> diff --git a/fsdev/virtfs-proxy-helper.c b/fsdev/virtfs-proxy-helper.c
> index 5cafcd7703..f311519fa3 100644
> --- a/fsdev/virtfs-proxy-helper.c
> +++ b/fsdev/virtfs-proxy-helper.c
> @@ -26,6 +26,7 @@
>  #include "qemu/xattr.h"
>  #include "9p-iov-marshal.h"
>  #include "hw/9pfs/9p-proxy.h"
> +#include "hw/9pfs/9p-util.h"
>  #include "fsdev/9p-iov-marshal.h"
>  
>  #define PROGNAME "virtfs-proxy-helper"
> @@ -338,6 +339,49 @@ static void resetugid(int suid, int sgid)
>  }
>  }
>  
> +/*
> + * Open regular file or directory. Attempts to open any special file are
> + * rejected.
> + *
> + * returns file descriptor or -1 on error
> + */
> +static int open_regular(const char *pathname, int flags, mode_t mode) {
> +int fd;
> +struct stat stbuf;
> +
> +fd = open(pathname, flags, mode);
> +if (fd < 0) {
> +return fd;
> +}
> +
> +/* CVE-2023-2861: Prohibit opening any special file directly on host
> + * (especially device files), as a compromised client could potentially
> + * gain access outside exported tree under certain, unsafe setups. We
> + * expect client to handle I/O on special files exclusively on guest 
> side.
> + */
> +if (qemu_fstat(fd, ) < 0) {
> +close_preserve_errno(fd);
> +return -1;
> +}
> +if (!S_ISREG(stbuf.st_mode) && !S_ISDIR(stbuf.st_mode)) {
> +/* Tcreate and Tlcreate 9p messages mandate to immediately open the
> + * created file for I/O. So this is not (necessarily) due to a broken
> + * client, and hence no error message is to be reported in this case.
> + */
> +if (!(flags & O_CREAT)) {

Tlcreate is explicitly about creating regular files only (see [1] and
v9fs_lcreate()) and I don't quite see how open() could successfully
create a regular file and the resulting fd is fstat'ed as something
else.

Tcreate seems to cover more types but again only regular files (with O_CREAT)
or directories (without O_CREAT) are expected here (see v9fs_create()).

Unless I'm missing something, it seems that the comment and the O_CREAT
check should be removed.

[1] 
https://github.com/chaos/diod/blob/master/protocol.md#lcreatecreate-regular-file

> +error_report_once(
> +"9p: broken or compromised client detected; attempt to open "
> +"special file (i.e. neither regular file, nor directory)"
> +);
> +}
> +close(fd);
> +errno = ENXIO;
> +return -1;
> +}
> +
> +return fd;
> +}
> +
>  /*
>   * send response in two parts
>   * 1) ProxyHeader
> @@ -682,7 +726,7 @@ static int do_create(struct iovec *iovec)
>  if (ret < 0) {
>  goto unmarshal_err_out;
>  }
> -ret = open(path.data, flags, mode);
> +ret = open_regular(path.data, flags, mode);
>  if (ret < 0) {
>  ret = -errno;
>  }
> @@ -707,7 +751,7 @@ static int do_open(struct iovec *iovec)
>  if (ret < 0) {
>  goto err_out;
>  }
> -ret = open(path.data, flags);
> +ret = open_regular(path.data, flags, 0);
>  if (ret < 0) {
>  ret = -errno;
>  }
> diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
> index c314cf381d..9da1a0538d 100644
> --- a/hw/9pfs/9p-util.h
> +++ b/hw/9pfs/9p-util.h
> @@ -13,6 +13,8 @@
>  #ifndef QEMU_9P_UTIL_H
>  #define 

Re: [PATCH v3] target: ppc: Use MSR_HVB bit to get the target endianness for memory dump

2023-05-23 Thread Greg Kurz
On Tue, 23 May 2023 12:20:17 +0530
Narayana Murty N  wrote:

> 
> On 5/22/23 23:50, Greg Kurz wrote:
> > On Mon, 22 May 2023 12:02:42 -0400
> > Narayana Murty N  wrote:
> >
> >> Currently on PPC64 qemu always dumps the guest memory in
> >> Big Endian (BE) format even though the guest running in Little Endian
> >> (LE) mode. So crash tool fails to load the dump as illustrated below:
> >>
> >> Log :
> >> $ virsh dump DOMAIN --memory-only dump.file
> >>
> >> Domain 'DOMAIN' dumped to dump.file
> >>
> >> $ crash vmlinux dump.file
> >>
> >> 
> >> crash 8.0.2-1.el9
> >>
> >> WARNING: endian mismatch:
> >>crash utility: little-endian
> >>dump.file: big-endian
> >>
> >> WARNING: machine type mismatch:
> >>crash utility: PPC64
> >>dump.file: (unknown)
> >>
> >> crash: dump.file: not a supported file format
> >> 
> >>
> >> This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode
> >> to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode
> >> always set for powerNV even though the guest is not running in hv mode.
> >> The hv mode should be taken from msr_mask MSR_HVB bit
> >> (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing
> >> MSR_HVB value to ppc_interrupts_little_endian() in order to determine
> >> the guest endianness.
> >>
> >> The crash tool also expects guest kernel endianness should match the
> >> endianness of the dump.
> >>
> >> The patch was tested on POWER9 box booted with Linux as host in
> >> following cases:
> >>
> >> Host-Endianess Qemu-Target-Machine Qemu-Guest-Endianess  
> >> Qemu-Generated-Guest
> >>
> >> Memory-Dump-Format
> >> BE powernv LE KVM guest LE
> >> BE powernv BE KVM guest BE
> >> LE powernv LE KVM guest LE
> >> LE powernv BE KVM guest BE
> > I don't quite understand why KVM is mentioned with the powernv machine.
> 
> guest running mode was mentioned.
> 

QEMU cannot use KVM on the host to run a powernv machine. The
guest is thus necessarily running in TCG mode.

Please describe your setup and what exactly you are testing.

> >
> > Also have you tried to dump at various moments, e.g. during skiboot
> > and when guest is booted, as in [1] which introduced the code this
> > patch is changing ?
> >
> > [1]https://github.com/qemu/qemu/commit/5609400a422809c89ea788e4d0e13124a617582e.
> >
> >> LE pseries KVM LE KVM guest LE
> >> LE pseries TCG LE guest LE
> >>
> > Fixes: 5609400a4228 ("target/ppc: Set the correct endianness for powernv 
> > memory dumps")
> 
> I agree, commit 5609400a4228 fixes endianness detection only for initial 
> stage (skiboot) till endianness switch happens.
> However, has_hv_mode is just a capability flag which is always set based on 
> command-line param and doesnt really represent current hv state.
> With this patch, it relies on the current state of the hv state based on the 
> MSR_HVB of the msr_mask.
> 

Yes I see what your patch is doing. The 'Fixes: 5609400a4228 ...' line is
intended to the changelog because it is supposedly a fix to this commit.

> >
> >> Signed-off-by: Narayana Murty N
> >> ---
> >> Changes since V2:
> >> commit message modified as per feedbak from Nicholas Piggin.
> >> Changes since V1:
> >> https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmli...@linux.ibm.com/
> >> The approach to solve the issue was changed based on feedback from
> >> Fabiano Rosas on patch V1.
> >> ---
> >>   target/ppc/arch_dump.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c
> >> index f58e6359d5..a8315659d9 100644
> >> --- a/target/ppc/arch_dump.c
> >> +++ b/target/ppc/arch_dump.c
> >> @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info,
> >>   info->d_machine = PPC_ELF_MACHINE;
> >>   info->d_class = ELFCLASS;
> >>   
> >> -if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) {
> >> +if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & 
> >> MSR_HVB))) {
> >>   info->d_endian = ELFDATA2LSB;
> >>   } else {
> >>   info->d_endian = ELFDATA2MSB;



Re: [PATCH v3] target: ppc: Use MSR_HVB bit to get the target endianness for memory dump

2023-05-22 Thread Greg Kurz
On Mon, 22 May 2023 12:02:42 -0400
Narayana Murty N  wrote:

> Currently on PPC64 qemu always dumps the guest memory in
> Big Endian (BE) format even though the guest running in Little Endian
> (LE) mode. So crash tool fails to load the dump as illustrated below:
> 
> Log :
> $ virsh dump DOMAIN --memory-only dump.file
> 
> Domain 'DOMAIN' dumped to dump.file
> 
> $ crash vmlinux dump.file
> 
> 
> crash 8.0.2-1.el9
> 
> WARNING: endian mismatch:
>   crash utility: little-endian
>   dump.file: big-endian
> 
> WARNING: machine type mismatch:
>   crash utility: PPC64
>   dump.file: (unknown)
> 
> crash: dump.file: not a supported file format
> 
> 
> This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode
> to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode
> always set for powerNV even though the guest is not running in hv mode.
> The hv mode should be taken from msr_mask MSR_HVB bit
> (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing
> MSR_HVB value to ppc_interrupts_little_endian() in order to determine
> the guest endianness.
> 
> The crash tool also expects guest kernel endianness should match the
> endianness of the dump.
> 
> The patch was tested on POWER9 box booted with Linux as host in
> following cases:
> 
> Host-Endianess Qemu-Target-Machine Qemu-Guest-Endianess  Qemu-Generated-Guest
>   Memory-Dump-Format
> BE powernv LE KVM guest LE
> BE powernv BE KVM guest BE
> LE powernv LE KVM guest LE
> LE powernv BE KVM guest BE

I don't quite understand why KVM is mentioned with the powernv machine.

Also have you tried to dump at various moments, e.g. during skiboot
and when guest is booted, as in [1] which introduced the code this
patch is changing ?

[1] 
https://github.com/qemu/qemu/commit/5609400a422809c89ea788e4d0e13124a617582e.

> LE pseries KVM LE KVM guest LE
> LE pseries TCG LE guest LE
> 

Fixes: 5609400a4228 ("target/ppc: Set the correct endianness for powernv memory 
dumps")

> Signed-off-by: Narayana Murty N 
> ---
> Changes since V2:
> commit message modified as per feedbak from Nicholas Piggin.
> Changes since V1:
> https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmli...@linux.ibm.com/
> The approach to solve the issue was changed based on feedback from
> Fabiano Rosas on patch V1.
> ---
>  target/ppc/arch_dump.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c
> index f58e6359d5..a8315659d9 100644
> --- a/target/ppc/arch_dump.c
> +++ b/target/ppc/arch_dump.c
> @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info,
>  info->d_machine = PPC_ELF_MACHINE;
>  info->d_class = ELFCLASS;
>  
> -if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) {
> +if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) {
>  info->d_endian = ELFDATA2LSB;
>  } else {
>  info->d_endian = ELFDATA2MSB;




Re: [PATCH] tests/9p: fix potential leak in v9fs_rreaddir()

2023-05-01 Thread Greg Kurz
On Sat, 29 Apr 2023 15:20:12 +0200
Christian Schoenebeck  wrote:

> On Saturday, April 29, 2023 2:04:30 PM CEST Greg Kurz wrote:
> > Hi Christian !
> 
> Hi there, it's been a while! :)
> 
> > On Sat, 29 Apr 2023 11:25:33 +0200
> > Christian Schoenebeck  wrote:
> > 
> > > Free allocated directory entries in v9fs_rreaddir() if argument
> > > `entries` was passed as NULL, to avoid a memory leak. It is
> > > explicitly allowed by design for `entries` to be NULL. [1]
> > > 
> > > [1] https://lore.kernel.org/all/1690923.g4PEXVpXuU@silver
> > > 
> > > Reported-by: Coverity (CID 1487558)
> > > Signed-off-by: Christian Schoenebeck 
> > > ---
> > 
> > Good catch Coverity ! :-)
> 
> Yeah, this Coverity report is actually from March and I ignored it so far,
> because the reported leak could never happen with current test code. But Paolo
> brought it up this week, so ...
> 
> > Reviewed-by: Greg Kurz 
> > 
> > I still have a suggestion. See below.
> > 
> > >  tests/qtest/libqos/virtio-9p-client.c | 5 +
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/tests/qtest/libqos/virtio-9p-client.c 
> > > b/tests/qtest/libqos/virtio-9p-client.c
> > > index e4a368e036..b8adc8d4b9 100644
> > > --- a/tests/qtest/libqos/virtio-9p-client.c
> > > +++ b/tests/qtest/libqos/virtio-9p-client.c
> > > @@ -594,6 +594,8 @@ void v9fs_rreaddir(P9Req *req, uint32_t *count, 
> > > uint32_t *nentries,
> > >  {
> > >  uint32_t local_count;
> > >  struct V9fsDirent *e = NULL;
> > > +/* only used to avoid a leak if entries was NULL */
> > > +struct V9fsDirent *unused_entries = NULL;
> > >  uint16_t slen;
> > >  uint32_t n = 0;
> > >  
> > > @@ -612,6 +614,8 @@ void v9fs_rreaddir(P9Req *req, uint32_t *count, 
> > > uint32_t *nentries,
> > >  e = g_new(struct V9fsDirent, 1);
> > >  if (entries) {
> > >  *entries = e;
> > > +} else {
> > > +unused_entries = e;
> > >  }
> > >  } else {
> > >  e = e->next = g_new(struct V9fsDirent, 1);
> > 
> > This is always allocating and chaining a new entry even
> > though it isn't needed in the entries == NULL case.
> > 
> > > @@ -628,6 +632,7 @@ void v9fs_rreaddir(P9Req *req, uint32_t *count, 
> > > uint32_t *nentries,
> > >  *nentries = n;
> > >  }
> > >  
> > > +v9fs_free_dirents(unused_entries);
> > 
> > This is going to loop again on all entries to free them.
> > 
> > >  v9fs_req_free(req);
> > >  }
> > >  
> > 
> > If this function is to be called one day with an enormous
> > number of entries and entries == NULL case, this might
> > not scale well.
> > 
> > What about only allocating a single entry in this case ?
> > 
> > E.g.
> > 
> > @@ -593,7 +593,7 @@ void v9fs_rreaddir(P9Req *req, uint32_t *count, 
> > uint32_t *nentries,
> > struct V9fsDirent **entries)
> >  {
> >  uint32_t local_count;
> > -struct V9fsDirent *e = NULL;
> > +g_autofree struct V9fsDirent *e = NULL;
> >  uint16_t slen;
> >  uint32_t n = 0;
> >  
> > @@ -611,10 +611,12 @@ void v9fs_rreaddir(P9Req *req, uint32_t *count, 
> > uint32_t *nentries,
> >  if (!e) {
> >  e = g_new(struct V9fsDirent, 1);
> >  if (entries) {
> > -*entries = e;
> > +*entries = g_steal_pointer(e);
> 
> g_steal_pointer(e) just sets `e` to NULL and returns its old value, so ...
> 
> >  }
> >  } else {
> > -e = e->next = g_new(struct V9fsDirent, 1);
> > +if (entries) {
> > +e = e->next = g_new(struct V9fsDirent, 1);
> > +}
> 
> ... this `else` block would never be reached and no list assembled.
> 
> >  }
> >  e->next = NULL;
> >  /* qid[13] offset[8] type[1] name[s] */
> 
> And even if above's issue was fixed, then it would cause a use-after-free for
> the last element in the list if entries != NULL and caller trying to access
> the last element afterwards. So you would still need a separate g_autofree
> pointer instead of tagging `e` directly, or something like this after loop
> end:
> 
>   if (entries)
> g_steal_pointer(e);
> 
> Which would somehow defeat the purpose of using g_autofree though.
> 
> I mean, yes this could be addressed, but is it worth it? I don't know. Even
> this reported leak is a purely theoretical one, but I understand if people
> want to silence a warning.
> 

Yeah you're right.

Cheers,

--
Greg

> Best regards,
> Christian Schoenebeck
> 
> 




Re: [PATCH] tests/9p: fix potential leak in v9fs_rreaddir()

2023-04-29 Thread Greg Kurz
Hi Christian !

On Sat, 29 Apr 2023 11:25:33 +0200
Christian Schoenebeck  wrote:

> Free allocated directory entries in v9fs_rreaddir() if argument
> `entries` was passed as NULL, to avoid a memory leak. It is
> explicitly allowed by design for `entries` to be NULL. [1]
> 
> [1] https://lore.kernel.org/all/1690923.g4PEXVpXuU@silver
> 
> Reported-by: Coverity (CID 1487558)
> Signed-off-by: Christian Schoenebeck 
> ---

Good catch Coverity ! :-)

Reviewed-by: Greg Kurz 

I still have a suggestion. See below.

>  tests/qtest/libqos/virtio-9p-client.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/tests/qtest/libqos/virtio-9p-client.c 
> b/tests/qtest/libqos/virtio-9p-client.c
> index e4a368e036..b8adc8d4b9 100644
> --- a/tests/qtest/libqos/virtio-9p-client.c
> +++ b/tests/qtest/libqos/virtio-9p-client.c
> @@ -594,6 +594,8 @@ void v9fs_rreaddir(P9Req *req, uint32_t *count, uint32_t 
> *nentries,
>  {
>  uint32_t local_count;
>  struct V9fsDirent *e = NULL;
> +/* only used to avoid a leak if entries was NULL */
> +struct V9fsDirent *unused_entries = NULL;
>  uint16_t slen;
>  uint32_t n = 0;
>  
> @@ -612,6 +614,8 @@ void v9fs_rreaddir(P9Req *req, uint32_t *count, uint32_t 
> *nentries,
>  e = g_new(struct V9fsDirent, 1);
>  if (entries) {
>  *entries = e;
> +} else {
> +unused_entries = e;
>  }
>  } else {
>  e = e->next = g_new(struct V9fsDirent, 1);

This is always allocating and chaining a new entry even
though it isn't needed in the entries == NULL case.

> @@ -628,6 +632,7 @@ void v9fs_rreaddir(P9Req *req, uint32_t *count, uint32_t 
> *nentries,
>  *nentries = n;
>  }
>  
> +v9fs_free_dirents(unused_entries);

This is going to loop again on all entries to free them.

>  v9fs_req_free(req);
>  }
>  

If this function is to be called one day with an enormous
number of entries and entries == NULL case, this might
not scale well.

What about only allocating a single entry in this case ?

E.g.

@@ -593,7 +593,7 @@ void v9fs_rreaddir(P9Req *req, uint32_t *count, uint32_t 
*nentries,
struct V9fsDirent **entries)
 {
 uint32_t local_count;
-struct V9fsDirent *e = NULL;
+g_autofree struct V9fsDirent *e = NULL;
 uint16_t slen;
 uint32_t n = 0;
 
@@ -611,10 +611,12 @@ void v9fs_rreaddir(P9Req *req, uint32_t *count, uint32_t 
*nentries,
 if (!e) {
 e = g_new(struct V9fsDirent, 1);
 if (entries) {
-*entries = e;
+*entries = g_steal_pointer(e);
 }
 } else {
-e = e->next = g_new(struct V9fsDirent, 1);
+if (entries) {
+e = e->next = g_new(struct V9fsDirent, 1);
+}
 }
 e->next = NULL;
 /* qid[13] offset[8] type[1] name[s] */




Re: [PATCH 0/2] vhost-user: Remove the nested event loop to unbreak the DPDK use case

2023-01-23 Thread Greg Kurz
On Thu, 19 Jan 2023 18:24:22 +0100
Greg Kurz  wrote:

> The nested event loop was introduced in QEMU 6.0 to allow servicing
> of requests coming from the slave channel while waiting for an ack
> from the back-end on the master socket. It turns out this is fragile
> and breaks if the servicing of the slave channel causes a new message
> to be sent on the master socket. This is exactly what happens when
> using DPDK as reported in [0].
> 
> The only identified user for the nested loop is DAX enablement that
> isn't upstream yet. Just drop the code for now. Some more clever
> solution should be designed when the need to service concurrent
> requests from both channels arises again.
> 
> Greg Kurz (2):
>   Revert "vhost-user: Monitor slave channel in vhost_user_read()"
>   Revert "vhost-user: Introduce nested event loop in vhost_user_read()"
> 
>  hw/virtio/vhost-user.c | 100 -
>  1 file changed, 8 insertions(+), 92 deletions(-)
> 

Hi Michael,

Can you please merge this series as you kindly proposed in [0] ?
This will help to fix [1] which is currently blocking downstream
testing.

Cheers,

--
Greg

[0] 
https://lore.kernel.org/qemu-devel/20230118060102-mutt-send-email-...@kernel.org/
[1] https://bugzilla.redhat.com/show_bug.cgi?id=2155173



[PATCH 0/2] vhost-user: Remove the nested event loop to unbreak the DPDK use case

2023-01-20 Thread Greg Kurz
The nested event loop was introduced in QEMU 6.0 to allow servicing
of requests coming from the slave channel while waiting for an ack
from the back-end on the master socket. It turns out this is fragile
and breaks if the servicing of the slave channel causes a new message
to be sent on the master socket. This is exactly what happens when
using DPDK as reported in [0].

The only identified user for the nested loop is DAX enablement that
isn't upstream yet. Just drop the code for now. Some more clever
solution should be designed when the need to service concurrent
requests from both channels arises again.

Greg Kurz (2):
  Revert "vhost-user: Monitor slave channel in vhost_user_read()"
  Revert "vhost-user: Introduce nested event loop in vhost_user_read()"

 hw/virtio/vhost-user.c | 100 -
 1 file changed, 8 insertions(+), 92 deletions(-)

-- 
2.39.0





Re: [PATCH 1/2] Revert "vhost-user: Monitor slave channel in vhost_user_read()"

2023-01-19 Thread Greg Kurz
For some reason, the cover letter didn't make it, even though
git publish reported it was sent. I'll repost tomorrow if it
is still missing (or possibly craft a message manually with
the appropriate id if I find time).

Sorry for the inconvenience.

Cheers,

--
Greg


On Thu, 19 Jan 2023 18:24:23 +0100
Greg Kurz  wrote:

> This reverts commit db8a3772e300c1a656331a92da0785d81667dc81.
> 
> Motivation : this is breaking vhost-user with DPDK as reported in [0].
> 
> Received unexpected msg type. Expected 22 received 40
> Fail to update device iotlb
> Received unexpected msg type. Expected 40 received 22
> Received unexpected msg type. Expected 22 received 11
> Fail to update device iotlb
> Received unexpected msg type. Expected 11 received 22
> vhost VQ 1 ring restore failed: -71: Protocol error (71)
> Received unexpected msg type. Expected 22 received 11
> Fail to update device iotlb
> Received unexpected msg type. Expected 11 received 22
> vhost VQ 0 ring restore failed: -71: Protocol error (71)
> unable to start vhost net: 71: falling back on userspace virtio
> 
> The failing sequence that leads to the first error is :
> - QEMU sends a VHOST_USER_GET_STATUS (40) request to DPDK on the master
>   socket
> - QEMU starts a nested event loop in order to wait for the
>   VHOST_USER_GET_STATUS response and to be able to process messages from
>   the slave channel
> - DPDK sends a couple of legitimate IOTLB miss messages on the slave
>   channel
> - QEMU processes each IOTLB request and sends VHOST_USER_IOTLB_MSG (22)
>   updates on the master socket
> - QEMU assumes to receive a response for the latest VHOST_USER_IOTLB_MSG
>   but it gets the response for the VHOST_USER_GET_STATUS instead
> 
> The subsequent errors have the same root cause : the nested event loop
> breaks the order by design. It lures QEMU to expect responses to the
> latest message sent on the master socket to arrive first.
> 
> Since this was only needed for DAX enablement which is still not merged
> upstream, just drop the code for now. A working solution will have to
> be merged later on. Likely protect the master socket with a mutex
> and service the slave channel with a separate thread, as discussed with
> Maxime in the mail thread below.
> 
> [0] 
> https://lore.kernel.org/qemu-devel/43145ede-89dc-280e-b953-6a2b436de...@redhat.com/
> 
> Reported-by: Yanghang Liu 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2155173
> Signed-off-by: Greg Kurz 
> ---
>  hw/virtio/vhost-user.c | 35 +++
>  1 file changed, 3 insertions(+), 32 deletions(-)
> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index d9ce0501b2c7..7fb78af22c56 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -356,35 +356,6 @@ end:
>  return G_SOURCE_REMOVE;
>  }
>  
> -static gboolean slave_read(QIOChannel *ioc, GIOCondition condition,
> -   gpointer opaque);
> -
> -/*
> - * This updates the read handler to use a new event loop context.
> - * Event sources are removed from the previous context : this ensures
> - * that events detected in the previous context are purged. They will
> - * be re-detected and processed in the new context.
> - */
> -static void slave_update_read_handler(struct vhost_dev *dev,
> -  GMainContext *ctxt)
> -{
> -struct vhost_user *u = dev->opaque;
> -
> -if (!u->slave_ioc) {
> -return;
> -}
> -
> -if (u->slave_src) {
> -g_source_destroy(u->slave_src);
> -g_source_unref(u->slave_src);
> -}
> -
> -u->slave_src = qio_channel_add_watch_source(u->slave_ioc,
> -G_IO_IN | G_IO_HUP,
> -slave_read, dev, NULL,
> -ctxt);
> -}
> -
>  static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
>  {
>  struct vhost_user *u = dev->opaque;
> @@ -406,7 +377,6 @@ static int vhost_user_read(struct vhost_dev *dev, 
> VhostUserMsg *msg)
>   * be prepared for re-entrancy. So we create a new one and switch chr
>   * to use it.
>   */
> -slave_update_read_handler(dev, ctxt);
>  qemu_chr_be_update_read_handlers(chr->chr, ctxt);
>  qemu_chr_fe_add_watch(chr, G_IO_IN | G_IO_HUP, vhost_user_read_cb, 
> );
>  
> @@ -418,7 +388,6 @@ static int vhost_user_read(struct vhost_dev *dev, 
> VhostUserMsg *msg)
>   * context that have been processed by the nested loop are purged.
>   */
>  qemu_chr_be_update_read_handlers(chr->chr, prev_ctxt);
> -slave_update_r

[PATCH 1/2] Revert "vhost-user: Monitor slave channel in vhost_user_read()"

2023-01-19 Thread Greg Kurz
This reverts commit db8a3772e300c1a656331a92da0785d81667dc81.

Motivation : this is breaking vhost-user with DPDK as reported in [0].

Received unexpected msg type. Expected 22 received 40
Fail to update device iotlb
Received unexpected msg type. Expected 40 received 22
Received unexpected msg type. Expected 22 received 11
Fail to update device iotlb
Received unexpected msg type. Expected 11 received 22
vhost VQ 1 ring restore failed: -71: Protocol error (71)
Received unexpected msg type. Expected 22 received 11
Fail to update device iotlb
Received unexpected msg type. Expected 11 received 22
vhost VQ 0 ring restore failed: -71: Protocol error (71)
unable to start vhost net: 71: falling back on userspace virtio

The failing sequence that leads to the first error is :
- QEMU sends a VHOST_USER_GET_STATUS (40) request to DPDK on the master
  socket
- QEMU starts a nested event loop in order to wait for the
  VHOST_USER_GET_STATUS response and to be able to process messages from
  the slave channel
- DPDK sends a couple of legitimate IOTLB miss messages on the slave
  channel
- QEMU processes each IOTLB request and sends VHOST_USER_IOTLB_MSG (22)
  updates on the master socket
- QEMU assumes to receive a response for the latest VHOST_USER_IOTLB_MSG
  but it gets the response for the VHOST_USER_GET_STATUS instead

The subsequent errors have the same root cause : the nested event loop
breaks the order by design. It lures QEMU to expect responses to the
latest message sent on the master socket to arrive first.

Since this was only needed for DAX enablement which is still not merged
upstream, just drop the code for now. A working solution will have to
be merged later on. Likely protect the master socket with a mutex
and service the slave channel with a separate thread, as discussed with
Maxime in the mail thread below.

[0] 
https://lore.kernel.org/qemu-devel/43145ede-89dc-280e-b953-6a2b436de...@redhat.com/

Reported-by: Yanghang Liu 
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2155173
Signed-off-by: Greg Kurz 
---
 hw/virtio/vhost-user.c | 35 +++
 1 file changed, 3 insertions(+), 32 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index d9ce0501b2c7..7fb78af22c56 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -356,35 +356,6 @@ end:
 return G_SOURCE_REMOVE;
 }
 
-static gboolean slave_read(QIOChannel *ioc, GIOCondition condition,
-   gpointer opaque);
-
-/*
- * This updates the read handler to use a new event loop context.
- * Event sources are removed from the previous context : this ensures
- * that events detected in the previous context are purged. They will
- * be re-detected and processed in the new context.
- */
-static void slave_update_read_handler(struct vhost_dev *dev,
-  GMainContext *ctxt)
-{
-struct vhost_user *u = dev->opaque;
-
-if (!u->slave_ioc) {
-return;
-}
-
-if (u->slave_src) {
-g_source_destroy(u->slave_src);
-g_source_unref(u->slave_src);
-}
-
-u->slave_src = qio_channel_add_watch_source(u->slave_ioc,
-G_IO_IN | G_IO_HUP,
-slave_read, dev, NULL,
-ctxt);
-}
-
 static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
 {
 struct vhost_user *u = dev->opaque;
@@ -406,7 +377,6 @@ static int vhost_user_read(struct vhost_dev *dev, 
VhostUserMsg *msg)
  * be prepared for re-entrancy. So we create a new one and switch chr
  * to use it.
  */
-slave_update_read_handler(dev, ctxt);
 qemu_chr_be_update_read_handlers(chr->chr, ctxt);
 qemu_chr_fe_add_watch(chr, G_IO_IN | G_IO_HUP, vhost_user_read_cb, );
 
@@ -418,7 +388,6 @@ static int vhost_user_read(struct vhost_dev *dev, 
VhostUserMsg *msg)
  * context that have been processed by the nested loop are purged.
  */
 qemu_chr_be_update_read_handlers(chr->chr, prev_ctxt);
-slave_update_read_handler(dev, NULL);
 
 g_main_loop_unref(loop);
 g_main_context_unref(ctxt);
@@ -1807,7 +1776,9 @@ static int vhost_setup_slave_channel(struct vhost_dev 
*dev)
 return -ECONNREFUSED;
 }
 u->slave_ioc = ioc;
-slave_update_read_handler(dev, NULL);
+u->slave_src = qio_channel_add_watch_source(u->slave_ioc,
+G_IO_IN | G_IO_HUP,
+slave_read, dev, NULL, NULL);
 
 if (reply_supported) {
 msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
-- 
2.39.0




[PATCH 2/2] Revert "vhost-user: Introduce nested event loop in vhost_user_read()"

2023-01-19 Thread Greg Kurz
This reverts commit a7f523c7d114d445c5d83aecdba3efc038e5a692.

The nested event loop is broken by design. It's only user was removed.
Drop the code as well so that nobody ever tries to use it again.

I had to fix a couple of trivial conflicts around return values because
of 025faa872bcf ("vhost-user: stick to -errno error return convention").

Signed-off-by: Greg Kurz 
---
 hw/virtio/vhost-user.c | 65 --
 1 file changed, 5 insertions(+), 60 deletions(-)

diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 7fb78af22c56..e14895c919ef 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -305,19 +305,8 @@ static int vhost_user_read_header(struct vhost_dev *dev, 
VhostUserMsg *msg)
 return 0;
 }
 
-struct vhost_user_read_cb_data {
-struct vhost_dev *dev;
-VhostUserMsg *msg;
-GMainLoop *loop;
-int ret;
-};
-
-static gboolean vhost_user_read_cb(void *do_not_use, GIOCondition condition,
-   gpointer opaque)
+static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
 {
-struct vhost_user_read_cb_data *data = opaque;
-struct vhost_dev *dev = data->dev;
-VhostUserMsg *msg = data->msg;
 struct vhost_user *u = dev->opaque;
 CharBackend *chr = u->user->chr;
 uint8_t *p = (uint8_t *) msg;
@@ -325,8 +314,7 @@ static gboolean vhost_user_read_cb(void *do_not_use, 
GIOCondition condition,
 
 r = vhost_user_read_header(dev, msg);
 if (r < 0) {
-data->ret = r;
-goto end;
+return r;
 }
 
 /* validate message size is sane */
@@ -334,8 +322,7 @@ static gboolean vhost_user_read_cb(void *do_not_use, 
GIOCondition condition,
 error_report("Failed to read msg header."
 " Size %d exceeds the maximum %zu.", msg->hdr.size,
 VHOST_USER_PAYLOAD_SIZE);
-data->ret = -EPROTO;
-goto end;
+return -EPROTO;
 }
 
 if (msg->hdr.size) {
@@ -346,53 +333,11 @@ static gboolean vhost_user_read_cb(void *do_not_use, 
GIOCondition condition,
 int saved_errno = errno;
 error_report("Failed to read msg payload."
  " Read %d instead of %d.", r, msg->hdr.size);
-data->ret = r < 0 ? -saved_errno : -EIO;
-goto end;
+return r < 0 ? -saved_errno : -EIO;
 }
 }
 
-end:
-g_main_loop_quit(data->loop);
-return G_SOURCE_REMOVE;
-}
-
-static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
-{
-struct vhost_user *u = dev->opaque;
-CharBackend *chr = u->user->chr;
-GMainContext *prev_ctxt = chr->chr->gcontext;
-GMainContext *ctxt = g_main_context_new();
-GMainLoop *loop = g_main_loop_new(ctxt, FALSE);
-struct vhost_user_read_cb_data data = {
-.dev = dev,
-.loop = loop,
-.msg = msg,
-.ret = 0
-};
-
-/*
- * We want to be able to monitor the slave channel fd while waiting
- * for chr I/O. This requires an event loop, but we can't nest the
- * one to which chr is currently attached : its fd handlers might not
- * be prepared for re-entrancy. So we create a new one and switch chr
- * to use it.
- */
-qemu_chr_be_update_read_handlers(chr->chr, ctxt);
-qemu_chr_fe_add_watch(chr, G_IO_IN | G_IO_HUP, vhost_user_read_cb, );
-
-g_main_loop_run(loop);
-
-/*
- * Restore the previous event loop context. This also destroys/recreates
- * event sources : this guarantees that all pending events in the original
- * context that have been processed by the nested loop are purged.
- */
-qemu_chr_be_update_read_handlers(chr->chr, prev_ctxt);
-
-g_main_loop_unref(loop);
-g_main_context_unref(ctxt);
-
-return data.ret;
+return 0;
 }
 
 static int process_message_reply(struct vhost_dev *dev,
-- 
2.39.0




Re: [PATCH v4 19/19] Drop duplicate #include

2023-01-19 Thread Greg Kurz
On Thu, 19 Jan 2023 07:59:59 +0100
Markus Armbruster  wrote:

> Tracked down with the help of scripts/clean-includes.
> 
> Signed-off-by: Markus Armbruster 
> ---

For ppc changes.

Reviewed-by: Greg Kurz 

>  include/hw/arm/fsl-imx6ul.h   | 1 -
>  include/hw/arm/fsl-imx7.h | 1 -
>  backends/tpm/tpm_emulator.c   | 1 -
>  hw/acpi/piix4.c   | 1 -
>  hw/alpha/dp264.c  | 1 -
>  hw/arm/virt.c | 1 -
>  hw/arm/xlnx-versal.c  | 1 -
>  hw/block/pflash_cfi01.c   | 1 -
>  hw/core/machine.c | 1 -
>  hw/hppa/machine.c | 1 -
>  hw/i386/acpi-build.c  | 1 -
>  hw/loongarch/acpi-build.c | 1 -
>  hw/misc/macio/cuda.c  | 1 -
>  hw/misc/macio/pmu.c   | 1 -
>  hw/net/xilinx_axienet.c   | 1 -
>  hw/ppc/ppc405_uc.c| 2 --
>  hw/ppc/ppc440_bamboo.c| 1 -
>  hw/ppc/spapr_drc.c| 1 -
>  hw/rdma/vmw/pvrdma_dev_ring.c | 1 -
>  hw/remote/machine.c   | 1 -
>  hw/remote/remote-obj.c| 1 -
>  hw/rtc/mc146818rtc.c  | 1 -
>  hw/s390x/virtio-ccw-serial.c  | 1 -
>  migration/postcopy-ram.c  | 2 --
>  softmmu/dirtylimit.c  | 1 -
>  softmmu/runstate.c| 1 -
>  softmmu/vl.c  | 1 -
>  target/loongarch/translate.c  | 1 -
>  target/mips/tcg/translate.c   | 1 -
>  target/nios2/translate.c  | 2 --
>  tests/unit/test-cutils.c  | 1 -
>  ui/gtk.c  | 1 -
>  util/oslib-posix.c| 4 
>  33 files changed, 39 deletions(-)
> 
> diff --git a/include/hw/arm/fsl-imx6ul.h b/include/hw/arm/fsl-imx6ul.h
> index 7812e516a5..1952cb984d 100644
> --- a/include/hw/arm/fsl-imx6ul.h
> +++ b/include/hw/arm/fsl-imx6ul.h
> @@ -30,7 +30,6 @@
>  #include "hw/timer/imx_gpt.h"
>  #include "hw/timer/imx_epit.h"
>  #include "hw/i2c/imx_i2c.h"
> -#include "hw/gpio/imx_gpio.h"
>  #include "hw/sd/sdhci.h"
>  #include "hw/ssi/imx_spi.h"
>  #include "hw/net/imx_fec.h"
> diff --git a/include/hw/arm/fsl-imx7.h b/include/hw/arm/fsl-imx7.h
> index 4e5e071864..355bd8ea83 100644
> --- a/include/hw/arm/fsl-imx7.h
> +++ b/include/hw/arm/fsl-imx7.h
> @@ -32,7 +32,6 @@
>  #include "hw/timer/imx_gpt.h"
>  #include "hw/timer/imx_epit.h"
>  #include "hw/i2c/imx_i2c.h"
> -#include "hw/gpio/imx_gpio.h"
>  #include "hw/sd/sdhci.h"
>  #include "hw/ssi/imx_spi.h"
>  #include "hw/net/imx_fec.h"
> diff --git a/backends/tpm/tpm_emulator.c b/backends/tpm/tpm_emulator.c
> index 49cc3d749d..2b440d2c9a 100644
> --- a/backends/tpm/tpm_emulator.c
> +++ b/backends/tpm/tpm_emulator.c
> @@ -35,7 +35,6 @@
>  #include "sysemu/runstate.h"
>  #include "sysemu/tpm_backend.h"
>  #include "sysemu/tpm_util.h"
> -#include "sysemu/runstate.h"
>  #include "tpm_int.h"
>  #include "tpm_ioctl.h"
>  #include "migration/blocker.h"
> diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
> index 0a81f1ad93..df39f91294 100644
> --- a/hw/acpi/piix4.c
> +++ b/hw/acpi/piix4.c
> @@ -35,7 +35,6 @@
>  #include "sysemu/xen.h"
>  #include "qapi/error.h"
>  #include "qemu/range.h"
> -#include "hw/acpi/pcihp.h"
>  #include "hw/acpi/cpu_hotplug.h"
>  #include "hw/acpi/cpu.h"
>  #include "hw/hotplug.h"
> diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c
> index c502c8c62a..4161f559a7 100644
> --- a/hw/alpha/dp264.c
> +++ b/hw/alpha/dp264.c
> @@ -18,7 +18,6 @@
>  #include "net/net.h"
>  #include "qemu/cutils.h"
>  #include "qemu/datadir.h"
> -#include "net/net.h"
>  
>  static uint64_t cpu_alpha_superpage_to_phys(void *opaque, uint64_t addr)
>  {
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index ea2413a0ba..d3849d7233 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -33,7 +33,6 @@
>  #include "qemu/units.h"
>  #include "qemu/option.h"
>  #include "monitor/qdev.h"
> -#include "qapi/error.h"
>  #include "hw/sysbus.h"
>  #include "hw/arm/boot.h"
>  #include "hw/arm/primecell.h"
> diff --git a/hw/arm/xlnx-versal.c b/hw/arm/xlnx-versal.c
> index 57276e1506..69b1b99e93 100644
> --- a/hw/arm/xlnx-versal.c
> +++ b/hw/arm/xlnx-versal.c
> @@ -22,7 +22,6 @@
>  #include "hw/misc/unimp.h"
>  #include "hw/arm/xlnx-versal.h"
>  #include "qemu/log.h"
> -#include "hw/sysbus.h"
>  
>  #define XLNX_VERSAL_ACPU_TYPE ARM_CPU_TYPE_

Re: [PULL v4 76/83] vhost-user: Support vhost_dev_start

2023-01-17 Thread Greg Kurz
On Tue, 17 Jan 2023 18:55:24 +0100
Greg Kurz  wrote:

> On Tue, 17 Jan 2023 16:07:00 +0100
> Maxime Coquelin  wrote:
> 
> > 
> > 
> > On 1/17/23 13:36, Greg Kurz wrote:
> > > On Tue, 17 Jan 2023 13:12:57 +0100
> > > Greg Kurz  wrote:
> > > 
> > >> Hi Maxime,
> > >>
> > >> On Tue, 17 Jan 2023 10:49:37 +0100
> > >> Maxime Coquelin  wrote:
> > >>
> > >>> Hi Yajun,
> > >>>
> > >>> On 1/16/23 08:14, Yajun Wu wrote:
> > >>>> Not quite sure about the whole picture.
> > >>>>
> > >>>> Seems while qemu waiting response of vhost_user_get_status, dpdk send 
> > >>>> out VHOST_USER_SLAVE_IOTLB_MSG and trigger qemu function 
> > >>>> vhost_backend_update_device_iotlb.
> > >>>> Qemu wait on reply of VHOST_USER_IOTLB_MSG but get 
> > >>>> VHOST_USER_GET_STATUS reply.
> > >>>
> > >>> Thanks for the backtrace, that helps a lot.
> > >>>
> > >>> The issue happens because:
> > >>>1. Introduction of nested event loop in vhost_user_read() [0] 
> > >>> features
> > >>> that enables handling slave channel request while waiting for reply on
> > >>> the masyer channel.
> > >>>2. Slave VHOST_USER_SLAVE_IOTLB_MSG slave request handling ends-up
> > >>> sending a VHOST_USER_IOTLB_MSG on the master channel.
> > >>>
> > >>> So while waiting for VHOST_USER_IOTLB_MSG reply, it receives the reply
> > >>> for the first request sent on the master channel, here the
> > >>> VHOST_USER_GET_STATUS reply.
> > >>>
> > >>> I don't see an easy way to fix it.
> > >>>
> > >>> One option would be to have the slave channel being handled by another
> > >>> thread, and protect master channel with a lock to enforce the
> > >>> synchronization. But this may induce other issues, so that's not a light
> > >>> change.
> > >>>
> > >>
> > >> This is going to be tough because the back-end might have set the
> > >> VHOST_USER_NEED_REPLY_MASK flag on the VHOST_USER_SLAVE_IOTLB_MSG
> > >> request and thus might be waiting for a response on the slave
> > >> channel. In order to emit such a response, the front-end must
> > >> send VHOST_USER_SLAVE_IOTLB_MSG updates on the master channel *first*
> > >> according to the protocol specification. This means that we really
> > >> cannot handle VHOST_USER_SLAVE_IOTLB_MSG requests while there's
> > >> an on-going transaction on the master channel.
> > 
> > Since the slave channel would be handled on another thread, it means the
> > on-going transaction on the master channel can continue. Once done, it
> > will release the mutex, and the thread handling the slave channel can
> > take it send the IOTLB update on the master channel.
> > 
> > That would work with DPDK, which does not request reply-ack on IOTLB
> > misses.
> > 
> 
> Not sure to catch what would happen if DPDK requested a reply-ack with
> this scenario.
> 
> > For the DAX enablement case, my understanding is that the handling of
> > the slave requests by QEMU does not induce sending requests on the
> > master channel, so if I'm not mistaken, it should work too. The thread
> > handling the slave requests can send the reply on the slave channel,
> > while the thread handling the master channel is blocked waiting for the
> > GET_VRING_BASE reply. Is that correct?
> > 
> 
> Yes this is correct AFAICT. Dropping the nested loop (commit db8a3772e3
> reverts like a charm) and having the slave channel serviced by its own
> thread seems to be the way to go then.
> 

Commit a7f523c7d114d needs to be reverted as well. There are some conflicts
because of commit 025faa872bcf9 but they seem trivial to fix.

> > >>
> > >>> (Adding Greg and Stefan, who worked on the nested event loop series.)
> > >>>
> > >>> Simply reverting nested event loop support may not be an option, since
> > >>> it would break virtiofsd, as if my understanding is correct, waits for
> > >>> some slave channel request to complete in order to complete a request
> > >>> made by QEMU on the master channel.
> > >>>
> > >>> Any thougths?
> > >>>
> > >>
> > >> Well... the nested even loop was added as pre

Re: [PULL v4 76/83] vhost-user: Support vhost_dev_start

2023-01-17 Thread Greg Kurz
On Tue, 17 Jan 2023 16:07:00 +0100
Maxime Coquelin  wrote:

> 
> 
> On 1/17/23 13:36, Greg Kurz wrote:
> > On Tue, 17 Jan 2023 13:12:57 +0100
> > Greg Kurz  wrote:
> > 
> >> Hi Maxime,
> >>
> >> On Tue, 17 Jan 2023 10:49:37 +0100
> >> Maxime Coquelin  wrote:
> >>
> >>> Hi Yajun,
> >>>
> >>> On 1/16/23 08:14, Yajun Wu wrote:
> >>>> Not quite sure about the whole picture.
> >>>>
> >>>> Seems while qemu waiting response of vhost_user_get_status, dpdk send 
> >>>> out VHOST_USER_SLAVE_IOTLB_MSG and trigger qemu function 
> >>>> vhost_backend_update_device_iotlb.
> >>>> Qemu wait on reply of VHOST_USER_IOTLB_MSG but get VHOST_USER_GET_STATUS 
> >>>> reply.
> >>>
> >>> Thanks for the backtrace, that helps a lot.
> >>>
> >>> The issue happens because:
> >>>1. Introduction of nested event loop in vhost_user_read() [0] features
> >>> that enables handling slave channel request while waiting for reply on
> >>> the masyer channel.
> >>>2. Slave VHOST_USER_SLAVE_IOTLB_MSG slave request handling ends-up
> >>> sending a VHOST_USER_IOTLB_MSG on the master channel.
> >>>
> >>> So while waiting for VHOST_USER_IOTLB_MSG reply, it receives the reply
> >>> for the first request sent on the master channel, here the
> >>> VHOST_USER_GET_STATUS reply.
> >>>
> >>> I don't see an easy way to fix it.
> >>>
> >>> One option would be to have the slave channel being handled by another
> >>> thread, and protect master channel with a lock to enforce the
> >>> synchronization. But this may induce other issues, so that's not a light
> >>> change.
> >>>
> >>
> >> This is going to be tough because the back-end might have set the
> >> VHOST_USER_NEED_REPLY_MASK flag on the VHOST_USER_SLAVE_IOTLB_MSG
> >> request and thus might be waiting for a response on the slave
> >> channel. In order to emit such a response, the front-end must
> >> send VHOST_USER_SLAVE_IOTLB_MSG updates on the master channel *first*
> >> according to the protocol specification. This means that we really
> >> cannot handle VHOST_USER_SLAVE_IOTLB_MSG requests while there's
> >> an on-going transaction on the master channel.
> 
> Since the slave channel would be handled on another thread, it means the
> on-going transaction on the master channel can continue. Once done, it
> will release the mutex, and the thread handling the slave channel can
> take it send the IOTLB update on the master channel.
> 
> That would work with DPDK, which does not request reply-ack on IOTLB
> misses.
> 

Not sure to catch what would happen if DPDK requested a reply-ack with
this scenario.

> For the DAX enablement case, my understanding is that the handling of
> the slave requests by QEMU does not induce sending requests on the
> master channel, so if I'm not mistaken, it should work too. The thread
> handling the slave requests can send the reply on the slave channel,
> while the thread handling the master channel is blocked waiting for the
> GET_VRING_BASE reply. Is that correct?
> 

Yes this is correct AFAICT. Dropping the nested loop (commit db8a3772e3
reverts like a charm) and having the slave channel serviced by its own
thread seems to be the way to go then.

> >>
> >>> (Adding Greg and Stefan, who worked on the nested event loop series.)
> >>>
> >>> Simply reverting nested event loop support may not be an option, since
> >>> it would break virtiofsd, as if my understanding is correct, waits for
> >>> some slave channel request to complete in order to complete a request
> >>> made by QEMU on the master channel.
> >>>
> >>> Any thougths?
> >>>
> >>
> >> Well... the nested even loop was added as preparatory work for "the
> >> upcoming enablement of DAX with virtio-fs". This requires changes on
> >> the QEMU side that haven't been merged yet. Technically, it seems that
> >> reverting the nested event loop won't break anything in upstream QEMU
> >> at this point (but this will bite again as soon as DAX enablement gets
> >> merged).
> >>
> > 
> > Cc'ing Dave to know about the DAX enablement status.
> > 
> >> AFAIK the event loop is only needed for the VHOST_USER_GET_VRING_BASE
> >> message. Another possibility might be to create the nested event loop
> >> 

Re: [PULL v4 76/83] vhost-user: Support vhost_dev_start

2023-01-17 Thread Greg Kurz
Hi Maxime,

On Tue, 17 Jan 2023 10:49:37 +0100
Maxime Coquelin  wrote:

> Hi Yajun,
> 
> On 1/16/23 08:14, Yajun Wu wrote:
> > Not quite sure about the whole picture.
> > 
> > Seems while qemu waiting response of vhost_user_get_status, dpdk send out 
> > VHOST_USER_SLAVE_IOTLB_MSG and trigger qemu function 
> > vhost_backend_update_device_iotlb.
> > Qemu wait on reply of VHOST_USER_IOTLB_MSG but get VHOST_USER_GET_STATUS 
> > reply.
> 
> Thanks for the backtrace, that helps a lot.
> 
> The issue happens because:
>   1. Introduction of nested event loop in vhost_user_read() [0] features
> that enables handling slave channel request while waiting for reply on
> the masyer channel.
>   2. Slave VHOST_USER_SLAVE_IOTLB_MSG slave request handling ends-up 
> sending a VHOST_USER_IOTLB_MSG on the master channel.
> 
> So while waiting for VHOST_USER_IOTLB_MSG reply, it receives the reply
> for the first request sent on the master channel, here the
> VHOST_USER_GET_STATUS reply.
> 
> I don't see an easy way to fix it.
> 
> One option would be to have the slave channel being handled by another
> thread, and protect master channel with a lock to enforce the
> synchronization. But this may induce other issues, so that's not a light
> change.
> 

This is going to be tough because the back-end might have set the
VHOST_USER_NEED_REPLY_MASK flag on the VHOST_USER_SLAVE_IOTLB_MSG
request and thus might be waiting for a response on the slave
channel. In order to emit such a response, the front-end must
send VHOST_USER_SLAVE_IOTLB_MSG updates on the master channel *first*
according to the protocol specification. This means that we really
cannot handle VHOST_USER_SLAVE_IOTLB_MSG requests while there's
an on-going transaction on the master channel.

> (Adding Greg and Stefan, who worked on the nested event loop series.)
> 
> Simply reverting nested event loop support may not be an option, since
> it would break virtiofsd, as if my understanding is correct, waits for
> some slave channel request to complete in order to complete a request
> made by QEMU on the master channel.
> 
> Any thougths?
> 

Well... the nested even loop was added as preparatory work for "the
upcoming enablement of DAX with virtio-fs". This requires changes on
the QEMU side that haven't been merged yet. Technically, it seems that
reverting the nested event loop won't break anything in upstream QEMU
at this point (but this will bite again as soon as DAX enablement gets
merged).

AFAIK the event loop is only needed for the VHOST_USER_GET_VRING_BASE
message. Another possibility might be to create the nested event loop
in this case only : this would allow VHOST_USER_GET_STATUS to complete
before QEMU starts processing the VHOST_USER_SLAVE_IOTLB_MSG requests.

Cheers,

--
Greg

> Maxime
> 
> [0]: 
> https://patchwork.ozlabs.org/project/qemu-devel/patch/20210312092212.782255-6-gr...@kaod.org/
> 
> 
> > Break on first error message("Received unexpected msg type. Expected 22 
> > received 40")
> > 
> > #0  0x55b72ed4 in process_message_reply (dev=0x584dd600, 
> > msg=0x7fffa330) at ../hw/virtio/vhost-user.c:445
> > #1  0x55b77c26 in vhost_user_send_device_iotlb_msg 
> > (dev=0x584dd600, imsg=0x7fffa600) at ../hw/virtio/vhost-user.c:2341
> > #2  0x55b7179e in vhost_backend_update_device_iotlb 
> > (dev=0x584dd600, iova=10442706944, uaddr=140736119902208, len=4096, 
> > perm=IOMMU_RW) at ../hw/virtio/vhost-backend.c:361
> > #3  0x55b6e34c in vhost_device_iotlb_miss (dev=0x584dd600, 
> > iova=10442706944, write=1) at ../hw/virtio/vhost.c:1113
> > #4  0x55b718d9 in vhost_backend_handle_iotlb_msg 
> > (dev=0x584dd600, imsg=0x7fffa7b0) at 
> > ../hw/virtio/vhost-backend.c:393
> > #5  0x55b76144 in slave_read (ioc=0x57a38680, 
> > condition=G_IO_IN, opaque=0x584dd600) at ../hw/virtio/vhost-user.c:1726
> > #6  0x55c797a5 in qio_channel_fd_source_dispatch 
> > (source=0x56a06fb0, callback=0x55b75f86 , 
> > user_data=0x584dd600) at ../io/channel-watch.c:84
> > #7  0x7554895d in g_main_context_dispatch () at 
> > /lib64/libglib-2.0.so.0
> > #8  0x75548d18 in g_main_context_iterate.isra () at 
> > /lib64/libglib-2.0.so.0
> > #9  0x75549042 in g_main_loop_run () at /lib64/libglib-2.0.so.0
> > #10 0x55b72de7 in vhost_user_read (dev=0x584dd600, 
> > msg=0x7fffac50) at ../hw/virtio/vhost-user.c:413
> > #11 0x55b72e9b in process_message_reply (dev=0x584dd600, 
> > msg=0x7fffaf10) at ../hw/virtio/vhost-user.c:439
> > #12 0x55b77c26 in vhost_user_send_device_iotlb_msg 
> > (dev=0x584dd600, imsg=0x7fffb1e0) at ../hw/virtio/vhost-user.c:2341
> > #13 0x55b7179e in vhost_backend_update_device_iotlb 
> > (dev=0x584dd600, iova=10468392960, uaddr=140736145588224, len=4096, 
> > perm=IOMMU_RW) at ../hw/virtio/vhost-backend.c:361
> > #14 0x55b6e34c in vhost_device_iotlb_miss (dev=0x584dd600, 

Re: [PULL v4 76/83] vhost-user: Support vhost_dev_start

2023-01-17 Thread Greg Kurz
On Tue, 17 Jan 2023 13:12:57 +0100
Greg Kurz  wrote:

> Hi Maxime,
> 
> On Tue, 17 Jan 2023 10:49:37 +0100
> Maxime Coquelin  wrote:
> 
> > Hi Yajun,
> > 
> > On 1/16/23 08:14, Yajun Wu wrote:
> > > Not quite sure about the whole picture.
> > > 
> > > Seems while qemu waiting response of vhost_user_get_status, dpdk send out 
> > > VHOST_USER_SLAVE_IOTLB_MSG and trigger qemu function 
> > > vhost_backend_update_device_iotlb.
> > > Qemu wait on reply of VHOST_USER_IOTLB_MSG but get VHOST_USER_GET_STATUS 
> > > reply.
> > 
> > Thanks for the backtrace, that helps a lot.
> > 
> > The issue happens because:
> >   1. Introduction of nested event loop in vhost_user_read() [0] features
> > that enables handling slave channel request while waiting for reply on
> > the masyer channel.
> >   2. Slave VHOST_USER_SLAVE_IOTLB_MSG slave request handling ends-up 
> > sending a VHOST_USER_IOTLB_MSG on the master channel.
> > 
> > So while waiting for VHOST_USER_IOTLB_MSG reply, it receives the reply
> > for the first request sent on the master channel, here the
> > VHOST_USER_GET_STATUS reply.
> > 
> > I don't see an easy way to fix it.
> > 
> > One option would be to have the slave channel being handled by another
> > thread, and protect master channel with a lock to enforce the
> > synchronization. But this may induce other issues, so that's not a light
> > change.
> > 
> 
> This is going to be tough because the back-end might have set the
> VHOST_USER_NEED_REPLY_MASK flag on the VHOST_USER_SLAVE_IOTLB_MSG
> request and thus might be waiting for a response on the slave
> channel. In order to emit such a response, the front-end must
> send VHOST_USER_SLAVE_IOTLB_MSG updates on the master channel *first*
> according to the protocol specification. This means that we really
> cannot handle VHOST_USER_SLAVE_IOTLB_MSG requests while there's
> an on-going transaction on the master channel.
> 
> > (Adding Greg and Stefan, who worked on the nested event loop series.)
> > 
> > Simply reverting nested event loop support may not be an option, since
> > it would break virtiofsd, as if my understanding is correct, waits for
> > some slave channel request to complete in order to complete a request
> > made by QEMU on the master channel.
> > 
> > Any thougths?
> > 
> 
> Well... the nested even loop was added as preparatory work for "the
> upcoming enablement of DAX with virtio-fs". This requires changes on
> the QEMU side that haven't been merged yet. Technically, it seems that
> reverting the nested event loop won't break anything in upstream QEMU
> at this point (but this will bite again as soon as DAX enablement gets
> merged).
> 

Cc'ing Dave to know about the DAX enablement status.

> AFAIK the event loop is only needed for the VHOST_USER_GET_VRING_BASE
> message. Another possibility might be to create the nested event loop
> in this case only : this would allow VHOST_USER_GET_STATUS to complete
> before QEMU starts processing the VHOST_USER_SLAVE_IOTLB_MSG requests.
> 
> Cheers,
> 
> --
> Greg
> 
> > Maxime
> > 
> > [0]: 
> > https://patchwork.ozlabs.org/project/qemu-devel/patch/20210312092212.782255-6-gr...@kaod.org/
> > 
> > 
> > > Break on first error message("Received unexpected msg type. Expected 22 
> > > received 40")
> > > 
> > > #0  0x55b72ed4 in process_message_reply (dev=0x584dd600, 
> > > msg=0x7fffa330) at ../hw/virtio/vhost-user.c:445
> > > #1  0x55b77c26 in vhost_user_send_device_iotlb_msg 
> > > (dev=0x584dd600, imsg=0x7fffa600) at 
> > > ../hw/virtio/vhost-user.c:2341
> > > #2  0x55b7179e in vhost_backend_update_device_iotlb 
> > > (dev=0x584dd600, iova=10442706944, uaddr=140736119902208, len=4096, 
> > > perm=IOMMU_RW) at ../hw/virtio/vhost-backend.c:361
> > > #3  0x55b6e34c in vhost_device_iotlb_miss (dev=0x584dd600, 
> > > iova=10442706944, write=1) at ../hw/virtio/vhost.c:1113
> > > #4  0x55b718d9 in vhost_backend_handle_iotlb_msg 
> > > (dev=0x584dd600, imsg=0x7fffa7b0) at 
> > > ../hw/virtio/vhost-backend.c:393
> > > #5  0x55b76144 in slave_read (ioc=0x57a38680, 
> > > condition=G_IO_IN, opaque=0x584dd600) at 
> > > ../hw/virtio/vhost-user.c:1726
> > > #6  0x55c797a5 in qio_channel_fd_source_dispatch 
> > > (source=0x56a06fb0, callback=0x55b75f86 , 
> > > user_data=0x584dd600) at ../io/channel-watch

Re: [RFC PATCH-for-8.0 07/10] hw/virtio: Directly access cached VirtIODevice::access_is_big_endian

2022-12-13 Thread Greg Kurz
On Tue, 13 Dec 2022 00:05:14 +0100
Philippe Mathieu-Daudé  wrote:

> Since the device endianness doesn't change at runtime,
> use the cached value instead of evaluating it on each call.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/hw/virtio/virtio-access.h | 44 +++
>  1 file changed, 22 insertions(+), 22 deletions(-)
> 
> diff --git a/include/hw/virtio/virtio-access.h 
> b/include/hw/virtio/virtio-access.h
> index 07aae69042..985f39fe16 100644
> --- a/include/hw/virtio/virtio-access.h
> +++ b/include/hw/virtio/virtio-access.h
> @@ -43,7 +43,7 @@ static inline uint16_t virtio_lduw_phys(VirtIODevice *vdev, 
> hwaddr pa)
>  {
>  AddressSpace *dma_as = vdev->dma_as;
>  
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {

For x86, virtio_access_is_big_endian() expands to :

static inline bool virtio_access_is_big_endian(VirtIODevice *vdev)
{
return false;
}

When I added these memory accessors, there was a strong requirement from MST
that x86 shouldn't have to pay for an out-of-line check when it is known that
everything is always little endian. Not sure exactly what you're trying to
achieve with VirtIODevice::access_is_big_endian but this shouldn't mess with
this fast path.

>  return lduw_be_phys(dma_as, pa);
>  }
>  return lduw_le_phys(dma_as, pa);
> @@ -53,7 +53,7 @@ static inline uint32_t virtio_ldl_phys(VirtIODevice *vdev, 
> hwaddr pa)
>  {
>  AddressSpace *dma_as = vdev->dma_as;
>  
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {
>  return ldl_be_phys(dma_as, pa);
>  }
>  return ldl_le_phys(dma_as, pa);
> @@ -63,7 +63,7 @@ static inline uint64_t virtio_ldq_phys(VirtIODevice *vdev, 
> hwaddr pa)
>  {
>  AddressSpace *dma_as = vdev->dma_as;
>  
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {
>  return ldq_be_phys(dma_as, pa);
>  }
>  return ldq_le_phys(dma_as, pa);
> @@ -74,7 +74,7 @@ static inline void virtio_stw_phys(VirtIODevice *vdev, 
> hwaddr pa,
>  {
>  AddressSpace *dma_as = vdev->dma_as;
>  
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {
>  stw_be_phys(dma_as, pa, value);
>  } else {
>  stw_le_phys(dma_as, pa, value);
> @@ -86,7 +86,7 @@ static inline void virtio_stl_phys(VirtIODevice *vdev, 
> hwaddr pa,
>  {
>  AddressSpace *dma_as = vdev->dma_as;
>  
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {
>  stl_be_phys(dma_as, pa, value);
>  } else {
>  stl_le_phys(dma_as, pa, value);
> @@ -95,7 +95,7 @@ static inline void virtio_stl_phys(VirtIODevice *vdev, 
> hwaddr pa,
>  
>  static inline void virtio_stw_p(VirtIODevice *vdev, void *ptr, uint16_t v)
>  {
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {
>  stw_be_p(ptr, v);
>  } else {
>  stw_le_p(ptr, v);
> @@ -104,7 +104,7 @@ static inline void virtio_stw_p(VirtIODevice *vdev, void 
> *ptr, uint16_t v)
>  
>  static inline void virtio_stl_p(VirtIODevice *vdev, void *ptr, uint32_t v)
>  {
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {
>  stl_be_p(ptr, v);
>  } else {
>  stl_le_p(ptr, v);
> @@ -113,7 +113,7 @@ static inline void virtio_stl_p(VirtIODevice *vdev, void 
> *ptr, uint32_t v)
>  
>  static inline void virtio_stq_p(VirtIODevice *vdev, void *ptr, uint64_t v)
>  {
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {
>  stq_be_p(ptr, v);
>  } else {
>  stq_le_p(ptr, v);
> @@ -122,7 +122,7 @@ static inline void virtio_stq_p(VirtIODevice *vdev, void 
> *ptr, uint64_t v)
>  
>  static inline int virtio_lduw_p(VirtIODevice *vdev, const void *ptr)
>  {
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {
>  return lduw_be_p(ptr);
>  } else {
>  return lduw_le_p(ptr);
> @@ -131,7 +131,7 @@ static inline int virtio_lduw_p(VirtIODevice *vdev, const 
> void *ptr)
>  
>  static inline int virtio_ldl_p(VirtIODevice *vdev, const void *ptr)
>  {
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {
>  return ldl_be_p(ptr);
>  } else {
>  return ldl_le_p(ptr);
> @@ -140,7 +140,7 @@ static inline int virtio_ldl_p(VirtIODevice *vdev, const 
> void *ptr)
>  
>  static inline uint64_t virtio_ldq_p(VirtIODevice *vdev, const void *ptr)
>  {
> -if (virtio_access_is_big_endian(vdev)) {
> +if (vdev->access_is_big_endian) {
>  return ldq_be_p(ptr);
>  } else {
>  return ldq_le_p(ptr);
> @@ -150,9 +150,9 @@ static inline uint64_t virtio_ldq_p(VirtIODevice *vdev, 
> const void *ptr)
>  static inline uint16_t virtio_tswap16(VirtIODevice *vdev, uint16_t s)
>  {
>  #if HOST_BIG_ENDIAN
> -return 

Re: [PATCH v2 for-8.0 2/2] pci: drop redundant PCIDeviceClass::is_bridge field

2022-11-30 Thread Greg Kurz
On Tue, 29 Nov 2022 11:13:41 +0100
Igor Mammedov  wrote:

> and use cast to TYPE_PCI_BRIDGE instead.
> 
> Signed-off-by: Igor Mammedov 
> Reviewed-by: Philippe Mathieu-Daudé 
> ---
> v2:
>(Philippe Mathieu-Daudé )
>   - replace leftover IS_PCI_BRIDGE cast with is_bridge variable
> ---
>  include/hw/pci/pci.h   | 11 +--
>  include/hw/pci/pci_bridge.h|  1 +
>  hw/acpi/pcihp.c|  3 +--
>  hw/i386/acpi-build.c   |  5 ++---
>  hw/pci-bridge/cxl_downstream.c |  1 -
>  hw/pci-bridge/cxl_upstream.c   |  1 -
>  hw/pci-bridge/i82801b11.c  |  1 -
>  hw/pci-bridge/pci_bridge_dev.c |  1 -
>  hw/pci-bridge/pcie_pci_bridge.c|  1 -
>  hw/pci-bridge/pcie_root_port.c |  1 -
>  hw/pci-bridge/simba.c  |  1 -
>  hw/pci-bridge/xio3130_downstream.c |  1 -
>  hw/pci-bridge/xio3130_upstream.c   |  1 -
>  hw/pci-host/designware.c   |  1 -
>  hw/pci-host/xilinx-pcie.c  |  1 -
>  hw/pci/pci.c   | 20 +---
>  hw/ppc/spapr_pci.c     | 15 +--

For ppc changes :

Reviewed-by: Greg Kurz 

>  17 files changed, 19 insertions(+), 47 deletions(-)
> 
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index 6ccaaf5154..8b3a8571bf 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -250,16 +250,7 @@ struct PCIDeviceClass {
>  uint16_t class_id;
>  uint16_t subsystem_vendor_id;   /* only for header type = 0 */
>  uint16_t subsystem_id;  /* only for header type = 0 */
> -
> -/*
> - * pci-to-pci bridge or normal device.
> - * This doesn't mean pci host switch.
> - * When card bus bridge is supported, this would be enhanced.
> - */
> -bool is_bridge;
> -
> -/* rom bar */
> -const char *romfile;
> +const char *romfile;/* rom bar */
>  };
>  
>  typedef void (*PCIINTxRoutingNotifier)(PCIDevice *dev);
> diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
> index ba4bafac7c..ca6caf487e 100644
> --- a/include/hw/pci/pci_bridge.h
> +++ b/include/hw/pci/pci_bridge.h
> @@ -53,6 +53,7 @@ struct PCIBridgeWindows {
>  
>  #define TYPE_PCI_BRIDGE "base-pci-bridge"
>  OBJECT_DECLARE_SIMPLE_TYPE(PCIBridge, PCI_BRIDGE)
> +#define IS_PCI_BRIDGE(dev) object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)
>  
>  struct PCIBridge {
>  /*< private >*/
> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> index 84d75e6b84..99a898d9ae 100644
> --- a/hw/acpi/pcihp.c
> +++ b/hw/acpi/pcihp.c
> @@ -186,7 +186,6 @@ static PCIBus *acpi_pcihp_find_hotplug_bus(AcpiPciHpState 
> *s, int bsel)
>  
>  static bool acpi_pcihp_pc_no_hotplug(AcpiPciHpState *s, PCIDevice *dev)
>  {
> -PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev);
>  DeviceClass *dc = DEVICE_GET_CLASS(dev);
>  /*
>   * ACPI doesn't allow hotplug of bridge devices.  Don't allow
> @@ -196,7 +195,7 @@ static bool acpi_pcihp_pc_no_hotplug(AcpiPciHpState *s, 
> PCIDevice *dev)
>   * Don't allow hot-unplug of SR-IOV Virtual Functions, as they
>   * will be removed implicitly, when Physical Function is unplugged.
>   */
> -return (pc->is_bridge && !dev->qdev.hotplugged) || !dc->hotpluggable ||
> +return (IS_PCI_BRIDGE(dev) && !dev->qdev.hotplugged) || 
> !dc->hotpluggable ||
> pci_is_vf(dev);
>  }
>  
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index d9eaa5fc4d..aa15b11cde 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -403,7 +403,6 @@ static void build_append_pci_bus_devices(Aml 
> *parent_scope, PCIBus *bus,
>  
>  for (devfn = 0; devfn < ARRAY_SIZE(bus->devices); devfn++) {
>  DeviceClass *dc;
> -PCIDeviceClass *pc;
>  PCIDevice *pdev = bus->devices[devfn];
>  int slot = PCI_SLOT(devfn);
>  int func = PCI_FUNC(devfn);
> @@ -414,14 +413,14 @@ static void build_append_pci_bus_devices(Aml 
> *parent_scope, PCIBus *bus,
>  bool cold_plugged_bridge = false;
>  
>  if (pdev) {
> -pc = PCI_DEVICE_GET_CLASS(pdev);
>  dc = DEVICE_GET_CLASS(pdev);
>  
>  /*
>   * Cold plugged bridges aren't themselves hot-pluggable.
>   * Hotplugged bridges *are* hot-pluggable.
>   */
> -cold_plugged_bridge = pc->is_bridge && !DEVICE(pdev)->hotplugged;
> +cold_plugged_bridge = IS_PCI_BRIDGE(pdev) &&
> +  !DEVICE(pdev)->hotplugged;
>  b

Re: [PATCH] MAINTAINERS: Add 9p test client to section "virtio-9p"

2022-11-28 Thread Greg Kurz
On Mon, 28 Nov 2022 18:12:04 +0100
Christian Schoenebeck  wrote:

> The 9p test cases use a dedicated, lite-weight 9p client implementation
> (using virtio transport) under tests/qtest/libqos/ to communicate with
> QEMU's 9p server.
> 
> It's already there for a long time. Let's officially assign it to 9p
> maintainers.
> 
> Signed-off-by: Christian Schoenebeck 
> ---

Reviewed-by: Greg Kurz 

>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index cf24910249..4f156a99f1 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2036,6 +2036,7 @@ X: hw/9pfs/xen-9p*
>  F: fsdev/
>  F: docs/tools/virtfs-proxy-helper.rst
>  F: tests/qtest/virtio-9p-test.c
> +F: tests/qtest/libqos/virtio-9p*
>  T: git https://gitlab.com/gkurz/qemu.git 9p-next
>  T: git https://github.com/cschoenebeck/qemu.git 9p.next
>  




Re: [PATCH] 9pfs: Fix some return statements in the synth backend

2022-11-28 Thread Greg Kurz
On Mon, 28 Nov 2022 08:35:22 +0100
Markus Armbruster  wrote:

> Greg Kurz  writes:
> 
> > The qemu_v9fs_synth_mkdir() and qemu_v9fs_synth_add_file() functions
> > currently return a positive errno value on failure. This causes
> > checkpatch.pl to spit several errors like the one below:
> >
> > ERROR: return of an errno should typically be -ve (return -EAGAIN)
> > #79: FILE: hw/9pfs/9p-synth.c:79:
> > +return EAGAIN;
> >
> > Simply change the sign. This has no consequence since callers
> > assert() the returned value to be equal to 0.
> 
> Out of curiosity: why is assert() appropriate?
> 

Most of the code base comes from the original synth backend which
was designed to expose QEMU internals to the guest using 9p. The
hope of the virtio-9p authors was that each QEMU subsystem would
create its own tree using these two functions (note that they
are declared extern). Of course these never happened and the synth
backend remained nearly dead code for years, until finally it got
re-used to implement 9p qtest. In this context, failure to create a
synthetic directory or file means the related test has a bug (e.g.
messing with the paths used by some other test). This code likely
needs improvements but we never got to it.

> > While here also get rid of the uneeded ret variables as suggested
> > by return_directly.cocci.
> >
> > Reported-by: Markus Armbruster 
> > Signed-off-by: Greg Kurz 
> 
> Signed-off-by: Markus Armbruster 
> 



Re: [PATCH for-8.0 6/7] hw/intc/xics: Convert TYPE_ICS to 3-phase reset

2022-11-25 Thread Greg Kurz
On Fri, 25 Nov 2022 11:52:39 +
Peter Maydell  wrote:

> Convert the TYPE_ICS class to 3-phase reset; this will allow us
> to convert the TYPE_PHB3_MSI class which inherits from it.
> 
> Signed-off-by: Peter Maydell 
> ---

Reviewed-by: Greg Kurz 

>  hw/intc/xics.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> index dd130467ccc..c7f8abd71e4 100644
> --- a/hw/intc/xics.c
> +++ b/hw/intc/xics.c
> @@ -564,9 +564,9 @@ static void ics_reset_irq(ICSIRQState *irq)
>  irq->saved_priority = 0xff;
>  }
>  
> -static void ics_reset(DeviceState *dev)
> +static void ics_reset_hold(Object *obj)
>  {
> -ICSState *ics = ICS(dev);
> +ICSState *ics = ICS(obj);
>  g_autofree uint8_t *flags = g_malloc(ics->nr_irqs);
>  int i;
>  
> @@ -584,7 +584,7 @@ static void ics_reset(DeviceState *dev)
>  if (kvm_irqchip_in_kernel()) {
>  Error *local_err = NULL;
>  
> -ics_set_kvm_state(ICS(dev), _err);
> +ics_set_kvm_state(ics, _err);


>  if (local_err) {
>  error_report_err(local_err);
>  }
> @@ -688,16 +688,17 @@ static Property ics_properties[] = {
>  static void ics_class_init(ObjectClass *klass, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(klass);
> +ResettableClass *rc = RESETTABLE_CLASS(klass);
>  
>  dc->realize = ics_realize;
>  device_class_set_props(dc, ics_properties);
> -dc->reset = ics_reset;
>  dc->vmsd = _ics;
>  /*
>   * Reason: part of XICS interrupt controller, needs to be wired up,
>   * e.g. by spapr_irq_init().
>   */
>  dc->user_creatable = false;
> +rc->phases.hold = ics_reset_hold;
>  }
>  
>  static const TypeInfo ics_info = {




Re: [PATCH for-8.0 5/7] hw/intc/xics: Reset TYPE_ICS objects with device_cold_reset()

2022-11-25 Thread Greg Kurz
On Fri, 25 Nov 2022 13:24:00 +0100
Cédric Le Goater  wrote:

> On 11/25/22 12:52, Peter Maydell wrote:
> > The realize method for the TYPE_ICS class uses qemu_register_reset()
> > to register a reset handler, as a workaround for the fact that
> > currently objects which directly inherit from TYPE_DEVICE don't get
> > automatically reset.  However, the reset function directly calls
> > ics_reset(), which is the function that implements the legacy reset
> > method.  This means that only the parent class's data gets reset, and
> > a subclass which also needs to handle reset, like TYPE_PHB3_MSI, has
> > to register its own reset function.
> > 
> > Make the TYPE_ICS reset function call device_cold_reset() instead:
> > this will handle reset for both the parent class and the subclass,
> > and will work whether the classes are using legacy reset or 3-phase
> > reset. This allows us to remove the reset function that the subclass
> > currently has to set up.
> 
> Nice !
> 

Seconded.

Reviewed-by: Greg Kurz 

> > 
> > Signed-off-by: Peter Maydell 
> 
> Reviewed-by: Cédric Le Goater 
> 
> Thanks,
> 
> C.
> 
> > ---
> >   hw/intc/xics.c | 2 +-
> >   hw/pci-host/pnv_phb3_msi.c | 7 ---
> >   2 files changed, 1 insertion(+), 8 deletions(-)
> > 
> > diff --git a/hw/intc/xics.c b/hw/intc/xics.c
> > index dcd021af668..dd130467ccc 100644
> > --- a/hw/intc/xics.c
> > +++ b/hw/intc/xics.c
> > @@ -593,7 +593,7 @@ static void ics_reset(DeviceState *dev)
> >   
> >   static void ics_reset_handler(void *dev)
> >   {
> > -ics_reset(dev);
> > +device_cold_reset(dev);
> >   }
> >   
> >   static void ics_realize(DeviceState *dev, Error **errp)
> > diff --git a/hw/pci-host/pnv_phb3_msi.c b/hw/pci-host/pnv_phb3_msi.c
> > index 2f4112907b8..ae908fd9e41 100644
> > --- a/hw/pci-host/pnv_phb3_msi.c
> > +++ b/hw/pci-host/pnv_phb3_msi.c
> > @@ -239,11 +239,6 @@ static void phb3_msi_reset(DeviceState *dev)
> >   msi->rba_sum = 0;
> >   }
> >   
> > -static void phb3_msi_reset_handler(void *dev)
> > -{
> > -phb3_msi_reset(dev);
> > -}
> > -
> >   void pnv_phb3_msi_update_config(Phb3MsiState *msi, uint32_t base,
> >   uint32_t count)
> >   {
> > @@ -272,8 +267,6 @@ static void phb3_msi_realize(DeviceState *dev, Error 
> > **errp)
> >   }
> >   
> >   msi->qirqs = qemu_allocate_irqs(phb3_msi_set_irq, msi, ics->nr_irqs);
> > -
> > -qemu_register_reset(phb3_msi_reset_handler, dev);
> >   }
> >   
> >   static void phb3_msi_instance_init(Object *obj)
> 




[PATCH] 9pfs: Fix some return statements in the synth backend

2022-11-24 Thread Greg Kurz
The qemu_v9fs_synth_mkdir() and qemu_v9fs_synth_add_file() functions
currently return a positive errno value on failure. This causes
checkpatch.pl to spit several errors like the one below:

ERROR: return of an errno should typically be -ve (return -EAGAIN)
#79: FILE: hw/9pfs/9p-synth.c:79:
+return EAGAIN;

Simply change the sign. This has no consequence since callers
assert() the returned value to be equal to 0.

While here also get rid of the uneeded ret variables as suggested
by return_directly.cocci.

Reported-by: Markus Armbruster 
Signed-off-by: Greg Kurz 
---
 hw/9pfs/9p-synth.c |   22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
index 1c5813e4ddc6..f62c40b639fc 100644
--- a/hw/9pfs/9p-synth.c
+++ b/hw/9pfs/9p-synth.c
@@ -72,14 +72,13 @@ static V9fsSynthNode *v9fs_add_dir_node(V9fsSynthNode 
*parent, int mode,
 int qemu_v9fs_synth_mkdir(V9fsSynthNode *parent, int mode,
   const char *name, V9fsSynthNode **result)
 {
-int ret;
 V9fsSynthNode *node, *tmp;
 
 if (!synth_fs) {
-return EAGAIN;
+return -EAGAIN;
 }
 if (!name || (strlen(name) >= NAME_MAX)) {
-return EINVAL;
+return -EINVAL;
 }
 if (!parent) {
 parent = _root;
@@ -87,8 +86,7 @@ int qemu_v9fs_synth_mkdir(V9fsSynthNode *parent, int mode,
 QEMU_LOCK_GUARD(_mutex);
 QLIST_FOREACH(tmp, >child, sibling) {
 if (!strcmp(tmp->name, name)) {
-ret = EEXIST;
-return ret;
+return -EEXIST;
 }
 }
 /* Add the name */
@@ -98,22 +96,20 @@ int qemu_v9fs_synth_mkdir(V9fsSynthNode *parent, int mode,
 v9fs_add_dir_node(node, node->attr->mode, ".",
   node->attr, node->attr->inode);
 *result = node;
-ret = 0;
-return ret;
+return 0;
 }
 
 int qemu_v9fs_synth_add_file(V9fsSynthNode *parent, int mode,
  const char *name, v9fs_synth_read read,
  v9fs_synth_write write, void *arg)
 {
-int ret;
 V9fsSynthNode *node, *tmp;
 
 if (!synth_fs) {
-return EAGAIN;
+return -EAGAIN;
 }
 if (!name || (strlen(name) >= NAME_MAX)) {
-return EINVAL;
+return -EINVAL;
 }
 if (!parent) {
 parent = _root;
@@ -122,8 +118,7 @@ int qemu_v9fs_synth_add_file(V9fsSynthNode *parent, int 
mode,
 QEMU_LOCK_GUARD(_mutex);
 QLIST_FOREACH(tmp, >child, sibling) {
 if (!strcmp(tmp->name, name)) {
-ret = EEXIST;
-return ret;
+return -EEXIST;
 }
 }
 /* Add file type and remove write bits */
@@ -138,8 +133,7 @@ int qemu_v9fs_synth_add_file(V9fsSynthNode *parent, int 
mode,
 node->private  = arg;
 pstrcpy(node->name, sizeof(node->name), name);
 QLIST_INSERT_HEAD_RCU(>child, node, sibling);
-ret = 0;
-return ret;
+return 0;
 }
 
 static void synth_fill_statbuf(V9fsSynthNode *node, struct stat *stbuf)





Re: [PATCH v2 1/2] cleanup: Tweak and re-run return_directly.cocci

2022-11-24 Thread Greg Kurz
On Thu, 24 Nov 2022 16:15:11 +0100
Greg Kurz  wrote:

> On Tue, 22 Nov 2022 14:49:16 +0100
> Markus Armbruster  wrote:
> 
> > Tweak the semantic patch to drop redundant parenthesis around the
> > return expression.
> > 
> > Coccinelle drops a comment in hw/rdma/vmw/pvrdma_cmd.c; restored
> > manually.
> > 
> > Coccinelle messes up vmdk_co_create(), not sure why.  Change dropped,
> > will be done manually in the next commit.
> > 
> > Line breaks in target/avr/cpu.h and hw/rdma/vmw/pvrdma_cmd.c tidied up
> > manually.
> > 
> > Whitespace in tools/virtiofsd/fuse_lowlevel.c tidied up manually.
> > 
> > checkpatch.pl complains "return of an errno should typically be -ve"
> > two times for hw/9pfs/9p-synth.c.  Preexisting, the patch merely makes
> > it visible to checkpatch.pl.
> > 
> 
> Hi Markus,
> 
> Yeah these positive errno values have been sitting there since the
> beginning. It was dead code until I hijacked the synth backend to
> implement qtest for 9p. I didn't care much about the return value
> of the two culprits at the time since both are passed to assert(!ret)
> right away. For this reason, changing the sign should be easy :-)
> 
> I see that checkpatch.pl considers this as an error. I'll post
> a fix. I guess you'll need to rebase on this fix for your patches
> to pass CI.
> 

Or maybe I can fix the issues detected by coccinelle as well and
you can just drop the 9p bits from this patch ?

> Anyway, for 9p:
> 
> Reviewed-by: Greg Kurz 
> 
> 
> > Signed-off-by: Markus Armbruster 
> > ---
> >  scripts/coccinelle/return_directly.cocci |  5 +--
> >  include/hw/pci/pci.h |  7 +--
> >  target/avr/cpu.h |  4 +-
> >  hw/9pfs/9p-synth.c   | 14 ++
> >  hw/char/sifive_uart.c|  4 +-
> >  hw/ppc/ppc4xx_sdram.c|  5 +--
> >  hw/rdma/vmw/pvrdma_cmd.c | 57 +---
> >  hw/virtio/vhost-user.c   |  6 +--
> >  migration/dirtyrate.c| 10 +
> >  migration/tls.c  |  6 +--
> >  replay/replay-time.c |  5 +--
> >  semihosting/console.c|  4 +-
> >  softmmu/memory.c | 11 +
> >  softmmu/physmem.c|  9 +---
> >  target/loongarch/cpu.c   |  4 +-
> >  target/mips/tcg/dsp_helper.c | 15 ++-
> >  target/riscv/debug.c |  6 +--
> >  target/riscv/vector_helper.c | 28 +++-
> >  tests/bench/benchmark-crypto-akcipher.c  |  6 +--
> >  tests/qtest/erst-test.c  |  5 +--
> >  tests/qtest/hexloader-test.c |  6 +--
> >  tests/qtest/pvpanic-pci-test.c   |  6 +--
> >  tests/qtest/pvpanic-test.c   |  6 +--
> >  tests/qtest/test-filter-mirror.c |  6 +--
> >  tests/qtest/virtio-ccw-test.c|  6 +--
> >  tests/tcg/multiarch/sha512.c |  9 +---
> >  tools/virtiofsd/fuse_lowlevel.c  | 24 +++---
> >  27 files changed, 70 insertions(+), 204 deletions(-)
> > 
> > diff --git a/scripts/coccinelle/return_directly.cocci 
> > b/scripts/coccinelle/return_directly.cocci
> > index 4cf50e75ea..6cb1b3c99a 100644
> > --- a/scripts/coccinelle/return_directly.cocci
> > +++ b/scripts/coccinelle/return_directly.cocci
> > @@ -11,9 +11,8 @@ identifier F;
> >  -T VAR;
> >   ... when != VAR
> >  
> > --VAR =
> > -+return
> > - E;
> > +-VAR = (E);
> >  -return VAR;
> > ++return E;
> >   ... when != VAR
> >   }
> > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > index 6ccaaf5154..06e2d5f889 100644
> > --- a/include/hw/pci/pci.h
> > +++ b/include/hw/pci/pci.h
> > @@ -921,11 +921,8 @@ PCI_DMA_DEFINE_LDST(q_be, q_be, 64);
> >  static inline void *pci_dma_map(PCIDevice *dev, dma_addr_t addr,
> >  dma_addr_t *plen, DMADirection dir)
> >  {
> > -void *buf;
> > -
> > -buf = dma_memory_map(pci_get_address_space(dev), addr, plen, dir,
> > - MEMTXATTRS_UNSPECIFIED);
> > -return buf;
> > +return dma_memory_map(pci_get_address_space(dev), addr, plen, dir,
> > +  MEMTXATTRS_UNSPECIFIED);
> >  }
> >  
> >  static inline void pci_dma_unmap(PCIDevice *dev, void *buffer, dma_addr_t 
> > len,
> > diff --git a/tar

Re: [PATCH for-8.0 13/19] target/ppc: Convert to 3-phase reset

2022-11-24 Thread Greg Kurz
On Thu, 24 Nov 2022 11:50:16 +
Peter Maydell  wrote:

> Convert the ppc CPU class to use 3-phase reset, so it doesn't
> need to use device_class_set_parent_reset() any more.
> 
> Signed-off-by: Peter Maydell 
> ---

Reviewed-by: Greg Kurz 

>  target/ppc/cpu-qom.h  |  4 ++--
>  target/ppc/cpu_init.c | 12 
>  2 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
> index 89ff88f28c9..0fbd8b72468 100644
> --- a/target/ppc/cpu-qom.h
> +++ b/target/ppc/cpu-qom.h
> @@ -143,7 +143,7 @@ typedef struct PPCHash64Options PPCHash64Options;
>  /**
>   * PowerPCCPUClass:
>   * @parent_realize: The parent class' realize handler.
> - * @parent_reset: The parent class' reset handler.
> + * @parent_phases: The parent class' reset phase handlers.
>   *
>   * A PowerPC CPU model.
>   */
> @@ -154,7 +154,7 @@ struct PowerPCCPUClass {
>  
>  DeviceRealize parent_realize;
>  DeviceUnrealize parent_unrealize;
> -DeviceReset parent_reset;
> +ResettablePhases parent_phases;
>  void (*parent_parse_features)(const char *type, char *str, Error **errp);
>  
>  uint32_t pvr;
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index cbf00813743..95d25856a0e 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -7031,16 +7031,18 @@ static bool ppc_cpu_has_work(CPUState *cs)
>  return cs->interrupt_request & CPU_INTERRUPT_HARD;
>  }
>  
> -static void ppc_cpu_reset(DeviceState *dev)
> +static void ppc_cpu_reset_hold(Object *obj)
>  {
> -CPUState *s = CPU(dev);
> +CPUState *s = CPU(obj);
>  PowerPCCPU *cpu = POWERPC_CPU(s);
>  PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
>  CPUPPCState *env = >env;
>  target_ulong msr;
>  int i;
>  
> -pcc->parent_reset(dev);
> +if (pcc->parent_phases.hold) {
> +pcc->parent_phases.hold(obj);
> +}
>  
>  msr = (target_ulong)0;
>  msr |= (target_ulong)MSR_HVB;
> @@ -7267,6 +7269,7 @@ static void ppc_cpu_class_init(ObjectClass *oc, void 
> *data)
>  PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
>  CPUClass *cc = CPU_CLASS(oc);
>  DeviceClass *dc = DEVICE_CLASS(oc);
> +ResettableClass *rc = RESETTABLE_CLASS(oc);
>  
>  device_class_set_parent_realize(dc, ppc_cpu_realize,
>  >parent_realize);
> @@ -7275,7 +7278,8 @@ static void ppc_cpu_class_init(ObjectClass *oc, void 
> *data)
>  pcc->pvr_match = ppc_pvr_match_default;
>  device_class_set_props(dc, ppc_cpu_properties);
>  
> -device_class_set_parent_reset(dc, ppc_cpu_reset, >parent_reset);
> +resettable_class_set_parent_phases(rc, NULL, ppc_cpu_reset_hold, NULL,
> +   >parent_phases);
>  
>  cc->class_by_name = ppc_cpu_class_by_name;
>  cc->has_work = ppc_cpu_has_work;




Re: [PATCH v2 1/2] cleanup: Tweak and re-run return_directly.cocci

2022-11-24 Thread Greg Kurz
On Tue, 22 Nov 2022 14:49:16 +0100
Markus Armbruster  wrote:

> Tweak the semantic patch to drop redundant parenthesis around the
> return expression.
> 
> Coccinelle drops a comment in hw/rdma/vmw/pvrdma_cmd.c; restored
> manually.
> 
> Coccinelle messes up vmdk_co_create(), not sure why.  Change dropped,
> will be done manually in the next commit.
> 
> Line breaks in target/avr/cpu.h and hw/rdma/vmw/pvrdma_cmd.c tidied up
> manually.
> 
> Whitespace in tools/virtiofsd/fuse_lowlevel.c tidied up manually.
> 
> checkpatch.pl complains "return of an errno should typically be -ve"
> two times for hw/9pfs/9p-synth.c.  Preexisting, the patch merely makes
> it visible to checkpatch.pl.
> 

Hi Markus,

Yeah these positive errno values have been sitting there since the
beginning. It was dead code until I hijacked the synth backend to
implement qtest for 9p. I didn't care much about the return value
of the two culprits at the time since both are passed to assert(!ret)
right away. For this reason, changing the sign should be easy :-)

I see that checkpatch.pl considers this as an error. I'll post
a fix. I guess you'll need to rebase on this fix for your patches
to pass CI.

Anyway, for 9p:

Reviewed-by: Greg Kurz 


> Signed-off-by: Markus Armbruster 
> ---
>  scripts/coccinelle/return_directly.cocci |  5 +--
>  include/hw/pci/pci.h |  7 +--
>  target/avr/cpu.h |  4 +-
>  hw/9pfs/9p-synth.c   | 14 ++
>  hw/char/sifive_uart.c|  4 +-
>  hw/ppc/ppc4xx_sdram.c|  5 +--
>  hw/rdma/vmw/pvrdma_cmd.c | 57 +---
>  hw/virtio/vhost-user.c   |  6 +--
>  migration/dirtyrate.c| 10 +
>  migration/tls.c  |  6 +--
>  replay/replay-time.c |  5 +--
>  semihosting/console.c|  4 +-
>  softmmu/memory.c | 11 +
>  softmmu/physmem.c|  9 +---
>  target/loongarch/cpu.c   |  4 +-
>  target/mips/tcg/dsp_helper.c | 15 ++-
>  target/riscv/debug.c |  6 +--
>  target/riscv/vector_helper.c | 28 +++-
>  tests/bench/benchmark-crypto-akcipher.c  |  6 +--
>  tests/qtest/erst-test.c  |  5 +--
>  tests/qtest/hexloader-test.c |  6 +--
>  tests/qtest/pvpanic-pci-test.c   |  6 +--
>  tests/qtest/pvpanic-test.c   |  6 +--
>  tests/qtest/test-filter-mirror.c |  6 +--
>  tests/qtest/virtio-ccw-test.c|  6 +--
>  tests/tcg/multiarch/sha512.c |  9 +---
>  tools/virtiofsd/fuse_lowlevel.c  | 24 +++---
>  27 files changed, 70 insertions(+), 204 deletions(-)
> 
> diff --git a/scripts/coccinelle/return_directly.cocci 
> b/scripts/coccinelle/return_directly.cocci
> index 4cf50e75ea..6cb1b3c99a 100644
> --- a/scripts/coccinelle/return_directly.cocci
> +++ b/scripts/coccinelle/return_directly.cocci
> @@ -11,9 +11,8 @@ identifier F;
>  -T VAR;
>   ... when != VAR
>  
> --VAR =
> -+return
> - E;
> +-VAR = (E);
>  -return VAR;
> ++return E;
>   ... when != VAR
>   }
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index 6ccaaf5154..06e2d5f889 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -921,11 +921,8 @@ PCI_DMA_DEFINE_LDST(q_be, q_be, 64);
>  static inline void *pci_dma_map(PCIDevice *dev, dma_addr_t addr,
>  dma_addr_t *plen, DMADirection dir)
>  {
> -void *buf;
> -
> -buf = dma_memory_map(pci_get_address_space(dev), addr, plen, dir,
> - MEMTXATTRS_UNSPECIFIED);
> -return buf;
> +return dma_memory_map(pci_get_address_space(dev), addr, plen, dir,
> +  MEMTXATTRS_UNSPECIFIED);
>  }
>  
>  static inline void pci_dma_unmap(PCIDevice *dev, void *buffer, dma_addr_t 
> len,
> diff --git a/target/avr/cpu.h b/target/avr/cpu.h
> index 96419c0c2b..f19dd72926 100644
> --- a/target/avr/cpu.h
> +++ b/target/avr/cpu.h
> @@ -215,8 +215,7 @@ static inline int cpu_interrupts_enabled(CPUAVRState *env)
>  
>  static inline uint8_t cpu_get_sreg(CPUAVRState *env)
>  {
> -uint8_t sreg;
> -sreg = (env->sregC) << 0
> +return (env->sregC) << 0
>   | (env->sregZ) << 1
>   | (env->sregN) << 2
>   | (env->sregV) << 3
> @@ -224,7 +223,6 @@ static inline uint8_t cpu_get_sreg(CPUAVRState *env)
>   | (env->sregH) << 5
>   | (env->sregT) << 6

Re: [PATCH 2/3] i386: kvm: disable KVM_CAP_PMU_CAPABILITY if "pmu" is disabled

2022-11-21 Thread Greg Kurz
On Sat, 19 Nov 2022 04:29:00 -0800
Dongli Zhang  wrote:

> The "perf stat" at the VM side still works even we set "-cpu host,-pmu" in
> the QEMU command line. That is, neither "-cpu host,-pmu" nor "-cpu EPYC"
> could disable the pmu virtualization in an AMD environment.
> 
> We still see below at VM kernel side ...
> 
> [0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
> 
> ... although we expect something like below.
> 
> [0.596381] Performance Events: PMU not available due to virtualization, 
> using software events only.
> [0.600972] NMI watchdog: Perf NMI watchdog permanently disabled
> 
> This is because the AMD pmu (v1) does not rely on cpuid to decide if the
> pmu virtualization is supported.
> 
> We disable KVM_CAP_PMU_CAPABILITY if the 'pmu' is disabled in the vcpu
> properties.
> 
> Cc: Joe Jin 
> Signed-off-by: Dongli Zhang 
> ---
>  target/i386/kvm/kvm.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 8fec0bc5b5..0b1226ff7f 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -137,6 +137,8 @@ static int has_triple_fault_event;
>  
>  static bool has_msr_mcg_ext_ctl;
>  
> +static int has_pmu_cap;
> +
>  static struct kvm_cpuid2 *cpuid_cache;
>  static struct kvm_cpuid2 *hv_cpuid_cache;
>  static struct kvm_msr_list *kvm_feature_msrs;
> @@ -1725,6 +1727,19 @@ static void kvm_init_nested_state(CPUX86State *env)
>  
>  void kvm_arch_pre_create_vcpu(CPUState *cs)
>  {
> +X86CPU *cpu = X86_CPU(cs);
> +int ret;
> +
> +if (has_pmu_cap && !cpu->enable_pmu) {
> +ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PMU_CAPABILITY, 0,
> +KVM_PMU_CAP_DISABLE);

It doesn't seem conceptually correct to configure VM level stuff out of
a vCPU property, which could theoretically be different for each vCPU,
even if this isn't the case with the current code base.

Maybe consider controlling PMU with a machine property and this
could be done in kvm_arch_init() like other VM level stuff ?

> +if (ret < 0) {
> +error_report("kvm: Failed to disable pmu cap: %s",
> + strerror(-ret));
> +}
> +
> +has_pmu_cap = 0;
> +}
>  }
>  
>  int kvm_arch_init_vcpu(CPUState *cs)
> @@ -2517,6 +2532,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>  }
>  }
>  
> +has_pmu_cap = kvm_check_extension(s, KVM_CAP_PMU_CAPABILITY);
> +
>  ret = kvm_get_supported_msrs(s);
>  if (ret < 0) {
>  return ret;




Re: [PATCH v2 05/19] hw/9pfs: Update 9pfs to use the new QemuFd_t type

2022-11-19 Thread Greg Kurz
On Fri, 18 Nov 2022 14:38:00 +0100
Christian Schoenebeck  wrote:

> On Friday, November 18, 2022 10:29:51 AM CET Greg Kurz wrote:
> > On Fri, 11 Nov 2022 12:22:11 +0800
> > Bin Meng  wrote:
> > 
> > > With this new QemuFd_t type, it significantly reduces the number of
> > 
> > I cannot find the definition of this type, nor the definition of
> > qemu_fd_invalid(). Missing patch ?
> 
> It's in patch 4. Looks like we were not CCed on that patch. :(
> 

Oh I didn't check the numbering. I guess we were not CCed automatically...

> https://lore.kernel.org/qemu-devel/2022042225.1115931-5-bin.m...@windriver.com/
> 

... because this only touches include/qemu/osdep.h .

Bin,

Please ensure that the maintainers are in the Cc list for all
patches in such a series, e.g. with explicit --cc arguments to
git-send-email.

> > Anyway, IIUC this type is an int for linux and a HANDLE for windows,
> > right ?
> > 
> > According to win32 documentation at [1] :
> > 
> > HANDLE  
> > A handle to an object.
> > 
> > This type is declared in WinNT.h as follows:
> > 
> > typedef PVOID HANDLE;
> > 
> > and
> > 
> > PVOID   
> > A pointer to any type.
> > 
> > This type is declared in WinNT.h as follows:
> > 
> > typedef void *PVOID;
> > 
> > HANDLE is void *.
> > 
> > From docs/devel/style.rst:
> > 
> > Naming
> > ==
> > 
> > Variables are lower_case_with_underscores; easy to type and read.  
> > Structured
> > type names are in CamelCase; harder to type but standing out.  Enum type
> > names and function type names should also be in CamelCase.  Scalar type
> > names are lower_case_with_underscores_ending_with_a_t, like the POSIX
> > uint64_t and family.  Note that this last convention contradicts POSIX
> > and is therefore likely to be changed.
> > 
> > Both int and void * are scalar types, so I'd rather name it qemu_fd_t,
> > not using CamelCase at all so that it cannot be confused with a struct.
> > 
> > [1] 
> > https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types
> 
> Not that I had a strong opinion about this issue (as in general with coding
> style topics). It was one of my suggested type names. To make this type long-
> term proof I suggested to handle it as if it was a truly opaque type in QEMU:
> 

A true opaque type in C is implemented with a structured type and pointers
to this type.

> https://lore.kernel.org/qemu-devel/4620086.XpUeK0iDWE@silver/
> 
> That is to explicitly not try to do things like:
> 
> if (fd == -1)
> 
> at least not hard wired in user code. According to QEMU code style you should
> probably then drop the trailing "_t" though.
> 

Yes, either one is fine I guess. Most important part is to provide
a documented API to manipulate that type since, no matter the name,
it is still a scalar type that can be manipulated as such.

> > > deviated code paths when adding Windows support.
> > > 
> > > Signed-off-by: Bin Meng 
> > > 
> > > ---
> > > 
> > > Changes in v2:
> > > - Use the new QemuFd_t type
> > > 
> > >  hw/9pfs/9p-local.h   |   6 +-
> > >  hw/9pfs/9p-util.h|  26 +++---
> > >  hw/9pfs/9p-local.c   | 174 ---
> > >  hw/9pfs/9p-util-darwin.c |  14 ++--
> > >  hw/9pfs/9p-util-linux.c  |  14 ++--
> > >  hw/9pfs/9p-xattr.c   |  16 ++--
> > >  6 files changed, 129 insertions(+), 121 deletions(-)
> > > 
> > > diff --git a/hw/9pfs/9p-local.h b/hw/9pfs/9p-local.h
> > > index 32c72749d9..66a21316a0 100644
> > > --- a/hw/9pfs/9p-local.h
> > > +++ b/hw/9pfs/9p-local.h
> > > @@ -13,8 +13,8 @@
> > >  #ifndef QEMU_9P_LOCAL_H
> > >  #define QEMU_9P_LOCAL_H
> > >  
> > > -int local_open_nofollow(FsContext *fs_ctx, const char *path, int flags,
> > > -mode_t mode);
> > > -int local_opendir_nofollow(FsContext *fs_ctx, const char *path);
> > > +QemuFd_t local_open_nofollow(FsContext *fs_ctx, const char *path, int 
> > > flags,
> > > + mode_t mode);
> > > +QemuFd_t local_opendir_nofollow(FsContext *fs_ctx, const char *path);
> > >  
> > >  #endif
> > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
> > > index c314cf381d..3d6bd1a51e 100644
> > > --- a/hw/9pfs/9p-util.h
> > > +++ b/hw/9pfs/9p-util.h
> > > @@ -101,30 +101,31 @@ static inline int

Re: [PATCH v2 05/19] hw/9pfs: Update 9pfs to use the new QemuFd_t type

2022-11-18 Thread Greg Kurz
On Fri, 11 Nov 2022 12:22:11 +0800
Bin Meng  wrote:

> With this new QemuFd_t type, it significantly reduces the number of

I cannot find the definition of this type, nor the definition of
qemu_fd_invalid(). Missing patch ?

Anyway, IIUC this type is an int for linux and a HANDLE for windows,
right ?

According to win32 documentation at [1] :

HANDLE  
A handle to an object.

This type is declared in WinNT.h as follows:

typedef PVOID HANDLE;

and

PVOID   
A pointer to any type.

This type is declared in WinNT.h as follows:

typedef void *PVOID;

HANDLE is void *.

From docs/devel/style.rst:

Naming
==

Variables are lower_case_with_underscores; easy to type and read.  Structured
type names are in CamelCase; harder to type but standing out.  Enum type
names and function type names should also be in CamelCase.  Scalar type
names are lower_case_with_underscores_ending_with_a_t, like the POSIX
uint64_t and family.  Note that this last convention contradicts POSIX
and is therefore likely to be changed.

Both int and void * are scalar types, so I'd rather name it qemu_fd_t,
not using CamelCase at all so that it cannot be confused with a struct.

[1] https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types

> deviated code paths when adding Windows support.
> 
> Signed-off-by: Bin Meng 
> 
> ---
> 
> Changes in v2:
> - Use the new QemuFd_t type
> 
>  hw/9pfs/9p-local.h   |   6 +-
>  hw/9pfs/9p-util.h|  26 +++---
>  hw/9pfs/9p-local.c   | 174 ---
>  hw/9pfs/9p-util-darwin.c |  14 ++--
>  hw/9pfs/9p-util-linux.c  |  14 ++--
>  hw/9pfs/9p-xattr.c   |  16 ++--
>  6 files changed, 129 insertions(+), 121 deletions(-)
> 
> diff --git a/hw/9pfs/9p-local.h b/hw/9pfs/9p-local.h
> index 32c72749d9..66a21316a0 100644
> --- a/hw/9pfs/9p-local.h
> +++ b/hw/9pfs/9p-local.h
> @@ -13,8 +13,8 @@
>  #ifndef QEMU_9P_LOCAL_H
>  #define QEMU_9P_LOCAL_H
>  
> -int local_open_nofollow(FsContext *fs_ctx, const char *path, int flags,
> -mode_t mode);
> -int local_opendir_nofollow(FsContext *fs_ctx, const char *path);
> +QemuFd_t local_open_nofollow(FsContext *fs_ctx, const char *path, int flags,
> + mode_t mode);
> +QemuFd_t local_opendir_nofollow(FsContext *fs_ctx, const char *path);
>  
>  #endif
> diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
> index c314cf381d..3d6bd1a51e 100644
> --- a/hw/9pfs/9p-util.h
> +++ b/hw/9pfs/9p-util.h
> @@ -101,30 +101,31 @@ static inline int errno_to_dotl(int err) {
>  #define qemu_utimensat  utimensat
>  #define qemu_unlinkat   unlinkat
>  
> -static inline void close_preserve_errno(int fd)
> +static inline void close_preserve_errno(QemuFd_t fd)
>  {
>  int serrno = errno;
>  close(fd);
>  errno = serrno;
>  }
>  
> -static inline int openat_dir(int dirfd, const char *name)
> +static inline QemuFd_t openat_dir(QemuFd_t dirfd, const char *name)
>  {
>  return qemu_openat(dirfd, name,
> O_DIRECTORY | O_RDONLY | O_NOFOLLOW | O_PATH_9P_UTIL);
>  }
>  
> -static inline int openat_file(int dirfd, const char *name, int flags,
> -  mode_t mode)
> +static inline QemuFd_t openat_file(QemuFd_t dirfd, const char *name,
> +   int flags, mode_t mode)
>  {
> -int fd, serrno, ret;
> +int serrno, ret;
> +QemuFd_t fd;
>  
>  #ifndef CONFIG_DARWIN
>  again:
>  #endif
>  fd = qemu_openat(dirfd, name, flags | O_NOFOLLOW | O_NOCTTY | O_NONBLOCK,
>   mode);
> -if (fd == -1) {
> +if (qemu_fd_invalid(fd)) {
>  #ifndef CONFIG_DARWIN
>  if (errno == EPERM && (flags & O_NOATIME)) {
>  /*
> @@ -155,13 +156,13 @@ again:
>  return fd;
>  }
>  
> -ssize_t fgetxattrat_nofollow(int dirfd, const char *path, const char *name,
> - void *value, size_t size);
> -int fsetxattrat_nofollow(int dirfd, const char *path, const char *name,
> +ssize_t fgetxattrat_nofollow(QemuFd_t dirfd, const char *path,
> + const char *name, void *value, size_t size);
> +int fsetxattrat_nofollow(QemuFd_t dirfd, const char *path, const char *name,
>   void *value, size_t size, int flags);
> -ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
> +ssize_t flistxattrat_nofollow(QemuFd_t dirfd, const char *filename,
>char *list, size_t size);
> -ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
> +ssize_t fremovexattrat_nofollow(QemuFd_t dirfd, const char *filename,
>  const char *name);
>  
>  /*
> @@ -219,6 +220,7 @@ static inline struct dirent *qemu_dirent_dup(struct 
> dirent *dent)
>  #if defined CONFIG_DARWIN && defined CONFIG_PTHREAD_FCHDIR_NP
>  int pthread_fchdir_np(int fd) __attribute__((weak_import));
>  #endif
> -int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev);
> 

Re: [PATCH v2 02/19] hw/9pfs: Drop unnecessary *xattr wrapper API declarations

2022-11-18 Thread Greg Kurz
On Fri, 11 Nov 2022 12:22:08 +0800
Bin Meng  wrote:

> These are not used anywhere in the source tree. Drop them.
> 
> Signed-off-by: Bin Meng 
> ---
> 

This one could even go through the trivial tree right
away IMHO.

Reviewed-by: Greg Kurz 

> (no changes since v1)
> 
>  hw/9pfs/9p-util.h | 11 ---
>  1 file changed, 11 deletions(-)
> 
> diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h
> index c3526144c9..ccfc8b1cb3 100644
> --- a/hw/9pfs/9p-util.h
> +++ b/hw/9pfs/9p-util.h
> @@ -90,19 +90,8 @@ static inline int errno_to_dotl(int err) {
>  
>  #ifdef CONFIG_DARWIN
>  #define qemu_fgetxattr(...) fgetxattr(__VA_ARGS__, 0, 0)
> -#define qemu_lgetxattr(...) getxattr(__VA_ARGS__, 0, XATTR_NOFOLLOW)
> -#define qemu_llistxattr(...) listxattr(__VA_ARGS__, XATTR_NOFOLLOW)
> -#define qemu_lremovexattr(...) removexattr(__VA_ARGS__, XATTR_NOFOLLOW)
> -static inline int qemu_lsetxattr(const char *path, const char *name,
> - const void *value, size_t size, int flags) {
> -return setxattr(path, name, value, size, 0, flags | XATTR_NOFOLLOW);
> -}
>  #else
>  #define qemu_fgetxattr fgetxattr
> -#define qemu_lgetxattr lgetxattr
> -#define qemu_llistxattr llistxattr
> -#define qemu_lremovexattr lremovexattr
> -#define qemu_lsetxattr lsetxattr
>  #endif
>  
>  static inline void close_preserve_errno(int fd)




Re: [PATCH for-8.0] MAINTAINERS: downgrade PPC KVM/TCG CPUs and pSeries to 'Odd Fixes'

2022-11-17 Thread Greg Kurz
On Thu, 17 Nov 2022 12:32:18 -0300
Daniel Henrique Barboza  wrote:

> The maintainer is no longer being paid to maintain these components. All
> maintainership work is being done in his personal time since the middle
> of the 7.2 development cycle.
> 

Great thanks Daniel for all your contributions over
the years, and for being the one steering the vessel
to the dry dock. This is it.

> Change the status of PPC KVM CPUs, PPC TCG CPUs and the pSeries machine
> to 'Odd Fixes', reflecting that the maintainer no longer has exclusive
> time to dedicate to them. It'll also (hopefully) keep expectations under
> check when/if these components are used in a customer product.
> 
> Cc: Cédric Le Goater 
> Cc: David Gibson 
> Cc: Greg Kurz 
> Signed-off-by: Daniel Henrique Barboza 
> ---

Reviewed-by: Greg Kurz 

>  MAINTAINERS | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index be151f0024..1d43153e5f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -264,7 +264,7 @@ R: Cédric Le Goater 
>  R: David Gibson 
>  R: Greg Kurz 
>  L: qemu-...@nongnu.org
> -S: Maintained
> +S: Odd Fixes
>  F: target/ppc/
>  F: hw/ppc/ppc.c
>  F: hw/ppc/ppc_booke.c
> @@ -389,7 +389,7 @@ M: Daniel Henrique Barboza 
>  R: Cédric Le Goater 
>  R: David Gibson 
>  R: Greg Kurz 
> -S: Maintained
> +S: Odd Fixes
>  F: target/ppc/kvm.c
>  
>  S390 KVM CPUs
> @@ -1367,7 +1367,7 @@ R: Cédric Le Goater 
>  R: David Gibson 
>  R: Greg Kurz 
>  L: qemu-...@nongnu.org
> -S: Maintained
> +S: Odd Fixes
>  F: hw/*/spapr*
>  F: include/hw/*/spapr*
>  F: hw/*/xics*




Re: [PATCH] target/ppc: Fix build warnings when building with 'disable-tcg'

2022-11-17 Thread Greg Kurz
On Thu, 17 Nov 2022 07:11:51 -0300
Daniel Henrique Barboza  wrote:

> Queued in gitlab.com/danielhb/qemu/tree/ppc-next with the following tags:
> 

You are planning a PR before 7.2-rc2, right ?

> 
>  Reported-by: Kowshik Jois B S 
>  Fixes: 61bd1d2942 ("target/ppc: Convert to tcg_ops restore_state_to_opc")
>  Fixes: 670f1da374 ("target/ppc: Implement hashst and hashchk")

The guard macro also covers the following two, introduced by yet another commit.

  HELPER_HASH(HASHSTP, env->spr[SPR_HASHPKEYR], true)
  HELPER_HASH(HASHCHKP, env->spr[SPR_HASHPKEYR], false)

Fixes: 53ae2aeb9407 ("target/ppc: Implement hashstp and hashchkp")

>  Resolves: https://gitlab.com/qemu-project/qemu/-/issues/377

Err... I don't see any relation with this issue.

Cedric ?

But this resolves the issue created by Vaibhav for 7.2 :

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1319

>  Signed-off-by: Vaibhav Jain 
>  Reviewed-by: Greg Kurz 
>  Reviewed-by: Philippe Mathieu-Daudé 
>  Tested-by: Kowshik Jois B S 
> 
> 
> Thanks,
> 
> 
> Daniel
> 
> On 11/16/22 10:17, Vaibhav Jain wrote:
> > Kowshik reported that building qemu with GCC 12.2.1 for 'ppc64-softmmu'
> > target is failing due to following build warnings:
> > 
> > 
> >   ../target/ppc/cpu_init.c:7018:13: error: 'ppc_restore_state_to_opc' 
> > defined but not used [-Werror=unused-function]
> >   7018 | static void ppc_restore_state_to_opc(CPUState *cs,
> > 
> > 
> > Fix this by wrapping these function definitions in 'ifdef CONFIG_TCG' so 
> > that
> > they are only defined if qemu is compiled with '--enable-tcg'
> > 
> > Reported-by: Kowshik Jois B S 
> > Signed-off-by: Vaibhav Jain 
> > ---
> >   target/ppc/cpu_init.c| 2 ++
> >   target/ppc/excp_helper.c | 2 ++
> >   2 files changed, 4 insertions(+)
> > 
> > diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> > index 32e94153d1..cbf0081374 100644
> > --- a/target/ppc/cpu_init.c
> > +++ b/target/ppc/cpu_init.c
> > @@ -7015,6 +7015,7 @@ static vaddr ppc_cpu_get_pc(CPUState *cs)
> >   return cpu->env.nip;
> >   }
> >   
> > +#ifdef CONFIG_TCG
> >   static void ppc_restore_state_to_opc(CPUState *cs,
> >const TranslationBlock *tb,
> >const uint64_t *data)
> > @@ -7023,6 +7024,7 @@ static void ppc_restore_state_to_opc(CPUState *cs,
> >   
> >   cpu->env.nip = data[0];
> >   }
> > +#endif /* CONFIG_TCG */
> >   
> >   static bool ppc_cpu_has_work(CPUState *cs)
> >   {
> > diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> > index a05a2ed595..94adcb766b 100644
> > --- a/target/ppc/excp_helper.c
> > +++ b/target/ppc/excp_helper.c
> > @@ -2842,6 +2842,7 @@ void helper_td(CPUPPCState *env, target_ulong arg1, 
> > target_ulong arg2,
> >   #endif
> >   #endif
> >   
> > +#ifdef CONFIG_TCG
> >   static uint32_t helper_SIMON_LIKE_32_64(uint32_t x, uint64_t key, 
> > uint32_t lane)
> >   {
> >   const uint16_t c = 0xfffc;
> > @@ -2924,6 +2925,7 @@ HELPER_HASH(HASHST, env->spr[SPR_HASHKEYR], true)
> >   HELPER_HASH(HASHCHK, env->spr[SPR_HASHKEYR], false)
> >   HELPER_HASH(HASHSTP, env->spr[SPR_HASHPKEYR], true)
> >   HELPER_HASH(HASHCHKP, env->spr[SPR_HASHPKEYR], false)
> > +#endif /* CONFIG_TCG */
> >   
> >   #if !defined(CONFIG_USER_ONLY)
> >   




Re: [PATCH] target/ppc: Fix build warnings when building with 'disable-tcg'

2022-11-16 Thread Greg Kurz
Hi Vaibhav,

Nice to see some people are still building QEMU at IBM ;-)

On Wed, 16 Nov 2022 18:47:43 +0530
Vaibhav Jain  wrote:

> Kowshik reported that building qemu with GCC 12.2.1 for 'ppc64-softmmu'
> target is failing due to following build warnings:
> 
> 
>  ../target/ppc/cpu_init.c:7018:13: error: 'ppc_restore_state_to_opc' defined 
> but not used [-Werror=unused-function]
>  7018 | static void ppc_restore_state_to_opc(CPUState *cs,
> 
> 
> Fix this by wrapping these function definitions in 'ifdef CONFIG_TCG' so that
> they are only defined if qemu is compiled with '--enable-tcg'
> 
> Reported-by: Kowshik Jois B S 
> Signed-off-by: Vaibhav Jain 
> ---

Reviewed-by: Greg Kurz 

This was introduced by a recent commit.

Fixes: 61bd1d29421a ("target/ppc: Convert to tcg_ops restore_state_to_opc")


Vaibhav,

This is serious enough it should get fixed in 7.2. Please fill up an
issue as explain in [1].

Cheers,

--
Greg

[1] https://lists.nongnu.org/archive/html/qemu-devel/2022-11/msg00137.html

>  target/ppc/cpu_init.c| 2 ++
>  target/ppc/excp_helper.c | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index 32e94153d1..cbf0081374 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -7015,6 +7015,7 @@ static vaddr ppc_cpu_get_pc(CPUState *cs)
>  return cpu->env.nip;
>  }
>  
> +#ifdef CONFIG_TCG
>  static void ppc_restore_state_to_opc(CPUState *cs,
>   const TranslationBlock *tb,
>   const uint64_t *data)
> @@ -7023,6 +7024,7 @@ static void ppc_restore_state_to_opc(CPUState *cs,
>  
>  cpu->env.nip = data[0];
>  }
> +#endif /* CONFIG_TCG */
>  
>  static bool ppc_cpu_has_work(CPUState *cs)
>  {
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index a05a2ed595..94adcb766b 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -2842,6 +2842,7 @@ void helper_td(CPUPPCState *env, target_ulong arg1, 
> target_ulong arg2,
>  #endif
>  #endif
>  
> +#ifdef CONFIG_TCG
>  static uint32_t helper_SIMON_LIKE_32_64(uint32_t x, uint64_t key, uint32_t 
> lane)
>  {
>  const uint16_t c = 0xfffc;
> @@ -2924,6 +2925,7 @@ HELPER_HASH(HASHST, env->spr[SPR_HASHKEYR], true)
>  HELPER_HASH(HASHCHK, env->spr[SPR_HASHKEYR], false)
>  HELPER_HASH(HASHSTP, env->spr[SPR_HASHPKEYR], true)
>  HELPER_HASH(HASHCHKP, env->spr[SPR_HASHPKEYR], false)
> +#endif /* CONFIG_TCG */
>  
>  #if !defined(CONFIG_USER_ONLY)
>  




Re: [PATCH for-8.0] hw: Add compat machines for 8.0

2022-11-15 Thread Greg Kurz
On Fri, 11 Nov 2022 13:45:34 +0100
Cornelia Huck  wrote:

> Add 8.0 machine types for arm/i440fx/m68k/q35/s390x/spapr.
> 
> Signed-off-by: Cornelia Huck 
> ---
>  hw/arm/virt.c  |  9 -
>  hw/core/machine.c  |  3 +++
>  hw/i386/pc.c   |  3 +++
>  hw/i386/pc_piix.c  | 14 +-
>  hw/i386/pc_q35.c   | 13 -
>  hw/m68k/virt.c |  9 -
>  hw/ppc/spapr.c | 15 +--
>  hw/s390x/s390-virtio-ccw.c | 14 +-
>  include/hw/boards.h|  3 +++
>  include/hw/i386/pc.h   |  3 +++
>  10 files changed, 79 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index b87135085610..2a46660980e7 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -3096,10 +3096,17 @@ static void machvirt_machine_init(void)
>  }
>  type_init(machvirt_machine_init);
>  
> +static void virt_machine_8_0_options(MachineClass *mc)
> +{
> +}
> +DEFINE_VIRT_MACHINE_AS_LATEST(8, 0)
> +
>  static void virt_machine_7_2_options(MachineClass *mc)
>  {
> +virt_machine_8_0_options(mc);
> +compat_props_add(mc->compat_props, hw_compat_7_2, hw_compat_7_2_len);
>  }
> -DEFINE_VIRT_MACHINE_AS_LATEST(7, 2)
> +DEFINE_VIRT_MACHINE(7, 2)
>  
>  static void virt_machine_7_1_options(MachineClass *mc)
>  {
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 8d34caa31dc8..f264fb53b46c 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -40,6 +40,9 @@
>  #include "hw/virtio/virtio-pci.h"
>  #include "qom/object_interfaces.h"
>  
> +GlobalProperty hw_compat_7_2[] ={};

Missing space between '=' and '{}'.

Anyway, for ppc parts:

Reviewed-by: Greg Kurz 

> +const size_t hw_compat_7_2_len = G_N_ELEMENTS(hw_compat_7_2);
> +
>  GlobalProperty hw_compat_7_1[] = {
>  { "virtio-device", "queue_reset", "false" },
>  };
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 546b703cb42c..9aeff77e9dca 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -107,6 +107,9 @@
>  { "qemu64-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },\
>  { "athlon-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },
>  
> +GlobalProperty pc_compat_7_2[] = {};
> +const size_t pc_compat_7_2_len = G_N_ELEMENTS(pc_compat_7_2);
> +
>  GlobalProperty pc_compat_7_1[] = {};
>  const size_t pc_compat_7_1_len = G_N_ELEMENTS(pc_compat_7_1);
>  
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 0ad0ed160387..1c0a7b83b545 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -435,7 +435,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
>  machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
>  }
>  
> -static void pc_i440fx_7_2_machine_options(MachineClass *m)
> +static void pc_i440fx_8_0_machine_options(MachineClass *m)
>  {
>  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
>  pc_i440fx_machine_options(m);
> @@ -444,6 +444,18 @@ static void pc_i440fx_7_2_machine_options(MachineClass 
> *m)
>  pcmc->default_cpu_version = 1;
>  }
>  
> +DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,
> +  pc_i440fx_8_0_machine_options);
> +
> +static void pc_i440fx_7_2_machine_options(MachineClass *m)
> +{
> +pc_i440fx_8_0_machine_options(m);
> +m->alias = NULL;
> +m->is_default = false;
> +compat_props_add(m->compat_props, hw_compat_7_2, hw_compat_7_2_len);
> +compat_props_add(m->compat_props, pc_compat_7_2, pc_compat_7_2_len);
> +}
> +
>  DEFINE_I440FX_MACHINE(v7_2, "pc-i440fx-7.2", NULL,
>pc_i440fx_7_2_machine_options);
>  
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index a496bd6e74f5..10bb49f679b0 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -370,7 +370,7 @@ static void pc_q35_machine_options(MachineClass *m)
>  m->max_cpus = 288;
>  }
>  
> -static void pc_q35_7_2_machine_options(MachineClass *m)
> +static void pc_q35_8_0_machine_options(MachineClass *m)
>  {
>  PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
>  pc_q35_machine_options(m);
> @@ -378,6 +378,17 @@ static void pc_q35_7_2_machine_options(MachineClass *m)
>  pcmc->default_cpu_version = 1;
>  }
>  
> +DEFINE_Q35_MACHINE(v8_0, "pc-q35-8.0", NULL,
> +   pc_q35_8_0_machine_options);
> +
> +static void pc_q35_7_2_machine_options(MachineClass *m)
> +{
> +pc_q35_8_0_machine_options(m);
> +m->alias = NULL;
> +compat_props_add(m->compa

[PATCH v3 2/2] util/log: Always send errors to logfile when daemonized

2022-11-08 Thread Greg Kurz
When QEMU is started with `-daemonize`, all stdio descriptors get
redirected to `/dev/null`. This basically means that anything
printed with error_report() and friends is lost.

Current logging code allows to redirect to a file with `-D` but
this requires to enable some logging item with `-d` as well to
be functional.

Relax the check on the log flags when QEMU is daemonized, so that
other users of stderr can benefit from the redirection, without the
need to enable unwanted debug logs. Previous behaviour is retained
for the non-daemonized case. The logic is unrolled as an `if` for
better readability. The qemu_log_level and log_per_thread globals
reflect the state we want to transition to at this point : use
them instead of the intermediary locals for correctness.

qemu_set_log_internal() is adapted to open a per-thread log file
when '-d tid' is passed. This is done by hijacking qemu_try_lock()
which seems simpler that refactoring the code.

Signed-off-by: Greg Kurz 
---
 util/log.c | 72 --
 1 file changed, 53 insertions(+), 19 deletions(-)

diff --git a/util/log.c b/util/log.c
index fb843453dd49..7837ff991769 100644
--- a/util/log.c
+++ b/util/log.c
@@ -79,13 +79,15 @@ static int log_thread_id(void)
 
 static void qemu_log_thread_cleanup(Notifier *n, void *unused)
 {
-fclose(thread_file);
-thread_file = NULL;
+if (thread_file != stderr) {
+fclose(thread_file);
+thread_file = NULL;
+}
 }
 
 /* Lock/unlock output. */
 
-FILE *qemu_log_trylock(void)
+static FILE *qemu_log_trylock_with_err(Error **errp)
 {
 FILE *logfile;
 
@@ -96,6 +98,9 @@ FILE *qemu_log_trylock(void)
 = g_strdup_printf(global_filename, log_thread_id());
 logfile = fopen(filename, "w");
 if (!logfile) {
+error_setg_errno(errp, errno,
+ "Error opening logfile %s for thread %d",
+ filename, log_thread_id());
 return NULL;
 }
 thread_file = logfile;
@@ -122,6 +127,11 @@ FILE *qemu_log_trylock(void)
 return logfile;
 }
 
+FILE *qemu_log_trylock(void)
+{
+return qemu_log_trylock_with_err(NULL);
+}
+
 void qemu_log_unlock(FILE *logfile)
 {
 if (logfile) {
@@ -265,16 +275,21 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 #endif
 qemu_loglevel = log_flags;
 
-/*
- * In all cases we only log if qemu_loglevel is set.
- * Also:
- *   If per-thread, open the file for each thread in qemu_log_lock.
- *   If not daemonized we will always log either to stderr
- * or to a file (if there is a filename).
- *   If we are daemonized, we will only log if there is a filename.
- */
 daemonized = is_daemonized();
-need_to_open_file = log_flags && !per_thread && (!daemonized || filename);
+need_to_open_file = false;
+if (!daemonized) {
+/*
+ * If not daemonized we only log if qemu_loglevel is set, either to
+ * stderr or to a file (if there is a filename).
+ * If per-thread, open the file for each thread in qemu_log_trylock().
+ */
+need_to_open_file = qemu_loglevel && !log_per_thread;
+} else {
+/*
+ * If we are daemonized, we will only log if there is a filename.
+ */
+need_to_open_file = filename != NULL;
+}
 
 if (logfile) {
 fflush(logfile);
@@ -287,19 +302,34 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 }
 }
 
+if (log_per_thread && daemonized) {
+logfile = thread_file;
+}
+
 if (!logfile && need_to_open_file) {
 if (filename) {
-logfile = fopen(filename, "w");
-if (!logfile) {
-error_setg_errno(errp, errno, "Error opening logfile %s",
- filename);
-return false;
+if (log_per_thread) {
+logfile = qemu_log_trylock_with_err(errp);
+if (!logfile) {
+return false;
+}
+qemu_log_unlock(logfile);
+} else {
+logfile = fopen(filename, "w");
+if (!logfile) {
+error_setg_errno(errp, errno, "Error opening logfile %s",
+ filename);
+return false;
+}
 }
 /* In case we are a daemon redirect stderr to logfile */
 if (daemonized) {
 dup2(fileno(logfile), STDERR_FILENO);
 fclose(logfile);
-/* This will skip closing logfile in rcu_close_file. */
+/*
+ * This will skip closing logfile in rcu_close_file()
+ * or qemu_log_thre

[PATCH v3 1/2] util/log: do not close and reopen log files when flags are turned off

2022-11-08 Thread Greg Kurz
From: Paolo Bonzini 

log_append makes sure that if you turn off the logging (which clears
log_flags and makes need_to_open_file false) the old log is not
overwritten.  The usecase is that if you remove or move the file
QEMU will not keep writing to the old file.  However, this is
not always the desited behavior, in particular having log_append==1
after changing the file name makes little sense.

When qemu_set_log_internal is called from the logfile monitor
command, filename must be non-NULL and therefore changed_name must
be true.  Therefore, the only case where the file is closed and
need_to_open_file == false is indeed when log_flags becomes
zero.  In this case, just flush the file and do not bother
closing it, thus faking the same append behavior as previously.

The behavioral change is that changing the logfile twice, for
example log1 -> log2 -> log1, will cause log1 to be overwritten.
This can simply be documented, since it is not a particularly
surprising behavior.

Suggested-by: Alex Bennée 
Signed-off-by: Paolo Bonzini 
Reviewed-by: Richard Henderson 
Reviewed-by: Greg Kurz 
Message-Id: <20221025092119.236224-1-pbonz...@redhat.com>
[groug: nullify global_file before actually closing the file]
Signed-off-by: Greg Kurz 
---
 util/log.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/util/log.c b/util/log.c
index c2198badf240..fb843453dd49 100644
--- a/util/log.c
+++ b/util/log.c
@@ -45,7 +45,6 @@ static __thread FILE *thread_file;
 static __thread Notifier qemu_log_thread_cleanup_notifier;
 
 int qemu_loglevel;
-static bool log_append;
 static bool log_per_thread;
 static GArray *debug_regions;
 
@@ -277,19 +276,20 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 daemonized = is_daemonized();
 need_to_open_file = log_flags && !per_thread && (!daemonized || filename);
 
-if (logfile && (!need_to_open_file || changed_name)) {
-qatomic_rcu_set(_file, NULL);
-if (logfile != stderr) {
+if (logfile) {
+fflush(logfile);
+if (changed_name && logfile != stderr) {
 RCUCloseFILE *r = g_new0(RCUCloseFILE, 1);
 r->fd = logfile;
+qatomic_rcu_set(_file, NULL);
 call_rcu(r, rcu_close_file, rcu);
+logfile = NULL;
 }
-logfile = NULL;
 }
 
 if (!logfile && need_to_open_file) {
 if (filename) {
-logfile = fopen(filename, log_append ? "a" : "w");
+logfile = fopen(filename, "w");
 if (!logfile) {
 error_setg_errno(errp, errno, "Error opening logfile %s",
  filename);
@@ -308,8 +308,6 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 logfile = stderr;
 }
 
-log_append = 1;
-
 qatomic_rcu_set(_file, logfile);
 }
 return true;
-- 
2.38.1




[PATCH v3 0/2] util/log: Always send errors to logfile when daemonized

2022-11-08 Thread Greg Kurz
When QEMU is started with `--daemonize -D ${logfile} -d ${some_log_item}`,
error logs from error_report() and friends go to ${logfile}, but if QEMU
is started with `-daemonize -D ${logfile}` and no `-d`, the file isn't
even created and all logs go to /dev/null.

This inconsistency is quite confusing for users and gives the impression
that QEMU doesn't log errors at all. It seems much saner to always create
the log file when `-D` was passed and to be able to report errors.

It was spotted by the kata-containers project, which happens to do just
that `--daemonize -D` without `-d` trick.

v3:
- drop log_append (Paolo's patch)
- new approach : call qemu_log_trylock() from qemu_set_log_internal() in
  the per-thread case, instead of trying to special case the main thread

v2:
- new log_thread_id() implementation for hosts without gettid() syscall
- avoid conflict between global log file and per-thread logfile
- style improvements

Greg Kurz (1):
  util/log: Always send errors to logfile when daemonized

Paolo Bonzini (1):
  util/log: do not close and reopen log files when flags are turned off

 util/log.c | 84 +-
 1 file changed, 58 insertions(+), 26 deletions(-)

-- 
2.38.1





Re: [PATCH 0/2] util/log: Make the per-thread flag immutable

2022-11-07 Thread Greg Kurz
On Sat, 5 Nov 2022 09:37:26 +1100
Richard Henderson  wrote:

> On 11/4/22 23:00, Greg Kurz wrote:
> > While working on the "util/log: Always send errors to logfile when 
> > daemonized"
> > series [1], I've encountered some issues with the per-thread flag. They stem
> > from the code not being designed to allow the per-thread flag to be enabled
> > or disabled more than once, but nothing is done to prevent that from
> > happening. This results in unexpected results like the creation of a log
> > file with a `%d` in its name or confusing errors when using the `log`
> > command in the monitor.
> > 
> > I'm posting fixes separately now in case it makes sense to merge them during
> > soft freeze. If so, I'll open an issue as explained in this recent mail [2].
> > 
> > [1] https://patchew.org/QEMU/20221019151651.334334-1-gr...@kaod.org/
> > [2] https://lists.nongnu.org/archive/html/qemu-devel/2022-11/msg00137.html
> > 
> > Date: Wed, 19 Oct 2022 17:16:49 +0200
> > Message-ID: <20221019151651.334334-1-gr...@kaod.org>
> > 
> > Greg Kurz (2):
> >util/log: Make the per-thread flag immutable
> >util/log: Ignore per-thread flag if global file already there
> > 
> >   util/log.c | 9 +
> >   1 file changed, 9 insertions(+)
> > 
> 
> Series:
> Reviewed-by: Richard Henderson 
> 

Thanks for the quick review Richard !

I've created https://gitlab.com/qemu-project/qemu/-/issues/1302 with
a 7.2 milestone.

Paolo,

Can you queue this ?

Cheers,

--
Greg

> 
> r~




[PATCH 0/2] util/log: Make the per-thread flag immutable

2022-11-04 Thread Greg Kurz
While working on the "util/log: Always send errors to logfile when daemonized"
series [1], I've encountered some issues with the per-thread flag. They stem
from the code not being designed to allow the per-thread flag to be enabled
or disabled more than once, but nothing is done to prevent that from
happening. This results in unexpected results like the creation of a log
file with a `%d` in its name or confusing errors when using the `log`
command in the monitor.

I'm posting fixes separately now in case it makes sense to merge them during
soft freeze. If so, I'll open an issue as explained in this recent mail [2].

[1] https://patchew.org/QEMU/20221019151651.334334-1-gr...@kaod.org/
[2] https://lists.nongnu.org/archive/html/qemu-devel/2022-11/msg00137.html

Date: Wed, 19 Oct 2022 17:16:49 +0200
Message-ID: <20221019151651.334334-1-gr...@kaod.org>

Greg Kurz (2):
  util/log: Make the per-thread flag immutable
  util/log: Ignore per-thread flag if global file already there

 util/log.c | 9 +
 1 file changed, 9 insertions(+)

-- 
2.38.1





[PATCH 2/2] util/log: Ignore per-thread flag if global file already there

2022-11-04 Thread Greg Kurz
If QEMU is started with `-D qemu.log.%d` without any `-d` option,
doing `log all` in the monitor fails with:

Filename template with '%d' required for 'tid'

It is confusing since '%d' was actually passed.

This happens because QEMU caches the log file name with %d converted
to getpid() since `tid` wasn't required. This name isn't suitable
for a subsequent enablement of per-thread logs. There's little cause
to change the behavior as `-d tid` is mostly used at user-only startup.

Drop the per-thread from the requested flags in this case : `log all`
will thus enable everything except `tid` instead of failing. This is
preferable over forcing the user to enable each log item individually.

With this patch, `tid` is now truely immutable : it can only be set
or unset from the command line and never changed afterwards.

Fixes: 4e51069d6793 ("util/log: Support per-thread log files")
Cc: richard.hender...@linaro.org
Signed-off-by: Greg Kurz 
---
 util/log.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/util/log.c b/util/log.c
index b7d2b6e09cfe..c2198badf240 100644
--- a/util/log.c
+++ b/util/log.c
@@ -209,6 +209,10 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 /* The per-thread flag is immutable. */
 if (log_per_thread) {
 log_flags |= LOG_PER_THREAD;
+} else {
+if (global_filename) {
+log_flags &= ~LOG_PER_THREAD;
+}
 }
 
 per_thread = log_flags & LOG_PER_THREAD;
-- 
2.38.1




[PATCH 1/2] util/log: Make the per-thread flag immutable

2022-11-04 Thread Greg Kurz
Per-thread logging was implemented under the assumption that once
enabled, it is not possible to switch back to single file logging.
This isn't enforced though and it is possible to go through the
global file opening sequence in per-thread mode. The code isn't
ready for this and produces unexpected results as detailed below.

Start QEMU in system emulation mode with `-D ./qemu.log.%d -d tid`
and then change the log level from the monitor to something that
doesn't have tid, e.g. `log cpu_reset`. The value of log_flags
is zero and per_thread is set to false : the rest of the code
then assumes it is running in the global log case and opens a
file named `qemu.log.%d`, which is obviously not an expected
behavior.

Enforce the immutability of the flag early in qemu_set_log_internal()
so that its value is correct for all subsequent users.

Fixes: 4e51069d6793 ("util/log: Support per-thread log files")
Cc: richard.hender...@linaro.org
Signed-off-by: Greg Kurz 
---
 util/log.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/util/log.c b/util/log.c
index 39866bdaf2fa..b7d2b6e09cfe 100644
--- a/util/log.c
+++ b/util/log.c
@@ -206,6 +206,11 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 QEMU_LOCK_GUARD(_mutex);
 logfile = global_file;
 
+/* The per-thread flag is immutable. */
+if (log_per_thread) {
+log_flags |= LOG_PER_THREAD;
+}
+
 per_thread = log_flags & LOG_PER_THREAD;
 
 if (changed_name) {
-- 
2.38.1




Re: [PATCH] util/log: Close per-thread log file on thread termination

2022-10-27 Thread Greg Kurz
Cc'ing stable

On Fri, 21 Oct 2022 12:57:34 +0200
Greg Kurz  wrote:

> When `-D ${logfile} -d tid` is passed, qemu_log_trylock() creates
> a dedicated log file for the current thread and opens it. The
> corresponding file descriptor is cached in a __thread variable.
> Nothing is done to close the corresponding file descriptor when the
> thread terminates though and the file descriptor is leaked.
> 
> The issue was found during code inspection and reproduced manually.
> 
> Fix that with an atexit notifier.
> 
> Fixes: 4e51069d6793 ("util/log: Support per-thread log files")
> Cc: richard.hender...@linaro.org
> Signed-off-by: Greg Kurz 
> ---
>  util/log.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/util/log.c b/util/log.c
> index d6eb0378c3a3..39866bdaf2fa 100644
> --- a/util/log.c
> +++ b/util/log.c
> @@ -42,6 +42,7 @@ static QemuMutex global_mutex;
>  static char *global_filename;
>  static FILE *global_file;
>  static __thread FILE *thread_file;
> +static __thread Notifier qemu_log_thread_cleanup_notifier;
>  
>  int qemu_loglevel;
>  static bool log_append;
> @@ -77,6 +78,12 @@ static int log_thread_id(void)
>  #endif
>  }
>  
> +static void qemu_log_thread_cleanup(Notifier *n, void *unused)
> +{
> +fclose(thread_file);
> +thread_file = NULL;
> +}
> +
>  /* Lock/unlock output. */
>  
>  FILE *qemu_log_trylock(void)
> @@ -93,6 +100,8 @@ FILE *qemu_log_trylock(void)
>  return NULL;
>  }
>  thread_file = logfile;
> +qemu_log_thread_cleanup_notifier.notify = 
> qemu_log_thread_cleanup;
> +qemu_thread_atexit_add(_log_thread_cleanup_notifier);
>  } else {
>  rcu_read_lock();
>  /*




Re: [PATCH] util/log: do not close and reopen log files when flags are turned off

2022-10-26 Thread Greg Kurz
On Tue, 25 Oct 2022 22:27:36 +0200
Paolo Bonzini  wrote:

> Il mar 25 ott 2022, 16:39 Greg Kurz  ha scritto:
> 
> > > > -if (logfile && (!need_to_open_file || changed_name)) {
> > > > -qatomic_rcu_set(_file, NULL);
> >
> > Hmm... wait, shouldn't this NULLifying be performed...
> >
> > > > -if (logfile != stderr) {
> > > > +if (logfile) {
> > > > +fflush(logfile);
> > > > +if (changed_name && logfile != stderr) {
> > > >  RCUCloseFILE *r = g_new0(RCUCloseFILE, 1);
> > > >  r->fd = logfile;
> >
> >
> > ... here since we the following closes the global_file ?
> >
> > > >  call_rcu(r, rcu_close_file, rcu);
> >
> 
> Yes it should.
> 

I'll fix this when I repost the full series.

Cheers,

--
Greg


> Paolo
> 
> > > +logfile = NULL;
> > > >  }
> > > > -logfile = NULL;
> > > >  }
> > > >
> > > >  if (!logfile && need_to_open_file) {
> > > >  if (filename) {
> > > > -logfile = fopen(filename, log_append ? "a" : "w");
> > > > +logfile = fopen(filename, "w");
> > > >  if (!logfile) {
> > > >  error_setg_errno(errp, errno, "Error opening logfile
> > %s",
> > > >   filename);
> > > > @@ -290,8 +289,6 @@ static bool qemu_set_log_internal(const char
> > *filename, bool changed_name,
> > > >  logfile = stderr;
> > > >  }
> > > >
> > > > -log_append = 1;
> > > > -
> > > >  qatomic_rcu_set(_file, logfile);
> > > >  }
> > > >  return true;
> > >
> > >
> >
> >




Re: [PATCH] util/log: do not close and reopen log files when flags are turned off

2022-10-25 Thread Greg Kurz
On Tue, 25 Oct 2022 14:33:15 +0200
Greg Kurz  wrote:

> On Tue, 25 Oct 2022 11:21:19 +0200
> Paolo Bonzini  wrote:
> 
> > log_append makes sure that if you turn off the logging (which clears
> > log_flags and makes need_to_open_file false) the old log is not
> > overwritten.  The usecase is that if you remove or move the file
> > QEMU will not keep writing to the old file.  However, this is
> > not always the desited behavior, in particular having log_append==1
> > after changing the file name makes little sense.
> > 
> > When qemu_set_log_internal is called from the logfile monitor
> > command, filename must be non-NULL and therefore changed_name must
> > be true.  Therefore, the only case where the file is closed and
> > need_to_open_file == false is indeed when log_flags becomes
> > zero.  In this case, just flush the file and do not bother
> > closing it, thus faking the same append behavior as previously.
> > 
> > The behavioral change is that changing the logfile twice, for
> > example log1 -> log2 -> log1, will cause log1 to be overwritten.
> > This can simply be documented, since it is not a particularly
> > surprising behavior.
> > 
> > Suggested-by: Alex Bennée 
> > Signed-off-by: Paolo Bonzini 
> > ---
> 
> Heh I currently have a very similar patch in my tree :-)
> 
> Reviewed-by: Greg Kurz 
> 
> I'll include this and other bug fixes as prerequisites for my
> on-going work on logging when daemonized.
> 
> >  util/log.c | 13 +
> >  1 file changed, 5 insertions(+), 8 deletions(-)
> > 
> > diff --git a/util/log.c b/util/log.c
> > index d6eb0378c3a3..06d0173788dc 100644
> > --- a/util/log.c
> > +++ b/util/log.c
> > @@ -44,7 +44,6 @@ static FILE *global_file;
> >  static __thread FILE *thread_file;
> >  
> >  int qemu_loglevel;
> > -static bool log_append;
> >  static bool log_per_thread;
> >  static GArray *debug_regions;
> >  
> > @@ -259,19 +258,19 @@ static bool qemu_set_log_internal(const char 
> > *filename, bool changed_name,
> >  daemonized = is_daemonized();
> >  need_to_open_file = log_flags && !per_thread && (!daemonized || 
> > filename);
> >  
> > -if (logfile && (!need_to_open_file || changed_name)) {
> > -qatomic_rcu_set(_file, NULL);

Hmm... wait, shouldn't this NULLifying be performed...

> > -if (logfile != stderr) {
> > +if (logfile) {
> > +fflush(logfile);
> > +if (changed_name && logfile != stderr) {
> >  RCUCloseFILE *r = g_new0(RCUCloseFILE, 1);
> >  r->fd = logfile;


... here since we the following closes the global_file ?

> >  call_rcu(r, rcu_close_file, rcu);
> > +logfile = NULL;
> >  }
> > -logfile = NULL;
> >  }
> >  
> >  if (!logfile && need_to_open_file) {
> >  if (filename) {
> > -logfile = fopen(filename, log_append ? "a" : "w");
> > +logfile = fopen(filename, "w");
> >  if (!logfile) {
> >  error_setg_errno(errp, errno, "Error opening logfile %s",
> >   filename);
> > @@ -290,8 +289,6 @@ static bool qemu_set_log_internal(const char *filename, 
> > bool changed_name,
> >  logfile = stderr;
> >  }
> >  
> > -log_append = 1;
> > -
> >  qatomic_rcu_set(_file, logfile);
> >  }
> >  return true;
> 
> 




Re: [PATCH] util/log: do not close and reopen log files when flags are turned off

2022-10-25 Thread Greg Kurz
On Tue, 25 Oct 2022 11:21:19 +0200
Paolo Bonzini  wrote:

> log_append makes sure that if you turn off the logging (which clears
> log_flags and makes need_to_open_file false) the old log is not
> overwritten.  The usecase is that if you remove or move the file
> QEMU will not keep writing to the old file.  However, this is
> not always the desited behavior, in particular having log_append==1
> after changing the file name makes little sense.
> 
> When qemu_set_log_internal is called from the logfile monitor
> command, filename must be non-NULL and therefore changed_name must
> be true.  Therefore, the only case where the file is closed and
> need_to_open_file == false is indeed when log_flags becomes
> zero.  In this case, just flush the file and do not bother
> closing it, thus faking the same append behavior as previously.
> 
> The behavioral change is that changing the logfile twice, for
> example log1 -> log2 -> log1, will cause log1 to be overwritten.
> This can simply be documented, since it is not a particularly
> surprising behavior.
> 
> Suggested-by: Alex Bennée 
> Signed-off-by: Paolo Bonzini 
> ---

Heh I currently have a very similar patch in my tree :-)

Reviewed-by: Greg Kurz 

I'll include this and other bug fixes as prerequisites for my
on-going work on logging when daemonized.

>  util/log.c | 13 +
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/util/log.c b/util/log.c
> index d6eb0378c3a3..06d0173788dc 100644
> --- a/util/log.c
> +++ b/util/log.c
> @@ -44,7 +44,6 @@ static FILE *global_file;
>  static __thread FILE *thread_file;
>  
>  int qemu_loglevel;
> -static bool log_append;
>  static bool log_per_thread;
>  static GArray *debug_regions;
>  
> @@ -259,19 +258,19 @@ static bool qemu_set_log_internal(const char *filename, 
> bool changed_name,
>  daemonized = is_daemonized();
>  need_to_open_file = log_flags && !per_thread && (!daemonized || 
> filename);
>  
> -if (logfile && (!need_to_open_file || changed_name)) {
> -qatomic_rcu_set(_file, NULL);
> -if (logfile != stderr) {
> +if (logfile) {
> +fflush(logfile);
> +if (changed_name && logfile != stderr) {
>  RCUCloseFILE *r = g_new0(RCUCloseFILE, 1);
>  r->fd = logfile;
>  call_rcu(r, rcu_close_file, rcu);
> +logfile = NULL;
>  }
> -logfile = NULL;
>  }
>  
>  if (!logfile && need_to_open_file) {
>  if (filename) {
> -logfile = fopen(filename, log_append ? "a" : "w");
> +logfile = fopen(filename, "w");
>  if (!logfile) {
>  error_setg_errno(errp, errno, "Error opening logfile %s",
>   filename);
> @@ -290,8 +289,6 @@ static bool qemu_set_log_internal(const char *filename, 
> bool changed_name,
>  logfile = stderr;
>  }
>  
> -log_append = 1;
> -
>  qatomic_rcu_set(_file, logfile);
>  }
>  return true;




Re: [PATCH v2 2/2] util/log: Always send errors to logfile when daemonized

2022-10-25 Thread Greg Kurz
On Mon, 24 Oct 2022 10:44:11 +0100
Alex Bennée  wrote:

> 
> Paolo Bonzini  writes:
> 
> 
> >> If we want to connect stdout/err to something when daemonized
> >> then lets either have a dedicated option for that, or simply
> >> tell apps not to use -daemonize and to take care of daemonzing
> >> themselves, thus having full control over stdout/err. The latter
> >> is what libvirt uses, because we actually want stderr/out on a
> >> pipe, not a file, in order to enforce rollover.
> >
> > I would gladly get rid of -daemonize, unfortunately it has many users.
> > Adding further complication to it is not beautiful, but overall I
> > think Greg's patch does make sense.  In particular I would continue
> > the refactoring by moving
> >
> >
> > /*
> >  * If per-thread, filename contains a single %d that should be
> >  * converted.
> >  */
> > if (per_thread) {
> > fname = g_strdup_printf(filename, getpid());
> > } else {
> > fname = g_strdup(filename);
> > }
> >
> > return fopen(fname, log_append ? "a" : "w");
> >
> > to a new function that can be used in both qemu_log_trylock() and
> > qemu_set_log_internal().  (In fact this refactoring is a bugfix
> > because per-thread log files do not currently obey log_append).
> 
> What is the use case for log_append. AFAICT it only ever applied if you
> did a dynamic set_log. Was it ever really used or should it be dropped
> as an excessive complication?
> 

The use case seems to be able to temporarily disable logging,
which closes the log file, without loosing already logged stuff
when logging is re-enabled. QEMU not overwriting previous logs
from the same run is certainly a legitimate expectation from the
user.

Complexity mostly stems from the fact that the log file gets closed
when doing `log none` from the monitor. The logic is also a bit
inconsistent : initial open ensures that we go with a pristine log
file, but renaming the file from the monitor will gladly append
messages to a pre-existing unrelated file...

> From my point of view appending to an existing per-thread log is just
> going to cause confusion.
> 

... and cause confusion all the same.

I'd rather leave the log file always open, except on renames,
and always open in truncating mode.

> >
> > Paolo
> 
> 




Re: [PATCH v2 1/2] util/log: Derive thread id from getpid() on hosts w/o gettid() syscall

2022-10-21 Thread Greg Kurz
On Thu, 20 Oct 2022 12:39:41 +0200
Paolo Bonzini  wrote:

> On 10/19/22 17:57, Daniel P. Berrangé wrote:
> >> +if (my_id == -1) {
> >> +my_id = getpid() + qatomic_fetch_inc();
> >> +}
> >> +return my_id;
> > This doesn't look safe for linux-user when we fork, but don't exec.
> 
> Linux-user won't ever get here, however bsd-user might.  We should have 
> get_thread_id() somewhere in util/, for example
> 
> https://github.com/wine-mirror/wine/blob/master/dlls/ntdll/unix/server.c
> 

We have qemu_get_thread_id() already :

https://git.qemu.org/?p=qemu.git;a=blob;f=util/oslib-posix.c;h=827a7aadba444cdb128284f5b4ba43934c78c3db;hb=HEAD#l96

> > The getpid() will change after the fork, but counter won't be
> > reset, so a thread in the parent could clash with a thread
> > in the forked child.
> 
> It might clash even if the counter is reset for that matter.
> 

Yes.

> Paolo
> 




[PATCH] util/log: Close per-thread log file on thread termination

2022-10-21 Thread Greg Kurz
When `-D ${logfile} -d tid` is passed, qemu_log_trylock() creates
a dedicated log file for the current thread and opens it. The
corresponding file descriptor is cached in a __thread variable.
Nothing is done to close the corresponding file descriptor when the
thread terminates though and the file descriptor is leaked.

The issue was found during code inspection and reproduced manually.

Fix that with an atexit notifier.

Fixes: 4e51069d6793 ("util/log: Support per-thread log files")
Cc: richard.hender...@linaro.org
Signed-off-by: Greg Kurz 
---
 util/log.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/util/log.c b/util/log.c
index d6eb0378c3a3..39866bdaf2fa 100644
--- a/util/log.c
+++ b/util/log.c
@@ -42,6 +42,7 @@ static QemuMutex global_mutex;
 static char *global_filename;
 static FILE *global_file;
 static __thread FILE *thread_file;
+static __thread Notifier qemu_log_thread_cleanup_notifier;
 
 int qemu_loglevel;
 static bool log_append;
@@ -77,6 +78,12 @@ static int log_thread_id(void)
 #endif
 }
 
+static void qemu_log_thread_cleanup(Notifier *n, void *unused)
+{
+fclose(thread_file);
+thread_file = NULL;
+}
+
 /* Lock/unlock output. */
 
 FILE *qemu_log_trylock(void)
@@ -93,6 +100,8 @@ FILE *qemu_log_trylock(void)
 return NULL;
 }
 thread_file = logfile;
+qemu_log_thread_cleanup_notifier.notify = qemu_log_thread_cleanup;
+qemu_thread_atexit_add(_log_thread_cleanup_notifier);
 } else {
 rcu_read_lock();
 /*
-- 
2.37.3




Re: [PATCH v2 2/2] util/log: Always send errors to logfile when daemonized

2022-10-20 Thread Greg Kurz
On Thu, 20 Oct 2022 12:52:21 +0200
Paolo Bonzini  wrote:

> On 10/20/22 11:58, Daniel P. Berrangé wrote:
> > 
> > The '-d' option enables logging in QEMU, primary for things
> > related to TCG. By default that logging goes to stderr, but
> > it can be sent to 1 or mnay files, using -D. IOW, -D is NOT
> > about controlling where stderr/out is connected. It is
> > about where TCG logging goes.
> 

I agree about the semantics of -D but the implementation forces
the logging to go through stderr connected to a file in the
daemonize case... so in the end -D controls where stderr
is connected to, and this affects any other non-log user of
stderr.

> (Aside: it's not just TCG logging.  The default tracing backend is also 
> printing to -D).
> 
> > Separately, IIUC, you found that when using -daemonize any
> > error_report() messages end up in the void, because stderr
> > is connected to /dev/null.
> > 
> > This patch is thus attempting to repurpose -D as a way to
> > say where error_report() messages end up with daemonized,
> > and this creates the complexity  because -D was never
> > intended to be a mechanism to control stderr or error_report
> > output.
> 
> True, but it already does that if "-d" is specified, because "-d" 
> *intentionally* reopens stderr when -daemonize is specified.  So I think 
> overall the idea of "make -D always open the destination when 
> daemonizing" is sound,

This is exactly why I decided to try this approach.

> the only weird thing is the interaction with "-d 
> tid" which is fixed if we just replace the fallback case from 
> log_thread_id() as in Wine's get_unix_tid() code.  "-d tid" can just be 
> forbidden if the platform is not supported by get_unix_tid().
> 

static int get_unix_tid(void)
{
int ret = -1;
#ifdef HAVE_PTHREAD_GETTHREADID_NP
ret = pthread_getthreadid_np();
#elif defined(linux)
ret = syscall( __NR_gettid );
#elif defined(__sun)
ret = pthread_self();
#elif defined(__APPLE__)
ret = mach_thread_self();
mach_port_deallocate(mach_task_self(), ret);
#elif defined(__NetBSD__)
ret = _lwp_self();
#elif defined(__FreeBSD__)
long lwpid;
thr_self(  );
ret = lwpid;
#elif defined(__DragonFly__)
ret = lwp_gettid();
#endif
return ret;
}

We could import all these cases except the defined(linux) case maybe since
it should be covered by what we already have in log_thread_id() :

#ifdef CONFIG_GETTID
return gettid();
#elif defined(SYS_gettid)
return syscall(SYS_gettid);

> > If we want to connect stdout/err to something when daemonized
> > then lets either have a dedicated option for that, or simply
> > tell apps not to use -daemonize and to take care of daemonzing
> > themselves, thus having full control over stdout/err. The latter
> > is what libvirt uses, because we actually want stderr/out on a
> > pipe, not a file, in order to enforce rollover.
> 
> I would gladly get rid of -daemonize, unfortunately it has many users. 
> Adding further complication to it is not beautiful, but overall I think 
> Greg's patch does make sense.  In particular I would continue the 
> refactoring by moving
> 
> 
>  /*
>   * If per-thread, filename contains a single %d that should be
>   * converted.
>   */
>  if (per_thread) {
>  fname = g_strdup_printf(filename, getpid());
>  } else {
>  fname = g_strdup(filename);
>  }
> 
>  return fopen(fname, log_append ? "a" : "w");
> 

+1

> to a new function that can be used in both qemu_log_trylock() and 
> qemu_set_log_internal().  (In fact this refactoring is a bugfix because 
> per-thread log files do not currently obey log_append).
> 

I had missed that but yes indeed... I'll fix that in a preparatory
patch.

> Paolo
> 

Thanks Paolo !

--
Greg



Re: [PATCH v2 2/2] util/log: Always send errors to logfile when daemonized

2022-10-20 Thread Greg Kurz
On Thu, 20 Oct 2022 12:21:27 +1000
Richard Henderson  wrote:

> On 10/20/22 01:16, Greg Kurz wrote:
> > When QEMU is started with `-daemonize`, all stdio descriptors get
> > redirected to `/dev/null`. This basically means that anything
> > printed with error_report() and friends is lost.
> > 
> > One could hope that passing `-D ${logfile}` would cause the messages
> > to go to `${logfile}`, as the documentation tends to suggest:
> > 
> >-D logfile
> >Output log in logfile instead of to stderr
> > 
> > Unfortunately, `-D` belongs to the logging framework and it only
> > does this redirection if some log item is also enabled with `-d`
> > or if QEMU was configured with `--enable-trace-backend=log`. A
> > typical production setup doesn't do tracing or fine-grain
> > debugging but it certainly needs to collect errors.
> > 
> > Ignore the check on enabled log items when QEMU is daemonized. Previous
> > behaviour is retained for the non-daemonized case. The logic is unrolled
> > as an `if` for better readability. Since qemu_set_log_internal() caches
> > the final log level and the per-thread property in global variables, it
> > seems more correct to check these instead of intermediary local variables.
> > 
> > Special care is needed for the `-D ${logfile} -d tid` case : `${logfile}`
> > is expected to be a template that contains exactly one `%d` that should be
> > expanded to a PID or TID. The logic in qemu_log_trylock() already takes
> > care of that for per-thread logs. Do it as well for the QEMU main thread
> > when opening the file.
> 
> I don't understand why daemonize changes -d tid at all.
> If there's a bug there, please separate it out.
> 
> I don't understand the is_main_log_thread checks.
> Why is the main thread special?
> 

The current code base either opens a per-thread file in
qemu_log_trylock() when -d tid is enabled, or only a
single global file in qemu_log_set_internal() in the
opposite case.

The goal of this patch is to go through the `In case we
are a daemon redirect stderr to logfile` logic, so that
other users of stderr, aka. error_report(), can benefit
from it as well. Since this is only done for the global
file, the logic was changed to : _main_ thread to always
use the global file and other threads to use the per-thread
file.

I now realize how terrible a choice this is. It violates
the current logic too much and brings new problems like
"how to identify the main thread"...

> > -/*
> > - * In all cases we only log if qemu_loglevel is set.
> > - * Also:
> > - *   If per-thread, open the file for each thread in qemu_log_lock.
> > - *   If not daemonized we will always log either to stderr
> > - * or to a file (if there is a filename).
> > - *   If we are daemonized, we will only log if there is a filename.
> > - */
> >   daemonized = is_daemonized();
> > -need_to_open_file = log_flags && !per_thread && (!daemonized || 
> > filename);
> > +need_to_open_file = false;
> > +if (!daemonized) {
> > +/*
> > + * If not daemonized we only log if qemu_loglevel is set, either to
> > + * stderr or to a file (if there is a filename).
> > + * If per-thread, open the file for each thread in 
> > qemu_log_trylock().
> > + */
> > +need_to_open_file = qemu_loglevel && !log_per_thread;
> > +} else {
> > +/*
> > + * If we are daemonized, we will only log if there is a filename.
> > + */
> > +need_to_open_file = filename != NULL;
> > +}
> 
> I would have thought that this was the only change required -- ignoring 
> qemu_loglevel when 
> daemonized.
> 

I was thinking the same at first, but this ended up in the
global file being open with a filename containing a '%d'...
I chose the direction of doing the g_strdup_printf() trick
for the global file as well but then I had to make sure
that qemu_log_trylock() wouldn't try later to open the same
file, hence the _main_ thread check...

The question is actually : where stderr should point to in
the '-daemonize -D foo%d.log -d tid` case ?

> 

> r~




Re: [PATCH v2 1/2] util/log: Derive thread id from getpid() on hosts w/o gettid() syscall

2022-10-20 Thread Greg Kurz
On Wed, 19 Oct 2022 16:57:54 +0100
Daniel P. Berrangé  wrote:

> On Wed, Oct 19, 2022 at 05:16:50PM +0200, Greg Kurz wrote:
> > A subsequent patch needs to be able to differentiate the main QEMU
> > thread from other threads. An obvious way to do so is to compare
> > log_thread_id() and getpid(), based on the fact that they are equal
> > for the main thread on systems that have the gettid() syscall (e.g.
> > linux).
> > 
> > Adapt the fallback code for systems without gettid() to provide the
> > same assumption.
> > 
> > Suggested-by: Paolo Bonzini 
> > Signed-off-by: Greg Kurz 
> > ---
> >  util/log.c | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/util/log.c b/util/log.c
> > index d6eb0378c3a3..e1c2535cfcd2 100644
> > --- a/util/log.c
> > +++ b/util/log.c
> > @@ -72,8 +72,13 @@ static int log_thread_id(void)
> >  #elif defined(SYS_gettid)
> >  return syscall(SYS_gettid);
> >  #else
> > +static __thread int my_id = -1;
> >  static int counter;
> > -return qatomic_fetch_inc();
> > +
> > +if (my_id == -1) {
> > +my_id = getpid() + qatomic_fetch_inc();
> > +}
> > +return my_id;
> 
> This doesn't look safe for linux-user when we fork, but don't exec.
> 

... which is a "dangerous" situation if the parent is already
multi-threaded at fork() time. The child thread must only call
async-signal-safe functions and...


> The getpid() will change after the fork, but counter won't be
> reset, so a thread in the parent could clash with a thread
> in the forked child.
> 

... pthread_create() isn't one AFAIK. This case has undefined
behavior.

Anyway, no matter what we do, even with a regular fork+exec pattern,
log_thread_id() no longer guarantees unique values for all threads
that could be running concurrently (unlike gettid() or counter++),
e.g. 
- parent process with pid A and one extra thread
  => parent uses thread ids A and A+1
- fork child process with pid B == A+1
- child execs
  => child uses thread id A+1

> I feel like if we want to check for the main thread, we should
> be using pthread_self(), and compare result against the value
> cached from main. Or cache in a __constructor__ function in
> log.c to keep it isolated from main().
> 

Hmm... pthread_self() is only guaranteed to be unique within
a process. It doesn't look safe either to compare results
of pthread_self() from different process contexts.

> 
> With regards,
> Daniel

Thanks for bringing this corner case up ! It highlights that
I should definitely go for another approach that doesn't
require to check for the main thread at all.

Cheers,

--
Greg



[PATCH v2 0/2] util/log: Always send errors to logfile when daemonized

2022-10-19 Thread Greg Kurz
When QEMU is started with `--daemonize -D ${logfile} -d ${some_log_item}`,
error logs from error_report() and friends go to ${logfile}, but if QEMU
is started with `-daemonize -D ${logfile}` and no `-d`, the file isn't
even created and all logs go to /dev/null.

This inconsistency is quite confusing for users and gives the impression
that QEMU doesn't log errors at all. It seems much saner to always create
the log file when `-D` was passed and to be able to report errors.

It was spotted by the kata-containers project, which happens to do just
that `--daemonize -D` without `-d` trick. It is possible that they will
stop doing so and catch errors through QEMU's stderr at some point, but
I'm posting the patches anyway.

v2:
- new log_thread_id() implementation for hosts without gettid() syscall
- avoid conflict between global log file and per-thread logfile
- style improvements

Greg Kurz (2):
  util/log: Derive thread id from getpid() on hosts w/o gettid() syscall
  util/log: Always send errors to logfile when daemonized

 util/log.c | 56 --
 1 file changed, 42 insertions(+), 14 deletions(-)

-- 
2.37.3





[PATCH v2 2/2] util/log: Always send errors to logfile when daemonized

2022-10-19 Thread Greg Kurz
When QEMU is started with `-daemonize`, all stdio descriptors get
redirected to `/dev/null`. This basically means that anything
printed with error_report() and friends is lost.

One could hope that passing `-D ${logfile}` would cause the messages
to go to `${logfile}`, as the documentation tends to suggest:

  -D logfile
  Output log in logfile instead of to stderr

Unfortunately, `-D` belongs to the logging framework and it only
does this redirection if some log item is also enabled with `-d`
or if QEMU was configured with `--enable-trace-backend=log`. A
typical production setup doesn't do tracing or fine-grain
debugging but it certainly needs to collect errors.

Ignore the check on enabled log items when QEMU is daemonized. Previous
behaviour is retained for the non-daemonized case. The logic is unrolled
as an `if` for better readability. Since qemu_set_log_internal() caches
the final log level and the per-thread property in global variables, it
seems more correct to check these instead of intermediary local variables.

Special care is needed for the `-D ${logfile} -d tid` case : `${logfile}`
is expected to be a template that contains exactly one `%d` that should be
expanded to a PID or TID. The logic in qemu_log_trylock() already takes
care of that for per-thread logs. Do it as well for the QEMU main thread
when opening the file.

Note that qemu_log_trylock() now must ensure that the main QEMU thread
only uses the global log file ; qemu_log_unlock() must be adapted as well
by checking thread_file which is always equal to NULL for the main thread.

Signed-off-by: Greg Kurz 
---
 util/log.c | 49 -
 1 file changed, 36 insertions(+), 13 deletions(-)

diff --git a/util/log.c b/util/log.c
index e1c2535cfcd2..0fa23729c78c 100644
--- a/util/log.c
+++ b/util/log.c
@@ -82,6 +82,11 @@ static int log_thread_id(void)
 #endif
 }
 
+static bool is_main_log_thread(void)
+{
+return log_thread_id() == getpid();
+}
+
 /* Lock/unlock output. */
 
 FILE *qemu_log_trylock(void)
@@ -90,7 +95,8 @@ FILE *qemu_log_trylock(void)
 
 logfile = thread_file;
 if (!logfile) {
-if (log_per_thread) {
+/* Main thread to use the global file only */
+if (log_per_thread && !is_main_log_thread()) {
 g_autofree char *filename
 = g_strdup_printf(global_filename, log_thread_id());
 logfile = fopen(filename, "w");
@@ -124,7 +130,7 @@ void qemu_log_unlock(FILE *logfile)
 if (logfile) {
 fflush(logfile);
 qemu_funlockfile(logfile);
-if (!log_per_thread) {
+if (!thread_file) {
 rcu_read_unlock();
 }
 }
@@ -253,16 +259,21 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 #endif
 qemu_loglevel = log_flags;
 
-/*
- * In all cases we only log if qemu_loglevel is set.
- * Also:
- *   If per-thread, open the file for each thread in qemu_log_lock.
- *   If not daemonized we will always log either to stderr
- * or to a file (if there is a filename).
- *   If we are daemonized, we will only log if there is a filename.
- */
 daemonized = is_daemonized();
-need_to_open_file = log_flags && !per_thread && (!daemonized || filename);
+need_to_open_file = false;
+if (!daemonized) {
+/*
+ * If not daemonized we only log if qemu_loglevel is set, either to
+ * stderr or to a file (if there is a filename).
+ * If per-thread, open the file for each thread in qemu_log_trylock().
+ */
+need_to_open_file = qemu_loglevel && !log_per_thread;
+} else {
+/*
+ * If we are daemonized, we will only log if there is a filename.
+ */
+need_to_open_file = filename != NULL;
+}
 
 if (logfile && (!need_to_open_file || changed_name)) {
 qatomic_rcu_set(_file, NULL);
@@ -276,10 +287,22 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 
 if (!logfile && need_to_open_file) {
 if (filename) {
-logfile = fopen(filename, log_append ? "a" : "w");
+g_autofree char *fname = NULL;
+
+/*
+ * If per-thread, filename contains a single %d that should be
+ * converted.
+ */
+if (per_thread) {
+fname = g_strdup_printf(filename, getpid());
+} else {
+fname = g_strdup(filename);
+}
+
+logfile = fopen(fname, log_append ? "a" : "w");
 if (!logfile) {
 error_setg_errno(errp, errno, "Error opening logfile %s",
- filename);
+ fname);
 return false;
 }
 /* In case we are a daemon redirect stderr to logfile */
-- 
2.37.3




[PATCH v2 1/2] util/log: Derive thread id from getpid() on hosts w/o gettid() syscall

2022-10-19 Thread Greg Kurz
A subsequent patch needs to be able to differentiate the main QEMU
thread from other threads. An obvious way to do so is to compare
log_thread_id() and getpid(), based on the fact that they are equal
for the main thread on systems that have the gettid() syscall (e.g.
linux).

Adapt the fallback code for systems without gettid() to provide the
same assumption.

Suggested-by: Paolo Bonzini 
Signed-off-by: Greg Kurz 
---
 util/log.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/util/log.c b/util/log.c
index d6eb0378c3a3..e1c2535cfcd2 100644
--- a/util/log.c
+++ b/util/log.c
@@ -72,8 +72,13 @@ static int log_thread_id(void)
 #elif defined(SYS_gettid)
 return syscall(SYS_gettid);
 #else
+static __thread int my_id = -1;
 static int counter;
-return qatomic_fetch_inc();
+
+if (my_id == -1) {
+my_id = getpid() + qatomic_fetch_inc();
+}
+return my_id;
 #endif
 }
 
-- 
2.37.3




Re: [PATCH] util/log: Always send errors to logfile when daemonized

2022-10-14 Thread Greg Kurz
On Fri, 14 Oct 2022 10:51:36 +0200
Paolo Bonzini  wrote:

> On 10/14/22 08:08, Greg Kurz wrote:
> > 
> > +need_to_open_file = log_flags && !per_thread;
> 
> Pre-existing, but I think this should check log_per_thread instead of 
> per_thread.
> 

Yes I agree, and also check qemu_loglevel instead of log_flags for
the same reason (and to match the comment just above).

> > +} else if (filename) {
> > +/*
> > + * If we are daemonized, we will only log if there is a filename.
> > + */
> > +need_to_open_file = true;
> 
> Slightly nicer:
> 
>  } else {
> /*
>  * If daemonized, always log to the -D file if present.
>  */
>  need_to_open_file = filename != NULL;
>  }
> 

Sure.

> > @@ -271,10 +276,22 @@ static bool qemu_set_log_internal(const char 
> > *filename, bool changed_name,
> >   
> >   if (!logfile && need_to_open_file) {
> >   if (filename) {
> > -logfile = fopen(filename, log_append ? "a" : "w");
> > +g_autofree char *fname = NULL;
> > +
> > +/*
> > + * If per-thread, filename contains a single %d that should be
> > + * converted.
> > + */
> > +if (per_thread) {
> > +fname = g_strdup_printf(filename, getpid());
> > +} else {
> > +fname = g_strdup(filename);
> > +}
> > +
> > +logfile = fopen(fname, log_append ? "a" : "w");
> >   if (!logfile) {
> >   error_setg_errno(errp, errno, "Error opening logfile %s",
> > - filename);
> > + fname);
> >   return false;
> >   }
> >   /* In case we are a daemon redirect stderr to logfile */
> 
> This could conflict with the file opened by qemu_log_trylock() when 
> per-thread logging is enabled *and* QEMU is daemonized.  Perhaps 
> something like:
> 

Yeah... if the main thread happens to call qemu_log(), it then opens
a file with the same name indeed. Thanks for catching that !

> 1) change qemu_log_trylock() to
> 
> -if (log_per_thread) {
> +if (log_per_thread && log_thread_id() != getpid()) {
> 
> i.e. use the global_file for the main thread
> 
> 2) change qemu_log_unlock() to
> 
> -if (!log_per_thread) {
> +if (!thread_file) {
> 
> to match (1)
> 
> 3) change log_thread_id() to something like
> 
> ...
> #else
>  static __thread int my_id = -1;
>  static int counter;
>  if (my_id == -1) {
>  my_id = getpid() + qatomic_fetch_inc();
>  }
>  return my_id;
> #endif
> 
> and perhaps do a dummy trylock/unlock late in qemu_set_log_internal(), 
> to ensure that the main thread is the one with log_thread_id() == getpid()?
> 
> I think this can be a separate patch before this one.
> 

2) and 3) can certainly be preparatory work but I think 1)
should be squashed in my patch. Because of the !per_thread
check in need_to_open_file, the existing code in
qemu_set_log_internal() doesn't even open the global file
and qemu_log_trylock() would always return NULL for the
main thread.

Thanks for the quick answer and suggestions !

> Paolo
> 




[PATCH] util/log: Always send errors to logfile when daemonized

2022-10-14 Thread Greg Kurz
When QEMU is started with `-daemonize`, all stdio descriptors get
redirected to `/dev/null`. This basically means that anything
printed with `error_report()` and friends is lost.

One could hope that passing `-D ${logfile}` would cause the messages
to go to `${logfile}`, as the documentation tends to suggest:

  -D logfile
  Output log in logfile instead of to stderr

Unfortunately, `-D` belongs to the logging framework and it only
does this redirection if some log item is also enabled with `-d`
or if QEMU was configured with `--enable-trace-backend=log`. A
typical production setup doesn't do tracing or fine-grain
debugging but it certainly needs to collect errors.

Ignore the check on enabled log items when QEMU is daemonized. Previous
behaviour is retained for the non-daemonized case. The logic is unrolled
as a series of `if` for better readability.

Special care is needed for the `-D ${logfile} -d tid` case : `${logfile}`
is expected to be a template that contains exactly one `%d` that should be
expanded to a PID or TID. The logic in `qemu_log_trylock()` already takes
care of that for per-thread logs. Do it as well for the QEMU main thread
when opening the file.

Signed-off-by: Greg Kurz 
---
 util/log.c | 39 ---
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/util/log.c b/util/log.c
index d6eb0378c3a3..a4592fa9bb70 100644
--- a/util/log.c
+++ b/util/log.c
@@ -248,16 +248,21 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 #endif
 qemu_loglevel = log_flags;
 
-/*
- * In all cases we only log if qemu_loglevel is set.
- * Also:
- *   If per-thread, open the file for each thread in qemu_log_lock.
- *   If not daemonized we will always log either to stderr
- * or to a file (if there is a filename).
- *   If we are daemonized, we will only log if there is a filename.
- */
 daemonized = is_daemonized();
-need_to_open_file = log_flags && !per_thread && (!daemonized || filename);
+need_to_open_file = false;
+if (!daemonized) {
+/*
+ * If not daemonized we only log if qemu_loglevel is set, either to
+ * stderr or to a file (if there is a filename).
+ * If per-thread, open the file for each thread in qemu_log_trylock().
+ */
+need_to_open_file = log_flags && !per_thread;
+} else if (filename) {
+/*
+ * If we are daemonized, we will only log if there is a filename.
+ */
+need_to_open_file = true;
+}
 
 if (logfile && (!need_to_open_file || changed_name)) {
 qatomic_rcu_set(_file, NULL);
@@ -271,10 +276,22 @@ static bool qemu_set_log_internal(const char *filename, 
bool changed_name,
 
 if (!logfile && need_to_open_file) {
 if (filename) {
-logfile = fopen(filename, log_append ? "a" : "w");
+g_autofree char *fname = NULL;
+
+/*
+ * If per-thread, filename contains a single %d that should be
+ * converted.
+ */
+if (per_thread) {
+fname = g_strdup_printf(filename, getpid());
+} else {
+fname = g_strdup(filename);
+}
+
+logfile = fopen(fname, log_append ? "a" : "w");
 if (!logfile) {
 error_setg_errno(errp, errno, "Error opening logfile %s",
- filename);
+ fname);
 return false;
 }
 /* In case we are a daemon redirect stderr to logfile */
-- 
2.37.3




Re: [PATCH 00/20] tests/9p: introduce declarative function calls

2022-10-12 Thread Greg Kurz
On Wed, 12 Oct 2022 12:00:40 +0200
Christian Schoenebeck  wrote:

> On Dienstag, 4. Oktober 2022 22:56:44 CEST Christian Schoenebeck wrote:
> > This series converts relevant 9p (test) client functions to use named
> > function arguments. For instance
> > 
> > do_walk_expect_error(v9p, "non-existent", ENOENT);
> > 
> > becomes
> > 
> > twalk({
> > .client = v9p, .path = "non-existent", .expectErr = ENOENT
> > });
> > 
> > The intention is to make the actual 9p test code more readable, and easier
> > to maintain on the long-term.
> > 
> > Not only makes it clear what a literal passed to a function is supposed to
> > do, it also makes the order and selection of arguments very liberal, and
> > allows to merge multiple, similar functions into one single function.
> > 
> > This is basically just refactoring, it does not change behaviour.
> 
> Too massive for review?
> 

Yeah, sorry :-(

But since the approach you're taking here may be valuable elsewhere,
and this is qtest, it seems fair to ask Thomas and Laurent to have
a look :-)

> If so, then I'll probably just go ahead and prepare a PR early next week with 
> this queued as well. It's just test code refactoring, so I am quite painless 
> about these changes.
> 
> Best regards,
> Christian Schoenebeck
> 
> > 
> > PREREQUISITES
> > =
> > 
> > This series requires the following additional patch to work correctly:
> > 
> > https://lore.kernel.org/all/e1odrya-0004fv...@lizzy.crudebyte.com/
> > https://github.com/cschoenebeck/qemu/commit/23d01367fc7a4f27be323ed6d195c527
> > bec9ede1
> > 
> > Christian Schoenebeck (20):
> >   tests/9p: merge *walk*() functions
> >   tests/9p: simplify callers of twalk()
> >   tests/9p: merge v9fs_tversion() and do_version()
> >   tests/9p: merge v9fs_tattach(), do_attach(), do_attach_rqid()
> >   tests/9p: simplify callers of tattach()
> >   tests/9p: convert v9fs_tgetattr() to declarative arguments
> >   tests/9p: simplify callers of tgetattr()
> >   tests/9p: convert v9fs_treaddir() to declarative arguments
> >   tests/9p: simplify callers of treaddir()
> >   tests/9p: convert v9fs_tlopen() to declarative arguments
> >   tests/9p: simplify callers of tlopen()
> >   tests/9p: convert v9fs_twrite() to declarative arguments
> >   tests/9p: simplify callers of twrite()
> >   tests/9p: convert v9fs_tflush() to declarative arguments
> >   tests/9p: merge v9fs_tmkdir() and do_mkdir()
> >   tests/9p: merge v9fs_tlcreate() and do_lcreate()
> >   tests/9p: merge v9fs_tsymlink() and do_symlink()
> >   tests/9p: merge v9fs_tlink() and do_hardlink()
> >   tests/9p: merge v9fs_tunlinkat() and do_unlinkat()
> >   tests/9p: remove unnecessary g_strdup() calls
> > 
> >  tests/qtest/libqos/virtio-9p-client.c | 569 +-
> >  tests/qtest/libqos/virtio-9p-client.h | 408 --
> >  tests/qtest/virtio-9p-test.c  | 529 
> >  3 files changed, 1031 insertions(+), 475 deletions(-)
> 
> 
> 




Re: [PATCH] MAINTAINERS: step back from PPC

2022-09-29 Thread Greg Kurz
On Thu, 29 Sep 2022 20:09:46 +0200
Cédric Le Goater  wrote:

> I am not active anymore on the PPC maintainership, degrade my self as
> standard Reviewer. Also degrade PowerNV and XIVE status since I am not
> funded for this work.
> 
> Signed-off-by: Cédric Le Goater 
> ---

End of an era. Thank you for all the dedication and accomplishments !

Reviewed-by: Greg Kurz 

>  MAINTAINERS | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1729c0901cea..40f4984b439b 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -267,8 +267,8 @@ F: hw/openrisc/
>  F: tests/tcg/openrisc/
>  
>  PowerPC TCG CPUs
> -M: Cédric Le Goater 
>  M: Daniel Henrique Barboza 
> +R: Cédric Le Goater 
>  R: David Gibson 
>  R: Greg Kurz 
>  L: qemu-...@nongnu.org
> @@ -392,8 +392,8 @@ F: target/mips/kvm*
>  F: target/mips/sysemu/
>  
>  PPC KVM CPUs
> -M: Cédric Le Goater 
>  M: Daniel Henrique Barboza 
> +R: Cédric Le Goater 
>  R: David Gibson 
>  R: Greg Kurz 
>  S: Maintained
> @@ -1365,8 +1365,8 @@ F: include/hw/rtc/m48t59.h
>  F: tests/avocado/ppc_prep_40p.py
>  
>  sPAPR (pseries)
> -M: Cédric Le Goater 
>  M: Daniel Henrique Barboza 
> +R: Cédric Le Goater 
>  R: David Gibson 
>  R: Greg Kurz 
>  L: qemu-...@nongnu.org
> @@ -1387,7 +1387,7 @@ F: tests/avocado/ppc_pseries.py
>  PowerNV (Non-Virtualized)
>  M: Cédric Le Goater 
>  L: qemu-...@nongnu.org
> -S: Maintained
> +S: Odd Fixes
>  F: docs/system/ppc/powernv.rst
>  F: hw/ppc/pnv*
>  F: hw/intc/pnv*
> @@ -2321,7 +2321,7 @@ T: git https://github.com/philmd/qemu.git fw_cfg-next
>  XIVE
>  M: Cédric Le Goater 
>  L: qemu-...@nongnu.org
> -S: Supported
> +S: Odd Fixes
>  F: hw/*/*xive*
>  F: include/hw/*/*xive*
>  F: docs/*/*xive*




Re: [PATCH 1/1] 9pfs: avoid iterator invalidation in v9fs_mark_fids_unreclaim

2022-09-27 Thread Greg Kurz
On Tue, 27 Sep 2022 19:14:33 +0200
Christian Schoenebeck  wrote:

> On Dienstag, 27. September 2022 15:05:13 CEST Linus Heckemann wrote:
> > Christian Schoenebeck  writes:
> > > Ah, you sent this fix as a separate patch on top. I actually just meant
> > > that you would take my already queued patch as the latest version (just
> > > because I had made some minor changes on my end) and adjust that patch
> > > further as v4.
> > > 
> > > Anyway, there are still some things to do here, so maybe you can send your
> > > patch squashed in the next round ...
> > 
> > I see, will do!
> > 
> > >> @Christian: I still haven't been able to reproduce the issue that this
> > >> commit is supposed to fix (I tried building KDE too, no problems), so
> > >> it's a bit of a shot in the dark. It certainly still runs and I think it
> > >> should fix the issue, but it would be great if you could test it.
> > > 
> > > No worries about reproduction, I will definitely test this thoroughly. ;-)
> > > 
> > >>  hw/9pfs/9p.c | 46 ++
> > >>  1 file changed, 30 insertions(+), 16 deletions(-)
> > >> 
> > >> diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
> > >> index f4c1e37202..825c39e122 100644
> > >> --- a/hw/9pfs/9p.c
> > >> +++ b/hw/9pfs/9p.c
> > >> @@ -522,33 +522,47 @@ static int coroutine_fn
> > >> v9fs_mark_fids_unreclaim(V9fsPDU *pdu, V9fsPath *path) V9fsFidState
> > >> *fidp;
> > >> 
> > >>  gpointer fid;
> > >>  GHashTableIter iter;
> > >> 
> > >> +/*
> > >> + * The most common case is probably that we have exactly one
> > >> + * fid for the given path, so preallocate exactly one.
> > >> + */
> > >> +GArray *to_reopen = g_array_sized_new(FALSE, FALSE,
> > >> sizeof(V9fsFidState*), 1); +gint i;
> > > 
> > > Please use `g_autoptr(GArray)` instead of `GArray *`, that avoids the need
> > > for explicit calls to g_array_free() below.
> > 
> > Good call. I'm not familiar with glib, so I didn't know about this :)
> > 
> > >> -fidp->flags |= FID_NON_RECLAIMABLE;
> > > 
> > > Why did you remove that? It should still be marked as FID_NON_RECLAIMABLE,
> > > no?
> > Indeed, that was an accident.
> > 
> > >> +/*
> > >> + * Ensure the fid survives a potential clunk request during
> > >> + * v9fs_reopen_fid or put_fid.
> > >> + */
> > >> +fidp->ref++;
> > > 
> > > Hmm, bumping the refcount here makes sense, as the 2nd loop may be
> > > interrupted and the fid otherwise disappear in between, but ...
> > > 
> > >> +g_array_append_val(to_reopen, fidp);
> > >> 
> > >>  }
> > >> 
> > >> +}
> > >> 
> > >> -/* We're done with this fid */
> > >> +for (i=0; i < to_reopen->len; i++) {
> > >> +fidp = g_array_index(to_reopen, V9fsFidState*, i);
> > >> +/* reopen the file/dir if already closed */
> > >> +err = v9fs_reopen_fid(pdu, fidp);
> > >> +if (err < 0) {
> > >> +put_fid(pdu, fidp);
> > >> +g_array_free(to_reopen, TRUE);
> > >> +return err;
> > > 
> > > ... this return would then leak all remainder fids that you have bumped
> > > the
> > > refcount for above already.
> > 
> > You're right. I think the best way around it, though it feels ugly, is
> > to add a third loop in an "out:".
> 
> Either that, or continuing the loop to the end. Not that this would become 
> much prettier. I must admit I also don't really have a good idea for a clean 
> solution in this case.
> 
> > > Also: I noticed that your changes in virtfs_reset() would need the same
> > > 2-loop hack to avoid hash iterator invalidation, as it would also call
> > > put_fid() inside the loop and be prone for hash iterator invalidation
> > > otherwise.
> > Good point. Will do.
> > 
> > One more thing has occurred to me. I think the reclaiming/reopening
> > logic will misbehave in the following sequence of events:
> > 
> > 1. QEMU reclaims an open fid, losing the file handle
> > 2. The file referred to by the fid is replaced with a different file
> >(e.g. via rename or symlink) outside QEMU
> > 3. The file is accessed again by the guest, causing QEMU to reopen a
> >_different file_ from before without the guest having performed any
> >operations that should cause this to happen.
> > 
> > This is neither introduced nor resolved by my changes. Am I overlooking
> > something that avoids this (be it documentation that directories exposed
> > via 9p should not be touched by the host), or is this a real issue? I'm
> > thinking one could at least detect it by saving inode numbers in
> > V9fsFidState and comparing them when reopening, but recovering from such
> > a situation seems difficult.
> 
> Well, in that specific scenario when rename/move happens outside of QEMU then 
> yes, this might happen unfortunately. The point of this "reclaim fid" stuff 
> is 
> to deal with the fact that there is an upper limit on systems for the max. 
> amount of 

Re: [PATCH] tests/9p: split virtio-9p-test.c into tests and 9p client part

2022-09-26 Thread Greg Kurz
On Sat, 10 Sep 2022 19:46:55 +0200
Christian Schoenebeck  wrote:

> This patch is pure refactoring, it does not change behaviour.
> 
> virtio-9p-test.c grew to 1657 lines. Let's split this file up between
> actual 9p test cases vs. 9p test client, to make it easier to
> concentrate on the actual 9p tests.
> 
> Move the 9p test client code to a new unit virtio-9p-client.c, which
> are basically all functions and types prefixed with v9fs_* already.
> 
> Note that some client wrapper functions (do_*) are preserved in
> virtio-9p-test.c, simply because these wrapper functions are going to
> be wiped with subsequent patches anyway.
> 
> As the global QGuestAllocator variable is moved to virtio-9p-client.c,
> add a new function v9fs_set_allocator() to be used by virtio-9p-test.c
> instead of fiddling with a global variable across units and libraries.
> 
> Signed-off-by: Christian Schoenebeck 
> ---
> 
> As I am working on extending the previously sent RFC [1] (which will be
> using function calls with named function arguments), I realized that it
> makes sense to first split the client code out to a new file, and then
> make the upcoming patches based on this patch here. Because that way
> I don't have to touch the order of the client functions and the upcoming
> patches will therefore become better readable.
> 

Hi Christian,

The change looks quite reasonable but you'll have to fix the includes...

> [1] https://lore.kernel.org/all/e1odqqv-0003d4...@lizzy.crudebyte.com/
> 
>  tests/qtest/libqos/meson.build|   1 +
>  tests/qtest/libqos/virtio-9p-client.c | 683 +++
>  tests/qtest/libqos/virtio-9p-client.h | 139 +
>  tests/qtest/virtio-9p-test.c  | 770 +-
>  4 files changed, 849 insertions(+), 744 deletions(-)
>  create mode 100644 tests/qtest/libqos/virtio-9p-client.c
>  create mode 100644 tests/qtest/libqos/virtio-9p-client.h
> 

[..snip..]

> diff --git a/tests/qtest/libqos/virtio-9p-client.h 
> b/tests/qtest/libqos/virtio-9p-client.h
> new file mode 100644
> index 00..8bea032a85
> --- /dev/null
> +++ b/tests/qtest/libqos/virtio-9p-client.h
> @@ -0,0 +1,139 @@
> +/*
> + * 9P network client for VirtIO 9P test cases (based on QTest)
> + *
> + * Copyright (c) 2014 SUSE LINUX Products GmbH
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +/*
> + * Not so fast! You might want to read the 9p developer docs first:
> + * https://wiki.qemu.org/Documentation/9p
> + */
> +
> +#ifndef TESTS_LIBQOS_VIRTIO_9P_CLIENT_H
> +#define TESTS_LIBQOS_VIRTIO_9P_CLIENT_H
> +
> +#include "qemu/osdep.h"

... here.

As explained in `docs/devel/style.rst`, the "qemu/osdep.h" header must
only be included in .c files. Please move this #include directive to
`tests/qtest/libqos/virtio-9p-client.c`:

#include "qemu/osdep.h"
#include "virtio-9p-client.h"

With that fixed you can add R-b tag.

Cheers,

--
Greg

> +#include "hw/9pfs/9p.h"
> +#include "hw/9pfs/9p-synth.h"
> +#include "virtio-9p.h"
> +#include "qgraph.h"
> +#include "tests/qtest/libqtest-single.h"
> +
> +#define P9_MAX_SIZE 4096 /* Max size of a T-message or R-message */
> +
> +typedef struct {
> +QTestState *qts;
> +QVirtio9P *v9p;
> +uint16_t tag;
> +uint64_t t_msg;
> +uint32_t t_size;
> +uint64_t r_msg;
> +/* No r_size, it is hardcoded to P9_MAX_SIZE */
> +size_t t_off;
> +size_t r_off;
> +uint32_t free_head;
> +} P9Req;
> +
> +/* type[1] version[4] path[8] */
> +typedef char v9fs_qid[13];
> +
> +typedef struct v9fs_attr {
> +uint64_t valid;
> +v9fs_qid qid;
> +uint32_t mode;
> +uint32_t uid;
> +uint32_t gid;
> +uint64_t nlink;
> +uint64_t rdev;
> +uint64_t size;
> +uint64_t blksize;
> +uint64_t blocks;
> +uint64_t atime_sec;
> +uint64_t atime_nsec;
> +uint64_t mtime_sec;
> +uint64_t mtime_nsec;
> +uint64_t ctime_sec;
> +uint64_t ctime_nsec;
> +uint64_t btime_sec;
> +uint64_t btime_nsec;
> +uint64_t gen;
> +uint64_t data_version;
> +} v9fs_attr;
> +
> +#define P9_GETATTR_BASIC0x07ffULL /* Mask for fields up to BLOCKS */
> +
> +struct V9fsDirent {
> +v9fs_qid qid;
> +uint64_t offset;
> +uint8_t type;
> +char *name;
> +struct V9fsDirent *next;
> +};
> +
> +void v9fs_set_allocator(QGuestAllocator *t_alloc);
> +void v9fs_memwrite(P9Req *req, const void *addr, size_t len);
> +void v9fs_memskip(P9Req *req, size_t len);
> +void v9fs_memread(P9Req *req, void *addr, size_t len);
> +void v9fs_uint8_read(P9Req *req, uint8_t *val);
> +void v9fs_uint16_write(P9Req *req, uint16_t val);
> +void v9fs_uint16_read(P9Req *req, uint16_t *val);
> +void v9fs_uint32_write(P9Req *req, uint32_t val);
> +void v9fs_uint64_write(P9Req *req, uint64_t val);
> +void v9fs_uint32_read(P9Req *req, uint32_t *val);
> +void v9fs_uint64_read(P9Req *req, uint64_t *val);
> +uint16_t 

Re: [PATCH v3] 9pfs: use GHashTable for fid table

2022-09-08 Thread Greg Kurz
On Thu, 08 Sep 2022 18:10:28 +0200
Linus Heckemann  wrote:

> (sorry for the dup @Greg, forgot to reply-all)
> 
> Greg Kurz  writes:
> >> > g_hash_table_steal_extended() [1] actually allows to do just that.
> >> 
> >> g_hash_table_steal_extended unfortunately isn't available since it was
> >> introduced in glib 2.58 and we're maintaining compatibility to 2.56.
> >> 
> >
> > Ha... this could be addressed through conditional compilation, e.g.:
> 
> It still won't compile, because we set GLIB_VERSION_MAX_ALLOWED in
> glib-compat.h and it would require a compat wrapper as described

ah drat, you're right !

> there. I think that's a bit much for this far more marginal performance
> change. I'm happy to resubmit with the TODO comment though if you like?

Either that or Christian may add it when merging.

Cheers,

--
Greg



  1   2   3   4   5   6   7   8   9   10   >