date:20240115

[PULL 0/5] qtest timeouts and ROP mitigation

2024-01-15 Thread Thomas Huth

 Hi Peter!

The following changes since commit 977542ded7e6b28d2bc077bcda24568c716e393c:

  Merge tag 'pull-testing-updates-120124-2' of https://gitlab.com/stsquad/qemu 
into staging (2024-01-12 14:02:53 +)

are available in the Git repository at:

  https://gitlab.com/thuth/qemu.git tags/pull-request-2024-01-16

for you to fetch changes up to 7ff9ff039380008952c6fd32011dd2a4d5666906:

  meson: mitigate against use of uninitialize stack for exploits (2024-01-16 
07:25:27 +0100)


* Improve the timeouts for some problematic qtests
* Enable some ROP mitigation compiler switches


Daniel P. Berrangé (2):
  meson: mitigate against ROP exploits with -fzero-call-used-regs
  meson: mitigate against use of uninitialize stack for exploits

Thomas Huth (3):
  tests/qtest/meson.build: Bump the boot-serial-test timeout to 4 minutes
  tests/qtest/npcm7xx_watchdog_timer: Only test the corner cases by default
  qtest: Bump npcm7xx_watchdog_timer-test timeout to 2 minutes

 meson.build   | 16 
 tests/qtest/npcm7xx_watchdog_timer-test.c |  5 +++--
 tests/qtest/meson.build   |  3 ++-
 3 files changed, 21 insertions(+), 3 deletions(-)

[PULL 1/5] tests/qtest/meson.build: Bump the boot-serial-test timeout to 4 minutes

2024-01-15 Thread Thomas Huth

When running with TCI, the boot-serial-test can take longer than 3 minutes:

 https://gitlab.com/qemu-project/qemu/-/jobs/5890481086#L4774

Bump the timeout to 4 minutes to avoid CI failures here.

Message-ID: <20240115071146.31213-1-th...@redhat.com>
Reviewed-by: "Daniel P. Berrangé" 
Signed-off-by: Thomas Huth 
---
 tests/qtest/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index fd40136fa9..78c298ce97 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -8,7 +8,7 @@ slow_qtests = {
   'test-hmp' : 240,
   'pxe-test': 600,
   'prom-env-test': 360,
-  'boot-serial-test': 180,
+  'boot-serial-test': 240,
   'qos-test': 120,
 }
 
-- 
2.43.0

[PULL 3/5] qtest: Bump npcm7xx_watchdog_timer-test timeout to 2 minutes

2024-01-15 Thread Thomas Huth

The npcm7xx_watchdog_timer-test can take more than 60 seconds in
SPEED=slow mode on a loaded host system.

Bumping to 2 minutes will give more headroom.

Message-ID: <20240112164717.1063954-1-th...@redhat.com>
Reviewed-by: "Daniel P. Berrangé" 
Signed-off-by: Thomas Huth 
---
 tests/qtest/meson.build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 78c298ce97..4293f3b133 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -4,6 +4,7 @@ slow_qtests = {
   'device-introspect-test' : 720,
   'migration-test' : 480,
   'npcm7xx_pwm-test': 300,
+  'npcm7xx_watchdog_timer-test': 120,
   'qom-test' : 900,
   'test-hmp' : 240,
   'pxe-test': 600,
-- 
2.43.0

[PULL 5/5] meson: mitigate against use of uninitialize stack for exploits

2024-01-15 Thread Thomas Huth

From: Daniel P. Berrangé 

When variables are used without being initialized, there is potential
to take advantage of data that was pre-existing on the stack from an
earlier call, to drive an exploit.

It is good practice to always initialize variables, and the compiler
can warn about flaws when -Wuninitialized is present. This warning,
however, is by no means foolproof with its output varying depending
on compiler version and which optimizations are enabled.

The -ftrivial-auto-var-init option can be used to tell the compiler
to always initialize all variables. This increases the security and
predictability of the program, closing off certain attack vectors,
reducing the risk of unsafe memory disclosure.

While the option takes several possible values, using 'zero' is
considered to be the  option that is likely to lead to semantically
correct or safe behaviour[1]. eg sizes/indexes are not likely to
lead to out-of-bounds accesses when initialized to zero. Pointers
are less likely to point something useful if initialized to zero.

Even with -ftrivial-auto-var-init=zero set, GCC will still issue
warnings with -Wuninitialized if it discovers a problem, so we are
not loosing diagnostics for developers, just hardening runtime
behaviour and making QEMU behave more predictably in case of hitting
bad codepaths.

[1] https://lists.llvm.org/pipermail/cfe-dev/2020-April/065221.html

Signed-off-by: "Daniel P. Berrangé" 
Message-ID: <20240103123414.2401208-3-berra...@redhat.com>
Signed-off-by: Thomas Huth 
---
 meson.build | 5 +
 1 file changed, 5 insertions(+)

diff --git a/meson.build b/meson.build
index 1bda391de6..d0329966f1 100644
--- a/meson.build
+++ b/meson.build
@@ -559,6 +559,11 @@ hardening_flags = [
 # upon its return. This makes it harder to assemble
 # ROP gadgets into something usable
 '-fzero-call-used-regs=used-gpr',
+
+# Initialize all stack variables to zero. This makes
+# it harder to take advantage of uninitialized stack
+# data to drive exploits
+'-ftrivial-auto-var-init=zero',
 ]
 
 qemu_common_flags += cc.get_supported_arguments(hardening_flags)
-- 
2.43.0

[PULL 4/5] meson: mitigate against ROP exploits with -fzero-call-used-regs

2024-01-15 Thread Thomas Huth

From: Daniel P. Berrangé 

To quote wikipedia:

  "Return-oriented programming (ROP) is a computer security exploit
   technique that allows an attacker to execute code in the presence
   of security defenses such as executable space protection and code
   signing.

   In this technique, an attacker gains control of the call stack to
   hijack program control flow and then executes carefully chosen
   machine instruction sequences that are already present in the
   machine's memory, called "gadgets". Each gadget typically ends in
   a return instruction and is located in a subroutine within the
   existing program and/or shared library code. Chained together,
   these gadgets allow an attacker to perform arbitrary operations
   on a machine employing defenses that thwart simpler attacks."

QEMU is by no means perfect with an ever growing set of CVEs from
flawed hardware device emulation, which could potentially be
exploited using ROP techniques.

Since GCC 11 there has been a compiler option that can mitigate
against this exploit technique:

-fzero-call-user-regs

To understand it refer to these two resources:

   https://www.jerkeby.se/newsletter/posts/rop-reduction-zero-call-user-regs/
   https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552262.html

I used two programs to scan qemu-system-x86_64 for ROP gadgets:

  https://github.com/0vercl0k/rp
  https://github.com/JonathanSalwan/ROPgadget

When asked to find 8 byte gadgets, the 'rp' tool reports:

  A total of 440278 gadgets found.
  You decided to keep only the unique ones, 156143 unique gadgets found.

While the ROPgadget tool reports:

  Unique gadgets found: 353122

With the --ropchain argument, the latter attempts to use the found
gadgets to product a chain that can execute arbitrary syscalls. With
current QEMU it succeeds in this task, which is an undesirable
situation.

With QEMU modified to use -fzero-call-user-regs=used-gpr the 'rp' tool
reports

  A total of 528991 gadgets found.
  You decided to keep only the unique ones, 121128 unique gadgets found.

This is 22% fewer unique gadgets

While the ROPgadget tool reports:

  Unique gadgets found: 328605

This is 7% fewer unique gadgets. Crucially though, despite this more
modest reduction, the ROPgadget tool is no longer able to identify a
chain of gadgets for executing arbitrary syscalls. It fails at the
very first step, unable to find gadgets for populating registers for
a future syscall. Having said that, more advanced tools do still
manage to put together a viable ROP chain.

Also this only takes into account QEMU code. QEMU links to many 3rd
party shared libraries and ideally all of them would be compiled with
this same hardening. That becomes a distro policy question though.

In terms of performance impact, TCG was used as an evaluation test
case. We're not interested in protecting TCG since it isn't designed
to provide a security barrier, but it is performance sensitive code,
so useful as a guide to how other areas of QEMU might be impacted.
With the -fzero-call-user-regs=used-gpr argument present, using the
real world test of booting a linux kernel and having init immediately
poweroff, there is a ~1% slow down in performance under TCG. The QEMU
binary size also grows by approximately 1%.

By comparison, using the more aggressive -fzero-call-user-regs=all,
results in a slowdown of over 25% in TCG, which is clearly not an
acceptable impact, and a binary size increase of 5%.

Considering that 'used-gpr' successfully stopped ROPgadget assembling
a chain, this more targeted protection is a justifiable hardening
/ performance tradeoff.

Reviewed-by: Thomas Huth 
Signed-off-by: "Daniel P. Berrangé" 
Message-ID: <20240103123414.2401208-2-berra...@redhat.com>
Signed-off-by: Thomas Huth 
---
 meson.build | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/meson.build b/meson.build
index 38deb9363c..1bda391de6 100644
--- a/meson.build
+++ b/meson.build
@@ -552,6 +552,17 @@ if get_option('cfi')
   add_global_link_arguments(cfi_flags, native: false, language: all_languages)
 endif
 
+# Check further flags that make QEMU more robust against malicious parties
+
+hardening_flags = [
+# Zero out registers used during a function call
+# upon its return. This makes it harder to assemble
+# ROP gadgets into something usable
+'-fzero-call-used-regs=used-gpr',
+]
+
+qemu_common_flags += cc.get_supported_arguments(hardening_flags)
+
 add_global_arguments(qemu_common_flags, native: false, language: all_languages)
 add_global_link_arguments(qemu_ldflags, native: false, language: all_languages)
 
-- 
2.43.0

[PULL 2/5] tests/qtest/npcm7xx_watchdog_timer: Only test the corner cases by default

2024-01-15 Thread Thomas Huth

The test_prescaler() part in the npcm7xx_watchdog_timer test is quite
repetitive, testing all possible combinations of the WTCLK and WTIS
bitfields. Since each test spins up a new instance of QEMU, this is
rather an expensive test, especially on loaded host systems.
For the normal quick test mode, it should be sufficient to test the
corner settings of these fields (i.e. 0 and 3), so we can speed up
this test in the default mode quite a bit.

Message-ID: <20240115070223.30178-1-th...@redhat.com>
Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Thomas Huth 
---
 tests/qtest/npcm7xx_watchdog_timer-test.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/npcm7xx_watchdog_timer-test.c 
b/tests/qtest/npcm7xx_watchdog_timer-test.c
index 4773a673b2..981b853c99 100644
--- a/tests/qtest/npcm7xx_watchdog_timer-test.c
+++ b/tests/qtest/npcm7xx_watchdog_timer-test.c
@@ -172,9 +172,10 @@ static void test_reset_action(gconstpointer watchdog)
 static void test_prescaler(gconstpointer watchdog)
 {
 const Watchdog *wd = watchdog;
+int inc = g_test_quick() ? 3 : 1;
 
-for (int wtclk = 0; wtclk < 4; ++wtclk) {
-for (int wtis = 0; wtis < 4; ++wtis) {
+for (int wtclk = 0; wtclk < 4; wtclk += inc) {
+for (int wtis = 0; wtis < 4; wtis += inc) {
 QTestState *qts = qtest_init("-machine quanta-gsj");
 
 qtest_irq_intercept_in(qts, "/machine/soc/a9mpcore/gic");
-- 
2.43.0

Re: [PATCH 03/12] tests/plugin: add test plugin for inline operations

2024-01-15 Thread Pierrick Bouvier

On 1/15/24 13:04, Alex Bennée wrote:> Pierrick Bouvier 
 writes:>

On 1/13/24 21:16, Alex Bennée wrote:

Pierrick Bouvier  writes:


On 1/12/24 21:20, Alex Bennée wrote:

Pierrick Bouvier  writes:


On 1/11/24 19:57, Philippe Mathieu-Daudé wrote:

Hi Pierrick,
On 11/1/24 15:23, Pierrick Bouvier wrote:

For now, it simply performs instruction, bb and mem count, and ensure
that inline vs callback versions have the same result. Later, we'll
extend it when new inline operations are added.

Use existing plugins to test everything works is a bit cumbersome, as
different events are treated in different plugins. Thus, this new one.




+#define MAX_CPUS 8

Where does this value come from?



The plugin tests/plugin/insn.c had this constant, so I picked it up
from here.


Should the pluggin API provide a helper to ask TCG how many
vCPUs are created?


In user mode, we can't know how many simultaneous threads (and thus
vcpu) will be triggered by advance. I'm not sure if additional cpus
can be added in system mode.

One problem though, is that when you register an inline op with a
dynamic array, when you resize it (when detecting a new vcpu), you
can't change it afterwards. So, you need a storage statically sized
somewhere.

Your question is good, and maybe we should define a MAX constant that
plugins should rely on, instead of a random amount.

For user-mode it can be infinite. The existing plugins do this by
ensuring vcpu_index % max_vcpu. Perhaps we just ensure that for the
scoreboard as well? Of course that does introduce a trap for those using
user-mode...



The problem with vcpu-index % max_vcpu is that it reintroduces race
condition, though it's probably less frequent than on a single
variable. IMHO, yes it solves memory error, but does not solve the
initial problem itself.

The simplest solution would be to have a size "big enough" for most
cases, and abort when it's reached.

Well that is simple enough for system emulation as max_vcpus is a
bounded
number.


Another solution, much more complicated, but correct, would be to move
memory management of plugin scoreboard to plugin runtime, and add a
level of indirection to access it.

That certainly gives us the most control and safety. We can then
ensure
we'll never to writing past the bounds of the buffer. The plugin would
have to use an access function to get the pointer to read at the time it
cared and of course inline checks should be pretty simple.


Every time a new vcpu is added, we
can grow dynamically. This way, the array can grow, and ultimately,
plugin can poke its content/size. I'm not sure this complexity is what
we want though.

It doesn't seem too bad. We have a start/end_exclusive is *-user
do_fork
where we could update pointers. If we are smart about growing the size
of the arrays we could avoid too much re-translation.



I was concerned about a potential race when the scoreboard updates
this pointer, and other cpus are executing tb (using it). But this
concern is not valid, since start_exclusive ensures all other cpus are
stopped.

vcpu_init_hook function in plugins/core.c seems a good location to add
this logic. We would check if an update is needed, then
start_exclusive(), update the scoreboard and exit exclusive section.


It might already be in the exclusive section. We should try and re-use
an existing exclusive region rather than adding a new one. It's
expensive so best avoided if we can.


Do you think it's worth to try to inline scoreboard pointer (and flush
all tb when updated), instead of simply adding an indirection to it?
With this, we could avoid any need to re-translate anything.


Depends on the cost/complexity of the indirection I guess.
Re-translation isn't super expensive if we say double the size of the
score board each time we need to.


Do we want a limit of one scoreboard per thread? Can we store structures
in there?



 From the current plugins use case, it seems that several scoreboards
are needed.
Allowing structure storage seems a bit more tricky to me, because
since memory may be reallocated, users won't be allowed to point
directly to it. I would be in favor to avoid this (comments are
welcome).


The init function can take a sizeof(entry) field and the inline op can
have an offset field (which for the simple 0 case can be folded away by
TCG).



Thanks for all your comments and guidance on this.

I implemented a new version, working with a scoreboard that gets resized 
automatically, which allows usage of structs as well. The result is 
pretty satisfying as there is almost no more need to manually keep track 
of how many cpus have been used, while offering thread-safety by default.


I'll re-spin the serie once I cleaned up commits, and ported existing 
plugins to this.

[PATCH v2 5/9] hw/core: Cleanup unused included header in machine-qmp-cmds.c

2024-01-15 Thread Zhao Liu

From: Zhao Liu 

Remove unused header (qemu/main-loop.h) in machine-qmp-cmds.c.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
---
 hw/core/machine-qmp-cmds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index 3860a50c3b7b..ba629379dd92 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -8,6 +8,7 @@
  */
 
 #include "qemu/osdep.h"
+
 #include "hw/acpi/vmgenid.h"
 #include "hw/boards.h"
 #include "hw/intc/intc.h"
@@ -19,7 +20,6 @@
 #include "qapi/qmp/qobject.h"
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/type-helpers.h"
-#include "qemu/main-loop.h"
 #include "qemu/uuid.h"
 #include "qom/qom-qobject.h"
 #include "sysemu/hostmem.h"
-- 
2.34.1

[PATCH v2 3/9] hw/core: Reorder included headers in cpu-common.c

2024-01-15 Thread Zhao Liu

From: Zhao Liu 

Reorder the header files (except qemu/osdep.h) in alphabetical order.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
---
 hw/core/cpu-common.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index a27f0e4cf216..23fc58462ffd 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -19,17 +19,18 @@
  */
 
 #include "qemu/osdep.h"
-#include "qapi/error.h"
+
+#include "exec/log.h"
+#include "hw/boards.h"
 #include "hw/core/cpu.h"
-#include "sysemu/hw_accel.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
 #include "qemu/log.h"
 #include "qemu/main-loop.h"
-#include "exec/log.h"
+#include "qemu/plugin.h"
+#include "sysemu/hw_accel.h"
 #include "sysemu/tcg.h"
-#include "hw/boards.h"
-#include "hw/qdev-properties.h"
 #include "trace.h"
-#include "qemu/plugin.h"
 
 CPUState *cpu_by_arch_id(int64_t id)
 {
-- 
2.34.1

[PATCH v2 7/9] hw/core: Reorder included headers in null-machine.c

2024-01-15 Thread Zhao Liu

From: Zhao Liu 

Reorder the header files (except qemu/osdep.h) in alphabetical order.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
---
 hw/core/null-machine.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/core/null-machine.c b/hw/core/null-machine.c
index f586a4bef543..ef7e95a09ad0 100644
--- a/hw/core/null-machine.c
+++ b/hw/core/null-machine.c
@@ -12,10 +12,11 @@
  */
 
 #include "qemu/osdep.h"
-#include "qemu/error-report.h"
-#include "hw/boards.h"
+
 #include "exec/address-spaces.h"
+#include "hw/boards.h"
 #include "hw/core/cpu.h"
+#include "qemu/error-report.h"
 
 static void machine_none_init(MachineState *mch)
 {
-- 
2.34.1

[PATCH v2 0/9] hw/core: Cleanup and reorder headers

2024-01-15 Thread Zhao Liu

From: Zhao Liu 

Identify unused headers by full compilation (tested by "./configure" and
then "make") and manually checking if it is a direct inclusion.

Then reorder headers in alphabetical order.

In addition, update a file entry in MAINTAINERS file.

---
Changelog:

v1: Per Peter and Philippe's comments in v1, keep directly included
headers to avoid implicit header inclusions [1].

[1]: 
https://lore.kernel.org/qemu-devel/06d4179f-76b8-42f0-b147-f4bc2d1f0...@linaro.org/#t

---
Zhao Liu (9):
  MAINTAINERS: Update hw/core/cpu.c entry
  hw/core: Cleanup unused included headers in cpu-common.c
  hw/core: Reorder included headers in cpu-common.c
  hw/core: Reorder included headers in cpu-sysemu.c
  hw/core: Cleanup unused included header in machine-qmp-cmds.c
  hw/core: Reorder included header in machine.c
  hw/core: Reorder included headers in null-machine.c
  hw/core: Cleanup unused included headers in numa.c
  hw/core: Reorder included headers in numa.c

 MAINTAINERS|  3 ++-
 hw/core/cpu-common.c   | 17 +++--
 hw/core/cpu-sysemu.c   |  3 ++-
 hw/core/machine-qmp-cmds.c |  2 +-
 hw/core/machine.c  | 21 +++--
 hw/core/null-machine.c |  5 +++--
 hw/core/numa.c | 22 ++
 7 files changed, 36 insertions(+), 37 deletions(-)

-- 
2.34.1

[PATCH v2 9/9] hw/core: Reorder included headers in numa.c

2024-01-15 Thread Zhao Liu

From: Zhao Liu 

Reorder the header files (except qemu/osdep.h) in alphabetical order.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
---
 hw/core/numa.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 05c5e1b8514c..778d2a16d2f1 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -23,22 +23,23 @@
  */
 
 #include "qemu/osdep.h"
-#include "qemu/units.h"
-#include "sysemu/hostmem.h"
-#include "sysemu/numa.h"
+
 #include "exec/cpu-common.h"
 #include "exec/ramlist.h"
-#include "qemu/error-report.h"
+#include "hw/boards.h"
+#include "hw/mem/memory-device.h"
+#include "hw/mem/pc-dimm.h"
+#include "sysemu/hostmem.h"
+#include "sysemu/numa.h"
+#include "sysemu/qtest.h"
 #include "qapi/error.h"
 #include "qapi/opts-visitor.h"
 #include "qapi/qapi-visit-machine.h"
-#include "sysemu/qtest.h"
-#include "hw/mem/pc-dimm.h"
-#include "hw/boards.h"
-#include "hw/mem/memory-device.h"
-#include "qemu/option.h"
 #include "qemu/config-file.h"
 #include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "qemu/option.h"
+#include "qemu/units.h"
 
 QemuOptsList qemu_numa_opts = {
 .name = "numa",
-- 
2.34.1

[PATCH v2 1/9] MAINTAINERS: Update hw/core/cpu.c entry

2024-01-15 Thread Zhao Liu

From: Zhao Liu 

The hw/core/cpu.c was split as hw/core/cpu-common.c and
hw/core/cpu-sysemu.c in the commit df4fd7d5c8a3 ("cpu: Split as
cpu-common / cpu-sysemu").

Update the related entry.

Signed-off-by: Zhao Liu 
Reviewed-by: Philippe Mathieu-Daudé 
---
 MAINTAINERS | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index b406fb20c059..529313eba27e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1867,7 +1867,8 @@ M: Marcel Apfelbaum 
 R: Philippe Mathieu-Daudé 
 R: Yanan Wang 
 S: Supported
-F: hw/core/cpu.c
+F: hw/core/cpu-common.c
+F: hw/core/cpu-sysemu.c
 F: hw/core/machine-qmp-cmds.c
 F: hw/core/machine.c
 F: hw/core/machine-smp.c
-- 
2.34.1

[PATCH v2 2/9] hw/core: Cleanup unused included headers in cpu-common.c

2024-01-15 Thread Zhao Liu

From: Zhao Liu 

Remove unused headers in cpu-common.c:
* qemu/notify.h
* exec/cpu-common.h
* qemu/error-report.h
* qemu/qemu-print.h

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
---
 hw/core/cpu-common.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index 3ccfe882e2c3..a27f0e4cf216 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -22,13 +22,9 @@
 #include "qapi/error.h"
 #include "hw/core/cpu.h"
 #include "sysemu/hw_accel.h"
-#include "qemu/notify.h"
 #include "qemu/log.h"
 #include "qemu/main-loop.h"
 #include "exec/log.h"
-#include "exec/cpu-common.h"
-#include "qemu/error-report.h"
-#include "qemu/qemu-print.h"
 #include "sysemu/tcg.h"
 #include "hw/boards.h"
 #include "hw/qdev-properties.h"
-- 
2.34.1

[PATCH v2 4/9] hw/core: Reorder included headers in cpu-sysemu.c

2024-01-15 Thread Zhao Liu

From: Zhao Liu 

Reorder the header files (except qemu/osdep.h) in alphabetical order.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
---
 hw/core/cpu-sysemu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/core/cpu-sysemu.c b/hw/core/cpu-sysemu.c
index d0d6a910f97c..c2cb05830076 100644
--- a/hw/core/cpu-sysemu.c
+++ b/hw/core/cpu-sysemu.c
@@ -19,9 +19,10 @@
  */
 
 #include "qemu/osdep.h"
-#include "qapi/error.h"
+
 #include "hw/core/cpu.h"
 #include "hw/core/sysemu-cpu-ops.h"
+#include "qapi/error.h"
 
 bool cpu_paging_enabled(const CPUState *cpu)
 {
-- 
2.34.1

[PATCH v2 6/9] hw/core: Reorder included header in machine.c

2024-01-15 Thread Zhao Liu

From: Zhao Liu 

Reorder the header files (except qemu/osdep.h) in alphabetical order.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
---
 hw/core/machine.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index fb5afdcae4cc..00f2f24ee79e 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -11,26 +11,27 @@
  */
 
 #include "qemu/osdep.h"
-#include "qemu/accel.h"
-#include "sysemu/replay.h"
+
+#include "audio/audio.h"
+#include "exec/confidential-guest-support.h"
 #include "hw/boards.h"
 #include "hw/loader.h"
+#include "hw/mem/nvdimm.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/virtio/virtio-net.h"
+#include "hw/virtio/virtio-pci.h"
+#include "migration/global_state.h"
 #include "qapi/error.h"
 #include "qapi/qapi-visit-machine.h"
+#include "qemu/accel.h"
 #include "qom/object_interfaces.h"
 #include "sysemu/cpus.h"
+#include "sysemu/qtest.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/replay.h"
 #include "sysemu/reset.h"
 #include "sysemu/runstate.h"
 #include "sysemu/xen.h"
-#include "sysemu/qtest.h"
-#include "hw/pci/pci_bridge.h"
-#include "hw/mem/nvdimm.h"
-#include "migration/global_state.h"
-#include "exec/confidential-guest-support.h"
-#include "hw/virtio/virtio-pci.h"
-#include "hw/virtio/virtio-net.h"
-#include "audio/audio.h"
 
 GlobalProperty hw_compat_8_2[] = {};
 const size_t hw_compat_8_2_len = G_N_ELEMENTS(hw_compat_8_2);
-- 
2.34.1

[PATCH v2 8/9] hw/core: Cleanup unused included headers in numa.c

2024-01-15 Thread Zhao Liu

From: Zhao Liu 

Remove unused header in numa.c:
* qemu/bitmap.h
* hw/core/cpu.h
* migration/vmstate.h

Note: Though parse_numa_hmat_lb() has the variable named "bitmap_copy",
it doesn't use the normal bitmap ops so that it's safe to exclude
qemu/bitmap.h header.

Tested by "./configure" and then "make".

Signed-off-by: Zhao Liu 
---
 hw/core/numa.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index f08956ddb0ff..05c5e1b8514c 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -28,15 +28,12 @@
 #include "sysemu/numa.h"
 #include "exec/cpu-common.h"
 #include "exec/ramlist.h"
-#include "qemu/bitmap.h"
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "qapi/opts-visitor.h"
 #include "qapi/qapi-visit-machine.h"
 #include "sysemu/qtest.h"
-#include "hw/core/cpu.h"
 #include "hw/mem/pc-dimm.h"
-#include "migration/vmstate.h"
 #include "hw/boards.h"
 #include "hw/mem/memory-device.h"
 #include "qemu/option.h"
-- 
2.34.1

Re: [PATCH] Fixed '-serial none' usage breaks following '-serial ...' usage

2024-01-15 Thread Markus Armbruster

Peter Maydell  writes:

> (I've cc'd a few people who might have opinions on possible
> command-line compatibility breakage.)
>
> On Wed, 10 Jan 2024 at 14:38, Bohdan Kostiv  wrote:
>>
>> Hello,
>>
>> I have faced an issue in using serial ports when I need to skip a couple of 
>> ports in the CLI.
>>
>> For example the ARM machine netduinoplus2 supports up to 7 UARTS.
>> Following case works (the first UART is used to send data in the firmware):
>> qemu-system-arm -machine netduinoplus2 -nographic -serial mon:stdio -kernel 
>> path-to-fw/firmware.elf
>> But this one doesn't  (the third UART is used to send data in the firmware):
>> qemu-system-arm -machine netduinoplus2 -nographic -serial none -serial none 
>> -serial mon:stdio -kernel path-to-fw/firmware.elf
>
> Putting the patch inline for more convenient discussion:
>
>> Subject: [PATCH] Fixed '-serial none' usage breaks following '-serial ...' 
>> usage
>>
>> Signed-off-by: Bohdan Kostiv 
>> ---
>>  system/vl.c | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/system/vl.c b/system/vl.c
>> index 2bcd9efb9a..b8744475cd 100644
>> --- a/system/vl.c
>> +++ b/system/vl.c
>> @@ -1442,8 +1442,11 @@ static int serial_parse(const char *devname)
>>  int index = num_serial_hds;
>>  char label[32];
>>
>> -if (strcmp(devname, "none") == 0)
>> +if (strcmp(devname, "none") == 0) {
>> +num_serial_hds++;
>>  return 0;
>> +}
>> +
>>  snprintf(label, sizeof(label), "serial%d", index);
>>  serial_hds = g_renew(Chardev *, serial_hds, index + 1);
>>
>> --
>> 2.39.3 (Apple Git-145)
>
> I agree that it's the right thing to do -- '-serial none
> -serial foo' ought to set serial_hds(0) as 'none' and
> serial_hds(1) as 'foo'.

I was about to ask whether this is a regression, but then ...

> My only concern here is that this is a very very
> longstanding bug -- as far as I can see it was
> introduced in commit 998bbd74b9d81 in 2009.

... saw you already showed it is.

> So I am
> a little worried that maybe some existing command lines
> accidentally rely on the current behaviour.
>
> I think the current behaviour is:
>
>  * "-serial none -serial something" is the same as
>"-serial something"
>  * "-serial none" on its own disables the default serial
>device (the docs say it will "disable all serial ports"
>but I don't think that is correct...)

Goes back all the way to the commit that introduced "none": c03b0f0fd86
(allow disabling of serial or parallel devices (Stefan Weil)), v0.9.0.

> which amounts to "the only effectively useful use of
> '-serial none' is to disable the default serial device"

Yes.

> and if we apply this patch:
>  * "-serial none -serial something" has the sensible behaviour
>of "first serial port not connected/present, second serial
>port exists" (which of those you get depends on the machine
>model)

Is this the behavior before commit 998bbd74b9d?

>  * "-serial none" on its own has no behaviour change
>
> So I think the only affected users would be anybody who
> accidentally had an extra "-serial none" in their command
> line that was previously being overridden by a later
> "-serial" option. That doesn't seem very likely to me,
> so I think I'd be in favour of making this change and
> having something in the release notes about it.
>
> Does anybody on the CC list have a different opinion /
> think I've mis-analysed what the current code is doing ?

I'm leaning towards agreeing with you.  A bit more heavily if the change
restores original behavior.

Your analysis should be worked into the commit message, though.

Re: [RFC PATCH v3 13/30] migration/multifd: Add outgoing QIOChannelFile support

2024-01-15 Thread Peter Xu

On Tue, Jan 16, 2024 at 12:05:57PM +0800, Peter Xu wrote:
> On Mon, Nov 27, 2023 at 05:25:55PM -0300, Fabiano Rosas wrote:
> > Allow multifd to open file-backed channels. This will be used when
> > enabling the fixed-ram migration stream format which expects a
> > seekable transport.
> > 
> > The QIOChannel read and write methods will use the preadv/pwritev
> > versions which don't update the file offset at each call so we can
> > reuse the fd without re-opening for every channel.
> > 
> > Note that this is just setup code and multifd cannot yet make use of
> > the file channels.
> > 
> > Signed-off-by: Fabiano Rosas 
> > ---
> > - open multifd channels with O_WRONLY and no mode
> > - stop cancelling migration and propagate error via qio_task
> > ---
> >  migration/file.c  | 47 +--
> >  migration/file.h  |  5 +
> >  migration/multifd.c   | 14 +++--
> >  migration/options.c   |  7 +++
> >  migration/options.h   |  1 +
> >  migration/qemu-file.h |  1 -
> >  6 files changed, 70 insertions(+), 5 deletions(-)
> > 
> > diff --git a/migration/file.c b/migration/file.c
> > index 5d4975f43e..67d6f42da7 100644
> > --- a/migration/file.c
> > +++ b/migration/file.c
> > @@ -17,6 +17,10 @@
> >  
> >  #define OFFSET_OPTION ",offset="
> >  
> > +static struct FileOutgoingArgs {
> > +char *fname;
> > +} outgoing_args;
> > +
> >  /* Remove the offset option from @filespec and return it in @offsetp. */
> >  
> >  int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
> > @@ -36,6 +40,42 @@ int file_parse_offset(char *filespec, uint64_t *offsetp, 
> > Error **errp)
> >  return 0;
> >  }
> >  
> > +static void qio_channel_file_connect_worker(QIOTask *task, gpointer opaque)
> > +{
> > +/* noop */
> > +}
> > +
> > +int file_send_channel_destroy(QIOChannel *ioc)
> > +{
> > +if (ioc) {
> > +qio_channel_close(ioc, NULL);
> > +object_unref(OBJECT(ioc));
> > +}
> > +g_free(outgoing_args.fname);
> > +outgoing_args.fname = NULL;
> > +
> > +return 0;
> > +}
> > +
> > +void file_send_channel_create(QIOTaskFunc f, void *data)
> > +{
> > +QIOChannelFile *ioc;
> > +QIOTask *task;
> > +Error *err = NULL;
> > +int flags = O_WRONLY;
> > +
> > +ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, );
> > +
> > +task = qio_task_new(OBJECT(ioc), f, (gpointer)data, NULL);
> > +if (!ioc) {
> > +qio_task_set_error(task, err);
> > +return;
> > +}
> > +
> > +qio_task_run_in_thread(task, qio_channel_file_connect_worker,
> > +   (gpointer)data, NULL, NULL);
> 
> This is pretty weird.  This invokes a thread, but it'll run a noop.  It
> seems meaningless to me.
> 
> I assume you wanted to keep using the same async model as the socket typed
> multifd, but I don't think that works anyway, because file open blocks at
> qio_channel_file_new_path() so it's sync anyway.
> 
> AFAICT we still share the code, as long as the file path properly invokes
> multifd_channel_connect() after the iochannel is setup.
> 
> > +}
> > +
> >  void file_start_outgoing_migration(MigrationState *s,
> > FileMigrationArgs *file_args, Error 
> > **errp)
> >  {
> > @@ -43,15 +83,18 @@ void file_start_outgoing_migration(MigrationState *s,
> >  g_autofree char *filename = g_strdup(file_args->filename);
> >  uint64_t offset = file_args->offset;
> >  QIOChannel *ioc;
> > +int flags = O_CREAT | O_TRUNC | O_WRONLY;
> > +mode_t mode = 0660;
> >  
> >  trace_migration_file_outgoing(filename);
> >  
> > -fioc = qio_channel_file_new_path(filename, O_CREAT | O_WRONLY | 
> > O_TRUNC,
> > - 0600, errp);
> > +fioc = qio_channel_file_new_path(filename, flags, mode, errp);
> >  if (!fioc) {
> >  return;
> >  }
> >  
> > +outgoing_args.fname = g_strdup(filename);
> > +
> >  ioc = QIO_CHANNEL(fioc);
> >  if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
> >  return;
> > diff --git a/migration/file.h b/migration/file.h
> > index 37d6a08bfc..511019b319 100644
> > --- a/migration/file.h
> > +++ b/migration/file.h
> > @@ -9,10 +9,15 @@
> >  #define QEMU_MIGRATION_FILE_H
> >  
> >  #include "qapi/qapi-types-migration.h"
> > +#include "io/task.h"
> > +#include "channel.h"
> >  
> >  void file_start_incoming_migration(FileMigrationArgs *file_args, Error 
> > **errp);
> >  
> >  void file_start_outgoing_migration(MigrationState *s,
> > FileMigrationArgs *file_args, Error 
> > **errp);
> >  int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
> > +
> > +void file_send_channel_create(QIOTaskFunc f, void *data);
> > +int file_send_channel_destroy(QIOChannel *ioc);
> >  #endif
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 123ff0dec0..427740aab6 100644
> > ---

Re: [PATCH v2 03/19] qapi: create QAPISchemaDefinition

2024-01-15 Thread Markus Armbruster

John Snow  writes:

> On Mon, Jan 15, 2024 at 8:16 AM Markus Armbruster  wrote:
>>
>> John Snow  writes:
>>
>> > Include entities don't have names, but we generally expect "entities" to
>> > have names. Reclassify all entities with names as *definitions*, leaving
>> > the nameless include entities as QAPISchemaEntity instances.
>> >
>> > This is primarily to help simplify typing around expectations of what
>> > callers expect for properties of an "entity".
>> >
>> > Suggested-by: Markus Armbruster 
>> > Signed-off-by: John Snow 
>> > ---
>> >  scripts/qapi/schema.py | 117 -
>> >  1 file changed, 70 insertions(+), 47 deletions(-)
>> >
>> > diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
>> > index b7830672e57..e39ed972a80 100644
>> > --- a/scripts/qapi/schema.py
>> > +++ b/scripts/qapi/schema.py
>> > @@ -55,14 +55,14 @@ def is_present(self):
>> >
>> >
>> >  class QAPISchemaEntity:
>> > -meta: Optional[str] = None
>> > +"""
>> > +QAPISchemaEntity represents all schema entities.
>> >
>> > -def __init__(self, name: str, info, doc, ifcond=None, features=None):
>> > -assert name is None or isinstance(name, str)
>> > -for f in features or []:
>> > -assert isinstance(f, QAPISchemaFeature)
>> > -f.set_defined_in(name)
>> > -self.name = name
>> > +Notably, this includes both named and un-named entities, such as
>> > +include directives. Entities with names are represented by the
>> > +more specific sub-class `QAPISchemaDefinition` instead.
>> > +"""
>>
>> Hmm.  What about:
>>
>>"""
>>A schema entity.
>>
>>This is either a directive, such as include, or a definition.
>>The latter use sub-class `QAPISchemaDefinition`.

Or is it "uses"?  Not a native speaker...

>>"""
>>
>
> Sure. Key point was just the cookie crumb to the sub-class.
>
>> > +def __init__(self, info):
>> >  self._module = None
>> >  # For explicitly defined entities, info points to the (explicit)
>> >  # definition.  For builtins (and their arrays), info is None.
>> > @@ -70,14 +70,50 @@ def __init__(self, name: str, info, doc, ifcond=None, 
>> > features=None):
>> >  # triggered the implicit definition (there may be more than one
>> >  # such place).
>> >  self.info = info
>> > +self._checked = False
>> > +
>> > +def __repr__(self):
>> > +return "<%s at 0x%x>" % (type(self).__name__, id(self))
>> > +
>> > +def check(self, schema):
>> > +# pylint: disable=unused-argument
>> > +self._checked = True
>> > +
>> > +def connect_doc(self, doc=None):
>> > +pass
>> > +
>> > +def check_doc(self):
>> > +pass
>> > +
>> > +def _set_module(self, schema, info):
>> > +assert self._checked
>> > +fname = info.fname if info else 
>> > QAPISchemaModule.BUILTIN_MODULE_NAME
>> > +self._module = schema.module_by_fname(fname)
>> > +self._module.add_entity(self)
>> > +
>> > +def set_module(self, schema):
>> > +self._set_module(schema, self.info)
>> > +
>> > +def visit(self, visitor):
>> > +# pylint: disable=unused-argument
>> > +assert self._checked
>> > +
>> > +
>> > +class QAPISchemaDefinition(QAPISchemaEntity):
>> > +meta: Optional[str] = None
>> > +
>> > +def __init__(self, name: str, info, doc, ifcond=None, features=None):
>> > +assert isinstance(name, str)
>> > +super().__init__(info)
>> > +for f in features or []:
>> > +assert isinstance(f, QAPISchemaFeature)
>> > +f.set_defined_in(name)
>> > +self.name = name
>> >  self.doc = doc
>> >  self._ifcond = ifcond or QAPISchemaIfCond()
>> >  self.features = features or []
>> > -self._checked = False
>> >
>> >  def __repr__(self):
>> > -if self.name is None:
>> > -return "<%s at 0x%x>" % (type(self).__name__, id(self))
>> >  return "<%s:%s at 0x%x>" % (type(self).__name__, self.name,
>> >  id(self))
>> >
>> > @@ -102,15 +138,6 @@ def check_doc(self):
>>def check(self, schema):
>># pylint: disable=unused-argument
>>assert not self._checked
>>seen = {}
>>for f in self.features:
>>f.check_clash(self.info, seen)
>>self._checked = True
>>
>>def connect_doc(self, doc=None):
>>doc = doc or self.doc
>>if doc:
>>for f in self.features:
>>doc.connect_feature(f)
>>
>>def check_doc(self):
>> >  if self.doc:
>> >  self.doc.check()
>>
>> No super().FOO()?
>
> Ah, just an oversight. It worked out because the super method doesn't
> do anything anyway. check() and connect_doc() should also use the
> super call, probably.

Yes, please; it's a good habit.

Re: [RFC PATCH v3 17/30] migration/multifd: Decouple recv method from pages

2024-01-15 Thread Peter Xu

On Mon, Nov 27, 2023 at 05:25:59PM -0300, Fabiano Rosas wrote:
> Next patch will abstract the type of data being received by the
> channels, so do some cleanup now to remove references to pages and
> dependency on 'normal_num'.
> 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [RFC PATCH v3 16/30] multifd: Rename MultiFDSendParams::data to compress_data

2024-01-15 Thread Peter Xu

On Mon, Nov 27, 2023 at 05:25:58PM -0300, Fabiano Rosas wrote:
> Use a more specific name for the compression data so we can use the
> generic for the multifd core code.

You also modified the recv side.  Touch up the subject would be nice to say
both.

> 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 


-- 
Peter Xu

Re: [RFC PATCH v3 15/30] io: Add a pwritev/preadv version that takes a discontiguous iovec

2024-01-15 Thread Peter Xu

On Mon, Nov 27, 2023 at 05:25:57PM -0300, Fabiano Rosas wrote:
> For the upcoming support to fixed-ram migration with multifd, we need
> to be able to accept an iovec array with non-contiguous data.
> 
> Add a pwritev and preadv version that splits the array into contiguous
> segments before writing. With that we can have the ram code continue
> to add pages in any order and the multifd code continue to send large
> arrays for reading and writing.
> 
> Signed-off-by: Fabiano Rosas 
> ---
> - split the API that was merged into a single function
> - use uintptr_t for compatibility with 32-bit
> ---
>  include/io/channel.h | 26 
>  io/channel.c | 70 
>  2 files changed, 96 insertions(+)
> 
> diff --git a/include/io/channel.h b/include/io/channel.h
> index 7986c49c71..25383db5aa 100644
> --- a/include/io/channel.h
> +++ b/include/io/channel.h
> @@ -559,6 +559,19 @@ int qio_channel_close(QIOChannel *ioc,
>  ssize_t qio_channel_pwritev(QIOChannel *ioc, const struct iovec *iov,
>  size_t niov, off_t offset, Error **errp);
>  
> +/**
> + * qio_channel_pwritev_all:
> + * @ioc: the channel object
> + * @iov: the array of memory regions to write data from
> + * @niov: the length of the @iov array
> + * @offset: the iovec offset in the file where to write the data
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Returns: 0 if all bytes were written, or -1 on error
> + */
> +int qio_channel_pwritev_all(QIOChannel *ioc, const struct iovec *iov,
> +size_t niov, off_t offset, Error **errp);
> +
>  /**
>   * qio_channel_pwrite
>   * @ioc: the channel object
> @@ -595,6 +608,19 @@ ssize_t qio_channel_pwrite(QIOChannel *ioc, char *buf, 
> size_t buflen,
>  ssize_t qio_channel_preadv(QIOChannel *ioc, const struct iovec *iov,
> size_t niov, off_t offset, Error **errp);
>  
> +/**
> + * qio_channel_preadv_all:
> + * @ioc: the channel object
> + * @iov: the array of memory regions to read data to
> + * @niov: the length of the @iov array
> + * @offset: the iovec offset in the file from where to read the data
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Returns: 0 if all bytes were read, or -1 on error
> + */
> +int qio_channel_preadv_all(QIOChannel *ioc, const struct iovec *iov,
> +   size_t niov, off_t offset, Error **errp);
> +
>  /**
>   * qio_channel_pread
>   * @ioc: the channel object
> diff --git a/io/channel.c b/io/channel.c
> index a1f12f8e90..2f1745d052 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -472,6 +472,69 @@ ssize_t qio_channel_pwritev(QIOChannel *ioc, const 
> struct iovec *iov,
>  return klass->io_pwritev(ioc, iov, niov, offset, errp);
>  }
>  
> +static int qio_channel_preadv_pwritev_contiguous(QIOChannel *ioc,
> + const struct iovec *iov,
> + size_t niov, off_t offset,
> + bool is_write, Error **errp)
> +{
> +ssize_t ret = -1;
> +int i, slice_idx, slice_num;
> +uintptr_t base, next, file_offset;
> +size_t len;
> +
> +slice_idx = 0;
> +slice_num = 1;
> +
> +/*
> + * If the iov array doesn't have contiguous elements, we need to
> + * split it in slices because we only have one (file) 'offset' for
> + * the whole iov. Do this here so callers don't need to break the
> + * iov array themselves.
> + */
> +for (i = 0; i < niov; i++, slice_num++) {
> +base = (uintptr_t) iov[i].iov_base;
> +
> +if (i != niov - 1) {
> +len = iov[i].iov_len;
> +next = (uintptr_t) iov[i + 1].iov_base;
> +
> +if (base + len == next) {
> +continue;
> +}
> +}
> +
> +/*
> + * Use the offset of the first element of the segment that
> + * we're sending.
> + */
> +file_offset = offset + (uintptr_t) iov[slice_idx].iov_base;
> +
> +if (is_write) {
> +ret = qio_channel_pwritev(ioc, [slice_idx], slice_num,
> +  file_offset, errp);
> +} else {
> +ret = qio_channel_preadv(ioc, [slice_idx], slice_num,
> + file_offset, errp);
> +}
> +
> +if (ret < 0) {
> +break;
> +}
> +
> +slice_idx += slice_num;
> +slice_num = 0;
> +}
> +
> +return (ret < 0) ? -1 : 0;
> +}
> +
> +int qio_channel_pwritev_all(QIOChannel *ioc, const struct iovec *iov,
> +size_t niov, off_t offset, Error **errp)
> +{
> +return qio_channel_preadv_pwritev_contiguous(ioc, iov, niov,
> + offset, true, errp);
> +}

I'm not sure how Dan thinks about this, but I don't think this is pretty..

With this

Re: [PATCH v2 01/19] qapi: sort pylint suppressions

2024-01-15 Thread Markus Armbruster

John Snow  writes:

> On Mon, Jan 15, 2024 at 7:18 AM Markus Armbruster  wrote:
>>
>> John Snow  writes:
>>
>> > Suggested-by: Markus Armbruster 
>> > Signed-off-by: John Snow 
>> > ---
>> >  scripts/qapi/pylintrc | 6 +++---
>> >  1 file changed, 3 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/scripts/qapi/pylintrc b/scripts/qapi/pylintrc
>> > index 90546df5345..78b63af4df6 100644
>> > --- a/scripts/qapi/pylintrc
>> > +++ b/scripts/qapi/pylintrc
>> > @@ -16,14 +16,14 @@ ignore-patterns=schema.py,
>> >  # --enable=similarities". If you want to run only the classes checker, 
>> > but have
>> >  # no Warning level messages displayed, use "--disable=all --enable=classes
>> >  # --disable=W".
>> > -disable=fixme,
>> > +disable=consider-using-f-string,
>> >  missing-docstring,
>> >  too-many-arguments,
>> >  too-many-branches,
>> > -too-many-statements,
>> >  too-many-instance-attributes,
>> > -consider-using-f-string,
>> > +too-many-statements,
>> >  useless-option-value,
>> > +fixme,
>> >
>> >  [REPORTS]
>>
>> Any particular reason for putting fixme last?
>>
>
> Uhh. You know, I have no idea? I swore I just used emacs to sort the
> lines, but ... lol? You'd think I'd notice that this wasn't in
> alphabetical order, but...
>
> *cough* sorry

No worries, happens to the best of us :)

RE: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and VIRTIO_F_NOTIFICATION_DATA feature

2024-01-15 Thread Wentao Jia

Hi, Jason 

The two features was defined before version 8.2, we can found them in qemu 8.2

VIRTIO_F_NOTIFICATION_DATA
https://github.com/qemu/qemu/commit/d0bf492f3877d4187d2f7d0c0abb3a2bf3104392
VIRTIO_F_IN_ORDER
https://github.com/qemu/qemu/commit/e4082063e47e9731dbeb1c26174c17f6038f577f

thank you

Wentao

-Original Message-
From: Jason Wang  
Sent: Tuesday, January 16, 2024 10:20 AM
To: Wentao Jia 
Cc: qemu-devel@nongnu.org; m...@redhat.com; Rick Zhong 

Subject: Re: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
VIRTIO_F_NOTIFICATION_DATA feature

On Tue, Jan 16, 2024 at 9:57 AM Wentao Jia  wrote:
>
> Hi, Jason
>
> I just add two features in vhost user feature bits, The patch was 
> tested in my environment I do not know what the compatibility mean

For example, if you don't do that,

Migrating from 9.0 to 8.2 will break as 8.2 doesn't have those new features at 
all.

Please refer hw_compat_8_2.

Thanks

>
> Wentao
>
> -Original Message-
> From: Jason Wang 
> Sent: Monday, January 15, 2024 8:18 AM
> To: Wentao Jia 
> Cc: qemu-devel@nongnu.org; m...@redhat.com; Rick Zhong 
> 
> Subject: Re: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
> VIRTIO_F_NOTIFICATION_DATA feature
>
> On Fri, Jan 12, 2024 at 4:18 PM Wentao Jia  wrote:
> >
> > Hi, Michael and Jason
> >
> > Do you have any other comments?
> > Is there a schedule for merge the patch into the community?
> > Thank you
>
> I think as discussed, we need to add compatibility support for those features.
>
> Thanks
>
> >
> > Wentao
> >
> > -Original Message-
> > From: Wentao Jia
> > Sent: Tuesday, January 2, 2024 1:57 PM
> > To: qemu-devel@nongnu.org
> > Cc: 'm...@redhat.com' ; Rick Zhong 
> > ; 'Jason Wang' 
> > Subject: RE: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
> > VIRTIO_F_NOTIFICATION_DATA feature
> >
> >
> > ---
> >  hw/net/vhost_net.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index
> > e8e1661646..211ca859a6 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -76,6 +76,8 @@ static const int user_feature_bits[] = {
> >  VIRTIO_F_IOMMU_PLATFORM,
> >  VIRTIO_F_RING_PACKED,
> >  VIRTIO_F_RING_RESET,
> > +VIRTIO_F_IN_ORDER,
> > +VIRTIO_F_NOTIFICATION_DATA,
> >  VIRTIO_NET_F_RSS,
> >  VIRTIO_NET_F_HASH_REPORT,
> >  VIRTIO_NET_F_GUEST_USO4,
> > --
> >
> > -Original Message-
> > From: Wentao Jia
> > Sent: Tuesday, January 2, 2024 1:38 PM
> > To: Jason Wang 
> > Cc: m...@redhat.com; Rick Zhong 
> > Subject: RE: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
> > VIRTIO_F_NOTIFICATION_DATA feature
> >
> > Hi, Jason
> >
> > It is good just change feature bits, I will commit a new patch, 
> > thanks
> >
> > Wentao Jia
> >
> > -Original Message-
> > From: Jason Wang 
> > Sent: Tuesday, January 2, 2024 11:24 AM
> > To: Wentao Jia 
> > Cc: m...@redhat.com; Rick Zhong 
> > Subject: Re: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
> > VIRTIO_F_NOTIFICATION_DATA feature
> >
> > On Tue, Jan 2, 2024 at 10:26 AM Wentao Jia  wrote:
> > >
> > > Hi, Michael  and Jason
> > >
> > >
> > >
> > > please review the patch at your convenience, thank you
> > >
> > > vhost-user: add VIRTIO_F_IN_ORDER and VIRTIO_F_NOTIFICATION_DATA 
> > > feature - Patchwork (kernel.org)
> > >
> > >
> > >
> > > Wentao Jia
> > >
> > >
> > >
> > > From: Wentao Jia
> > > Sent: Friday, December 1, 2023 6:11 PM
> > > To: qemu-devel@nongnu.org
> > > Subject: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
> > > VIRTIO_F_NOTIFICATION_DATA feature
> > >
> > >
> > >
> > > VIRTIO_F_IN_ORDER and VIRTIO_F_NOTIFICATION_DATA feature are 
> > > important feature
> > >
> > > for dpdk vdpa packets transmitting performance, add the 2 features 
> > > at vhost-user
> > >
> > > front-end to negotiation with backend.
> > >
> > >
> > >
> > > Signed-off-by: Kyle Xu zhenbing...@corigine.com
> > >
> > > Signed-off-by: Wentao Jia wentao@corigine.com
> > >
> > > Reviewed-by:   Xinying Yu xinying...@corigine.com
> > >
> > > Reviewed-by:   Shujing Dong shujing.d...@corigine.com
> > >
> > > Reviewed-by:   Rick Zhong zhaoyong.zh...@corigine.com
> > >
> > > ---
> > >
> > > hw/net/vhost_net.c | 2 ++
> > >
> > > include/hw/virtio/virtio.h | 4 
> > >
> > > 2 files changed, 6 insertions(+)
> > >
> > >
> > >
> > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > >
> > > index e8e1661646..211ca859a6 100644
> > >
> > > --- a/hw/net/vhost_net.c
> > >
> > > +++ b/hw/net/vhost_net.c
> > >
> > > @@ -76,6 +76,8 @@ static const int user_feature_bits[] = {
> > >
> > >  VIRTIO_F_IOMMU_PLATFORM,
> > >
> > >  VIRTIO_F_RING_PACKED,
> > >
> > >  VIRTIO_F_RING_RESET,
> > >
> > > +VIRTIO_F_IN_ORDER,
> > >
> > > +VIRTIO_F_NOTIFICATION_DATA,
> > >
> > >  VIRTIO_NET_F_RSS,
> > >
> > >  VIRTIO_NET_F_HASH_REPORT,
> > >
> > >  VIRTIO_NET_F_GUEST_USO4,
> > >
> > > diff --git a/include/hw/virtio/virtio.h 
> > >

Re: [RFC PATCH v3 14/30] migration/multifd: Add incoming QIOChannelFile support

2024-01-15 Thread Peter Xu

On Mon, Nov 27, 2023 at 05:25:56PM -0300, Fabiano Rosas wrote:
> On the receiving side we don't need to differentiate between main
> channel and threads, so whichever channel is defined first gets to be
> the main one. And since there are no packets, use the atomic channel
> count to index into the params array.
> 
> Signed-off-by: Fabiano Rosas 
> ---
> - stop setting offset in secondary channels
> - check for packets when peeking
> ---
>  migration/file.c  | 36 
>  migration/migration.c |  3 ++-
>  migration/multifd.c   |  3 +--
>  migration/multifd.h   |  1 +
>  4 files changed, 32 insertions(+), 11 deletions(-)
> 
> diff --git a/migration/file.c b/migration/file.c
> index 67d6f42da7..62ba994109 100644
> --- a/migration/file.c
> +++ b/migration/file.c
> @@ -7,12 +7,14 @@
>  
>  #include "qemu/osdep.h"
>  #include "qemu/cutils.h"
> +#include "qemu/error-report.h"
>  #include "qapi/error.h"
>  #include "channel.h"
>  #include "file.h"
>  #include "migration.h"
>  #include "io/channel-file.h"
>  #include "io/channel-util.h"
> +#include "options.h"
>  #include "trace.h"
>  
>  #define OFFSET_OPTION ",offset="
> @@ -117,22 +119,40 @@ void file_start_incoming_migration(FileMigrationArgs 
> *file_args, Error **errp)
>  g_autofree char *filename = g_strdup(file_args->filename);
>  QIOChannelFile *fioc = NULL;
>  uint64_t offset = file_args->offset;
> -QIOChannel *ioc;
> +int channels = 1;
> +int i = 0, fd;
>  
>  trace_migration_file_incoming(filename);
>  
>  fioc = qio_channel_file_new_path(filename, O_RDONLY, 0, errp);
>  if (!fioc) {
> +goto out;

Shouldn't here be a "return"?  Won't "goto out" try to error_setg() again
and crash?

It looks like that label can be dropped.

> +}
> +
> +if (offset &&
> +qio_channel_io_seek(QIO_CHANNEL(fioc), offset, SEEK_SET, errp) < 0) {
>  return;
>  }
>  
> -ioc = QIO_CHANNEL(fioc);
> -if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
> +if (migrate_multifd()) {
> +channels += migrate_multifd_channels();
> +}
> +
> +fd = fioc->fd;
> +
> +do {
> +QIOChannel *ioc = QIO_CHANNEL(fioc);
> +
> +qio_channel_set_name(ioc, "migration-file-incoming");
> +qio_channel_add_watch_full(ioc, G_IO_IN,
> +   file_accept_incoming_migration,
> +   NULL, NULL,
> +   g_main_context_get_thread_default());
> +} while (++i < channels && (fioc = qio_channel_file_new_fd(fd)));
> +
> +out:
> +if (!fioc) {
> +error_setg(errp, "Error creating migration incoming channel");
>  return;
>  }
> -qio_channel_set_name(QIO_CHANNEL(ioc), "migration-file-incoming");
> -qio_channel_add_watch_full(ioc, G_IO_IN,
> -   file_accept_incoming_migration,
> -   NULL, NULL,
> -   g_main_context_get_thread_default());
>  }
> diff --git a/migration/migration.c b/migration/migration.c
> index 897ed1db67..16689171ab 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -838,7 +838,8 @@ void migration_ioc_process_incoming(QIOChannel *ioc, 
> Error **errp)
>  uint32_t channel_magic = 0;
>  int ret = 0;
>  
> -if (migrate_multifd() && !migrate_postcopy_ram() &&
> +if (migrate_multifd() && migrate_multifd_packets() &&
> +!migrate_postcopy_ram() &&
>  qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
>  /*
>   * With multiple channels, it is possible that we receive channels
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 427740aab6..3476fac49f 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -1283,8 +1283,7 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error 
> **errp)
>  /* initial packet */
>  num_packets = 1;
>  } else {
> -/* next patch gives this a meaningful value */
> -id = 0;
> +id = qatomic_read(_recv_state->count);
>  }
>  
>  p = _recv_state->params[id];
> diff --git a/migration/multifd.h b/migration/multifd.h
> index a835643b48..a112ec7ac6 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -18,6 +18,7 @@ void multifd_save_cleanup(void);
>  int multifd_load_setup(Error **errp);
>  void multifd_load_cleanup(void);
>  void multifd_load_shutdown(void);
> +bool multifd_recv_first_channel(void);

This can be dropped?

>  bool multifd_recv_all_channels_created(void);
>  void multifd_recv_new_channel(QIOChannel *ioc, Error **errp);
>  void multifd_recv_sync_main(void);
> -- 
> 2.35.3
> 

-- 
Peter Xu

Re: [PATCH 5/5] qemu-options: Remove the deprecated -singlestep option

2024-01-15 Thread Markus Armbruster

Daniel P. Berrangé  writes:

> On Mon, Jan 15, 2024 at 05:39:19PM +, Peter Maydell wrote:
>> On Mon, 15 Jan 2024 at 13:54, Thomas Huth  wrote:
>> >
>> > On 12/01/2024 16.39, Philippe Mathieu-Daudé wrote:
>> > > Hi Thomas
>> > >
>> > > +Laurent & Peter
>> > >
>> > > On 12/1/24 11:00, Thomas Huth wrote:
>> > >> It's been marked as deprecated since QEMU 8.1, so it should be fine
>> > >> to remove this now.
>> > >>
>> > >> Signed-off-by: Thomas Huth 
>> 
>> > > StatusInfo::singlestep was deprecated at the same time,
>> > > can we remove it?
>> > >
>> > > IOW could we complete your patch with this?
>> 
>> > > diff --git a/qapi/run-state.json b/qapi/run-state.json
>> > > index ca05502e0a..08bc99cb85 100644
>> > > --- a/qapi/run-state.json
>> > > +++ b/qapi/run-state.json
>> > > @@ -106,25 +106,15 @@
>> > >  #
>> > >  # @running: true if all VCPUs are runnable, false if not runnable
>> > >  #
>> > > -# @singlestep: true if using TCG with one guest instruction per
>> > > -# translation block
>> > > -#
>> > >  # @status: the virtual machine @RunState
>> > >  #
>> > >  # Features:
>> > >  #
>> > > -# @deprecated: Member 'singlestep' is deprecated (with no
>> > > -# replacement).
>> > > -#
>> > >  # Since: 0.14
>> > >  #
>> > > -# Notes: @singlestep is enabled on the command line with '-accel
>> > > -# tcg,one-insn-per-tb=on', or with the HMP 'one-insn-per-tb'
>> > > -# command.
>> > >  ##
>> > >  { 'struct': 'StatusInfo',
>> > >'data': {'running': 'bool',
>> > > -   'singlestep': { 'type': 'bool', 'features': [ 'deprecated' 
>> > > ]},
>> > > 'status': 'RunState'} }
>> >
>> > Uh, oh, that's a bigger change already ... can we safely remove the field
>> > here without upsetting 3rd party apps that rely on this interface?
>> 
>> That was the whole point of marking it 'deprecated' in the JSON,
>> I thought? We don't think anybody's using it, we've given fair
>> warning, isn't the next step "remove it"? Markus, you're the
>> expert on QAPI deprecations...
>
> Yes, it is fine to delete it without thinking further about possible usage,
> unless someone steps forward quickly with new information that wasn't known
> when the deprecation was added

Concur.

Supporting data:

commit 34c18203d472c5bf969ebd87dc06c7c3a957efc4
Author: Peter Maydell 
Date:   Mon Apr 17 17:40:41 2023 +0100

qmp: Deprecate 'singlestep' member of StatusInfo

The 'singlestep' member of StatusInfo has never done what the QMP
documentation claims it does.  What it actually reports is whether
TCG is working in "one guest instruction per translation block" mode.

We no longer need this field for the HMP 'info status' command, as
we've moved that information to 'info jit'.  It seems unlikely that
anybody is monitoring the state of this obscure TCG setting via QMP,
especially since QMP provides no means for changing the setting.  So
simply deprecate the field, without providing any replacement.

Until we do eventually delete the member, correct the misstatements
in the QAPI documentation about it.

If we do find that there are users for this, then the most likely way
we would provide replacement access to the information would be to
put the accelerator QOM object at a well-known path such as
/machine/accel, which could then be used with the existing qom-set
and qom-get commands.

Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Markus Armbruster 
Message-id: 20230417164041.684562-11-peter.mayd...@linaro.org

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 6f5e689aa4..d5eda0f566 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -199,6 +199,20 @@ accepted incorrect commands will return an error. Users 
should make sure that
 all arguments passed to ``device_add`` are consistent with the documented
 property types.
 
+``StatusInfo`` member ``singlestep`` (since 8.1)
+
+
+The ``singlestep`` member of the ``StatusInfo`` returned from the
+``query-status`` command is deprecated. This member has a confusing
+name and it never did what the documentation claimed or what its name
+suggests. We do not believe that anybody is actually using the
+information provided in this member.
+
+The information it reports is whether the TCG JIT is in "one
+instruction per translated block" mode (which can be set on the
+command line or via the HMP, but not via QMP). The information remains
+available via the HMP 'info jit' command.
+
[...]

RE: [PATCH v2 0/9] Hexagon (target/hexagon) Make generators object oriented

2024-01-15 Thread Brian Cain



> -Original Message-
> From: Taylor Simpson 
> Sent: Sunday, December 10, 2023 4:07 PM
> To: qemu-devel@nongnu.org
> Cc: Brian Cain ; Matheus Bernardino (QUIC)
> ; Sid Manning ; Marco
> Liebel (QUIC) ; richard.hender...@linaro.org;
> phi...@linaro.org; a...@rev.ng; a...@rev.ng; ltaylorsimp...@gmail.com
> Subject: [PATCH v2 0/9] Hexagon (target/hexagon) Make generators object
> oriented
> 
> WARNING: This email originated from outside of Qualcomm. Please be wary of
> any links or attachments, and do not enable macros.
> 
> See commit message in second patch
> 
>  Changes in v2 
> Address feedback from Brian Cain 
> - Consolidate logic to create helper arg lists
> 
> 
> Taylor Simpson (9):
>   Hexagon (target/hexagon) Clean up handling of modifier registers
>   Hexagon (target/hexagon) Make generators object oriented -
> gen_tcg_funcs
>   Hexagon (target/hexagon) Make generators object oriented -
> gen_helper_protos
>   Hexagon (target/hexagon) Make generators object oriented -
> gen_helper_funcs
>   Hexagon (target/hexagon) Make generators object oriented -
> gen_idef_parser_funcs
>   Hexagon (target/hexagon) Make generators object oriented - gen_op_regs
>   Hexagon (target/hexagon) Make generators object oriented -
> gen_analyze_funcs
>   Hexagon (target/hexagon) Remove unused WRITES_PRED_REG attribute
>   Hexagon (target/hexagon) Remove dead functions from hex_common.py
> 
>  target/hexagon/gen_tcg.h|   9 +-
>  target/hexagon/macros.h |   3 +-
>  target/hexagon/attribs_def.h.inc|   1 -
>  target/hexagon/idef-parser/parser-helpers.c |   8 +-
>  target/hexagon/gen_analyze_funcs.py | 163 +---
>  target/hexagon/gen_helper_funcs.py  | 368 ++--
>  target/hexagon/gen_helper_protos.py | 149 +---
>  target/hexagon/gen_idef_parser_funcs.py |  20 +-
>  target/hexagon/gen_op_regs.py   |   6 +-
>  target/hexagon/gen_tcg_funcs.py | 566 +---
>  target/hexagon/hex_common.py| 921 ++--
>  11 files changed, 964 insertions(+), 1250 deletions(-)
> 
> --
> 2.34.1


Queued - https://github.com/quic/qemu/tree/hex.next

RE: [PATCH v2 0/3] Hexagon (target/hexagon) Use QEMU decodetree

2024-01-15 Thread Brian Cain



> -Original Message-
> From: Taylor Simpson 
> Sent: Monday, January 15, 2024 4:15 PM
> To: qemu-devel@nongnu.org
> Cc: Brian Cain ; Matheus Bernardino (QUIC)
> ; Sid Manning ; Marco
> Liebel (QUIC) ; richard.hender...@linaro.org;
> phi...@linaro.org; a...@rev.ng; a...@rev.ng; ltaylorsimp...@gmail.com
> Subject: [PATCH v2 0/3] Hexagon (target/hexagon) Use QEMU decodetree
> 
> WARNING: This email originated from outside of Qualcomm. Please be wary of
> any links or attachments, and do not enable macros.
> 
> Replace the old Hexagon dectree.py with QEMU decodetree
> 
>  Changes in v2 
> Suggested Python improvements from Brian Cain 
> 
> 
> Taylor Simpson (3):
>   Hexagon (target/hexagon) Use QEMU decodetree (32-bit instructions)
>   Hexagon (target/hexagon) Use QEMU decodetree (16-bit instructions)
>   Hexagon (target/hexagon) Remove old dectree.py
> 
>  target/hexagon/decode.h |   5 +-
>  target/hexagon/opcodes.h|   2 -
>  target/hexagon/decode.c | 435 +++-
>  target/hexagon/gen_dectree_import.c |  49 
>  target/hexagon/opcodes.c|  29 --
>  target/hexagon/translate.c  |   4 +-
>  target/hexagon/README   |  14 +-
>  target/hexagon/dectree.py   | 403 --
>  target/hexagon/gen_decodetree.py| 198 +
>  target/hexagon/gen_trans_funcs.py   | 124 
>  target/hexagon/meson.build  | 147 +-
>  11 files changed, 586 insertions(+), 824 deletions(-)
>  delete mode 100755 target/hexagon/dectree.py
>  create mode 100755 target/hexagon/gen_decodetree.py
>  create mode 100755 target/hexagon/gen_trans_funcs.py
> 
> --
> 2.34.1


Queued - https://github.com/quic/qemu/tree/hex.next

Re: [PATCH v2 06/12] target/riscv/insn_trans/trans_rvvk.c.inc: use 'vlenb'

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

Use s->cfg_ptr->vlenb instead of s->cfg_ptr->vlen / 8.

Signed-off-by: Daniel Henrique Barboza
---
  target/riscv/insn_trans/trans_rvvk.c.inc | 16 
  1 file changed, 8 insertions(+), 8 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [RFC PATCH v3 13/30] migration/multifd: Add outgoing QIOChannelFile support

2024-01-15 Thread Peter Xu

On Mon, Nov 27, 2023 at 05:25:55PM -0300, Fabiano Rosas wrote:
> Allow multifd to open file-backed channels. This will be used when
> enabling the fixed-ram migration stream format which expects a
> seekable transport.
> 
> The QIOChannel read and write methods will use the preadv/pwritev
> versions which don't update the file offset at each call so we can
> reuse the fd without re-opening for every channel.
> 
> Note that this is just setup code and multifd cannot yet make use of
> the file channels.
> 
> Signed-off-by: Fabiano Rosas 
> ---
> - open multifd channels with O_WRONLY and no mode
> - stop cancelling migration and propagate error via qio_task
> ---
>  migration/file.c  | 47 +--
>  migration/file.h  |  5 +
>  migration/multifd.c   | 14 +++--
>  migration/options.c   |  7 +++
>  migration/options.h   |  1 +
>  migration/qemu-file.h |  1 -
>  6 files changed, 70 insertions(+), 5 deletions(-)
> 
> diff --git a/migration/file.c b/migration/file.c
> index 5d4975f43e..67d6f42da7 100644
> --- a/migration/file.c
> +++ b/migration/file.c
> @@ -17,6 +17,10 @@
>  
>  #define OFFSET_OPTION ",offset="
>  
> +static struct FileOutgoingArgs {
> +char *fname;
> +} outgoing_args;
> +
>  /* Remove the offset option from @filespec and return it in @offsetp. */
>  
>  int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp)
> @@ -36,6 +40,42 @@ int file_parse_offset(char *filespec, uint64_t *offsetp, 
> Error **errp)
>  return 0;
>  }
>  
> +static void qio_channel_file_connect_worker(QIOTask *task, gpointer opaque)
> +{
> +/* noop */
> +}
> +
> +int file_send_channel_destroy(QIOChannel *ioc)
> +{
> +if (ioc) {
> +qio_channel_close(ioc, NULL);
> +object_unref(OBJECT(ioc));
> +}
> +g_free(outgoing_args.fname);
> +outgoing_args.fname = NULL;
> +
> +return 0;
> +}
> +
> +void file_send_channel_create(QIOTaskFunc f, void *data)
> +{
> +QIOChannelFile *ioc;
> +QIOTask *task;
> +Error *err = NULL;
> +int flags = O_WRONLY;
> +
> +ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, );
> +
> +task = qio_task_new(OBJECT(ioc), f, (gpointer)data, NULL);
> +if (!ioc) {
> +qio_task_set_error(task, err);
> +return;
> +}
> +
> +qio_task_run_in_thread(task, qio_channel_file_connect_worker,
> +   (gpointer)data, NULL, NULL);

This is pretty weird.  This invokes a thread, but it'll run a noop.  It
seems meaningless to me.

I assume you wanted to keep using the same async model as the socket typed
multifd, but I don't think that works anyway, because file open blocks at
qio_channel_file_new_path() so it's sync anyway.

AFAICT we still share the code, as long as the file path properly invokes
multifd_channel_connect() after the iochannel is setup.

> +}
> +
>  void file_start_outgoing_migration(MigrationState *s,
> FileMigrationArgs *file_args, Error 
> **errp)
>  {
> @@ -43,15 +83,18 @@ void file_start_outgoing_migration(MigrationState *s,
>  g_autofree char *filename = g_strdup(file_args->filename);
>  uint64_t offset = file_args->offset;
>  QIOChannel *ioc;
> +int flags = O_CREAT | O_TRUNC | O_WRONLY;
> +mode_t mode = 0660;
>  
>  trace_migration_file_outgoing(filename);
>  
> -fioc = qio_channel_file_new_path(filename, O_CREAT | O_WRONLY | O_TRUNC,
> - 0600, errp);
> +fioc = qio_channel_file_new_path(filename, flags, mode, errp);
>  if (!fioc) {
>  return;
>  }
>  
> +outgoing_args.fname = g_strdup(filename);
> +
>  ioc = QIO_CHANNEL(fioc);
>  if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
>  return;
> diff --git a/migration/file.h b/migration/file.h
> index 37d6a08bfc..511019b319 100644
> --- a/migration/file.h
> +++ b/migration/file.h
> @@ -9,10 +9,15 @@
>  #define QEMU_MIGRATION_FILE_H
>  
>  #include "qapi/qapi-types-migration.h"
> +#include "io/task.h"
> +#include "channel.h"
>  
>  void file_start_incoming_migration(FileMigrationArgs *file_args, Error 
> **errp);
>  
>  void file_start_outgoing_migration(MigrationState *s,
> FileMigrationArgs *file_args, Error 
> **errp);
>  int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
> +
> +void file_send_channel_create(QIOTaskFunc f, void *data);
> +int file_send_channel_destroy(QIOChannel *ioc);
>  #endif
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 123ff0dec0..427740aab6 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -17,6 +17,7 @@
>  #include "exec/ramblock.h"
>  #include "qemu/error-report.h"
>  #include "qapi/error.h"
> +#include "file.h"
>  #include "ram.h"
>  #include "migration.h"
>  #include "migration-stats.h"
> @@ -28,6 +29,7 @@
>  #include "threadinfo.h"
>  #include "options.h"
>  #include "qemu/yank.h"
>

[PULL 06/20] tests/qtest/migration: Print migration incoming errors

2024-01-15 Thread peterx

From: Fabiano Rosas 

We're currently just asserting when incoming migration fails. Let's
print the error message from QMP as well.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Peter Xu 
Link: https://lore.kernel.org/r/20240104142144.9680-6-faro...@suse.de
Signed-off-by: Peter Xu 
---
 tests/qtest/migration-helpers.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index 37e8e812c5..19384e3fa6 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -111,6 +111,12 @@ void migrate_incoming_qmp(QTestState *to, const char *uri, 
const char *fmt, ...)
 
 rsp = qtest_qmp(to, "{ 'execute': 'migrate-incoming', 'arguments': %p}",
 args);
+
+if (!qdict_haskey(rsp, "return")) {
+g_autoptr(GString) s = qobject_to_json_pretty(QOBJECT(rsp), true);
+g_test_message("%s", s->str);
+}
+
 g_assert(qdict_haskey(rsp, "return"));
 qobject_unref(rsp);
 
-- 
2.43.0

Re: [PATCH 01/29] include: move include/qapi/qmp/ to include/qobject/

2024-01-15 Thread Zhao Liu

On Mon, Jan 08, 2024 at 06:46:38PM +, Daniel P. Berrangé wrote:
> Date: Mon, 8 Jan 2024 18:46:38 +
> From: "Daniel P. Berrangé" 
> Subject: Re: [PATCH 01/29] include: move include/qapi/qmp/ to
>  include/qobject/
> 
> On Mon, Jan 08, 2024 at 06:23:37PM +, Daniel P. Berrangé wrote:
> > The general expectation is that header files should follow the same
> > file/path naming scheme as the corresponding source file. There are
> > various historical exceptions to this practice in QEMU, with one of
> > the most notable being the include/qapi/qmp/ directory. Most of the
> > headers there correspond to source files in qobject/.
> > 
> > This patch corrects that inconsistency by creating include/qobject/.
> > The only outlier is include/qapi/qmp/dispatch.h which gets renamed
> > to include/qapi/qmp-registry.h.
> > 
> > To allow the code to continue to build, symlinks are temporarily
> > added in $QEMU/qapi/qmp/ to point to the new location. They will
> > be removed in a later commit.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >  MAINTAINERS | 5 +
> >  include/qapi/{qmp/dispatch.h => qmp-registry.h} | 0
> >  include/{qapi/qmp => qobject}/json-parser.h | 0
> >  include/{qapi/qmp => qobject}/json-writer.h | 0
> >  include/{qapi/qmp => qobject}/qbool.h   | 0
> >  include/{qapi/qmp => qobject}/qdict.h   | 0
> >  include/{qapi/qmp => qobject}/qerror.h  | 0
> 
> Of course just after sending this I decided that moving qerror.h
> to qobject/ is probably not optimal. It only contains a set of
> (deprecated) error message strings. Perhaps it could just move
> from qapi/qmp/qerror.h to just qapi/qerror.h ? Other suggestions ?

>From the naming style ("q" + module name) and the content comments
(descripted as a module), qerror.h (as an error module starting with
q) seems to be more neatly put together with other qmodules such as
qbool.h, qdirct.h, qlist.h, etc.

There is already an error.h under the include/qapi, which is supposed
to be the developer's first choice, and it seems a bit confusing to
have qerror.h in the same directory as error.h (even though it states
that qerror.h will be deprecated)?

Regards,
Zhao

> 
> >  include/{qapi/qmp => qobject}/qjson.h   | 0
> >  include/{qapi/qmp => qobject}/qlist.h   | 0
> >  include/{qapi/qmp => qobject}/qlit.h| 0
> >  include/{qapi/qmp => qobject}/qnull.h   | 0
> >  include/{qapi/qmp => qobject}/qnum.h| 0
> >  include/{qapi/qmp => qobject}/qobject.h | 0
> >  include/{qapi/qmp => qobject}/qstring.h | 0
> 
> With regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
> 
>

[PULL 13/20] docs/migration: Split "Backwards compatibility" separately

2024-01-15 Thread peterx

From: Peter Xu 

Split the section from main.rst into a separate file.  Reference it in the
index.rst.

Reviewed-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240109064628.595453-5-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 docs/devel/migration/compatibility.rst | 517 
 docs/devel/migration/index.rst |   1 +
 docs/devel/migration/main.rst  | 519 -
 3 files changed, 518 insertions(+), 519 deletions(-)
 create mode 100644 docs/devel/migration/compatibility.rst

diff --git a/docs/devel/migration/compatibility.rst 
b/docs/devel/migration/compatibility.rst
new file mode 100644
index 00..5a5417ef06
--- /dev/null
+++ b/docs/devel/migration/compatibility.rst
@@ -0,0 +1,517 @@
+Backwards compatibility
+===
+
+How backwards compatibility works
+-
+
+When we do migration, we have two QEMU processes: the source and the
+target.  There are two cases, they are the same version or they are
+different versions.  The easy case is when they are the same version.
+The difficult one is when they are different versions.
+
+There are two things that are different, but they have very similar
+names and sometimes get confused:
+
+- QEMU version
+- machine type version
+
+Let's start with a practical example, we start with:
+
+- qemu-system-x86_64 (v5.2), from now on qemu-5.2.
+- qemu-system-x86_64 (v5.1), from now on qemu-5.1.
+
+Related to this are the "latest" machine types defined on each of
+them:
+
+- pc-q35-5.2 (newer one in qemu-5.2) from now on pc-5.2
+- pc-q35-5.1 (newer one in qemu-5.1) from now on pc-5.1
+
+First of all, migration is only supposed to work if you use the same
+machine type in both source and destination. The QEMU hardware
+configuration needs to be the same also on source and destination.
+Most aspects of the backend configuration can be changed at will,
+except for a few cases where the backend features influence frontend
+device feature exposure.  But that is not relevant for this section.
+
+I am going to list the number of combinations that we can have.  Let's
+start with the trivial ones, QEMU is the same on source and
+destination:
+
+1 - qemu-5.2 -M pc-5.2  -> migrates to -> qemu-5.2 -M pc-5.2
+
+  This is the latest QEMU with the latest machine type.
+  This have to work, and if it doesn't work it is a bug.
+
+2 - qemu-5.1 -M pc-5.1  -> migrates to -> qemu-5.1 -M pc-5.1
+
+  Exactly the same case than the previous one, but for 5.1.
+  Nothing to see here either.
+
+This are the easiest ones, we will not talk more about them in this
+section.
+
+Now we start with the more interesting cases.  Consider the case where
+we have the same QEMU version in both sides (qemu-5.2) but we are using
+the latest machine type for that version (pc-5.2) but one of an older
+QEMU version, in this case pc-5.1.
+
+3 - qemu-5.2 -M pc-5.1  -> migrates to -> qemu-5.2 -M pc-5.1
+
+  It needs to use the definition of pc-5.1 and the devices as they
+  were configured on 5.1, but this should be easy in the sense that
+  both sides are the same QEMU and both sides have exactly the same
+  idea of what the pc-5.1 machine is.
+
+4 - qemu-5.1 -M pc-5.2  -> migrates to -> qemu-5.1 -M pc-5.2
+
+  This combination is not possible as the qemu-5.1 doesn't understand
+  pc-5.2 machine type.  So nothing to worry here.
+
+Now it comes the interesting ones, when both QEMU processes are
+different.  Notice also that the machine type needs to be pc-5.1,
+because we have the limitation than qemu-5.1 doesn't know pc-5.2.  So
+the possible cases are:
+
+5 - qemu-5.2 -M pc-5.1  -> migrates to -> qemu-5.1 -M pc-5.1
+
+  This migration is known as newer to older.  We need to make sure
+  when we are developing 5.2 we need to take care about not to break
+  migration to qemu-5.1.  Notice that we can't make updates to
+  qemu-5.1 to understand whatever qemu-5.2 decides to change, so it is
+  in qemu-5.2 side to make the relevant changes.
+
+6 - qemu-5.1 -M pc-5.1  -> migrates to -> qemu-5.2 -M pc-5.1
+
+  This migration is known as older to newer.  We need to make sure
+  than we are able to receive migrations from qemu-5.1. The problem is
+  similar to the previous one.
+
+If qemu-5.1 and qemu-5.2 were the same, there will not be any
+compatibility problems.  But the reason that we create qemu-5.2 is to
+get new features, devices, defaults, etc.
+
+If we get a device that has a new feature, or change a default value,
+we have a problem when we try to migrate between different QEMU
+versions.
+
+So we need a way to tell qemu-5.2 that when we are using machine type
+pc-5.1, it needs to **not** use the feature, to be able to migrate to
+real qemu-5.1.
+
+And the equivalent part when migrating from qemu-5.1 to qemu-5.2.
+qemu-5.2 has to expect that it is not going to get data for the new
+feature, because qemu-5.1 doesn't know about it.
+
+How do we tell QEMU about these device feature changes?  In

[PULL 14/20] docs/migration: Split "Debugging" and "Firmware"

2024-01-15 Thread peterx

From: Peter Xu 

Move the two sections into a separate file called "best-practices.rst".
Add the entry into index.

Reviewed-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240109064628.595453-6-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 docs/devel/migration/best-practices.rst | 48 +
 docs/devel/migration/index.rst  |  1 +
 docs/devel/migration/main.rst   | 44 ---
 3 files changed, 49 insertions(+), 44 deletions(-)
 create mode 100644 docs/devel/migration/best-practices.rst

diff --git a/docs/devel/migration/best-practices.rst 
b/docs/devel/migration/best-practices.rst
new file mode 100644
index 00..d7c34a3014
--- /dev/null
+++ b/docs/devel/migration/best-practices.rst
@@ -0,0 +1,48 @@
+==
+Best practices
+==
+
+Debugging
+=
+
+The migration stream can be analyzed thanks to 
``scripts/analyze-migration.py``.
+
+Example usage:
+
+.. code-block:: shell
+
+  $ qemu-system-x86_64 -display none -monitor stdio
+  (qemu) migrate "exec:cat > mig"
+  (qemu) q
+  $ ./scripts/analyze-migration.py -f mig
+  {
+"ram (3)": {
+"section sizes": {
+"pc.ram": "0x0800",
+  ...
+
+See also ``analyze-migration.py -h`` help for more options.
+
+Firmware
+
+
+Migration migrates the copies of RAM and ROM, and thus when running
+on the destination it includes the firmware from the source. Even after
+resetting a VM, the old firmware is used.  Only once QEMU has been restarted
+is the new firmware in use.
+
+- Changes in firmware size can cause changes in the required RAMBlock size
+  to hold the firmware and thus migration can fail.  In practice it's best
+  to pad firmware images to convenient powers of 2 with plenty of space
+  for growth.
+
+- Care should be taken with device emulation code so that newer
+  emulation code can work with older firmware to allow forward migration.
+
+- Care should be taken with newer firmware so that backward migration
+  to older systems with older device emulation code will work.
+
+In some cases it may be best to tie specific firmware versions to specific
+versioned machine types to cut down on the combinations that will need
+support.  This is also useful when newer versions of firmware outgrow
+the padding.
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 7fc02b9520..9a8fd1ead7 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -11,3 +11,4 @@ QEMU live migration works.
compatibility
vfio
virtio
+   best-practices
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index b3e31bb52f..97811ce371 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -52,27 +52,6 @@ All these migration protocols use the same infrastructure to
 save/restore state devices.  This infrastructure is shared with the
 savevm/loadvm functionality.
 
-Debugging
-=
-
-The migration stream can be analyzed thanks to 
``scripts/analyze-migration.py``.
-
-Example usage:
-
-.. code-block:: shell
-
-  $ qemu-system-x86_64 -display none -monitor stdio
-  (qemu) migrate "exec:cat > mig"
-  (qemu) q
-  $ ./scripts/analyze-migration.py -f mig
-  {
-"ram (3)": {
-"section sizes": {
-"pc.ram": "0x0800",
-  ...
-
-See also ``analyze-migration.py -h`` help for more options.
-
 Common infrastructure
 =
 
@@ -970,26 +949,3 @@ the background migration channel.  Anyone who cares about 
latencies of page
 faults during a postcopy migration should enable this feature.  By default,
 it's not enabled.
 
-Firmware
-
-
-Migration migrates the copies of RAM and ROM, and thus when running
-on the destination it includes the firmware from the source. Even after
-resetting a VM, the old firmware is used.  Only once QEMU has been restarted
-is the new firmware in use.
-
-- Changes in firmware size can cause changes in the required RAMBlock size
-  to hold the firmware and thus migration can fail.  In practice it's best
-  to pad firmware images to convenient powers of 2 with plenty of space
-  for growth.
-
-- Care should be taken with device emulation code so that newer
-  emulation code can work with older firmware to allow forward migration.
-
-- Care should be taken with newer firmware so that backward migration
-  to older systems with older device emulation code will work.
-
-In some cases it may be best to tie specific firmware versions to specific
-versioned machine types to cut down on the combinations that will need
-support.  This is also useful when newer versions of firmware outgrow
-the padding.
-- 
2.43.0

[PULL 05/20] migration: Report error in incoming migration

2024-01-15 Thread peterx

From: Fabiano Rosas 

We're not currently reporting the errors set with migrate_set_error()
when incoming migration fails.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Peter Xu 
Link: https://lore.kernel.org/r/20240104142144.9680-5-faro...@suse.de
Signed-off-by: Peter Xu 
---
 migration/migration.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 2365a3a13c..219447dea1 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -697,6 +697,13 @@ process_incoming_migration_co(void *opaque)
 }
 
 if (ret < 0) {
+MigrationState *s = migrate_get_current();
+
+if (migrate_has_error(s)) {
+WITH_QEMU_LOCK_GUARD(>error_mutex) {
+error_report_err(s->error);
+}
+}
 error_report("load of migration failed: %s", strerror(-ret));
 goto fail;
 }
-- 
2.43.0

[PULL 20/20] migration/rdma: define htonll/ntohll only if not predefined

2024-01-15 Thread peterx

From: Nick Briggs 

Solaris has #defines for htonll and ntohll which cause syntax errors
when compiling code that attempts to (re)define these functions..

Signed-off-by: Nick Briggs 
Link: https://lore.kernel.org/r/65a04a7d.497ab3.3e7be...@gateway.sonic.net
Signed-off-by: Peter Xu 
---
 migration/rdma.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/migration/rdma.c b/migration/rdma.c
index 94c0f871f0..a355dcea89 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -238,6 +238,7 @@ static const char *control_desc(unsigned int rdma_control)
 return strs[rdma_control];
 }
 
+#if !defined(htonll)
 static uint64_t htonll(uint64_t v)
 {
 union { uint32_t lv[2]; uint64_t llv; } u;
@@ -245,13 +246,16 @@ static uint64_t htonll(uint64_t v)
 u.lv[1] = htonl(v & 0xULL);
 return u.llv;
 }
+#endif
 
+#if !defined(ntohll)
 static uint64_t ntohll(uint64_t v)
 {
 union { uint32_t lv[2]; uint64_t llv; } u;
 u.llv = v;
 return ((uint64_t)ntohl(u.lv[0]) << 32) | (uint64_t) ntohl(u.lv[1]);
 }
+#endif
 
 static void dest_block_to_network(RDMADestBlock *db)
 {
-- 
2.43.0

[PULL 12/20] docs/migration: Convert virtio.txt into rST

2024-01-15 Thread peterx

From: Peter Xu 

Convert the plain old .txt into .rst, add it into migration/index.rst.

Reviewed-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240109064628.595453-4-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 docs/devel/migration/index.rst  |   1 +
 docs/devel/migration/virtio.rst | 115 
 docs/devel/migration/virtio.txt | 108 --
 3 files changed, 116 insertions(+), 108 deletions(-)
 create mode 100644 docs/devel/migration/virtio.rst
 delete mode 100644 docs/devel/migration/virtio.txt

diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 02cfdcc969..2cb701c77c 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -9,3 +9,4 @@ QEMU live migration works.
 
main
vfio
+   virtio
diff --git a/docs/devel/migration/virtio.rst b/docs/devel/migration/virtio.rst
new file mode 100644
index 00..611a18b821
--- /dev/null
+++ b/docs/devel/migration/virtio.rst
@@ -0,0 +1,115 @@
+===
+Virtio device migration
+===
+
+Copyright 2015 IBM Corp.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.  See
+the COPYING file in the top-level directory.
+
+Saving and restoring the state of virtio devices is a bit of a twisty maze,
+for several reasons:
+
+- state is distributed between several parts:
+
+  - virtio core, for common fields like features, number of queues, ...
+
+  - virtio transport (pci, ccw, ...), for the different proxy devices and
+transport specific state (msix vectors, indicators, ...)
+
+  - virtio device (net, blk, ...), for the different device types and their
+state (mac address, request queue, ...)
+
+- most fields are saved via the stream interface; subsequently, subsections
+  have been added to make cross-version migration possible
+
+This file attempts to document the current procedure and point out some
+caveats.
+
+Save state procedure
+
+
+::
+
+  virtio core   virtio transport  virtio device
+  ---     -
+
+  save() function 
registered
+  via VMState wrapper on
+  device class
+  virtio_save()   <--
+   -->  save_config()
+- save proxy device
+- save transport-specific
+  device fields
+  - save common device
+fields
+  - save common virtqueue
+fields
+   -->  save_queue()
+- save transport-specific
+  virtqueue fields
+   -->   save_device()
+ - save device-specific
+   fields
+  - save subsections
+- device endianness,
+  if changed from
+  default endianness
+- 64 bit features, if
+  any high feature bit
+  is set
+- virtio-1 virtqueue
+  fields, if VERSION_1
+  is set
+
+Load state procedure
+
+
+::
+
+  virtio core   virtio transport  virtio device
+  ---     -
+
+  load() function 
registered
+  via VMState wrapper on
+  device class
+  virtio_load()   <--
+   -->  load_config()
+- load proxy device
+- load transport-specific
+  device fields
+  - load common device
+fields
+  - load common virtqueue
+fields
+   -->  load_queue()
+- load transport-specific
+  virtqueue fields
+  - notify guest
+   -->   load_device()
+ - load device-specific
+   fields
+  - load subsections
+- device endianness
+- 64 bit features
+- virtio-1 virtqueue
+  fields
+  - sanitize endianness
+  - sanitize features
+  - virtqueue index sanity
+check
+ - feature-dependent setup
+
+Implications of this setup
+==
+
+Devices need to be careful in their state processing during load: The
+load_device() procedure is invoked by the core before subsections have
+been loaded. Any code that depends on information transmitted in subsections

[PULL 18/20] docs/migration: Further move vfio to be feature of migration

2024-01-15 Thread peterx

From: Peter Xu 

Move it one layer down, so taking VFIO-migration as a feature for
migration.

Cc: Alex Williamson 
Cc: Cédric Le Goater 
Reviewed-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240109064628.595453-10-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 docs/devel/migration/features.rst | 1 +
 docs/devel/migration/index.rst| 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
index e257d0d100..dea016f707 100644
--- a/docs/devel/migration/features.rst
+++ b/docs/devel/migration/features.rst
@@ -8,3 +8,4 @@ Migration has plenty of features to support different use cases.
 
postcopy
dirty-limit
+   vfio
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 21ad58b189..b1357309e1 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -10,6 +10,5 @@ QEMU live migration works.
main
features
compatibility
-   vfio
virtio
best-practices
-- 
2.43.0

[PULL 16/20] docs/migration: Split "dirty limit"

2024-01-15 Thread peterx

From: Peter Xu 

Split that into a separate file, put under "features".

Cc: Yong Huang 
Reviewed-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240109064628.595453-8-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 docs/devel/migration/dirty-limit.rst | 71 
 docs/devel/migration/features.rst|  1 +
 docs/devel/migration/main.rst| 71 
 3 files changed, 72 insertions(+), 71 deletions(-)
 create mode 100644 docs/devel/migration/dirty-limit.rst

diff --git a/docs/devel/migration/dirty-limit.rst 
b/docs/devel/migration/dirty-limit.rst
new file mode 100644
index 00..8f32329d5f
--- /dev/null
+++ b/docs/devel/migration/dirty-limit.rst
@@ -0,0 +1,71 @@
+Dirty limit
+===
+
+The dirty limit, short for dirty page rate upper limit, is a new capability
+introduced in the 8.1 QEMU release that uses a new algorithm based on the KVM
+dirty ring to throttle down the guest during live migration.
+
+The algorithm framework is as follows:
+
+::
+
+  
--
+  main   --> throttle thread > PREPARE(1) <
+  thread  \|  |
+   \   |  |
+\  V  |
+ -\CALCULATE(2)   |
+   \   |  |
+\  |  |
+ \ V  |
+  \SET PENALTY(3) -
+   -\  |
+ \ |
+  \V
+   -> virtual CPU thread ---> ACCEPT PENALTY(4)
+  
--
+
+When the qmp command qmp_set_vcpu_dirty_limit is called for the first time,
+the QEMU main thread starts the throttle thread. The throttle thread, once
+launched, executes the loop, which consists of three steps:
+
+  - PREPARE (1)
+
+ The entire work of PREPARE (1) is preparation for the second stage,
+ CALCULATE(2), as the name implies. It involves preparing the dirty
+ page rate value and the corresponding upper limit of the VM:
+ The dirty page rate is calculated via the KVM dirty ring mechanism,
+ which tells QEMU how many dirty pages a virtual CPU has had since the
+ last KVM_EXIT_DIRTY_RING_FULL exception; The dirty page rate upper
+ limit is specified by caller, therefore fetch it directly.
+
+  - CALCULATE (2)
+
+ Calculate a suitable sleep period for each virtual CPU, which will be
+ used to determine the penalty for the target virtual CPU. The
+ computation must be done carefully in order to reduce the dirty page
+ rate progressively down to the upper limit without oscillation. To
+ achieve this, two strategies are provided: the first is to add or
+ subtract sleep time based on the ratio of the current dirty page rate
+ to the limit, which is used when the current dirty page rate is far
+ from the limit; the second is to add or subtract a fixed time when
+ the current dirty page rate is close to the limit.
+
+  - SET PENALTY (3)
+
+ Set the sleep time for each virtual CPU that should be penalized based
+ on the results of the calculation supplied by step CALCULATE (2).
+
+After completing the three above stages, the throttle thread loops back
+to step PREPARE (1) until the dirty limit is reached.
+
+On the other hand, each virtual CPU thread reads the sleep duration and
+sleeps in the path of the KVM_EXIT_DIRTY_RING_FULL exception handler, that
+is ACCEPT PENALTY (4). Virtual CPUs tied with writing processes will
+obviously exit to the path and get penalized, whereas virtual CPUs involved
+with read processes will not.
+
+In summary, thanks to the KVM dirty ring technology, the dirty limit
+algorithm will restrict virtual CPUs as needed to keep their dirty page
+rate inside the limit. This leads to more steady reading performance during
+live migration and can aid in improving large guest responsiveness.
diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
index 0054e0c900..e257d0d100 100644
--- a/docs/devel/migration/features.rst
+++ b/docs/devel/migration/features.rst
@@ -7,3 +7,4 @@ Migration has plenty of features to support different use cases.
:maxdepth: 2
 
postcopy
+   dirty-limit
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 051ea43f0e..00b9c3d32f 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@

[PULL 15/20] docs/migration: Split "Postcopy"

2024-01-15 Thread peterx

From: Peter Xu 

Split postcopy into a separate file.  Introduce a head page "features.rst"
to keep all the features on top of migration framework.

Reviewed-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240109064628.595453-7-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 docs/devel/migration/features.rst |   9 +
 docs/devel/migration/index.rst|   1 +
 docs/devel/migration/main.rst | 305 --
 docs/devel/migration/postcopy.rst | 304 +
 4 files changed, 314 insertions(+), 305 deletions(-)
 create mode 100644 docs/devel/migration/features.rst
 create mode 100644 docs/devel/migration/postcopy.rst

diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
new file mode 100644
index 00..0054e0c900
--- /dev/null
+++ b/docs/devel/migration/features.rst
@@ -0,0 +1,9 @@
+Migration features
+==
+
+Migration has plenty of features to support different use cases.
+
+.. toctree::
+   :maxdepth: 2
+
+   postcopy
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index 9a8fd1ead7..21ad58b189 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -8,6 +8,7 @@ QEMU live migration works.
:maxdepth: 2
 
main
+   features
compatibility
vfio
virtio
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 97811ce371..051ea43f0e 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -644,308 +644,3 @@ algorithm will restrict virtual CPUs as needed to keep 
their dirty page
 rate inside the limit. This leads to more steady reading performance during
 live migration and can aid in improving large guest responsiveness.
 
-Postcopy
-
-
-'Postcopy' migration is a way to deal with migrations that refuse to converge
-(or take too long to converge) its plus side is that there is an upper bound on
-the amount of migration traffic and time it takes, the down side is that during
-the postcopy phase, a failure of *either* side causes the guest to be lost.
-
-In postcopy the destination CPUs are started before all the memory has been
-transferred, and accesses to pages that are yet to be transferred cause
-a fault that's translated by QEMU into a request to the source QEMU.
-
-Postcopy can be combined with precopy (i.e. normal migration) so that if 
precopy
-doesn't finish in a given time the switch is made to postcopy.
-
-Enabling postcopy
--
-
-To enable postcopy, issue this command on the monitor (both source and
-destination) prior to the start of migration:
-
-``migrate_set_capability postcopy-ram on``
-
-The normal commands are then used to start a migration, which is still
-started in precopy mode.  Issuing:
-
-``migrate_start_postcopy``
-
-will now cause the transition from precopy to postcopy.
-It can be issued immediately after migration is started or any
-time later on.  Issuing it after the end of a migration is harmless.
-
-Blocktime is a postcopy live migration metric, intended to show how
-long the vCPU was in state of interruptible sleep due to pagefault.
-That metric is calculated both for all vCPUs as overlapped value, and
-separately for each vCPU. These values are calculated on destination
-side.  To enable postcopy blocktime calculation, enter following
-command on destination monitor:
-
-``migrate_set_capability postcopy-blocktime on``
-
-Postcopy blocktime can be retrieved by query-migrate qmp command.
-postcopy-blocktime value of qmp command will show overlapped blocking
-time for all vCPU, postcopy-vcpu-blocktime will show list of blocking
-time per vCPU.
-
-.. note::
-  During the postcopy phase, the bandwidth limits set using
-  ``migrate_set_parameter`` is ignored (to avoid delaying requested pages that
-  the destination is waiting for).
-
-Postcopy device transfer
-
-
-Loading of device data may cause the device emulation to access guest RAM
-that may trigger faults that have to be resolved by the source, as such
-the migration stream has to be able to respond with page data *during* the
-device load, and hence the device data has to be read from the stream 
completely
-before the device load begins to free the stream up.  This is achieved by
-'packaging' the device data into a blob that's read in one go.
-
-Source behaviour
-
-
-Until postcopy is entered the migration stream is identical to normal
-precopy, except for the addition of a 'postcopy advise' command at
-the beginning, to tell the destination that postcopy might happen.
-When postcopy starts the source sends the page discard data and then
-forms the 'package' containing:
-
-   - Command: 'postcopy listen'
-   - The device state
-
- A series of sections, identical to the precopy streams device state stream
- containing everything except postcopiable devices (i.e. RAM)
-   - Command: 'postcopy run'
-
-The 'package' is sent as the data

[PULL 02/20] migration/multifd: Remove MultiFDPages_t::packet_num

2024-01-15 Thread peterx

From: Fabiano Rosas 

This was introduced by commit 34c55a94b1 ("migration: Create multipage
support") and never used.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Peter Xu 
Link: https://lore.kernel.org/r/20240104142144.9680-2-faro...@suse.de
Signed-off-by: Peter Xu 
---
 migration/multifd.h | 2 --
 migration/multifd.c | 1 -
 2 files changed, 3 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index a835643b48..b0ff610c37 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -58,8 +58,6 @@ typedef struct {
 uint32_t num;
 /* number of allocated pages */
 uint32_t allocated;
-/* global number of generated multifd packets */
-uint64_t packet_num;
 /* offset of each page */
 ram_addr_t *offset;
 RAMBlock *block;
diff --git a/migration/multifd.c b/migration/multifd.c
index 9f353aecfa..3e650f5da0 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -250,7 +250,6 @@ static void multifd_pages_clear(MultiFDPages_t *pages)
 {
 pages->num = 0;
 pages->allocated = 0;
-pages->packet_num = 0;
 pages->block = NULL;
 g_free(pages->offset);
 pages->offset = NULL;
-- 
2.43.0

[PULL 01/20] migration: Simplify initial conditionals in migration for better readability

2024-01-15 Thread peterx

From: Het Gala 

The inital conditional statements in qmp migration functions is harder
to understand than necessary. It is better to get all errors out of
the way in the beginning itself to have better readability and error
handling.

Signed-off-by: Het Gala 
Suggested-by: Markus Armbruster 
Reviewed-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20231205080039.197615-1-het.g...@nutanix.com
Signed-off-by: Peter Xu 
---
 migration/migration.c | 36 
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 98c5c3e140..2365a3a13c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -523,28 +523,26 @@ static void qemu_start_incoming_migration(const char 
*uri, bool has_channels,
 /*
  * Having preliminary checks for uri and channel
  */
-if (uri && has_channels) {
-error_setg(errp, "'uri' and 'channels' arguments are mutually "
-   "exclusive; exactly one of the two should be present in "
-   "'migrate-incoming' qmp command ");
+if (!uri == !channels) {
+error_setg(errp, "need either 'uri' or 'channels' argument");
 return;
-} else if (channels) {
+}
+
+if (channels) {
 /* To verify that Migrate channel list has only item */
 if (channels->next) {
 error_setg(errp, "Channel list has more than one entries");
 return;
 }
 addr = channels->value->addr;
-} else if (uri) {
+}
+
+if (uri) {
 /* caller uses the old URI syntax */
 if (!migrate_uri_parse(uri, , errp)) {
 return;
 }
 addr = channel->addr;
-} else {
-error_setg(errp, "neither 'uri' or 'channels' argument are "
-   "specified in 'migrate-incoming' qmp command ");
-return;
 }
 
 /* transport mechanism not suitable for migration? */
@@ -1924,28 +1922,26 @@ void qmp_migrate(const char *uri, bool has_channels,
 /*
  * Having preliminary checks for uri and channel
  */
-if (uri && has_channels) {
-error_setg(errp, "'uri' and 'channels' arguments are mutually "
-   "exclusive; exactly one of the two should be present in "
-   "'migrate' qmp command ");
+if (!uri == !channels) {
+error_setg(errp, "need either 'uri' or 'channels' argument");
 return;
-} else if (channels) {
+}
+
+if (channels) {
 /* To verify that Migrate channel list has only item */
 if (channels->next) {
 error_setg(errp, "Channel list has more than one entries");
 return;
 }
 addr = channels->value->addr;
-} else if (uri) {
+}
+
+if (uri) {
 /* caller uses the old URI syntax */
 if (!migrate_uri_parse(uri, , errp)) {
 return;
 }
 addr = channel->addr;
-} else {
-error_setg(errp, "neither 'uri' or 'channels' argument are "
-   "specified in 'migrate' qmp command ");
-return;
 }
 
 /* transport mechanism not suitable for migration? */
-- 
2.43.0

[PULL 19/20] docs/migration: Further move virtio to be feature of migration

2024-01-15 Thread peterx

From: Peter Xu 

Move it one layer down, so taking Virtio-migration as a feature for
migration.

Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Reviewed-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240109064628.595453-11-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 docs/devel/migration/features.rst | 1 +
 docs/devel/migration/index.rst| 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/devel/migration/features.rst 
b/docs/devel/migration/features.rst
index dea016f707..a9acaf618e 100644
--- a/docs/devel/migration/features.rst
+++ b/docs/devel/migration/features.rst
@@ -9,3 +9,4 @@ Migration has plenty of features to support different use cases.
postcopy
dirty-limit
vfio
+   virtio
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
index b1357309e1..2aa294d631 100644
--- a/docs/devel/migration/index.rst
+++ b/docs/devel/migration/index.rst
@@ -10,5 +10,4 @@ QEMU live migration works.
main
features
compatibility
-   virtio
best-practices
-- 
2.43.0

[PULL 10/20] docs/migration: Create migration/ directory

2024-01-15 Thread peterx

From: Peter Xu 

Migration documentation is growing into a single file too large.  Create a
sub-directory for it for a split.

We also already have separate vfio/virtio documentations, move it all over
into the directory.

Note that the virtio one is still not yet converted to rST.  That is a job
for later.

Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Alex Williamson 
Cc: Cédric Le Goater 
Reviewed-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240109064628.595453-2-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 docs/devel/index-internals.rst| 2 +-
 docs/devel/{migration.rst => migration/main.rst}  | 0
 docs/devel/{vfio-migration.rst => migration/vfio.rst} | 0
 docs/devel/{virtio-migration.txt => migration/virtio.txt} | 0
 4 files changed, 1 insertion(+), 1 deletion(-)
 rename docs/devel/{migration.rst => migration/main.rst} (100%)
 rename docs/devel/{vfio-migration.rst => migration/vfio.rst} (100%)
 rename docs/devel/{virtio-migration.txt => migration/virtio.txt} (100%)

diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index 3def4a138b..a41d62c1eb 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -11,7 +11,7 @@ Details about QEMU's various subsystems including how to add 
features to them.
block-coroutine-wrapper
clocks
ebpf_rss
-   migration
+   migration/main
multi-process
reset
s390-cpu-topology
diff --git a/docs/devel/migration.rst b/docs/devel/migration/main.rst
similarity index 100%
rename from docs/devel/migration.rst
rename to docs/devel/migration/main.rst
diff --git a/docs/devel/vfio-migration.rst b/docs/devel/migration/vfio.rst
similarity index 100%
rename from docs/devel/vfio-migration.rst
rename to docs/devel/migration/vfio.rst
diff --git a/docs/devel/virtio-migration.txt b/docs/devel/migration/virtio.txt
similarity index 100%
rename from docs/devel/virtio-migration.txt
rename to docs/devel/migration/virtio.txt
-- 
2.43.0

[PULL 08/20] tests/qtest/migration: Use the new migration_test_add

2024-01-15 Thread peterx

From: Fabiano Rosas 

Replace the tests registration with the new function that prints tests
names.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Peter Xu 
Link: https://lore.kernel.org/r/20240104142144.9680-8-faro...@suse.de
Signed-off-by: Peter Xu 
---
 tests/qtest/migration-test.c | 215 ++-
 1 file changed, 112 insertions(+), 103 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 136e5df06c..21da140aea 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -3404,70 +3404,75 @@ int main(int argc, char **argv)
 module_call_init(MODULE_INIT_QOM);
 
 if (is_x86) {
-qtest_add_func("/migration/precopy/unix/suspend/live",
-   test_precopy_unix_suspend_live);
-qtest_add_func("/migration/precopy/unix/suspend/notlive",
-   test_precopy_unix_suspend_notlive);
+migration_test_add("/migration/precopy/unix/suspend/live",
+   test_precopy_unix_suspend_live);
+migration_test_add("/migration/precopy/unix/suspend/notlive",
+   test_precopy_unix_suspend_notlive);
 }
 
 if (has_uffd) {
-qtest_add_func("/migration/postcopy/plain", test_postcopy);
-qtest_add_func("/migration/postcopy/recovery/plain",
-   test_postcopy_recovery);
-qtest_add_func("/migration/postcopy/preempt/plain", 
test_postcopy_preempt);
-qtest_add_func("/migration/postcopy/preempt/recovery/plain",
-   test_postcopy_preempt_recovery);
+migration_test_add("/migration/postcopy/plain", test_postcopy);
+migration_test_add("/migration/postcopy/recovery/plain",
+   test_postcopy_recovery);
+migration_test_add("/migration/postcopy/preempt/plain",
+   test_postcopy_preempt);
+migration_test_add("/migration/postcopy/preempt/recovery/plain",
+   test_postcopy_preempt_recovery);
 if (getenv("QEMU_TEST_FLAKY_TESTS")) {
-qtest_add_func("/migration/postcopy/compress/plain",
-   test_postcopy_compress);
-qtest_add_func("/migration/postcopy/recovery/compress/plain",
-   test_postcopy_recovery_compress);
+migration_test_add("/migration/postcopy/compress/plain",
+   test_postcopy_compress);
+migration_test_add("/migration/postcopy/recovery/compress/plain",
+   test_postcopy_recovery_compress);
 }
 #ifndef _WIN32
-qtest_add_func("/migration/postcopy/recovery/double-failures",
-   test_postcopy_recovery_double_fail);
+migration_test_add("/migration/postcopy/recovery/double-failures",
+   test_postcopy_recovery_double_fail);
 #endif /* _WIN32 */
 if (is_x86) {
-qtest_add_func("/migration/postcopy/suspend",
-   test_postcopy_suspend);
+migration_test_add("/migration/postcopy/suspend",
+   test_postcopy_suspend);
 }
 }
 
-qtest_add_func("/migration/bad_dest", test_baddest);
+migration_test_add("/migration/bad_dest", test_baddest);
 #ifndef _WIN32
-qtest_add_func("/migration/analyze-script", test_analyze_script);
+if (!g_str_equal(arch, "s390x")) {
+migration_test_add("/migration/analyze-script", test_analyze_script);
+}
 #endif
-qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain);
-qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);
+migration_test_add("/migration/precopy/unix/plain",
+   test_precopy_unix_plain);
+migration_test_add("/migration/precopy/unix/xbzrle",
+   test_precopy_unix_xbzrle);
 /*
  * Compression fails from time to time.
  * Put test here but don't enable it until everything is fixed.
  */
 if (getenv("QEMU_TEST_FLAKY_TESTS")) {
-qtest_add_func("/migration/precopy/unix/compress/wait",
-   test_precopy_unix_compress);
-qtest_add_func("/migration/precopy/unix/compress/nowait",
-   test_precopy_unix_compress_nowait);
+migration_test_add("/migration/precopy/unix/compress/wait",
+   test_precopy_unix_compress);
+migration_test_add("/migration/precopy/unix/compress/nowait",
+   test_precopy_unix_compress_nowait);
 }
 
-qtest_add_func("/migration/precopy/file",
-   test_precopy_file);
-qtest_add_func("/migration/precopy/file/offset",
-   test_precopy_file_offset);
-qtest_add_func("/migration/precopy/file/offset/bad",
-   test_precopy_file_offset_bad);
+migration_test_add("/migration/precopy/file",
+

[PULL 11/20] docs/migration: Create index page

2024-01-15 Thread peterx

From: Peter Xu 

Create an index page for migration module.  Move VFIO migration there too.
A trivial touch-up on the title to use lower case there.

Since then we'll have "migration" as the top title, make the main doc file
renamed to "migration framework".

Cc: Alex Williamson 
Cc: Cédric Le Goater 
Reviewed-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240109064628.595453-3-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 docs/devel/index-internals.rst |  3 +--
 docs/devel/migration/index.rst | 11 +++
 docs/devel/migration/main.rst  |  6 +++---
 docs/devel/migration/vfio.rst  |  2 +-
 4 files changed, 16 insertions(+), 6 deletions(-)
 create mode 100644 docs/devel/migration/index.rst

diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index a41d62c1eb..5636e9cf1d 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -11,13 +11,12 @@ Details about QEMU's various subsystems including how to 
add features to them.
block-coroutine-wrapper
clocks
ebpf_rss
-   migration/main
+   migration/index
multi-process
reset
s390-cpu-topology
s390-dasd-ipl
tracing
-   vfio-migration
vfio-iommufd
writing-monitor-commands
virtio-backends
diff --git a/docs/devel/migration/index.rst b/docs/devel/migration/index.rst
new file mode 100644
index 00..02cfdcc969
--- /dev/null
+++ b/docs/devel/migration/index.rst
@@ -0,0 +1,11 @@
+Migration
+=
+
+This is the main entry for QEMU migration documentations.  It explains how
+QEMU live migration works.
+
+.. toctree::
+   :maxdepth: 2
+
+   main
+   vfio
diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst
index 95351ba51f..62bf027fb4 100644
--- a/docs/devel/migration/main.rst
+++ b/docs/devel/migration/main.rst
@@ -1,6 +1,6 @@
-=
-Migration
-=
+===
+Migration framework
+===
 
 QEMU has code to load/save the state of the guest that it is running.
 These are two complementary operations.  Saving the state just does
diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst
index 605fe60e96..c49482eab6 100644
--- a/docs/devel/migration/vfio.rst
+++ b/docs/devel/migration/vfio.rst
@@ -1,5 +1,5 @@
 =
-VFIO device Migration
+VFIO device migration
 =
 
 Migration of virtual machine involves saving the state for each device that
-- 
2.43.0

[PULL 17/20] docs/migration: Organize "Postcopy" page

2024-01-15 Thread peterx

From: Peter Xu 

Reorganize the page, moving things around, and add a few
headlines ("Postcopy internals", "Postcopy features") to cover sub-areas.

Reviewed-by: Cédric Le Goater 
Link: https://lore.kernel.org/r/20240109064628.595453-9-pet...@redhat.com
Signed-off-by: Peter Xu 
---
 docs/devel/migration/postcopy.rst | 159 --
 1 file changed, 84 insertions(+), 75 deletions(-)

diff --git a/docs/devel/migration/postcopy.rst 
b/docs/devel/migration/postcopy.rst
index d60eec06ab..6c51e96d79 100644
--- a/docs/devel/migration/postcopy.rst
+++ b/docs/devel/migration/postcopy.rst
@@ -1,6 +1,9 @@
+
 Postcopy
 
 
+.. contents::
+
 'Postcopy' migration is a way to deal with migrations that refuse to converge
 (or take too long to converge) its plus side is that there is an upper bound on
 the amount of migration traffic and time it takes, the down side is that during
@@ -14,7 +17,7 @@ Postcopy can be combined with precopy (i.e. normal migration) 
so that if precopy
 doesn't finish in a given time the switch is made to postcopy.
 
 Enabling postcopy
--
+=
 
 To enable postcopy, issue this command on the monitor (both source and
 destination) prior to the start of migration:
@@ -49,8 +52,71 @@ time per vCPU.
   ``migrate_set_parameter`` is ignored (to avoid delaying requested pages that
   the destination is waiting for).
 
-Postcopy device transfer
-
+Postcopy internals
+==
+
+State machine
+-
+
+Postcopy moves through a series of states (see postcopy_state) from
+ADVISE->DISCARD->LISTEN->RUNNING->END
+
+ - Advise
+
+Set at the start of migration if postcopy is enabled, even
+if it hasn't had the start command; here the destination
+checks that its OS has the support needed for postcopy, and performs
+setup to ensure the RAM mappings are suitable for later postcopy.
+The destination will fail early in migration at this point if the
+required OS support is not present.
+(Triggered by reception of POSTCOPY_ADVISE command)
+
+ - Discard
+
+Entered on receipt of the first 'discard' command; prior to
+the first Discard being performed, hugepages are switched off
+(using madvise) to ensure that no new huge pages are created
+during the postcopy phase, and to cause any huge pages that
+have discards on them to be broken.
+
+ - Listen
+
+The first command in the package, POSTCOPY_LISTEN, switches
+the destination state to Listen, and starts a new thread
+(the 'listen thread') which takes over the job of receiving
+pages off the migration stream, while the main thread carries
+on processing the blob.  With this thread able to process page
+reception, the destination now 'sensitises' the RAM to detect
+any access to missing pages (on Linux using the 'userfault'
+system).
+
+ - Running
+
+POSTCOPY_RUN causes the destination to synchronise all
+state and start the CPUs and IO devices running.  The main
+thread now finishes processing the migration package and
+now carries on as it would for normal precopy migration
+(although it can't do the cleanup it would do as it
+finishes a normal migration).
+
+ - Paused
+
+Postcopy can run into a paused state (normally on both sides when
+happens), where all threads will be temporarily halted mostly due to
+network errors.  When reaching paused state, migration will make sure
+the qemu binary on both sides maintain the data without corrupting
+the VM.  To continue the migration, the admin needs to fix the
+migration channel using the QMP command 'migrate-recover' on the
+destination node, then resume the migration using QMP command 'migrate'
+again on source node, with resume=true flag set.
+
+ - End
+
+The listen thread can now quit, and perform the cleanup of migration
+state, the migration is now complete.
+
+Device transfer
+---
 
 Loading of device data may cause the device emulation to access guest RAM
 that may trigger faults that have to be resolved by the source, as such
@@ -130,7 +196,20 @@ processing.
is no longer used by migration, while the listen thread carries on servicing
page data until the end of migration.
 
-Postcopy Recovery
+Source side page bitmap
+---
+
+The 'migration bitmap' in postcopy is basically the same as in the precopy,
+where each of the bit to indicate that page is 'dirty' - i.e. needs
+sending.  During the precopy phase this is updated as the CPU dirties
+pages, however during postcopy the CPUs are stopped and nothing should
+dirty anything any more. Instead, dirty bits are cleared when the relevant
+pages are sent during postcopy.
+
+Postcopy features
+=
+
+Postcopy recovery
 -
 
 Comparing to precopy, postcopy is special on error handlings.  When any
@@ -166,76 +245,6 @@ configurations of the guest.  For example,

[PULL 09/20] tests/qtest: Re-enable multifd cancel test

2024-01-15 Thread peterx

From: Fabiano Rosas 

We've found the source of flakiness in this test, so re-enable it.

Reviewed-by: Juan Quintela 
Signed-off-by: Fabiano Rosas 
Link: https://lore.kernel.org/r/20230606144551.24367-4-faro...@suse.de
[peterx: rebase to 2a61a6964c, to use migration_test_add()]
Signed-off-by: Peter Xu 
---
 tests/qtest/migration-test.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 21da140aea..d3066e119f 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -3550,14 +3550,8 @@ int main(int argc, char **argv)
 }
 migration_test_add("/migration/multifd/tcp/plain/none",
test_multifd_tcp_none);
-/*
- * This test is flaky and sometimes fails in CI and otherwise:
- * don't run unless user opts in via environment variable.
- */
-if (getenv("QEMU_TEST_FLAKY_TESTS")) {
-migration_test_add("/migration/multifd/tcp/plain/cancel",
-   test_multifd_tcp_cancel);
-}
+migration_test_add("/migration/multifd/tcp/plain/cancel",
+   test_multifd_tcp_cancel);
 migration_test_add("/migration/multifd/tcp/plain/zlib",
test_multifd_tcp_zlib);
 #ifdef CONFIG_ZSTD
-- 
2.43.0

[PULL 03/20] migration/multifd: Remove QEMUFile from where it is not needed

2024-01-15 Thread peterx

From: Fabiano Rosas 

Signed-off-by: Fabiano Rosas 
Reviewed-by: Peter Xu 
Link: https://lore.kernel.org/r/20240104142144.9680-3-faro...@suse.de
Signed-off-by: Peter Xu 
---
 migration/multifd.h |  4 ++--
 migration/multifd.c | 12 ++--
 migration/ram.c | 15 +++
 3 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index b0ff610c37..35d11f103c 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -21,8 +21,8 @@ void multifd_load_shutdown(void);
 bool multifd_recv_all_channels_created(void);
 void multifd_recv_new_channel(QIOChannel *ioc, Error **errp);
 void multifd_recv_sync_main(void);
-int multifd_send_sync_main(QEMUFile *f);
-int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset);
+int multifd_send_sync_main(void);
+int multifd_queue_page(RAMBlock *block, ram_addr_t offset);
 
 /* Multifd Compression flags */
 #define MULTIFD_FLAG_SYNC (1 << 0)
diff --git a/migration/multifd.c b/migration/multifd.c
index 3e650f5da0..2dbc3ba836 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -390,7 +390,7 @@ struct {
  * false.
  */
 
-static int multifd_send_pages(QEMUFile *f)
+static int multifd_send_pages(void)
 {
 int i;
 static int next_channel;
@@ -436,7 +436,7 @@ static int multifd_send_pages(QEMUFile *f)
 return 1;
 }
 
-int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset)
+int multifd_queue_page(RAMBlock *block, ram_addr_t offset)
 {
 MultiFDPages_t *pages = multifd_send_state->pages;
 bool changed = false;
@@ -456,12 +456,12 @@ int multifd_queue_page(QEMUFile *f, RAMBlock *block, 
ram_addr_t offset)
 changed = true;
 }
 
-if (multifd_send_pages(f) < 0) {
+if (multifd_send_pages() < 0) {
 return -1;
 }
 
 if (changed) {
-return multifd_queue_page(f, block, offset);
+return multifd_queue_page(block, offset);
 }
 
 return 1;
@@ -583,7 +583,7 @@ static int multifd_zero_copy_flush(QIOChannel *c)
 return ret;
 }
 
-int multifd_send_sync_main(QEMUFile *f)
+int multifd_send_sync_main(void)
 {
 int i;
 bool flush_zero_copy;
@@ -592,7 +592,7 @@ int multifd_send_sync_main(QEMUFile *f)
 return 0;
 }
 if (multifd_send_state->pages->num) {
-if (multifd_send_pages(f) < 0) {
+if (multifd_send_pages() < 0) {
 error_report("%s: multifd_send_pages fail", __func__);
 return -1;
 }
diff --git a/migration/ram.c b/migration/ram.c
index 890f31cf66..c0cdcccb75 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1250,10 +1250,9 @@ static int ram_save_page(RAMState *rs, PageSearchStatus 
*pss)
 return pages;
 }
 
-static int ram_save_multifd_page(QEMUFile *file, RAMBlock *block,
- ram_addr_t offset)
+static int ram_save_multifd_page(RAMBlock *block, ram_addr_t offset)
 {
-if (multifd_queue_page(file, block, offset) < 0) {
+if (multifd_queue_page(block, offset) < 0) {
 return -1;
 }
 stat64_add(_stats.normal_pages, 1);
@@ -1336,7 +1335,7 @@ static int find_dirty_block(RAMState *rs, 
PageSearchStatus *pss)
 if (migrate_multifd() &&
 !migrate_multifd_flush_after_each_section()) {
 QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
-int ret = multifd_send_sync_main(f);
+int ret = multifd_send_sync_main();
 if (ret < 0) {
 return ret;
 }
@@ -2067,7 +2066,7 @@ static int ram_save_target_page_legacy(RAMState *rs, 
PageSearchStatus *pss)
  * still see partially copied pages which is data corruption.
  */
 if (migrate_multifd() && !migration_in_postcopy()) {
-return ram_save_multifd_page(pss->pss_channel, block, offset);
+return ram_save_multifd_page(block, offset);
 }
 
 return ram_save_page(rs, pss);
@@ -2985,7 +2984,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 migration_ops->ram_save_target_page = ram_save_target_page_legacy;
 
 bql_unlock();
-ret = multifd_send_sync_main(f);
+ret = multifd_send_sync_main();
 bql_lock();
 if (ret < 0) {
 return ret;
@@ -3109,7 +3108,7 @@ out:
 if (ret >= 0
 && migration_is_setup_or_active(migrate_get_current()->state)) {
 if (migrate_multifd() && migrate_multifd_flush_after_each_section()) {
-ret = 
multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
+ret = multifd_send_sync_main();
 if (ret < 0) {
 return ret;
 }
@@ -3183,7 +3182,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 }
 }
 
-ret = multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
+ret = multifd_send_sync_main();
 if (ret < 0) {
 return ret;
 }
-- 
2.43.0

[PULL 07/20] tests/qtest/migration: Add a wrapper to print test names

2024-01-15 Thread peterx

From: Fabiano Rosas 

Our usage of gtest results in us losing the very basic functionality
of "knowing which test failed". The issue is that gtest only prints
test names ("paths" in gtest parlance) once the test has finished, but
we use asserts in the tests and crash gtest itself before it can print
anything. We also use a final abort when the result of g_test_run is
not 0.

Depending on how the test failed/broke we can see the function that
trigged the abort, which may be representative of the test, but it
could also just be some generic function.

We have been relying on the primitive method of looking at the name of
the previous successful test and then looking at the code to figure
out which test should have come next.

Add a wrapper to the test registration that does the job of printing
the test name before running.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Peter Xu 
Link: https://lore.kernel.org/r/20240104142144.9680-7-faro...@suse.de
Signed-off-by: Peter Xu 
---
 tests/qtest/migration-helpers.h |  1 +
 tests/qtest/migration-helpers.c | 32 
 2 files changed, 33 insertions(+)

diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index b478549096..3bf7ded1b9 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -52,4 +52,5 @@ char *find_common_machine_version(const char *mtype, const 
char *var1,
   const char *var2);
 char *resolve_machine_version(const char *alias, const char *var1,
   const char *var2);
+void migration_test_add(const char *path, void (*fn)(void));
 #endif /* MIGRATION_HELPERS_H */
diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index 19384e3fa6..e451dbdbed 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -291,3 +291,35 @@ char *resolve_machine_version(const char *alias, const 
char *var1,
 
 return find_common_machine_version(machine_name, var1, var2);
 }
+
+typedef struct {
+char *name;
+void (*func)(void);
+} MigrationTest;
+
+static void migration_test_destroy(gpointer data)
+{
+MigrationTest *test = (MigrationTest *)data;
+
+g_free(test->name);
+g_free(test);
+}
+
+static void migration_test_wrapper(const void *data)
+{
+MigrationTest *test = (MigrationTest *)data;
+
+g_test_message("Running /%s%s", qtest_get_arch(), test->name);
+test->func();
+}
+
+void migration_test_add(const char *path, void (*fn)(void))
+{
+MigrationTest *test = g_new0(MigrationTest, 1);
+
+test->func = fn;
+test->name = g_strdup(path);
+
+qtest_add_data_func_full(path, test, migration_test_wrapper,
+ migration_test_destroy);
+}
-- 
2.43.0

[PULL 04/20] migration/multifd: Change multifd_pages_init argument

2024-01-15 Thread peterx

From: Fabiano Rosas 

The 'size' argument is actually the number of pages that fit in a
multifd packet. Change it to uint32_t and rename.

Signed-off-by: Fabiano Rosas 
Reviewed-by: Peter Xu 
Link: https://lore.kernel.org/r/20240104142144.9680-4-faro...@suse.de
Signed-off-by: Peter Xu 
---
 migration/multifd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 2dbc3ba836..25cbc6dc6b 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -236,12 +236,12 @@ static int multifd_recv_initial_packet(QIOChannel *c, 
Error **errp)
 return msg.id;
 }
 
-static MultiFDPages_t *multifd_pages_init(size_t size)
+static MultiFDPages_t *multifd_pages_init(uint32_t n)
 {
 MultiFDPages_t *pages = g_new0(MultiFDPages_t, 1);
 
-pages->allocated = size;
-pages->offset = g_new0(ram_addr_t, size);
+pages->allocated = n;
+pages->offset = g_new0(ram_addr_t, n);
 
 return pages;
 }
-- 
2.43.0

[PULL 00/20] Migration 20240116 patches

2024-01-15 Thread peterx

From: Peter Xu 

The following changes since commit 977542ded7e6b28d2bc077bcda24568c716e393c:

  Merge tag 'pull-testing-updates-120124-2' of https://gitlab.com/stsquad/qemu 
into staging (2024-01-12 14:02:53 +)

are available in the Git repository at:

  https://gitlab.com/peterx/qemu.git tags/migration-20240116-pull-request

for you to fetch changes up to 44ce1b5d2fc77343f6a318cb3de613336a240048:

  migration/rdma: define htonll/ntohll only if not predefined (2024-01-16 
11:16:10 +0800)


Migration pull request 2nd batch for 9.0

- Het's cleanup on migration qmp command paths
- Fabiano's migration cleanups and test improvements
- Fabiano's patch to re-enable multifd-cancel test
- Peter's migration doc reorganizations
- Nick Briggs's fix for Solaries build on rdma



Fabiano Rosas (8):
  migration/multifd: Remove MultiFDPages_t::packet_num
  migration/multifd: Remove QEMUFile from where it is not needed
  migration/multifd: Change multifd_pages_init argument
  migration: Report error in incoming migration
  tests/qtest/migration: Print migration incoming errors
  tests/qtest/migration: Add a wrapper to print test names
  tests/qtest/migration: Use the new migration_test_add
  tests/qtest: Re-enable multifd cancel test

Het Gala (1):
  migration: Simplify initial conditionals in migration for better
readability

Nick Briggs (1):
  migration/rdma: define htonll/ntohll only if not predefined

Peter Xu (10):
  docs/migration: Create migration/ directory
  docs/migration: Create index page
  docs/migration: Convert virtio.txt into rST
  docs/migration: Split "Backwards compatibility" separately
  docs/migration: Split "Debugging" and "Firmware"
  docs/migration: Split "Postcopy"
  docs/migration: Split "dirty limit"
  docs/migration: Organize "Postcopy" page
  docs/migration: Further move vfio to be feature of migration
  docs/migration: Further move virtio to be feature of migration

 docs/devel/index-internals.rst|3 +-
 docs/devel/migration.rst  | 1514 -
 docs/devel/migration/best-practices.rst   |   48 +
 docs/devel/migration/compatibility.rst|  517 ++
 docs/devel/migration/dirty-limit.rst  |   71 +
 docs/devel/migration/features.rst |   12 +
 docs/devel/migration/index.rst|   13 +
 docs/devel/migration/main.rst |  575 +++
 docs/devel/migration/postcopy.rst |  313 
 .../vfio.rst} |2 +-
 docs/devel/migration/virtio.rst   |  115 ++
 docs/devel/virtio-migration.txt   |  108 --
 migration/multifd.h   |6 +-
 tests/qtest/migration-helpers.h   |1 +
 migration/migration.c |   43 +-
 migration/multifd.c   |   19 +-
 migration/ram.c   |   15 +-
 migration/rdma.c  |4 +
 tests/qtest/migration-helpers.c   |   38 +
 tests/qtest/migration-test.c  |  219 +--
 20 files changed, 1861 insertions(+), 1775 deletions(-)
 delete mode 100644 docs/devel/migration.rst
 create mode 100644 docs/devel/migration/best-practices.rst
 create mode 100644 docs/devel/migration/compatibility.rst
 create mode 100644 docs/devel/migration/dirty-limit.rst
 create mode 100644 docs/devel/migration/features.rst
 create mode 100644 docs/devel/migration/index.rst
 create mode 100644 docs/devel/migration/main.rst
 create mode 100644 docs/devel/migration/postcopy.rst
 rename docs/devel/{vfio-migration.rst => migration/vfio.rst} (99%)
 create mode 100644 docs/devel/migration/virtio.rst
 delete mode 100644 docs/devel/virtio-migration.txt

-- 
2.43.0

[RFC PATCH 1/1] hw/intc/loongarch_extioi: Add virt extension support

2024-01-15 Thread Song Gao

With hardware extioi, irq can be routed to four vcpus with hardware
extioi. This patch adds virt extension support, sot that irq can
be routed to 256 vcpus.

Signed-off-by: Song Gao 
---
 include/hw/intc/loongarch_extioi.h |  24 +-
 include/hw/loongarch/virt.h|   6 ++
 target/loongarch/cpu.h |   1 +
 hw/intc/loongarch_extioi.c | 100 +++-
 hw/loongarch/virt.c| 119 ++---
 5 files changed, 233 insertions(+), 17 deletions(-)

diff --git a/include/hw/intc/loongarch_extioi.h 
b/include/hw/intc/loongarch_extioi.h
index a0a46b888c..1044e8911a 100644
--- a/include/hw/intc/loongarch_extioi.h
+++ b/include/hw/intc/loongarch_extioi.h
@@ -36,10 +36,29 @@
 #define EXTIOI_ISR_START (0x700 - APIC_OFFSET)
 #define EXTIOI_ISR_END   (0x720 - APIC_OFFSET)
 #define EXTIOI_COREISR_START (0x800 - APIC_OFFSET)
-#define EXTIOI_COREISR_END   (0xB20 - APIC_OFFSET)
+#define EXTIOI_COREISR_END   (0x820 - APIC_OFFSET)
 #define EXTIOI_COREMAP_START (0xC00 - APIC_OFFSET)
 #define EXTIOI_COREMAP_END   (0xD00 - APIC_OFFSET)
 
+#define EXTIOI_VIRT_BASE (0x4000)
+#define EXTIOI_VIRT_SIZE (0x1000)
+#define EXTIOI_VIRT_FEATURES (0x0)
+#define  EXTIOI_HAS_VIRT_EXTENSION (0)
+#define  EXTIOI_HAS_ENABLE_OPTION  (1)
+#define  EXTIOI_HAS_INT_ENCODE (2)
+#define  EXTIOI_HAS_IRQ_ENCODE (3)
+#define  EXTIOI_HAS_CPUID_ENCODE   (4)
+#define  EXTIOI_VIRT_HAS_FEATURES  (BIT(EXTIOI_HAS_VIRT_EXTENSION) \
+| BIT(EXTIOI_HAS_INT_ENCODE)   \
+| BIT(EXTIOI_HAS_IRQ_ENCODE))
+#define EXTIOI_VIRT_CONFIG   (0x4)
+#define  EXTIOI_ENABLE (1)
+#define  EXTIOI_ENABLE_INT_ENCODE  (2)
+#define  EXTIOI_ENABLE_IRQ_ENCODE  (3)
+#define  EXTIOI_ENABLE_CPUID_ENCODE(4)
+#define EXTIOI_VIRT_COREMAP_START(0x40)
+#define EXTIOI_VIRT_COREMAP_END  (0x240)
+
 typedef struct ExtIOICore {
 uint32_t coreisr[EXTIOI_IRQS_GROUP_COUNT];
 DECLARE_BITMAP(sw_isr[LS3A_INTC_IP], EXTIOI_IRQS);
@@ -51,6 +70,8 @@ OBJECT_DECLARE_SIMPLE_TYPE(LoongArchExtIOI, LOONGARCH_EXTIOI)
 struct LoongArchExtIOI {
 SysBusDevice parent_obj;
 uint32_t num_cpu;
+uint32_t features;
+uint32_t status;
 /* hardware state */
 uint32_t nodetype[EXTIOI_IRQS_NODETYPE_COUNT / 2];
 uint32_t bounce[EXTIOI_IRQS_GROUP_COUNT];
@@ -64,5 +85,6 @@ struct LoongArchExtIOI {
 qemu_irq irq[EXTIOI_IRQS];
 ExtIOICore *cpu;
 MemoryRegion extioi_system_mem;
+MemoryRegion virt_extend;
 };
 #endif /* LOONGARCH_EXTIOI_H */
diff --git a/include/hw/loongarch/virt.h b/include/hw/loongarch/virt.h
index 6ef9a92394..afd63cfd5e 100644
--- a/include/hw/loongarch/virt.h
+++ b/include/hw/loongarch/virt.h
@@ -30,6 +30,9 @@
 #define VIRT_GED_MEM_ADDR   (VIRT_GED_EVT_ADDR + ACPI_GED_EVT_SEL_LEN)
 #define VIRT_GED_REG_ADDR   (VIRT_GED_MEM_ADDR + MEMORY_HOTPLUG_IO_LEN)
 
+/* board device features */
+#define VIRT_HAS_V_EIOINTCBIT(0)
+
 struct LoongArchMachineState {
 /*< private >*/
 MachineState parent_obj;
@@ -38,6 +41,7 @@ struct LoongArchMachineState {
 MemoryRegion highmem;
 MemoryRegion bios;
 bool bios_loaded;
+bool v_eiointc;
 /* State for other subsystems/APIs: */
 FWCfgState  *fw_cfg;
 Notifier machine_done;
@@ -48,11 +52,13 @@ struct LoongArchMachineState {
 DeviceState  *acpi_ged;
 int  fdt_size;
 DeviceState *platform_bus_dev;
+DeviceState *extioi;
 PCIBus   *pci_bus;
 PFlashCFI01  *flash;
 MemoryRegion system_iocsr;
 MemoryRegion iocsr_mem;
 AddressSpace as_iocsr;
+int  features;
 };
 
 #define TYPE_LOONGARCH_MACHINE  MACHINE_TYPE_NAME("virt")
diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h
index 0fa5e0ca93..339a4832f0 100644
--- a/target/loongarch/cpu.h
+++ b/target/loongarch/cpu.h
@@ -36,6 +36,7 @@
 #define CPUNAME_REG 0x20
 #define MISC_FUNC_REG   0x420
 #define IOCSRM_EXTIOI_EN48
+#define IOCSRM_EXTIOI_INT_ENCODE  49
 
 #define IOCSR_MEM_SIZE  0x428
 
diff --git a/hw/intc/loongarch_extioi.c b/hw/intc/loongarch_extioi.c
index bdfa3b481e..a22a33b849 100644
--- a/hw/intc/loongarch_extioi.c
+++ b/hw/intc/loongarch_extioi.c
@@ -55,6 +55,11 @@ static void extioi_update_irq(LoongArchExtIOI *s, int irq, 
int level)
 static void extioi_setirq(void *opaque, int irq, int level)
 {
 LoongArchExtIOI *s = LOONGARCH_EXTIOI(opaque);
+
+if (s->status & BIT(EXTIOI_ENABLE)) {
+return;
+}
+
 trace_loongarch_extioi_setirq(irq, level);
 if (level) {
 /*
@@ -143,10 +148,13 @@ static inline void 
extioi_update_sw_coremap(LoongArchExtIOI *s, int irq,
 
 for (i = 0; i < 4; i++) {
 cpu = val & 0xff;
-cpu = ctz32(cpu);
-cpu = (cpu >=

Re: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and VIRTIO_F_NOTIFICATION_DATA feature

2024-01-15 Thread Jason Wang

On Tue, Jan 16, 2024 at 9:57 AM Wentao Jia  wrote:
>
> Hi, Jason
>
> I just add two features in vhost user feature bits, The patch was tested in 
> my environment
> I do not know what the compatibility mean

For example, if you don't do that,

Migrating from 9.0 to 8.2 will break as 8.2 doesn't have those new
features at all.

Please refer hw_compat_8_2.

Thanks

>
> Wentao
>
> -Original Message-
> From: Jason Wang 
> Sent: Monday, January 15, 2024 8:18 AM
> To: Wentao Jia 
> Cc: qemu-devel@nongnu.org; m...@redhat.com; Rick Zhong 
> 
> Subject: Re: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
> VIRTIO_F_NOTIFICATION_DATA feature
>
> On Fri, Jan 12, 2024 at 4:18 PM Wentao Jia  wrote:
> >
> > Hi, Michael and Jason
> >
> > Do you have any other comments?
> > Is there a schedule for merge the patch into the community?
> > Thank you
>
> I think as discussed, we need to add compatibility support for those features.
>
> Thanks
>
> >
> > Wentao
> >
> > -Original Message-
> > From: Wentao Jia
> > Sent: Tuesday, January 2, 2024 1:57 PM
> > To: qemu-devel@nongnu.org
> > Cc: 'm...@redhat.com' ; Rick Zhong
> > ; 'Jason Wang' 
> > Subject: RE: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and
> > VIRTIO_F_NOTIFICATION_DATA feature
> >
> >
> > ---
> >  hw/net/vhost_net.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index
> > e8e1661646..211ca859a6 100644
> > --- a/hw/net/vhost_net.c
> > +++ b/hw/net/vhost_net.c
> > @@ -76,6 +76,8 @@ static const int user_feature_bits[] = {
> >  VIRTIO_F_IOMMU_PLATFORM,
> >  VIRTIO_F_RING_PACKED,
> >  VIRTIO_F_RING_RESET,
> > +VIRTIO_F_IN_ORDER,
> > +VIRTIO_F_NOTIFICATION_DATA,
> >  VIRTIO_NET_F_RSS,
> >  VIRTIO_NET_F_HASH_REPORT,
> >  VIRTIO_NET_F_GUEST_USO4,
> > --
> >
> > -Original Message-
> > From: Wentao Jia
> > Sent: Tuesday, January 2, 2024 1:38 PM
> > To: Jason Wang 
> > Cc: m...@redhat.com; Rick Zhong 
> > Subject: RE: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and
> > VIRTIO_F_NOTIFICATION_DATA feature
> >
> > Hi, Jason
> >
> > It is good just change feature bits, I will commit a new patch, thanks
> >
> > Wentao Jia
> >
> > -Original Message-
> > From: Jason Wang 
> > Sent: Tuesday, January 2, 2024 11:24 AM
> > To: Wentao Jia 
> > Cc: m...@redhat.com; Rick Zhong 
> > Subject: Re: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and
> > VIRTIO_F_NOTIFICATION_DATA feature
> >
> > On Tue, Jan 2, 2024 at 10:26 AM Wentao Jia  wrote:
> > >
> > > Hi, Michael  and Jason
> > >
> > >
> > >
> > > please review the patch at your convenience, thank you
> > >
> > > vhost-user: add VIRTIO_F_IN_ORDER and VIRTIO_F_NOTIFICATION_DATA
> > > feature - Patchwork (kernel.org)
> > >
> > >
> > >
> > > Wentao Jia
> > >
> > >
> > >
> > > From: Wentao Jia
> > > Sent: Friday, December 1, 2023 6:11 PM
> > > To: qemu-devel@nongnu.org
> > > Subject: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and
> > > VIRTIO_F_NOTIFICATION_DATA feature
> > >
> > >
> > >
> > > VIRTIO_F_IN_ORDER and VIRTIO_F_NOTIFICATION_DATA feature are
> > > important feature
> > >
> > > for dpdk vdpa packets transmitting performance, add the 2 features
> > > at vhost-user
> > >
> > > front-end to negotiation with backend.
> > >
> > >
> > >
> > > Signed-off-by: Kyle Xu zhenbing...@corigine.com
> > >
> > > Signed-off-by: Wentao Jia wentao@corigine.com
> > >
> > > Reviewed-by:   Xinying Yu xinying...@corigine.com
> > >
> > > Reviewed-by:   Shujing Dong shujing.d...@corigine.com
> > >
> > > Reviewed-by:   Rick Zhong zhaoyong.zh...@corigine.com
> > >
> > > ---
> > >
> > > hw/net/vhost_net.c | 2 ++
> > >
> > > include/hw/virtio/virtio.h | 4 
> > >
> > > 2 files changed, 6 insertions(+)
> > >
> > >
> > >
> > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> > >
> > > index e8e1661646..211ca859a6 100644
> > >
> > > --- a/hw/net/vhost_net.c
> > >
> > > +++ b/hw/net/vhost_net.c
> > >
> > > @@ -76,6 +76,8 @@ static const int user_feature_bits[] = {
> > >
> > >  VIRTIO_F_IOMMU_PLATFORM,
> > >
> > >  VIRTIO_F_RING_PACKED,
> > >
> > >  VIRTIO_F_RING_RESET,
> > >
> > > +VIRTIO_F_IN_ORDER,
> > >
> > > +VIRTIO_F_NOTIFICATION_DATA,
> > >
> > >  VIRTIO_NET_F_RSS,
> > >
> > >  VIRTIO_NET_F_HASH_REPORT,
> > >
> > >  VIRTIO_NET_F_GUEST_USO4,
> > >
> > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > >
> > > index c8f72850bc..3880b6764c 100644
> > >
> > > --- a/include/hw/virtio/virtio.h
> > >
> > > +++ b/include/hw/virtio/virtio.h
> > >
> > > @@ -369,6 +369,10 @@ typedef struct VirtIORNGConf VirtIORNGConf;
> > >
> > >VIRTIO_F_RING_PACKED, false), \
> > >
> > >  DEFINE_PROP_BIT64("queue_reset", _state, _field, \
> > >
> > >VIRTIO_F_RING_RESET, true)
> > >
> > > +DEFINE_PROP_BIT64("notification_data", _state, _field, \
> > >
> > > +  VIRTIO_F_NOTIFICATION_DATA, true), \
> > >
> > > +

RE: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and VIRTIO_F_NOTIFICATION_DATA feature

2024-01-15 Thread Wentao Jia

Hi, Jason

I just add two features in vhost user feature bits, The patch was tested in my 
environment
I do not know what the compatibility mean

Wentao

-Original Message-
From: Jason Wang  
Sent: Monday, January 15, 2024 8:18 AM
To: Wentao Jia 
Cc: qemu-devel@nongnu.org; m...@redhat.com; Rick Zhong 

Subject: Re: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
VIRTIO_F_NOTIFICATION_DATA feature

On Fri, Jan 12, 2024 at 4:18 PM Wentao Jia  wrote:
>
> Hi, Michael and Jason
>
> Do you have any other comments?
> Is there a schedule for merge the patch into the community?
> Thank you

I think as discussed, we need to add compatibility support for those features.

Thanks

>
> Wentao
>
> -Original Message-
> From: Wentao Jia
> Sent: Tuesday, January 2, 2024 1:57 PM
> To: qemu-devel@nongnu.org
> Cc: 'm...@redhat.com' ; Rick Zhong 
> ; 'Jason Wang' 
> Subject: RE: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
> VIRTIO_F_NOTIFICATION_DATA feature
>
>
> ---
>  hw/net/vhost_net.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index 
> e8e1661646..211ca859a6 100644
> --- a/hw/net/vhost_net.c
> +++ b/hw/net/vhost_net.c
> @@ -76,6 +76,8 @@ static const int user_feature_bits[] = {
>  VIRTIO_F_IOMMU_PLATFORM,
>  VIRTIO_F_RING_PACKED,
>  VIRTIO_F_RING_RESET,
> +VIRTIO_F_IN_ORDER,
> +VIRTIO_F_NOTIFICATION_DATA,
>  VIRTIO_NET_F_RSS,
>  VIRTIO_NET_F_HASH_REPORT,
>  VIRTIO_NET_F_GUEST_USO4,
> --
>
> -Original Message-
> From: Wentao Jia
> Sent: Tuesday, January 2, 2024 1:38 PM
> To: Jason Wang 
> Cc: m...@redhat.com; Rick Zhong 
> Subject: RE: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
> VIRTIO_F_NOTIFICATION_DATA feature
>
> Hi, Jason
>
> It is good just change feature bits, I will commit a new patch, thanks
>
> Wentao Jia
>
> -Original Message-
> From: Jason Wang 
> Sent: Tuesday, January 2, 2024 11:24 AM
> To: Wentao Jia 
> Cc: m...@redhat.com; Rick Zhong 
> Subject: Re: FW: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
> VIRTIO_F_NOTIFICATION_DATA feature
>
> On Tue, Jan 2, 2024 at 10:26 AM Wentao Jia  wrote:
> >
> > Hi, Michael  and Jason
> >
> >
> >
> > please review the patch at your convenience, thank you
> >
> > vhost-user: add VIRTIO_F_IN_ORDER and VIRTIO_F_NOTIFICATION_DATA 
> > feature - Patchwork (kernel.org)
> >
> >
> >
> > Wentao Jia
> >
> >
> >
> > From: Wentao Jia
> > Sent: Friday, December 1, 2023 6:11 PM
> > To: qemu-devel@nongnu.org
> > Subject: [PATCH] vhost-user: add VIRTIO_F_IN_ORDER and 
> > VIRTIO_F_NOTIFICATION_DATA feature
> >
> >
> >
> > VIRTIO_F_IN_ORDER and VIRTIO_F_NOTIFICATION_DATA feature are 
> > important feature
> >
> > for dpdk vdpa packets transmitting performance, add the 2 features 
> > at vhost-user
> >
> > front-end to negotiation with backend.
> >
> >
> >
> > Signed-off-by: Kyle Xu zhenbing...@corigine.com
> >
> > Signed-off-by: Wentao Jia wentao@corigine.com
> >
> > Reviewed-by:   Xinying Yu xinying...@corigine.com
> >
> > Reviewed-by:   Shujing Dong shujing.d...@corigine.com
> >
> > Reviewed-by:   Rick Zhong zhaoyong.zh...@corigine.com
> >
> > ---
> >
> > hw/net/vhost_net.c | 2 ++
> >
> > include/hw/virtio/virtio.h | 4 
> >
> > 2 files changed, 6 insertions(+)
> >
> >
> >
> > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >
> > index e8e1661646..211ca859a6 100644
> >
> > --- a/hw/net/vhost_net.c
> >
> > +++ b/hw/net/vhost_net.c
> >
> > @@ -76,6 +76,8 @@ static const int user_feature_bits[] = {
> >
> >  VIRTIO_F_IOMMU_PLATFORM,
> >
> >  VIRTIO_F_RING_PACKED,
> >
> >  VIRTIO_F_RING_RESET,
> >
> > +VIRTIO_F_IN_ORDER,
> >
> > +VIRTIO_F_NOTIFICATION_DATA,
> >
> >  VIRTIO_NET_F_RSS,
> >
> >  VIRTIO_NET_F_HASH_REPORT,
> >
> >  VIRTIO_NET_F_GUEST_USO4,
> >
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> >
> > index c8f72850bc..3880b6764c 100644
> >
> > --- a/include/hw/virtio/virtio.h
> >
> > +++ b/include/hw/virtio/virtio.h
> >
> > @@ -369,6 +369,10 @@ typedef struct VirtIORNGConf VirtIORNGConf;
> >
> >VIRTIO_F_RING_PACKED, false), \
> >
> >  DEFINE_PROP_BIT64("queue_reset", _state, _field, \
> >
> >VIRTIO_F_RING_RESET, true)
> >
> > +DEFINE_PROP_BIT64("notification_data", _state, _field, \
> >
> > +  VIRTIO_F_NOTIFICATION_DATA, true), \
> >
> > +DEFINE_PROP_BIT64("in_order", _state, _field, \
> >
> > +  VIRTIO_F_IN_ORDER, true)
>
> Do we want compatibility support for those?
>
> Thanks
>
> >
> >
> >
> > hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n);
> >
> > bool virtio_queue_enabled_legacy(VirtIODevice *vdev, int n);
> >
> > --
>

[PATCH v2] configure: Add linux header compile support for LoongArch

2024-01-15 Thread Bibo Mao

When compiling qemu with system KVM mode for LoongArch, header files
in directory linux-headers/asm-loongarch should be used firstly.
Otherwise it fails to find kvm.h on system with old glibc, since
latest kernel header files are not installed.

This patch adds linux_arch definition for LoongArch system so that
header files in directory linux-headers/asm-loongarch can be included.

Fixes: 714b03c125 ("target/loongarch: Add loongarch kvm into meson build")
Signed-off-by: Bibo Mao 
Reviewed-by: Philippe Mathieu-Daudé 
---
Changes in v2:
 1. Add Fixes label for commit 714b03c125

---
 configure | 1 +
 1 file changed, 1 insertion(+)

diff --git a/configure b/configure
index 21ab9a64e9..3d8e24ae01 100755
--- a/configure
+++ b/configure
@@ -445,6 +445,7 @@ case "$cpu" in
   loongarch*)
 cpu=loongarch64
 host_arch=loongarch64
+linux_arch=loongarch
 ;;
 
   mips64*)

base-commit: 977542ded7e6b28d2bc077bcda24568c716e393c
-- 
2.39.3

[PATCH v2 2/2] hw/riscv/virt-acpi-build.c: Generate SPCR table

2024-01-15 Thread Sia Jee Heng

Generate Serial Port Console Redirection Table (SPCR) for RISC-V
virtual machine.

Signed-off-by: Sia Jee Heng 
Reviewed-by: Daniel Henrique Barboza 
---
 hw/riscv/virt-acpi-build.c | 39 ++
 1 file changed, 39 insertions(+)

diff --git a/hw/riscv/virt-acpi-build.c b/hw/riscv/virt-acpi-build.c
index 26c7e4482d..7fc5071c84 100644
--- a/hw/riscv/virt-acpi-build.c
+++ b/hw/riscv/virt-acpi-build.c
@@ -174,6 +174,42 @@ acpi_dsdt_add_uart(Aml *scope, const MemMapEntry 
*uart_memmap,
 aml_append(scope, dev);
 }
 
+/*
+ * Serial Port Console Redirection Table (SPCR)
+ * Rev: 1.07
+ */
+
+static void
+spcr_setup(GArray *table_data, BIOSLinker *linker, RISCVVirtState *s)
+{
+AcpiSpcrData serial = {
+.interface_type = 0,   /* 16550 compatible */
+.base_addr.id = AML_AS_SYSTEM_MEMORY,
+.base_addr.width = 32,
+.base_addr.offset = 0,
+.base_addr.size = 1,
+.base_addr.addr = s->memmap[VIRT_UART0].base,
+.interrupt_type = (1 << 4),/* Bit[4] RISC-V PLIC/APLIC */
+.pc_interrupt = 0,
+.interrupt = UART0_IRQ,
+.baud_rate = 7,/* 15200 */
+.parity = 0,
+.stop_bits = 1,
+.flow_control = 0,
+.terminal_type = 3,/* ANSI */
+.language = 0, /* Language */
+.pci_device_id = 0x,   /* not a PCI device*/
+.pci_vendor_id = 0x,   /* not a PCI device*/
+.pci_bus = 0,
+.pci_device = 0,
+.pci_function = 0,
+.pci_flags = 0,
+.pci_segment = 0,
+};
+
+build_spcr(table_data, linker, , 2, s->oem_id, s->oem_table_id);
+}
+
 /* RHCT Node[N] starts at offset 56 */
 #define RHCT_NODE_ARRAY_OFFSET 56
 
@@ -555,6 +591,9 @@ static void virt_acpi_build(RISCVVirtState *s, 
AcpiBuildTables *tables)
 acpi_add_table(table_offsets, tables_blob);
 build_rhct(tables_blob, tables->linker, s);
 
+acpi_add_table(table_offsets, tables_blob);
+spcr_setup(tables_blob, tables->linker, s);
+
 acpi_add_table(table_offsets, tables_blob);
 {
 AcpiMcfgInfo mcfg = {
-- 
2.34.1

[PATCH v2 1/2] hw/arm/virt-acpi-build.c: Migrate SPCR creation to common location

2024-01-15 Thread Sia Jee Heng

RISC-V should also generate the SPCR in a manner similar to ARM.
Therefore, instead of replicating the code, relocate this function
to the common AML build.

Signed-off-by: Sia Jee Heng 
---
 hw/acpi/aml-build.c | 51 
 hw/arm/virt-acpi-build.c| 68 +++--
 include/hw/acpi/acpi-defs.h | 33 ++
 include/hw/acpi/aml-build.h |  4 +++
 4 files changed, 115 insertions(+), 41 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index af66bde0f5..f3904650e4 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1994,6 +1994,57 @@ static void build_processor_hierarchy_node(GArray *tbl, 
uint32_t flags,
 }
 }
 
+void build_spcr(GArray *table_data, BIOSLinker *linker,
+const AcpiSpcrData *f, const uint8_t rev,
+const char *oem_id, const char *oem_table_id)
+{
+AcpiTable table = { .sig = "SPCR", .rev = rev, .oem_id = oem_id,
+.oem_table_id = oem_table_id };
+
+acpi_table_begin(, table_data);
+/* Interface type */
+build_append_int_noprefix(table_data, f->interface_type, 1);
+/* Reserved */
+build_append_int_noprefix(table_data, 0, 3);
+/* Base Address */
+build_append_gas(table_data, f->base_addr.id, f->base_addr.width,
+ f->base_addr.offset, f->base_addr.size,
+ f->base_addr.addr);
+/* Interrupt type */
+build_append_int_noprefix(table_data, f->interrupt_type, 1);
+/* IRQ */
+build_append_int_noprefix(table_data, f->pc_interrupt, 1);
+/* Global System Interrupt */
+build_append_int_noprefix(table_data, f->interrupt, 4);
+/* Baud Rate */
+build_append_int_noprefix(table_data, f->baud_rate, 1);
+/* Parity */
+build_append_int_noprefix(table_data, f->parity, 1);
+/* Stop Bits */
+build_append_int_noprefix(table_data, f->stop_bits, 1);
+/* Flow Control */
+build_append_int_noprefix(table_data, f->flow_control, 1);
+/* Terminal Type */
+build_append_int_noprefix(table_data, f->terminal_type, 1);
+/* PCI Device ID  */
+build_append_int_noprefix(table_data, f->pci_device_id, 2);
+/* PCI Vendor ID */
+build_append_int_noprefix(table_data, f->pci_vendor_id, 2);
+/* PCI Bus Number */
+build_append_int_noprefix(table_data, f->pci_bus, 1);
+/* PCI Device Number */
+build_append_int_noprefix(table_data, f->pci_device, 1);
+/* PCI Function Number */
+build_append_int_noprefix(table_data, f->pci_function, 1);
+/* PCI Flags */
+build_append_int_noprefix(table_data, f->pci_flags, 4);
+/* PCI Segment */
+build_append_int_noprefix(table_data, f->pci_segment, 1);
+/* Reserved */
+build_append_int_noprefix(table_data, 0, 4);
+
+acpi_table_end(linker, );
+}
 /*
  * ACPI spec, Revision 6.3
  * 5.2.29 Processor Properties Topology Table (PPTT)
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index a22a2f43a5..195767c0f0 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -431,48 +431,34 @@ build_iort(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
  * Rev: 1.07
  */
 static void
-build_spcr(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
+spcr_setup(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms)
 {
-AcpiTable table = { .sig = "SPCR", .rev = 2, .oem_id = vms->oem_id,
-.oem_table_id = vms->oem_table_id };
-
-acpi_table_begin(, table_data);
-
-/* Interface Type */
-build_append_int_noprefix(table_data, 3, 1); /* ARM PL011 UART */
-build_append_int_noprefix(table_data, 0, 3); /* Reserved */
-/* Base Address */
-build_append_gas(table_data, AML_AS_SYSTEM_MEMORY, 32, 0, 3,
- vms->memmap[VIRT_UART].base);
-/* Interrupt Type */
-build_append_int_noprefix(table_data,
-(1 << 3) /* Bit[3] ARMH GIC interrupt */, 1);
-build_append_int_noprefix(table_data, 0, 1); /* IRQ */
-/* Global System Interrupt */
-build_append_int_noprefix(table_data,
-  vms->irqmap[VIRT_UART] + ARM_SPI_BASE, 4);
-build_append_int_noprefix(table_data, 3 /* 9600 */, 1); /* Baud Rate */
-build_append_int_noprefix(table_data, 0 /* No Parity */, 1); /* Parity */
-/* Stop Bits */
-build_append_int_noprefix(table_data, 1 /* 1 Stop bit */, 1);
-/* Flow Control */
-build_append_int_noprefix(table_data,
-(1 << 1) /* RTS/CTS hardware flow control */, 1);
-/* Terminal Type */
-build_append_int_noprefix(table_data, 0 /* VT100 */, 1);
-build_append_int_noprefix(table_data, 0, 1); /* Language */
-/* PCI Device ID  */
-build_append_int_noprefix(table_data, 0x /* not a PCI device*/, 2);
-/* PCI Vendor ID */
-build_append_int_noprefix(table_data, 0x /* not a PCI device*/, 2);
-build_append_int_noprefix(table_data, 0, 1); /* PCI Bus Number */
-

[PATCH v2 0/2] RISC-V: ACPI: Enable SPCR

2024-01-15 Thread Sia Jee Heng

This series focuses on enabling the Serial Port Console Redirection (SPCR)
table for the RISC-V virt platform. Considering that ARM utilizes the same
function, the initial patch involves migrating the build_spcr function to
common code. This consolidation ensures that RISC-V avoids duplicating the
function.

The patch set is built upon Alistair's riscv-to-apply.next branch

Changes in v2:
- Renamed the build_spcr_rev2() function to spcr_setup().
- SPCR table version is passed from spcr_setup() to the common
  build_spcr() function.
- Added "Reviewed-by" from Daniel for patch 2.
- The term 'RFC' has been removed from this series, as the dependency code
  from [1] has been merged into Alistair's riscv-to-apply.next branch. The
  first series of this patch can be found at [2].

[1] 
https://lore.kernel.org/qemu-devel/20231218150247.466427-1-suni...@ventanamicro.com/
[2] 
https://lore.kernel.org/qemu-devel/20240105090608.5745-1-jeeheng@starfivetech.com/

Sia Jee Heng (2):
  hw/arm/virt-acpi-build.c: Migrate SPCR creation to common location
  hw/riscv/virt-acpi-build.c: Generate SPCR table

 hw/acpi/aml-build.c | 51 
 hw/arm/virt-acpi-build.c| 68 +++--
 hw/riscv/virt-acpi-build.c  | 39 +
 include/hw/acpi/acpi-defs.h | 33 ++
 include/hw/acpi/aml-build.h |  4 +++
 5 files changed, 154 insertions(+), 41 deletions(-)

-- 
2.34.1

Re: [PATCH 0/6] net: fix non-deterministic failures of the 'netdev-socket' qtest

2024-01-15 Thread Jason Wang

On Mon, Jan 15, 2024 at 6:19 PM Peter Maydell  wrote:
>
> On Mon, 15 Jan 2024 at 02:37, Jason Wang  wrote:
> >
> > On Fri, Jan 5, 2024 at 12:45 AM Stefan Hajnoczi  wrote:
> > >
> > > On Thu, 4 Jan 2024 at 11:30, Daniel P. Berrangé  
> > > wrote:
> > > >
> > > > We've previously bumped up the timeouts in the netdev-socket qtest
> > > > to supposedly fix non-deterministic failures, however, the failures
> > > > are still hitting CI.
> > > >
> > > > A simple 'listen()' and 'connect()' pairing across 2 QEMU processes
> > > > should be very quick to execute, even under high system load, so it
> > > > was never likely that the test was failing due to timeouts being
> > > > reached.
> > > >
> > > > The actual root cause was a race condition in the test design. It
> > > > was spawning a QEMU with a 'server' netdev, and then spawning one
> > > > with the 'client' netdev. There was insufficient synchronization,
> > > > however, so it was possible for the 2nd QEMU process to attempt
> > > > to 'connect()' before the 'listen()' call was made by the 1st QEMU.
> > > >
> > > > In the test scenarios that did not use the 'reconnect' flag, this
> > > > would result in the client QEMU never getting into the expected
> > > > state. The test code would thus loop on 'info network' until
> > > > hitting the maximum wait time.
> > > >
> > > > This series reverts the increased timeouts, and fixes synchronization
> > > > in the test scenarios. It also improves reporting of errors in the
> > > > socket netdev backend so that 'info network' reports what actually
> > > > went wrong rather than a useless generic 'connection error' string.
> > > > This will help us diagnose any future CI problems, should they occurr.
> > > >
> > > > Daniel P. Berrangé (6):
> > > >   Revert "netdev: set timeout depending on loadavg"
> > > >   Revert "osdep: add getloadavg"
> > > >   Revert "tests/qtest/netdev-socket: Raise connection timeout to 120
> > > > seconds"
> > > >   net: add explicit info about connecting/listening state
> > > >   net: handle QIOTask completion to report useful error message
> > > >   qtest: ensure netdev-socket tests have non-overlapping names
> > > >
> > > >  include/qemu/osdep.h| 10 -
> > > >  meson.build |  1 -
> > > >  net/stream.c| 18 +++-
> > > >  tests/qtest/netdev-socket.c | 42 +++--
> > > >  4 files changed, 21 insertions(+), 50 deletions(-)
> > >
> > > Reviewed-by: Stefan Hajnoczi 
> > >
> >
> > Queued.
>
> These are already upstream, via thuth's testing queue.
>
> thanks
> -- PMM
>

Great, so I dropped this.

Thanks

[PATCH 2/2] tests/tcg: Add the syscall catchpoint gdbstub test

2024-01-15 Thread Ilya Leoshkevich

Check that adding/removing syscall catchpoints works.

Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/multiarch/Makefile.target   | 10 +++-
 tests/tcg/multiarch/catch-syscalls.c  | 51 ++
 tests/tcg/multiarch/gdbstub/catch-syscalls.py | 52 +++
 3 files changed, 112 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/multiarch/catch-syscalls.c
 create mode 100644 tests/tcg/multiarch/gdbstub/catch-syscalls.py

diff --git a/tests/tcg/multiarch/Makefile.target 
b/tests/tcg/multiarch/Makefile.target
index 315a2e13588..e10951a8016 100644
--- a/tests/tcg/multiarch/Makefile.target
+++ b/tests/tcg/multiarch/Makefile.target
@@ -108,13 +108,21 @@ run-gdbstub-prot-none: prot-none
--bin $< --test $(MULTIARCH_SRC)/gdbstub/prot-none.py, \
accessing PROT_NONE memory)
 
+run-gdbstub-catch-syscalls: catch-syscalls
+   $(call run-test, $@, $(GDB_SCRIPT) \
+   --gdb $(GDB) \
+   --qemu $(QEMU) --qargs "$(QEMU_OPTS)" \
+   --bin $< --test $(MULTIARCH_SRC)/gdbstub/catch-syscalls.py, \
+   hitting a syscall catchpoint)
+
 else
 run-gdbstub-%:
$(call skip-test, "gdbstub test $*", "need working gdb with $(patsubst 
-%,,$(TARGET_NAME)) support")
 endif
 EXTRA_RUNS += run-gdbstub-sha1 run-gdbstub-qxfer-auxv-read \
  run-gdbstub-proc-mappings run-gdbstub-thread-breakpoint \
- run-gdbstub-registers run-gdbstub-prot-none
+ run-gdbstub-registers run-gdbstub-prot-none \
+ run-gdbstub-catch-syscalls
 
 # ARM Compatible Semi Hosting Tests
 #
diff --git a/tests/tcg/multiarch/catch-syscalls.c 
b/tests/tcg/multiarch/catch-syscalls.c
new file mode 100644
index 000..d1ff1936a7a
--- /dev/null
+++ b/tests/tcg/multiarch/catch-syscalls.c
@@ -0,0 +1,51 @@
+/*
+ * Test GDB syscall catchpoints.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#define _GNU_SOURCE
+#include 
+#include 
+
+const char *catch_syscalls_state = "start";
+
+void end_of_main(void)
+{
+}
+
+int main(void)
+{
+int ret = EXIT_FAILURE;
+char c0 = 'A', c1;
+int fd[2];
+
+catch_syscalls_state = "pipe2";
+if (pipe2(fd, 0)) {
+goto out;
+}
+
+catch_syscalls_state = "write";
+if (write(fd[1], , sizeof(c0)) != sizeof(c0)) {
+goto out_close;
+}
+
+catch_syscalls_state = "read";
+if (read(fd[0], , sizeof(c1)) != sizeof(c1)) {
+goto out_close;
+}
+
+catch_syscalls_state = "check";
+if (c0 == c1) {
+ret = EXIT_SUCCESS;
+}
+
+out_close:
+catch_syscalls_state = "close";
+close(fd[0]);
+close(fd[1]);
+
+out:
+catch_syscalls_state = "end";
+end_of_main();
+return ret;
+}
diff --git a/tests/tcg/multiarch/gdbstub/catch-syscalls.py 
b/tests/tcg/multiarch/gdbstub/catch-syscalls.py
new file mode 100644
index 000..8bab12537fc
--- /dev/null
+++ b/tests/tcg/multiarch/gdbstub/catch-syscalls.py
@@ -0,0 +1,52 @@
+"""Test GDB syscall catchpoints.
+
+SPDX-License-Identifier: GPL-2.0-or-later
+"""
+from test_gdbstub import main, report
+
+
+def check_state(expected):
+"""Check the catch_syscalls_state value"""
+actual = gdb.parse_and_eval("catch_syscalls_state").string()
+report(actual == expected, "{} == {}".format(actual, expected))
+
+
+def run_test():
+"""Run through the tests one by one"""
+gdb.Breakpoint("main")
+gdb.execute("continue")
+
+# Check that GDB stops for pipe2/read calls/returns, but not for write.
+gdb.execute("delete")
+try:
+gdb.execute("catch syscall pipe2 read")
+except gdb.error as exc:
+exc_str = str(exc)
+if "not supported on this architecture" in exc_str:
+print("SKIP: {}".format(exc_str))
+return
+for _ in range(2):
+gdb.execute("continue")
+check_state("pipe2")
+for _ in range(2):
+gdb.execute("continue")
+check_state("read")
+
+# Check that deletion works.
+gdb.execute("delete")
+gdb.Breakpoint("end_of_main")
+gdb.execute("continue")
+check_state("end")
+
+# Check that catch-all works (libc should at least call exit).
+gdb.execute("delete")
+gdb.execute("catch syscall")
+gdb.execute("continue")
+gdb.execute("delete")
+gdb.execute("continue")
+
+exitcode = int(gdb.parse_and_eval("$_exitcode"))
+report(exitcode == 0, "{} == 0".format(exitcode))
+
+
+main(run_test)
-- 
2.43.0

[PATCH 0/2] gdbstub: Implement catching syscalls

2024-01-15 Thread Ilya Leoshkevich

Based-on: <20240116003551.75168-1-...@linux.ibm.com>
([PATCH v3 0/3] linux-user: Allow gdbstub to ignore page protection)

Hi,

I noticed that GDB's "catch syscall" does not work with qemu-user.
This series adds the missing bits in [1/2] and a test in [2/2].
I'm basing this on my other series, since it contains useful gdbstub
test refactorings.

Best regards,
Ilya

Ilya Leoshkevich (2):
  gdbstub: Implement catching syscalls
  tests/tcg: Add the syscall catchpoint gdbstub test

 gdbstub/gdbstub.c | 11 +++-
 gdbstub/internals.h   | 16 ++
 gdbstub/system.c  |  1 +
 gdbstub/user-target.c | 39 ++
 gdbstub/user.c| 51 +-
 include/gdbstub/user.h| 29 ++-
 include/user/syscall-trace.h  |  7 ++-
 tests/tcg/multiarch/Makefile.target   | 10 +++-
 tests/tcg/multiarch/catch-syscalls.c  | 51 ++
 tests/tcg/multiarch/gdbstub/catch-syscalls.py | 52 +++
 10 files changed, 260 insertions(+), 7 deletions(-)
 create mode 100644 tests/tcg/multiarch/catch-syscalls.c
 create mode 100644 tests/tcg/multiarch/gdbstub/catch-syscalls.py

-- 
2.43.0

[PATCH 1/2] gdbstub: Implement catching syscalls

2024-01-15 Thread Ilya Leoshkevich

GDB supports stopping on syscall entry and exit using the "catch
syscall" command. It relies on 3 packets, which are currently not
supported by QEMU:

* qSupported:QCatchSyscalls+ [1]
* QCatchSyscalls: [2]
* T05syscall_entry: and T05syscall_return: [3]

Implement generation and handling of these packets.

[1] 
https://sourceware.org/gdb/current/onlinedocs/gdb.html/General-Query-Packets.html#qSupported
[2] 
https://sourceware.org/gdb/current/onlinedocs/gdb.html/General-Query-Packets.html#QCatchSyscalls
[3] 
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Stop-Reply-Packets.html

Signed-off-by: Ilya Leoshkevich 
---
 gdbstub/gdbstub.c| 11 +++-
 gdbstub/internals.h  | 16 +++
 gdbstub/system.c |  1 +
 gdbstub/user-target.c| 39 +++
 gdbstub/user.c   | 51 +++-
 include/gdbstub/user.h   | 29 ++--
 include/user/syscall-trace.h |  7 +++--
 7 files changed, 148 insertions(+), 6 deletions(-)

diff --git a/gdbstub/gdbstub.c b/gdbstub/gdbstub.c
index 46d752bbc2c..7faf19508d1 100644
--- a/gdbstub/gdbstub.c
+++ b/gdbstub/gdbstub.c
@@ -1618,7 +1618,8 @@ static void handle_query_supported(GArray *params, void 
*user_ctx)
 g_string_append(gdbserver_state.str_buf, ";qXfer:auxv:read+");
 }
 #endif
-g_string_append(gdbserver_state.str_buf, ";qXfer:exec-file:read+");
+g_string_append(gdbserver_state.str_buf, ";qXfer:exec-file:read+"
+ ";QCatchSyscalls+");
 #endif
 
 if (params->len &&
@@ -1810,6 +1811,14 @@ static const GdbCmdParseEntry gdb_gen_set_table[] = {
 .schema = "l0"
 },
 #endif
+#if defined(CONFIG_USER_ONLY)
+{
+.handler = gdb_handle_set_catch_syscalls,
+.cmd = "CatchSyscalls:",
+.cmd_startswith = 1,
+.schema = "s0",
+},
+#endif
 };
 
 static void handle_gen_query(GArray *params, void *user_ctx)
diff --git a/gdbstub/internals.h b/gdbstub/internals.h
index 5c0c725e54c..6e0905ca328 100644
--- a/gdbstub/internals.h
+++ b/gdbstub/internals.h
@@ -10,6 +10,7 @@
 #define GDBSTUB_INTERNALS_H
 
 #include "exec/cpu-common.h"
+#include "qemu/bitops.h"
 
 #define MAX_PACKET_LENGTH 4096
 
@@ -46,6 +47,14 @@ enum RSState {
 RS_CHKSUM2,
 };
 
+enum GDBCatchSyscallsState {
+GDB_CATCH_SYSCALLS_NONE,
+GDB_CATCH_SYSCALLS_ALL,
+GDB_CATCH_SYSCALLS_SELECTED,
+};
+#define GDB_NR_SYSCALLS 1024
+typedef unsigned long GDBSyscallsMask[BITS_TO_LONGS(GDB_NR_SYSCALLS)];
+
 typedef struct GDBState {
 bool init;   /* have we been initialised? */
 CPUState *c_cpu; /* current CPU for step/continue ops */
@@ -70,6 +79,12 @@ typedef struct GDBState {
  * Must be set off after sending the stop reply itself.
  */
 bool allow_stop_reply;
+/*
+ * Store syscalls mask without memory allocation in order to avoid
+ * implementing synchronization.
+ */
+enum GDBCatchSyscallsState catch_syscalls_state;
+GDBSyscallsMask catch_syscalls_mask;
 } GDBState;
 
 /* lives in main gdbstub.c */
@@ -194,6 +209,7 @@ void gdb_handle_v_file_close(GArray *params, void 
*user_ctx); /* user */
 void gdb_handle_v_file_pread(GArray *params, void *user_ctx); /* user */
 void gdb_handle_v_file_readlink(GArray *params, void *user_ctx); /* user */
 void gdb_handle_query_xfer_exec_file(GArray *params, void *user_ctx); /* user 
*/
+void gdb_handle_set_catch_syscalls(GArray *params, void *user_ctx); /* user */
 
 void gdb_handle_query_attached(GArray *params, void *user_ctx); /* both */
 
diff --git a/gdbstub/system.c b/gdbstub/system.c
index 83fd452800b..4c4bafd3bcc 100644
--- a/gdbstub/system.c
+++ b/gdbstub/system.c
@@ -44,6 +44,7 @@ static void reset_gdbserver_state(void)
 gdbserver_state.processes = NULL;
 gdbserver_state.process_num = 0;
 gdbserver_state.allow_stop_reply = false;
+gdbserver_state.catch_syscalls_state = GDB_CATCH_SYSCALLS_NONE;
 }
 
 /*
diff --git a/gdbstub/user-target.c b/gdbstub/user-target.c
index c4bba4c72c7..442d15e9473 100644
--- a/gdbstub/user-target.c
+++ b/gdbstub/user-target.c
@@ -9,6 +9,7 @@
 
 #include "qemu/osdep.h"
 #include "exec/gdbstub.h"
+#include "gdbstub/user.h"
 #include "qemu.h"
 #include "internals.h"
 #ifdef CONFIG_LINUX
@@ -418,3 +419,41 @@ void gdb_handle_query_xfer_exec_file(GArray *params, void 
*user_ctx)
 ts->bprm->filename + offset);
 gdb_put_strbuf();
 }
+
+static bool should_catch_syscall(int num)
+{
+switch (gdbserver_state.catch_syscalls_state) {
+case GDB_CATCH_SYSCALLS_NONE:
+return false;
+case GDB_CATCH_SYSCALLS_ALL:
+return true;
+case GDB_CATCH_SYSCALLS_SELECTED:
+if (num < 0 || num >= GDB_NR_SYSCALLS) {
+return false;
+} else {
+return test_bit(num, gdbserver_state.catch_syscalls_mask);
+}
+default:
+g_assert_not_reached();
+}
+}
+
+void gdb_syscall_entry(CPUState

[PATCH v3 2/3] tests/tcg: Factor out gdbstub test functions

2024-01-15 Thread Ilya Leoshkevich

Both the report() function as well as the initial gdbstub test sequence
are copy-pasted into ~10 files with slight modifications. This
indicates that they are indeed generic, so factor them out. While
at it, add a few newlines to make the formatting closer to PEP-8.

Signed-off-by: Ilya Leoshkevich 
---
 tests/guest-debug/run-test.py |  7 ++-
 tests/guest-debug/test_gdbstub.py | 58 +++
 tests/tcg/aarch64/gdbstub/test-sve-ioctl.py   | 34 +--
 tests/tcg/aarch64/gdbstub/test-sve.py | 33 +--
 tests/tcg/multiarch/gdbstub/interrupt.py  | 47 ++-
 tests/tcg/multiarch/gdbstub/memory.py | 41 +
 tests/tcg/multiarch/gdbstub/registers.py  | 41 ++---
 tests/tcg/multiarch/gdbstub/sha1.py   | 40 ++---
 .../multiarch/gdbstub/test-proc-mappings.py   | 39 +
 .../multiarch/gdbstub/test-qxfer-auxv-read.py | 37 +---
 .../gdbstub/test-thread-breakpoint.py | 37 +---
 tests/tcg/s390x/gdbstub/test-signals-s390x.py | 42 +-
 tests/tcg/s390x/gdbstub/test-svc.py   | 39 +
 13 files changed, 98 insertions(+), 397 deletions(-)
 create mode 100644 tests/guest-debug/test_gdbstub.py

diff --git a/tests/guest-debug/run-test.py b/tests/guest-debug/run-test.py
index b13b27d4b19..368ff8a8903 100755
--- a/tests/guest-debug/run-test.py
+++ b/tests/guest-debug/run-test.py
@@ -97,7 +97,12 @@ def log(output, msg):
 sleep(1)
 log(output, "GDB CMD: %s" % (gdb_cmd))
 
-result = subprocess.call(gdb_cmd, shell=True, stdout=output, stderr=stderr)
+gdb_env = dict(os.environ)
+gdb_pythonpath = gdb_env.get("PYTHONPATH", "").split(os.pathsep)
+gdb_pythonpath.append(os.path.dirname(os.path.realpath(__file__)))
+gdb_env["PYTHONPATH"] = os.pathsep.join(gdb_pythonpath)
+result = subprocess.call(gdb_cmd, shell=True, stdout=output, stderr=stderr,
+ env=gdb_env)
 
 # A result of greater than 128 indicates a fatal signal (likely a
 # crash due to gdb internal failure). That's a problem for GDB and
diff --git a/tests/guest-debug/test_gdbstub.py 
b/tests/guest-debug/test_gdbstub.py
new file mode 100644
index 000..a71cdaa915a
--- /dev/null
+++ b/tests/guest-debug/test_gdbstub.py
@@ -0,0 +1,58 @@
+"""Helper functions for gdbstub testing
+
+"""
+from __future__ import print_function
+import gdb
+import sys
+import traceback
+
+fail_count = 0
+
+
+def report(cond, msg):
+"""Report success/fail of a test"""
+if cond:
+print("PASS: {}".format(msg))
+else:
+print("FAIL: {}".format(msg))
+global fail_count
+fail_count += 1
+
+
+def main(test, expected_arch=None):
+"""Run a test function
+
+This runs as the script it sourced (via -x, via run-test.py)."""
+try:
+inferior = gdb.selected_inferior()
+arch = inferior.architecture()
+print("ATTACHED: {}".format(arch.name()))
+if expected_arch is not None:
+report(arch.name() == expected_arch,
+   "connected to {}".format(expected_arch))
+except (gdb.error, AttributeError):
+print("SKIP: not connected")
+exit(0)
+
+if gdb.parse_and_eval("$pc") == 0:
+print("SKIP: PC not set")
+exit(0)
+
+try:
+test()
+except:
+print("GDB Exception:")
+traceback.print_exc(file=sys.stdout)
+global fail_count
+fail_count += 1
+import code
+code.InteractiveConsole(locals=globals()).interact()
+raise
+
+try:
+gdb.execute("kill")
+except gdb.error:
+pass
+
+print("All tests complete: {} failures".format(fail_count))
+exit(fail_count)
diff --git a/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py 
b/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py
index ee8d467e59d..a78a3a2514d 100644
--- a/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py
+++ b/tests/tcg/aarch64/gdbstub/test-sve-ioctl.py
@@ -8,19 +8,10 @@
 #
 
 import gdb
-import sys
+from test_gdbstub import main, report
 
 initial_vlen = 0
-failcount = 0
 
-def report(cond, msg):
-"Report success/fail of test"
-if cond:
-print ("PASS: %s" % (msg))
-else:
-print ("FAIL: %s" % (msg))
-global failcount
-failcount += 1
 
 class TestBreakpoint(gdb.Breakpoint):
 def __init__(self, sym_name="__sve_ld_done"):
@@ -64,26 +55,5 @@ def run_test():
 
 gdb.execute("c")
 
-#
-# This runs as the script it sourced (via -x, via run-test.py)
-#
-try:
-inferior = gdb.selected_inferior()
-arch = inferior.architecture()
-report(arch.name() == "aarch64", "connected to aarch64")
-except (gdb.error, AttributeError):
-print("SKIPPING (not connected)", file=sys.stderr)
-exit(0)
-
-try:
-# Run the actual tests
-run_test()
-except:
-print ("GDB Exception: %s" % (sys.exc_info()[0]))
-failcount += 1
-import code
-

[PATCH v3 1/3] linux-user: Allow gdbstub to ignore page protection

2024-01-15 Thread Ilya Leoshkevich

gdbserver ignores page protection by virtue of using /proc/$pid/mem.
Teach qemu gdbstub to do this too. This will not work if /proc is not
mounted; accept this limitation.

One alternative is to temporarily grant the missing PROT_* bit, but
this is inherently racy. Another alternative is self-debugging with
ptrace(POKE), which will break if QEMU itself is being debugged - a
much more severe limitation.

Reviewed-by: Richard Henderson 
Signed-off-by: Ilya Leoshkevich 
---
 cpu-target.c | 76 +---
 1 file changed, 61 insertions(+), 15 deletions(-)

diff --git a/cpu-target.c b/cpu-target.c
index 5eecd7ea2d7..723f6af5fba 100644
--- a/cpu-target.c
+++ b/cpu-target.c
@@ -406,6 +406,9 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr,
 vaddr l, page;
 void * p;
 uint8_t *buf = ptr;
+ssize_t written;
+int ret = -1;
+int fd = -1;
 
 while (len > 0) {
 page = addr & TARGET_PAGE_MASK;
@@ -413,30 +416,73 @@ int cpu_memory_rw_debug(CPUState *cpu, vaddr addr,
 if (l > len)
 l = len;
 flags = page_get_flags(page);
-if (!(flags & PAGE_VALID))
-return -1;
+if (!(flags & PAGE_VALID)) {
+goto out_close;
+}
 if (is_write) {
-if (!(flags & PAGE_WRITE))
-return -1;
-/* XXX: this code should not depend on lock_user */
-if (!(p = lock_user(VERIFY_WRITE, addr, l, 0)))
-return -1;
-memcpy(p, buf, l);
-unlock_user(p, addr, l);
-} else {
-if (!(flags & PAGE_READ))
-return -1;
+if (flags & PAGE_WRITE) {
+/* XXX: this code should not depend on lock_user */
+p = lock_user(VERIFY_WRITE, addr, l, 0);
+if (!p) {
+goto out_close;
+}
+memcpy(p, buf, l);
+unlock_user(p, addr, l);
+} else {
+/* Bypass the host page protection using ptrace. */
+if (fd == -1) {
+fd = open("/proc/self/mem", O_WRONLY);
+if (fd == -1) {
+goto out;
+}
+}
+/*
+ * If there is a TranslationBlock and we weren't bypassing the
+ * host page protection, the memcpy() above would SEGV,
+ * ultimately leading to page_unprotect(). So invalidate the
+ * translations manually. Both invalidation and pwrite() must
+ * be under mmap_lock() in order to prevent the creation of
+ * another TranslationBlock in between.
+ */
+mmap_lock();
+tb_invalidate_phys_range(addr, addr + l - 1);
+written = pwrite(fd, buf, l, (off_t)g2h_untagged(addr));
+mmap_unlock();
+if (written != l) {
+goto out_close;
+}
+}
+} else if (flags & PAGE_READ) {
 /* XXX: this code should not depend on lock_user */
-if (!(p = lock_user(VERIFY_READ, addr, l, 1)))
-return -1;
+p = lock_user(VERIFY_READ, addr, l, 1);
+if (!p) {
+goto out_close;
+}
 memcpy(buf, p, l);
 unlock_user(p, addr, 0);
+} else {
+/* Bypass the host page protection using ptrace. */
+if (fd == -1) {
+fd = open("/proc/self/mem", O_RDONLY);
+if (fd == -1) {
+goto out;
+}
+}
+if (pread(fd, buf, l, (off_t)g2h_untagged(addr)) != l) {
+goto out_close;
+}
 }
 len -= l;
 buf += l;
 addr += l;
 }
-return 0;
+ret = 0;
+out_close:
+if (fd != -1) {
+close(fd);
+}
+out:
+return ret;
 }
 #endif
 
-- 
2.43.0

[PATCH v3 0/3] linux-user: Allow gdbstub to ignore page protection

2024-01-15 Thread Ilya Leoshkevich

v2: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg01592.html
v2 -> v3: Add Richard's R-b on [1/3].
  Fix printing the architecture name and the number of failures
  in test_gdbstub.py.
  Patches that need review: [2/3] and [3/3].

v1: https://lists.gnu.org/archive/html/qemu-devel/2024-01/msg01314.html
v1 -> v2: Use /proc/self/mem as a fallback. Handle TB invalidation
  (Richard).
  Test cross-page accesses.

RFC: https://lists.gnu.org/archive/html/qemu-devel/2023-12/msg02044.html
RFC -> v1: Use /proc/self/mem and accept that this will not work
   without /proc.
   Factor out a couple functions for gdbstub testing.
   Add a test.

Hi,

I've noticed that gdbstub behaves differently from gdbserver in that it
doesn't allow reading non-readable pages. This series improves the
situation by using the same mechanism as gdbserver: /proc/self/mem.

Best regards,
Ilya

Ilya Leoshkevich (3):
  linux-user: Allow gdbstub to ignore page protection
  tests/tcg: Factor out gdbstub test functions
  tests/tcg: Add the PROT_NONE gdbstub test

 cpu-target.c  | 76 +++
 tests/guest-debug/run-test.py |  7 +-
 tests/guest-debug/test_gdbstub.py | 58 ++
 tests/tcg/aarch64/gdbstub/test-sve-ioctl.py   | 34 +
 tests/tcg/aarch64/gdbstub/test-sve.py | 33 +---
 tests/tcg/multiarch/Makefile.target   |  9 ++-
 tests/tcg/multiarch/gdbstub/interrupt.py  | 47 ++--
 tests/tcg/multiarch/gdbstub/memory.py | 41 +-
 tests/tcg/multiarch/gdbstub/prot-none.py  | 22 ++
 tests/tcg/multiarch/gdbstub/registers.py  | 41 ++
 tests/tcg/multiarch/gdbstub/sha1.py   | 40 ++
 .../multiarch/gdbstub/test-proc-mappings.py   | 39 +-
 .../multiarch/gdbstub/test-qxfer-auxv-read.py | 37 +
 .../gdbstub/test-thread-breakpoint.py | 37 +
 tests/tcg/multiarch/prot-none.c   | 40 ++
 tests/tcg/s390x/gdbstub/test-signals-s390x.py | 42 +-
 tests/tcg/s390x/gdbstub/test-svc.py   | 39 +-
 17 files changed, 229 insertions(+), 413 deletions(-)
 create mode 100644 tests/guest-debug/test_gdbstub.py
 create mode 100644 tests/tcg/multiarch/gdbstub/prot-none.py
 create mode 100644 tests/tcg/multiarch/prot-none.c

-- 
2.43.0

[PATCH v3 3/3] tests/tcg: Add the PROT_NONE gdbstub test

2024-01-15 Thread Ilya Leoshkevich

Make sure that qemu gdbstub, like gdbserver, allows reading from and
writing to PROT_NONE pages.

Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/multiarch/Makefile.target  |  9 +-
 tests/tcg/multiarch/gdbstub/prot-none.py | 22 +
 tests/tcg/multiarch/prot-none.c  | 40 
 3 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/multiarch/gdbstub/prot-none.py
 create mode 100644 tests/tcg/multiarch/prot-none.c

diff --git a/tests/tcg/multiarch/Makefile.target 
b/tests/tcg/multiarch/Makefile.target
index d31ba8d6ae4..315a2e13588 100644
--- a/tests/tcg/multiarch/Makefile.target
+++ b/tests/tcg/multiarch/Makefile.target
@@ -101,13 +101,20 @@ run-gdbstub-registers: sha512
--bin $< --test $(MULTIARCH_SRC)/gdbstub/registers.py, \
checking register enumeration)
 
+run-gdbstub-prot-none: prot-none
+   $(call run-test, $@, env PROT_NONE_PY=1 $(GDB_SCRIPT) \
+   --gdb $(GDB) \
+   --qemu $(QEMU) --qargs "$(QEMU_OPTS)" \
+   --bin $< --test $(MULTIARCH_SRC)/gdbstub/prot-none.py, \
+   accessing PROT_NONE memory)
+
 else
 run-gdbstub-%:
$(call skip-test, "gdbstub test $*", "need working gdb with $(patsubst 
-%,,$(TARGET_NAME)) support")
 endif
 EXTRA_RUNS += run-gdbstub-sha1 run-gdbstub-qxfer-auxv-read \
  run-gdbstub-proc-mappings run-gdbstub-thread-breakpoint \
- run-gdbstub-registers
+ run-gdbstub-registers run-gdbstub-prot-none
 
 # ARM Compatible Semi Hosting Tests
 #
diff --git a/tests/tcg/multiarch/gdbstub/prot-none.py 
b/tests/tcg/multiarch/gdbstub/prot-none.py
new file mode 100644
index 000..f1f1dd82cbe
--- /dev/null
+++ b/tests/tcg/multiarch/gdbstub/prot-none.py
@@ -0,0 +1,22 @@
+"""Test that GDB can access PROT_NONE pages.
+
+This runs as a sourced script (via -x, via run-test.py).
+
+SPDX-License-Identifier: GPL-2.0-or-later
+"""
+from test_gdbstub import main, report
+
+
+def run_test():
+"""Run through the tests one by one"""
+gdb.Breakpoint("break_here")
+gdb.execute("continue")
+val = gdb.parse_and_eval("*(char[2] *)q").string()
+report(val == "42", "{} == 42".format(val))
+gdb.execute("set *(char[3] *)q = \"24\"")
+gdb.execute("continue")
+exitcode = int(gdb.parse_and_eval("$_exitcode"))
+report(exitcode == 0, "{} == 0".format(exitcode))
+
+
+main(run_test)
diff --git a/tests/tcg/multiarch/prot-none.c b/tests/tcg/multiarch/prot-none.c
new file mode 100644
index 000..dc56aadb3c5
--- /dev/null
+++ b/tests/tcg/multiarch/prot-none.c
@@ -0,0 +1,40 @@
+/*
+ * Test that GDB can access PROT_NONE pages.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+void break_here(void *q)
+{
+}
+
+int main(void)
+{
+long pagesize = sysconf(_SC_PAGESIZE);
+void *p, *q;
+int err;
+
+p = mmap(NULL, pagesize * 2, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+assert(p != MAP_FAILED);
+q = p + pagesize - 1;
+strcpy(q, "42");
+
+err = mprotect(p, pagesize * 2, PROT_NONE);
+assert(err == 0);
+
+break_here(q);
+
+err = mprotect(p, pagesize * 2, PROT_READ);
+assert(err == 0);
+if (getenv("PROT_NONE_PY")) {
+assert(strcmp(q, "24") == 0);
+}
+
+return EXIT_SUCCESS;
+}
-- 
2.43.0

Re: [PATCH] target/i386/kvm: call kvm_put_vcpu_events() before kvm_put_nested_state()

2024-01-15 Thread Eiichi Tsukata

Ping.

> On Nov 8, 2023, at 10:12, Eiichi Tsukata  wrote:
> 
> Hi all, appreciate any comments or feedbacks on the patch.
> 
> Thanks,
> Eiichi
> 
>> On Nov 1, 2023, at 23:04, Vitaly Kuznetsov  wrote:
>> 
>> Eiichi Tsukata  writes:
>> 
>>> FYI: The EINVAL in vmx_set_nested_state() is caused by the following 
>>> condition:
>>> * vcpu->arch.hflags == 0
>>> * kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON
>> 
>> This is a weird state indeed,
>> 
>> 'vcpu->arch.hflags == 0' means we're not in SMM and not in guest mode
>> but kvm_state->hdr.vmx.smm.flags == KVM_STATE_NESTED_SMM_VMXON is a
>> reflection of vmx->nested.smm.vmxon (see
>> vmx_get_nested_state()). vmx->nested.smm.vmxon gets set (conditioally)
>> in vmx_enter_smm() and gets cleared in vmx_leave_smm() which means the
>> vCPU must be in SMM to have it set.
>> 
>> In case the vCPU is in SMM upon migration, HF_SMM_MASK must be set from
>> kvm_vcpu_ioctl_x86_set_vcpu_events() -> kvm_smm_changed() but QEMU's
>> kvm_put_vcpu_events() calls kvm_put_nested_state() _before_
>> kvm_put_vcpu_events(). This can explain "vcpu->arch.hflags == 0".
>> 
>> Paolo, Max, any idea how this is supposed to work?
>> 
>> -- 
>> Vitaly
>> 
>

Re: [PATCH v3 4/4] [NOT FOR MERGE] tests/qtest/migration: Adapt tests to use older QEMUs

2024-01-15 Thread Peter Xu

On Mon, Jan 15, 2024 at 10:45:33AM -0300, Fabiano Rosas wrote:
> > IMHO the n-1 tests are not for this.  The new FOO cap can only be enabled
> > in n+ versions anyway, so something like above should be covered by the
> > normal migration test that anyone would like to propose the new FOO cap.
> 
> You're being too generous in thinking new code will always restrict
> itself to implementing new functionality and never have a bug that
> affects a completly different part of the code. There could be an
> innocent refactoring along with cap FOO that breaks the migration only
> when FOO is enabled.

The question is even if we run cross-binary migration-test with current
version ("n") we can't detect such issue, right?  Because afaiu with that
we need to let migration-test always understand qemu versions, and it
should skip the new test that will enable FOO for cross-binary test since
it should detect the old binary doesn't support it.

> 
> But fine. We can't predict every scenario. Let's get this series out the
> door.
> 
> Thanks for the comments so far. I'll spin another version.

Yes if you think that is a good start point, we can start from simple.
That's so far the only solution I can think of that has mostly zero
maintanence burden for the tests meanwhile hopefully start to cover some
spots for us.  Said that, the discussion can keep going no matter what.

-- 
Peter Xu

Re: [RFC PATCH v3 00/30] migration: File based migration with multifd and fixed-ram

2024-01-15 Thread Peter Xu

On Mon, Jan 15, 2024 at 04:45:15PM -0300, Fabiano Rosas wrote:
> Peter Xu  writes:
> 
> > On Thu, Jan 11, 2024 at 03:38:31PM -0300, Fabiano Rosas wrote:
> >> Peter Xu  writes:
> >> 
> >> > On Mon, Nov 27, 2023 at 05:25:42PM -0300, Fabiano Rosas wrote:
> >> >> Hi,
> >> >> 
> >> >> In this v3:
> >> >> 
> >> >> Added support for the "file:/dev/fdset/" syntax to receive multiple
> >> >> file descriptors. This allows the management layer to open the
> >> >> migration file beforehand and pass the file descriptors to QEMU. We
> >> >> need more than one fd to be able to use O_DIRECT concurrently with
> >> >> unaligned writes.
> >> >> 
> >> >> Dropped the auto-pause capability. That discussion was kind of
> >> >> stuck. We can revisit optimizations for non-live scenarios once the
> >> >> series is more mature/merged.
> >> >> 
> >> >> Changed the multifd incoming side to use a more generic data structure
> >> >> instead of MultiFDPages_t. This allows multifd to restore the ram
> >> >> using larger chunks.
> >> >> 
> >> >> The rest are minor changes, I have noted them in the patches
> >> >> themselves.
> >> >
> >> > Fabiano,
> >> >
> >> > Could you always keep a section around in the cover letter (and also in 
> >> > the
> >> > upcoming doc file fixed-ram.rst) on the benefits of this feature?
> >> >
> >> > Please bare with me - I can start to ask silly questions.
> >> >
> >> 
> >> That's fine. Ask away!
> >> 
> >> > I thought it was about "keeping the snapshot file small".  But then when 
> >> > I
> >> > was thinking the use case, iiuc fixed-ram migration should always suggest
> >> > the user to stop the VM first before migration starts, then if the VM is
> >> > stopped the ultimate image shouldn't be large either.
> >> >
> >> > Or is it about performance only?  Where did I miss?
> >> 
> >> Performance is the main benefit because fixed-ram enables the use of
> >> multifd for file migration which would otherwise not be
> >> parallelizable. To use multifd has been the direction for a while as you
> >> know, so it makes sense.
> >> 
> >> A fast file migration is desirable because it could be used for
> >> snapshots with a stopped vm and also to replace the "exec:cat" hack
> >> (this last one I found out about recently, Juan mentioned it in this
> >> thread: https://lore.kernel.org/r/87cyx5ty26.fsf@secure.mitica).
> >
> > I digged again the history, and started to remember the "live" migration
> > case for fixed-ram. IIUC that is what Dan mentioned in below email
> > regarding to the "virDomainSnapshotXXX" use case:
> >
> > https://lore.kernel.org/all/zd7mrgq+4qsdb...@redhat.com/
> >
> > So IIUC "stopped VM" is not always the use case?
> >
> > If you agree with this, we need to document these two use cases clearly in
> > the doc update:
> >
> >   - "Migrate a VM to file, then destroy the VM"
> >
> > It should be suggested to stop the VM first before triggering such
> > migration in this use case in the documents.
> >
> >   - "Take a live snapshot of the VM"
> >
> > It'll be ideal if there is a portable interface to synchronously track
> > dirtying of guest pages, but we don't...
> >
> > So fixed-ram seems to be the solution for such a portable solution for
> > taking live snapshot across-platforms as long as async dirty tracking
> > is still supported on that OS (aka KVM_GET_DIRTY_LOG).  If async
> > tracking is not supported, snapshot cannot be done live on the OS then,
> > and one needs to use "snapshot-save".
> >
> > For this one, IMHO it would be good to mention (from QEMU perspective)
> > the existance of background-snapshot even though libvirt didn't support
> > it for some reason.  Currently background-snapshot lacks multi-thread
> > feature (nor O_DIRECT), though, so it may be less performant than
> > fixed-ram.  However if with all features there I believe that's even
> > more performant.  Please consider mention to a degree of detail on
> > this.
> >
> 
> I'll include these in some form in the docs update.

Thanks.

Fixed-ram should also need a separate file after the doc series applied.
I'll try to prepare a pull this week so both fixed-ram and cpr should
hopefully have place to hold its own file under docs/devel/migration/.

PS: just in case it didn't land as soon, feel free to fetch migration-next
of my github.com/peterx/qemu repo; I only put things there if they at least
pass one round of CI, so the content should be relatively stable even
though not fully guranteed.

> 
> >> 
> >> The size aspect is just an interesting property, not necessarily a
> >> reason.
> >
> > See above on the 2nd "live" use case of fixed-ram. I think in that case,
> > size is still a matter, then, because that one cannot stop the VM vcpus.
> >
> >> It's about having the file bounded to the RAM size. So a running
> >> guest would not produce a continuously growing file. This is in contrast
> >> with previous experiments (libvirt code) in using a proxy to put
> >>

Re: [PATCH 1/2] linux-user/riscv: vdso: fix call frame info in __vdso_rt_sigreturn

2024-01-15 Thread Richard Henderson


On 1/16/24 10:15, Vineet Gupta wrote:

When testing gcc testsuite against QEMU v8.2 we found some additional
failures vs. v8.1.2.

| FAIL: gcc.dg/cleanup-10.c execution test
| FAIL: gcc.dg/cleanup-11.c execution test
| FAIL: gcc.dg/cleanup-8.c execution test
| FAIL: gcc.dg/cleanup-9.c execution test

All of these tests involve unwinding off signal stack and v8.2 did
introduce a vdso with sigreturn trampoline and associated unwinding
info. It seems that info is not correct and making it similar to
to one in the linux kernel fixes the above failures.


So.. you didn't actually determine what might be off in the unwind info?


+   .cfi_startproc
+   .cfi_signal_frame
+   raw_syscall __NR_rt_sigreturn
+   .cfi_endproc


No, this is wrong.  It indicates that the unwind info is present and trivial.


r~

[PATCH 2/2] linux-user/riscv: rebuild vdso binaries after prev fix

2024-01-15 Thread Vineet Gupta

Signed-off-by: Vineet Gupta 
---
Splitting this from prev patch in case maintainers want to regenerate
the vdso at their end. Or if they choose to, this can be squashed with
prev change too.
---

Signed-off-by: Vineet Gupta 
---
 linux-user/riscv/vdso-32.so | Bin 2900 -> 2836 bytes
 linux-user/riscv/vdso-64.so | Bin 3856 -> 3792 bytes
 2 files changed, 0 insertions(+), 0 deletions(-)

diff --git a/linux-user/riscv/vdso-32.so b/linux-user/riscv/vdso-32.so
index 
1ad1e5c8b1fe36b0fe4bcb6c06fab8219ecd..ee158b8374d14973492a4d05f705bf12c6cf42b2
 100755
GIT binary patch
delta 563
zcmca2HbrcL0%OiZMRVRQEDUhqGci@34J5FIWnx#Ah{>aRivX+7%f)pfCby{;th~y{
zaAdLs;~pUuppponE+GaU1}z}n!7}+7qnyzipfCqiSQJRt0Qo#nz7LRIfFy4R}vipe{f!vIX7G3eSw)$}$z=?Tj+0$kO*!MMa>_3=0e<
zH!_M`_+QDw3tn?8^%WRtG<1Q`Hy1_&@ReEc5-qz?d{4+0>54v?O~4CVp}E{1v_
z>A(cxgXE{eq=2*_kiP&(BYS@fkdGXSCxHAZh~TnkxCRtJ4$oIWeg{GU!#^M
zla0z;OWoY{@)=YrCVR3PB7C^ni@l3+^CgaDjEotR1G$7J_i*vBf^1irJeNzI6%^q<
llO4H*C*R=`U}Tv5mP?)O02>2?Fymw?ZY8G6?33-d#Q-6rVafmi

delta 697
zcmbOtc13K00^^#Aisrl)tPF7AJTX9{TMW>6xcV8$Z
z{cy4b;~t?EEDQ`0KwUx%JPcYunum4rGe$Wh1E5Y0sIVx=P#~WN%J%`%DoFBnK)wX)
zW;3QdMo|Gqpp8I&29OQ_(o=x+pUFF!`uL_VDcG+Q+e<4eZ!C
zK*vo0VxGyeEHikVfUKg-;^Z<0MvKV@SxhH4GKx(WVco#EV)9W|{d!QaLI5b#5P*f@
z<9~i2R|4p35MTuIC4jU8Fg!s3#5V%cGnm0#Ai>4p3M3tvAbgN~G)xLe3j+CRKpHtd
zsu?(52!!_DFQ)0L5|=JKt2zW{1G4@Il>u|lLLw?eG;?v@)=YrHgm9dF>ap1
zv4oNF&169?;mJN+Jggw=R!olNQfCFFIp@iY+`^ORa0xKhOkN8VvtegoxCs_}!zI8Z
K$}#yLmlyyeGL7y4

diff --git a/linux-user/riscv/vdso-64.so b/linux-user/riscv/vdso-64.so
index 
83992bebe6d0182f24edfffc531015fd2f4e1cfb..f2e250fc6ca1bfb79bd7f350ae914fa7b366a1f5
 100755
GIT binary patch
delta 694
zcmbOrcR_Z72IGW@n$m7Mjr8r8x*)curh$b2iA$Vj)^$8=3hz_
z5MXOLa6sbt;}gq7+qw5{-olu}D76Nvjt8QgK?_I=G4L=PVV!KntmpLtD$N0v7KKXh
zK-1;}6~BX~-VQ2$fpzmH<|;<99w6Tah-UzC01%%5;tGbzPOM6f9zZ6{6#_uE4iGy)
zX&}V_5(5F|s-R#7#u`ROXN?vW(iLg;Sllk1nOgC`1pSZBandvf`Dw0I4nG2
zQaNA-7lQ)S!!Yss$qPBe>!$*_fT*L@WRcxCD
zxEC-nrcB<*tIq{<;TqP-uXyb^VZq`v*^tki(P45XpFP(D4zSe>!iqm&6$9S=k~gBFk$V$VVi8ltmjn#mF9p-i$bLx(6sqL
z#Z%DK+d;)6*fwusu3{7uVPs=>Q<@1Ed)kCp)n!)n5RzV6G4VvUPy?1e69+
z3^PCk5HME-1v4<#I50}{FdblHb!cI1Y+!7vX|8A~X)S2WY0v0L=}hQ~>5k|L=?&=f
z>Gzo6GSOj@&18!yCQ}Wj=}gy{p)ylpmdtF4IU;if=JCwuSm3hIX_151V!I_aORbh!
zEH_(WveIak!D_uVI%~DoX{=YnI%m$LfsI00EjnZ?Ov42%|&|FW6_qep3T3>yz4lL_18EOs|RSjYp_
zLBQmR?BbsMKsF=8$NxJ(0fPi2fNYR>12lkPQbu3~7lQ^PFCa)=Zpq&1cAC3
zCNuI1*Ms5}J%Xx$DnQ~e*T7<+4=Nsk7SVH{;^+xu1614xP5lw5I55FT1DS{@W=Kws
z2rzdwa@NadP^s9g!MTWW^98O2OpI?P7xL

[PATCH 1/2] linux-user/riscv: vdso: fix call frame info in __vdso_rt_sigreturn

2024-01-15 Thread Vineet Gupta

When testing gcc testsuite against QEMU v8.2 we found some additional
failures vs. v8.1.2.

| FAIL: gcc.dg/cleanup-10.c execution test
| FAIL: gcc.dg/cleanup-11.c execution test
| FAIL: gcc.dg/cleanup-8.c execution test
| FAIL: gcc.dg/cleanup-9.c execution test

All of these tests involve unwinding off signal stack and v8.2 did
introduce a vdso with sigreturn trampoline and associated unwinding
info. It seems that info is not correct and making it similar to
to one in the linux kernel fixes the above failures.

Fixes: 468c1bb5cac9 ("linux-user/riscv: Add vdso")
Reported-by: Edwin Lu 
Signed-off-by: Vineet Gupta 
---
 linux-user/riscv/vdso.S | 87 ++---
 1 file changed, 4 insertions(+), 83 deletions(-)

diff --git a/linux-user/riscv/vdso.S b/linux-user/riscv/vdso.S
index a86d8fc488e0..20119010c11b 100644
--- a/linux-user/riscv/vdso.S
+++ b/linux-user/riscv/vdso.S
@@ -97,91 +97,12 @@ endf __vdso_flush_icache
  * trampoline, because the unwinder will assume we are returning
  * after a call site.
  */
-
-   .cfi_startproc simple
-   .cfi_signal_frame
-
-#define sizeof_reg (__riscv_xlen / 4)
-#define sizeof_freg8
-#define B_GR   (offsetof_uc_mcontext - sizeof_rt_sigframe)
-#define B_FR   (offsetof_uc_mcontext - sizeof_rt_sigframe + offsetof_freg0)
-
-   .cfi_def_cfa2, sizeof_rt_sigframe
-
-   /* Return address */
-   .cfi_return_column 64
-   .cfi_offset 64, B_GR + 0/* pc */
-
-   /* Integer registers */
-   .cfi_offset 1, B_GR + 1 * sizeof_reg/* r1 (ra) */
-   .cfi_offset 2, B_GR + 2 * sizeof_reg/* r2 (sp) */
-   .cfi_offset 3, B_GR + 3 * sizeof_reg
-   .cfi_offset 4, B_GR + 4 * sizeof_reg
-   .cfi_offset 5, B_GR + 5 * sizeof_reg
-   .cfi_offset 6, B_GR + 6 * sizeof_reg
-   .cfi_offset 7, B_GR + 7 * sizeof_reg
-   .cfi_offset 8, B_GR + 8 * sizeof_reg
-   .cfi_offset 9, B_GR + 9 * sizeof_reg
-   .cfi_offset 10, B_GR + 10 * sizeof_reg
-   .cfi_offset 11, B_GR + 11 * sizeof_reg
-   .cfi_offset 12, B_GR + 12 * sizeof_reg
-   .cfi_offset 13, B_GR + 13 * sizeof_reg
-   .cfi_offset 14, B_GR + 14 * sizeof_reg
-   .cfi_offset 15, B_GR + 15 * sizeof_reg
-   .cfi_offset 16, B_GR + 16 * sizeof_reg
-   .cfi_offset 17, B_GR + 17 * sizeof_reg
-   .cfi_offset 18, B_GR + 18 * sizeof_reg
-   .cfi_offset 19, B_GR + 19 * sizeof_reg
-   .cfi_offset 20, B_GR + 20 * sizeof_reg
-   .cfi_offset 21, B_GR + 21 * sizeof_reg
-   .cfi_offset 22, B_GR + 22 * sizeof_reg
-   .cfi_offset 23, B_GR + 23 * sizeof_reg
-   .cfi_offset 24, B_GR + 24 * sizeof_reg
-   .cfi_offset 25, B_GR + 25 * sizeof_reg
-   .cfi_offset 26, B_GR + 26 * sizeof_reg
-   .cfi_offset 27, B_GR + 27 * sizeof_reg
-   .cfi_offset 28, B_GR + 28 * sizeof_reg
-   .cfi_offset 29, B_GR + 29 * sizeof_reg
-   .cfi_offset 30, B_GR + 30 * sizeof_reg
-   .cfi_offset 31, B_GR + 31 * sizeof_reg  /* r31 */
-
-   .cfi_offset 32, B_FR + 0/* f0 */
-   .cfi_offset 33, B_FR + 1 * sizeof_freg  /* f1 */
-   .cfi_offset 34, B_FR + 2 * sizeof_freg
-   .cfi_offset 35, B_FR + 3 * sizeof_freg
-   .cfi_offset 36, B_FR + 4 * sizeof_freg
-   .cfi_offset 37, B_FR + 5 * sizeof_freg
-   .cfi_offset 38, B_FR + 6 * sizeof_freg
-   .cfi_offset 39, B_FR + 7 * sizeof_freg
-   .cfi_offset 40, B_FR + 8 * sizeof_freg
-   .cfi_offset 41, B_FR + 9 * sizeof_freg
-   .cfi_offset 42, B_FR + 10 * sizeof_freg
-   .cfi_offset 43, B_FR + 11 * sizeof_freg
-   .cfi_offset 44, B_FR + 12 * sizeof_freg
-   .cfi_offset 45, B_FR + 13 * sizeof_freg
-   .cfi_offset 46, B_FR + 14 * sizeof_freg
-   .cfi_offset 47, B_FR + 15 * sizeof_freg
-   .cfi_offset 48, B_FR + 16 * sizeof_freg
-   .cfi_offset 49, B_FR + 17 * sizeof_freg
-   .cfi_offset 50, B_FR + 18 * sizeof_freg
-   .cfi_offset 51, B_FR + 19 * sizeof_freg
-   .cfi_offset 52, B_FR + 20 * sizeof_freg
-   .cfi_offset 53, B_FR + 21 * sizeof_freg
-   .cfi_offset 54, B_FR + 22 * sizeof_freg
-   .cfi_offset 55, B_FR + 23 * sizeof_freg
-   .cfi_offset 56, B_FR + 24 * sizeof_freg
-   .cfi_offset 57, B_FR + 25 * sizeof_freg
-   .cfi_offset 58, B_FR + 26 * sizeof_freg
-   .cfi_offset 59, B_FR + 27 * sizeof_freg
-   .cfi_offset 60, B_FR + 28 * sizeof_freg
-   .cfi_offset 61, B_FR + 29 * sizeof_freg
-   .cfi_offset 62, B_FR + 30 * sizeof_freg
-   .cfi_offset 63, B_FR + 31 * sizeof_freg /* f31 */
-
nop
 
 __vdso_rt_sigreturn:
-   raw_syscall __NR_rt_sigreturn
+   .cfi_startproc
+   .cfi_signal_frame
+   raw_syscall

Re: [PATCH v2 12/12] target/riscv/cpu.c: remove cpu->cfg.vlen

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

There is no need to keep both 'vlen' and 'vlenb'. All existing code
that requires 'vlen' is retrieving it via 'vlenb << 3'.

Signed-off-by: Daniel Henrique Barboza
---
  target/riscv/cpu.c | 8 +++-
  target/riscv/cpu_cfg.h | 1 -
  target/riscv/tcg/tcg-cpu.c | 4 +++-
  3 files changed, 6 insertions(+), 7 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 09/12] target/riscv/cpu.h: use 'vlenb' in vext_get_vlmax()

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

Signed-off-by: Daniel Henrique Barboza 
---
  target/riscv/cpu.h | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 11df226a00..7304e478c2 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -692,7 +692,11 @@ static inline uint32_t vext_get_vlmax(RISCVCPU *cpu, 
target_ulong vtype)
  {
  uint8_t sew = FIELD_EX64(vtype, VTYPE, VSEW);
  int8_t lmul = sextract32(FIELD_EX64(vtype, VTYPE, VLMUL), 0, 3);
-return cpu->cfg.vlen >> (sew + 3 - lmul);
+/*
+ * vlmax = vlen >> (sew + 3 - lmul). With vlenb,
+ * 3 less shifts: vlenb >> (sew + 3 - 3 - lmul)
+ */
+return cpu->cfg.vlenb >> (sew - lmul);
  }


I take it back -- this doesn't work without the + 3:

  sew = 0
  lmul = 3

   vlenb >> (0 - 3)
 = vlenb >> -3

Need

  vlen = vlenb << 3
  vlmax = vlen >> (0 + 3 - 3)
= vlen >> 0
= vlen


r~

Re: [PATCH v2 10/12] target/riscv/insn_trans/trans_rvv.c.inc: use 'vlenb' in MAXSZ()

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

Calculate the maximum vector size possible, 'max_sz', which is the size
in bytes 'vlenb' multiplied by the max value of LMUL (LMUL = 8, when
s->lmul = 3).

'max_sz' is then shifted right by 'scale', expressed as '3 - s->lmul',
which is clearer than doing 'scale = lmul - 3' and then using '-scale'
in the shift right.

Signed-off-by: Daniel Henrique Barboza
---
  target/riscv/insn_trans/trans_rvv.c.inc | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [RFC PATCH v3 11/30] migration/multifd: Allow multifd without packets

2024-01-15 Thread Peter Xu

On Mon, Jan 15, 2024 at 03:39:29PM -0300, Fabiano Rosas wrote:
> > Maybe multifd_use_packets()?  Dropping the migrate_ prefix as this is not a
> > global API but multifd-only.  Then if multifd_packets() reads too weird and
> > unclear, "add" makes it clear.
> 
> We removed all the instances of migrate_use_* from the migration code
> recently. Not sure we should introduce them back, it seems like a step
> back.
> 
> We're setting 'use_packets = migrate_multifd_packets()' in most places,
> so I guess 'use_packets = multifd_packets()' wouldn't be too bad.

I actually prefers keep using "_use_" all over the places because it's
clearer to me. :) While I don't see much benefit saving three chars.  Try
"git grep _use_ | wc -l" in both QEMU / Linux, then we'll see that reports
275 / 4680 corrrespondingly.

But yeah that's trivial, multifd_packets() is still okay.

-- 
Peter Xu

Re: [PATCH v2 09/12] target/riscv/cpu.h: use 'vlenb' in vext_get_vlmax()

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

Signed-off-by: Daniel Henrique Barboza
---
  target/riscv/cpu.h | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 08/12] target/riscv/vector_helper.c: use vlenb in HELPER(vsetvl)

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

Use the new 'vlenb' CPU config to validate fractional LMUL. The original
comparison is done with 'vlen' and 'sew', both in bits. Adjust the shift
to use vlenb.

Signed-off-by: Daniel Henrique Barboza 
---
  target/riscv/vector_helper.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index cb944229b0..9e3ae4b5d3 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -45,9 +45,13 @@ target_ulong HELPER(vsetvl)(CPURISCVState *env, target_ulong 
s1,
  xlen - 1 - 
R_VTYPE_RESERVED_SHIFT);
  
  if (lmul & 4) {

-/* Fractional LMUL - check LMUL * VLEN >= SEW */
+/*
+ * Fractional LMUL: check VLEN * LMUL >= SEW,
+ * or VLEN * (8 - lmul) >= SEW. Using VLENB we
+ * need 3 less shifts rights.


The last sentence is structured oddly.  Perhaps

  Using VLENB, we decrease the right shift by 3

or perhaps just show the expansion:

/*
 * Fractional LMUL, check
 *
 *VLEN * LMUL >= SEW
 *VLEN >> (8 - lmul) >= sew
 *(vlenb << 3) >> (8 - lmul) >= sew
 *vlenb >> (8 - 3 - lmul) >= sew
 */

Anyway,
Reviewed-by: Richard Henderson 

r~



+ */
  if (lmul == 4 ||
-cpu->cfg.vlen >> (8 - lmul) < sew) {
+cpu->cfg.vlenb >> (8 - 3 - lmul) < sew) {
  vill = true;
  }
  }

Re: [PATCH v2 07/12] target/riscv/vector_helper.c: use 'vlenb'

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

Use 'cpu->cfg.vlenb' instead of 'cpu->cfg.vlen >> 3'.

Signed-off-by: Daniel Henrique Barboza
---
  target/riscv/vector_helper.c | 18 +-
  1 file changed, 9 insertions(+), 9 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 05/12] target/riscv/insn_trans/trans_rvv.c.inc: use 'vlenb'

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

Use s->cfg_ptr->vlenb instead of "s->cfg_ptr->vlen / 8"  and
"s->cfg_ptr->vlen >> 3".

Signed-off-by: Daniel Henrique Barboza
---
  target/riscv/insn_trans/trans_rvv.c.inc | 140 
  1 file changed, 70 insertions(+), 70 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 04/12] target/riscv/insn_trans/trans_rvbf16.c.inc: use cpu->cfg.vlenb

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

Use ctx->cfg_ptr->vlenb instead of ctx->cfg_ptr->vlen / 8.

Signed-off-by: Daniel Henrique Barboza
---
  target/riscv/insn_trans/trans_rvbf16.c.inc | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 03/12] target/riscv/gdbstub.c: use 'vlenb' instead of shifting 'vlen'

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

Signed-off-by: Daniel Henrique Barboza
---
  target/riscv/gdbstub.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 02/12] target/riscv/csr.c: use 'vlenb' instead of 'vlen'

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

As a bonus, we're being more idiomatic using cpu->cfg.vlenb when
reading CSR_VLENB.

Signed-off-by: Daniel Henrique Barboza 
---
  target/riscv/csr.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 


r~

Re: [PATCH v2 01/12] target/riscv: add 'vlenb' field in cpu->cfg

2024-01-15 Thread Richard Henderson


On 1/16/24 09:25, Daniel Henrique Barboza wrote:

Our usage of 'vlenb' is overwhelming superior than the use of 'vlen'.
We're using 'vlenb' most of the time, having to do 'vlen >> 3' or
'vlen / 8' in every instance.

In hindsight we would be better if the 'vlenb' property  was introduced
instead of 'vlen'. That's not what happened, and now we can't easily get
rid of it due to user scripts all around. What we can do, however, is to
change our internal representation to use 'vlenb'.

Add a 'vlenb' field in cpu->cfg. It'll be set via the existing 'vlen'
property, i.e. setting 'vlen' will also set 'vlenb'.

We'll replace all 'vlen >> 3' code to use 'vlenb' directly. Start with
the single instance we have in target/riscv/cpu.c.

Signed-off-by: Daniel Henrique Barboza
---
  target/riscv/cpu.c | 4 +++-
  target/riscv/cpu_cfg.h | 1 +
  2 files changed, 4 insertions(+), 1 deletion(-)


Reviewed-by: Richard Henderson 


r~

[PATCH v2 10/12] target/riscv/insn_trans/trans_rvv.c.inc: use 'vlenb' in MAXSZ()

2024-01-15 Thread Daniel Henrique Barboza

Calculate the maximum vector size possible, 'max_sz', which is the size
in bytes 'vlenb' multiplied by the max value of LMUL (LMUL = 8, when
s->lmul = 3).

'max_sz' is then shifted right by 'scale', expressed as '3 - s->lmul',
which is clearer than doing 'scale = lmul - 3' and then using '-scale'
in the shift right.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index d743675262..b4663b6e1f 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -1160,12 +1160,12 @@ GEN_LDST_WHOLE_TRANS(vs8r_v, 8, 1, true)
 /*
  * MAXSZ returns the maximum vector size can be operated in bytes,
  * which is used in GVEC IR when vl_eq_vlmax flag is set to true
- * to accerlate vector operation.
+ * to accelerate vector operation.
  */
 static inline uint32_t MAXSZ(DisasContext *s)
 {
-int scale = s->lmul - 3;
-return s->cfg_ptr->vlen >> -scale;
+int max_sz = s->cfg_ptr->vlenb * 8;
+return max_sz >> (3 - s->lmul);
 }
 
 static bool opivv_check(DisasContext *s, arg_rmrr *a)
-- 
2.43.0

[PATCH v2 09/12] target/riscv/cpu.h: use 'vlenb' in vext_get_vlmax()

2024-01-15 Thread Daniel Henrique Barboza

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 11df226a00..7304e478c2 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -692,7 +692,11 @@ static inline uint32_t vext_get_vlmax(RISCVCPU *cpu, 
target_ulong vtype)
 {
 uint8_t sew = FIELD_EX64(vtype, VTYPE, VSEW);
 int8_t lmul = sextract32(FIELD_EX64(vtype, VTYPE, VLMUL), 0, 3);
-return cpu->cfg.vlen >> (sew + 3 - lmul);
+/*
+ * vlmax = vlen >> (sew + 3 - lmul). With vlenb,
+ * 3 less shifts: vlenb >> (sew + 3 - 3 - lmul)
+ */
+return cpu->cfg.vlenb >> (sew - lmul);
 }
 
 void cpu_get_tb_cpu_state(CPURISCVState *env, vaddr *pc,
-- 
2.43.0

[PATCH v2 02/12] target/riscv/csr.c: use 'vlenb' instead of 'vlen'

2024-01-15 Thread Daniel Henrique Barboza

As a bonus, we're being more idiomatic using cpu->cfg.vlenb when
reading CSR_VLENB.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/csr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 674ea075a4..5c8d22452b 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -683,7 +683,7 @@ static RISCVException read_vl(CPURISCVState *env, int csrno,
 
 static int read_vlenb(CPURISCVState *env, int csrno, target_ulong *val)
 {
-*val = riscv_cpu_cfg(env)->vlen >> 3;
+*val = riscv_cpu_cfg(env)->vlenb;
 return RISCV_EXCP_NONE;
 }
 
@@ -738,7 +738,7 @@ static RISCVException write_vstart(CPURISCVState *env, int 
csrno,
  * The vstart CSR is defined to have only enough writable bits
  * to hold the largest element index, i.e. lg2(VLEN) bits.
  */
-env->vstart = val & ~(~0ULL << ctzl(riscv_cpu_cfg(env)->vlen));
+env->vstart = val & ~(~0ULL << ctzl(riscv_cpu_cfg(env)->vlenb << 3));
 return RISCV_EXCP_NONE;
 }
 
-- 
2.43.0

[PATCH v2 01/12] target/riscv: add 'vlenb' field in cpu->cfg

2024-01-15 Thread Daniel Henrique Barboza

Our usage of 'vlenb' is overwhelming superior than the use of 'vlen'.
We're using 'vlenb' most of the time, having to do 'vlen >> 3' or
'vlen / 8' in every instance.

In hindsight we would be better if the 'vlenb' property  was introduced
instead of 'vlen'. That's not what happened, and now we can't easily get
rid of it due to user scripts all around. What we can do, however, is to
change our internal representation to use 'vlenb'.

Add a 'vlenb' field in cpu->cfg. It'll be set via the existing 'vlen'
property, i.e. setting 'vlen' will also set 'vlenb'.

We'll replace all 'vlen >> 3' code to use 'vlenb' directly. Start with
the single instance we have in target/riscv/cpu.c.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 4 +++-
 target/riscv/cpu_cfg.h | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8d3ec74a1c..f4261d2ffc 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -847,7 +847,7 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, int 
flags)
  csr_ops[csrno].name, val);
 }
 }
-uint16_t vlenb = cpu->cfg.vlen >> 3;
+uint16_t vlenb = cpu->cfg.vlenb;
 
 for (i = 0; i < 32; i++) {
 qemu_fprintf(f, " %-8s ", riscv_rvv_regnames[i]);
@@ -1314,6 +1314,7 @@ static void riscv_cpu_init(Object *obj)
 /* Default values for non-bool cpu properties */
 cpu->cfg.pmu_mask = MAKE_64BIT_MASK(3, 16);
 cpu->cfg.vlen = 128;
+cpu->cfg.vlenb = 128 >> 3;
 cpu->cfg.elen = 64;
 cpu->env.vext_ver = VEXT_VERSION_1_00_0;
 }
@@ -1810,6 +1811,7 @@ static void prop_vlen_set(Object *obj, Visitor *v, const 
char *name,
 
 cpu_option_add_user_setting(name, value);
 cpu->cfg.vlen = value;
+cpu->cfg.vlenb = value >> 3;
 }
 
 static void prop_vlen_get(Object *obj, Visitor *v, const char *name,
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index fea14c275f..50479dd72f 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -140,6 +140,7 @@ struct RISCVCPUConfig {
 
 uint32_t pmu_mask;
 uint16_t vlen;
+uint16_t vlenb;
 uint16_t elen;
 uint16_t cbom_blocksize;
 uint16_t cbop_blocksize;
-- 
2.43.0

[PATCH v2 07/12] target/riscv/vector_helper.c: use 'vlenb'

2024-01-15 Thread Daniel Henrique Barboza

Use 'cpu->cfg.vlenb' instead of 'cpu->cfg.vlen >> 3'.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/vector_helper.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index c1c3a4d1ea..cb944229b0 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -558,7 +558,7 @@ vext_ldst_whole(void *vd, target_ulong base, CPURISCVState 
*env, uint32_t desc,
 {
 uint32_t i, k, off, pos;
 uint32_t nf = vext_nf(desc);
-uint32_t vlenb = riscv_cpu_cfg(env)->vlen >> 3;
+uint32_t vlenb = riscv_cpu_cfg(env)->vlenb;
 uint32_t max_elems = vlenb >> log2_esz;
 
 k = env->vstart / max_elems;
@@ -929,7 +929,7 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, 
  \
 { \
 uint32_t vl = env->vl;\
 uint32_t vm = vext_vm(desc);  \
-uint32_t total_elems = riscv_cpu_cfg(env)->vlen;  \
+uint32_t total_elems = riscv_cpu_cfg(env)->vlenb << 3;\
 uint32_t vta_all_1s = vext_vta_all_1s(desc);  \
 uint32_t i;   \
   \
@@ -967,7 +967,7 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1,  
\
 {   \
 uint32_t vl = env->vl;  \
 uint32_t vm = vext_vm(desc);\
-uint32_t total_elems = riscv_cpu_cfg(env)->vlen;\
+uint32_t total_elems = riscv_cpu_cfg(env)->vlenb << 3;  \
 uint32_t vta_all_1s = vext_vta_all_1s(desc);\
 uint32_t i; \
 \
@@ -1171,7 +1171,7 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void 
*vs2,   \
 { \
 uint32_t vm = vext_vm(desc);  \
 uint32_t vl = env->vl;\
-uint32_t total_elems = riscv_cpu_cfg(env)->vlen;  \
+uint32_t total_elems = riscv_cpu_cfg(env)->vlenb << 3;\
 uint32_t vta_all_1s = vext_vta_all_1s(desc);  \
 uint32_t vma = vext_vma(desc);\
 uint32_t i;   \
@@ -1236,7 +1236,7 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, 
void *vs2,   \
 {   \
 uint32_t vm = vext_vm(desc);\
 uint32_t vl = env->vl;  \
-uint32_t total_elems = riscv_cpu_cfg(env)->vlen;\
+uint32_t total_elems = riscv_cpu_cfg(env)->vlenb << 3;  \
 uint32_t vta_all_1s = vext_vta_all_1s(desc);\
 uint32_t vma = vext_vma(desc);  \
 uint32_t i; \
@@ -3971,7 +3971,7 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1, void 
*vs2,   \
 { \
 uint32_t vm = vext_vm(desc);  \
 uint32_t vl = env->vl;\
-uint32_t total_elems = riscv_cpu_cfg(env)->vlen;  \
+uint32_t total_elems = riscv_cpu_cfg(env)->vlenb << 3;\
 uint32_t vta_all_1s = vext_vta_all_1s(desc);  \
 uint32_t vma = vext_vma(desc);\
 uint32_t i;   \
@@ -4011,7 +4011,7 @@ void HELPER(NAME)(void *vd, void *v0, uint64_t s1, void 
*vs2,   \
 {   \
 uint32_t vm = vext_vm(desc);\
 uint32_t vl = env->vl;  \
-uint32_t total_elems = riscv_cpu_cfg(env)->vlen;\
+uint32_t total_elems = riscv_cpu_cfg(env)->vlenb << 3;  \
 uint32_t vta_all_1s = vext_vta_all_1s(desc);\
 uint32_t vma = vext_vma(desc);  \
 uint32_t i; \
@@ -4528,7 +4528,7 @@ void HELPER(NAME)(void *vd, void *v0, void *vs1,  
\
   uint32_t desc)  \
 { \
 uint32_t vl = env->vl;\
-uint32_t total_elems = riscv_cpu_cfg(env)->vlen;  \
+uint32_t total_elems = riscv_cpu_cfg(env)->vlenb << 3;\
 uint32_t vta_all_1s = vext_vta_all_1s(desc);  \
 uint32_t i;

[PATCH v2 04/12] target/riscv/insn_trans/trans_rvbf16.c.inc: use cpu->cfg.vlenb

2024-01-15 Thread Daniel Henrique Barboza

Use ctx->cfg_ptr->vlenb instead of ctx->cfg_ptr->vlen / 8.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/insn_trans/trans_rvbf16.c.inc | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvbf16.c.inc 
b/target/riscv/insn_trans/trans_rvbf16.c.inc
index 4e39c00884..8ee99df3f3 100644
--- a/target/riscv/insn_trans/trans_rvbf16.c.inc
+++ b/target/riscv/insn_trans/trans_rvbf16.c.inc
@@ -83,8 +83,8 @@ static bool trans_vfncvtbf16_f_f_w(DisasContext *ctx, 
arg_vfncvtbf16_f_f_w *a)
 data = FIELD_DP32(data, VDATA, VMA, ctx->vma);
 tcg_gen_gvec_3_ptr(vreg_ofs(ctx, a->rd), vreg_ofs(ctx, 0),
vreg_ofs(ctx, a->rs2), tcg_env,
-   ctx->cfg_ptr->vlen / 8,
-   ctx->cfg_ptr->vlen / 8, data,
+   ctx->cfg_ptr->vlenb,
+   ctx->cfg_ptr->vlenb, data,
gen_helper_vfncvtbf16_f_f_w);
 mark_vs_dirty(ctx);
 gen_set_label(over);
@@ -112,8 +112,8 @@ static bool trans_vfwcvtbf16_f_f_v(DisasContext *ctx, 
arg_vfwcvtbf16_f_f_v *a)
 data = FIELD_DP32(data, VDATA, VMA, ctx->vma);
 tcg_gen_gvec_3_ptr(vreg_ofs(ctx, a->rd), vreg_ofs(ctx, 0),
vreg_ofs(ctx, a->rs2), tcg_env,
-   ctx->cfg_ptr->vlen / 8,
-   ctx->cfg_ptr->vlen / 8, data,
+   ctx->cfg_ptr->vlenb,
+   ctx->cfg_ptr->vlenb, data,
gen_helper_vfwcvtbf16_f_f_v);
 mark_vs_dirty(ctx);
 gen_set_label(over);
@@ -143,8 +143,8 @@ static bool trans_vfwmaccbf16_vv(DisasContext *ctx, 
arg_vfwmaccbf16_vv *a)
 tcg_gen_gvec_4_ptr(vreg_ofs(ctx, a->rd), vreg_ofs(ctx, 0),
vreg_ofs(ctx, a->rs1),
vreg_ofs(ctx, a->rs2), tcg_env,
-   ctx->cfg_ptr->vlen / 8,
-   ctx->cfg_ptr->vlen / 8, data,
+   ctx->cfg_ptr->vlenb,
+   ctx->cfg_ptr->vlenb, data,
gen_helper_vfwmaccbf16_vv);
 mark_vs_dirty(ctx);
 gen_set_label(over);
-- 
2.43.0

[PATCH v2 12/12] target/riscv/cpu.c: remove cpu->cfg.vlen

2024-01-15 Thread Daniel Henrique Barboza

There is no need to keep both 'vlen' and 'vlenb'. All existing code
that requires 'vlen' is retrieving it via 'vlenb << 3'.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/cpu.c | 8 +++-
 target/riscv/cpu_cfg.h | 1 -
 target/riscv/tcg/tcg-cpu.c | 4 +++-
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f4261d2ffc..7b3f69d3fb 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1313,7 +1313,6 @@ static void riscv_cpu_init(Object *obj)
 
 /* Default values for non-bool cpu properties */
 cpu->cfg.pmu_mask = MAKE_64BIT_MASK(3, 16);
-cpu->cfg.vlen = 128;
 cpu->cfg.vlenb = 128 >> 3;
 cpu->cfg.elen = 64;
 cpu->env.vext_ver = VEXT_VERSION_1_00_0;
@@ -1802,22 +1801,21 @@ static void prop_vlen_set(Object *obj, Visitor *v, 
const char *name,
 return;
 }
 
-if (value != cpu->cfg.vlen && riscv_cpu_is_vendor(obj)) {
+if (value != cpu->cfg.vlenb && riscv_cpu_is_vendor(obj)) {
 cpu_set_prop_err(cpu, name, errp);
 error_append_hint(errp, "Current '%s' val: %u\n",
-  name, cpu->cfg.vlen);
+  name, cpu->cfg.vlenb << 3);
 return;
 }
 
 cpu_option_add_user_setting(name, value);
-cpu->cfg.vlen = value;
 cpu->cfg.vlenb = value >> 3;
 }
 
 static void prop_vlen_get(Object *obj, Visitor *v, const char *name,
  void *opaque, Error **errp)
 {
-uint16_t value = RISCV_CPU(obj)->cfg.vlen;
+uint16_t value = RISCV_CPU(obj)->cfg.vlenb << 3;
 
 visit_type_uint16(v, name, , errp);
 }
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index 50479dd72f..e241922f89 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -139,7 +139,6 @@ struct RISCVCPUConfig {
 bool ext_XVentanaCondOps;
 
 uint32_t pmu_mask;
-uint16_t vlen;
 uint16_t vlenb;
 uint16_t elen;
 uint16_t cbom_blocksize;
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index daff0b8f60..667421b0b7 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -298,7 +298,9 @@ static void riscv_cpu_validate_misa_mxl(RISCVCPU *cpu, 
Error **errp)
 static void riscv_cpu_validate_v(CPURISCVState *env, RISCVCPUConfig *cfg,
  Error **errp)
 {
-if (cfg->vlen > RV_VLEN_MAX || cfg->vlen < 128) {
+uint32_t vlen = cfg->vlenb << 3;
+
+if (vlen > RV_VLEN_MAX || vlen < 128) {
 error_setg(errp,
"Vector extension implementation only supports VLEN "
"in the range [128, %d]", RV_VLEN_MAX);
-- 
2.43.0

[PATCH v2 03/12] target/riscv/gdbstub.c: use 'vlenb' instead of shifting 'vlen'

2024-01-15 Thread Daniel Henrique Barboza

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/gdbstub.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/riscv/gdbstub.c b/target/riscv/gdbstub.c
index 58b3ace0fe..5ab0abda19 100644
--- a/target/riscv/gdbstub.c
+++ b/target/riscv/gdbstub.c
@@ -130,7 +130,7 @@ static int riscv_gdb_set_fpu(CPURISCVState *env, uint8_t 
*mem_buf, int n)
 
 static int riscv_gdb_get_vector(CPURISCVState *env, GByteArray *buf, int n)
 {
-uint16_t vlenb = riscv_cpu_cfg(env)->vlen >> 3;
+uint16_t vlenb = riscv_cpu_cfg(env)->vlenb;
 if (n < 32) {
 int i;
 int cnt = 0;
@@ -146,7 +146,7 @@ static int riscv_gdb_get_vector(CPURISCVState *env, 
GByteArray *buf, int n)
 
 static int riscv_gdb_set_vector(CPURISCVState *env, uint8_t *mem_buf, int n)
 {
-uint16_t vlenb = riscv_cpu_cfg(env)->vlen >> 3;
+uint16_t vlenb = riscv_cpu_cfg(env)->vlenb;
 if (n < 32) {
 int i;
 for (i = 0; i < vlenb; i += 8) {
@@ -266,7 +266,7 @@ static int ricsv_gen_dynamic_vector_xml(CPUState *cs, int 
base_reg)
 RISCVCPU *cpu = RISCV_CPU(cs);
 GString *s = g_string_new(NULL);
 g_autoptr(GString) ts = g_string_new("");
-int reg_width = cpu->cfg.vlen;
+int reg_width = cpu->cfg.vlenb << 3;
 int num_regs = 0;
 int i;
 
-- 
2.43.0

[PATCH v2 06/12] target/riscv/insn_trans/trans_rvvk.c.inc: use 'vlenb'

2024-01-15 Thread Daniel Henrique Barboza

Use s->cfg_ptr->vlenb instead of s->cfg_ptr->vlen / 8.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/insn_trans/trans_rvvk.c.inc | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc 
b/target/riscv/insn_trans/trans_rvvk.c.inc
index 3801c16829..a5cdd1b67f 100644
--- a/target/riscv/insn_trans/trans_rvvk.c.inc
+++ b/target/riscv/insn_trans/trans_rvvk.c.inc
@@ -174,7 +174,7 @@ GEN_OPIVX_GVEC_TRANS_CHECK(vandn_vx, andcs, zvkb_vx_check)
 data = FIELD_DP32(data, VDATA, VMA, s->vma);   \
 tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), \
vreg_ofs(s, a->rs2), tcg_env,   \
-   s->cfg_ptr->vlen / 8, s->cfg_ptr->vlen / 8, \
+   s->cfg_ptr->vlenb, s->cfg_ptr->vlenb,   \
data, fns[s->sew]); \
 mark_vs_dirty(s);  \
 gen_set_label(over);   \
@@ -267,7 +267,7 @@ GEN_OPIVI_WIDEN_TRANS(vwsll_vi, IMM_ZX, vwsll_vx, 
vwsll_vx_check)
 rd_v = tcg_temp_new_ptr();\
 rs2_v = tcg_temp_new_ptr();   \
 desc = tcg_constant_i32(  \
-simd_desc(s->cfg_ptr->vlen / 8, s->cfg_ptr->vlen / 8, data)); \
+simd_desc(s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data));   \
 tcg_gen_addi_ptr(rd_v, tcg_env, vreg_ofs(s, a->rd));  \
 tcg_gen_addi_ptr(rs2_v, tcg_env, vreg_ofs(s, a->rs2));\
 gen_helper_##NAME(rd_v, rs2_v, tcg_env, desc);\
@@ -345,7 +345,7 @@ GEN_V_UNMASKED_TRANS(vaesem_vs, vaes_check_vs, ZVKNED_EGS)
 rs2_v = tcg_temp_new_ptr();   \
 uimm_v = tcg_constant_i32(a->rs1);\
 desc = tcg_constant_i32(  \
-simd_desc(s->cfg_ptr->vlen / 8, s->cfg_ptr->vlen / 8, data)); \
+simd_desc(s->cfg_ptr->vlenb, s->cfg_ptr->vlenb, data));   \
 tcg_gen_addi_ptr(rd_v, tcg_env, vreg_ofs(s, a->rd));  \
 tcg_gen_addi_ptr(rs2_v, tcg_env, vreg_ofs(s, a->rs2));\
 gen_helper_##NAME(rd_v, rs2_v, uimm_v, tcg_env, desc);\
@@ -413,7 +413,7 @@ GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS)
   \
 tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs1),   \
vreg_ofs(s, a->rs2), tcg_env,  \
-   s->cfg_ptr->vlen / 8, s->cfg_ptr->vlen / 8,\
+   s->cfg_ptr->vlenb, s->cfg_ptr->vlenb,  \
data, gen_helper_##NAME);  \
   \
 mark_vs_dirty(s); \
@@ -466,8 +466,8 @@ static bool trans_vsha2cl_vv(DisasContext *s, arg_rmrr *a)
 data = FIELD_DP32(data, VDATA, VMA, s->vma);
 
 tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs1),
-vreg_ofs(s, a->rs2), tcg_env, s->cfg_ptr->vlen / 8,
-s->cfg_ptr->vlen / 8, data,
+vreg_ofs(s, a->rs2), tcg_env, s->cfg_ptr->vlenb,
+s->cfg_ptr->vlenb, data,
 s->sew == MO_32 ?
 gen_helper_vsha2cl32_vv : gen_helper_vsha2cl64_vv);
 
@@ -500,8 +500,8 @@ static bool trans_vsha2ch_vv(DisasContext *s, arg_rmrr *a)
 data = FIELD_DP32(data, VDATA, VMA, s->vma);
 
 tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs1),
-vreg_ofs(s, a->rs2), tcg_env, s->cfg_ptr->vlen / 8,
-s->cfg_ptr->vlen / 8, data,
+vreg_ofs(s, a->rs2), tcg_env, s->cfg_ptr->vlenb,
+s->cfg_ptr->vlenb, data,
 s->sew == MO_32 ?
 gen_helper_vsha2ch32_vv : gen_helper_vsha2ch64_vv);
 
-- 
2.43.0

[PATCH v2 11/12] trans_rvv.c.inc: use 'vlenb' to calc vlmax in trans_vrgather_v*()

2024-01-15 Thread Daniel Henrique Barboza

Use the same vext_get_vlmax() logic to retrieve 'vlmax' in
trans_vrgather_vi() and trans_vrgather_vx().

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index b4663b6e1f..05c243186d 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -3535,8 +3535,7 @@ static bool trans_vrgather_vx(DisasContext *s, arg_rmrr 
*a)
 }
 
 if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) {
-int scale = s->lmul - (s->sew + 3);
-int vlmax = s->cfg_ptr->vlen >> -scale;
+int vlmax = s->cfg_ptr->vlenb >> (s->sew - s->lmul);
 TCGv_i64 dest = tcg_temp_new_i64();
 
 if (a->rs1 == 0) {
@@ -3566,8 +3565,7 @@ static bool trans_vrgather_vi(DisasContext *s, arg_rmrr 
*a)
 }
 
 if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) {
-int scale = s->lmul - (s->sew + 3);
-int vlmax = s->cfg_ptr->vlen >> -scale;
+int vlmax = s->cfg_ptr->vlenb >> (s->sew - s->lmul);
 if (a->rs1 >= vlmax) {
 tcg_gen_gvec_dup_imm(MO_64, vreg_ofs(s, a->rd),
  MAXSZ(s), MAXSZ(s), 0);
-- 
2.43.0

[PATCH v2 08/12] target/riscv/vector_helper.c: use vlenb in HELPER(vsetvl)

2024-01-15 Thread Daniel Henrique Barboza

Use the new 'vlenb' CPU config to validate fractional LMUL. The original
comparison is done with 'vlen' and 'sew', both in bits. Adjust the shift
to use vlenb.

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/vector_helper.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index cb944229b0..9e3ae4b5d3 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -45,9 +45,13 @@ target_ulong HELPER(vsetvl)(CPURISCVState *env, target_ulong 
s1,
 xlen - 1 - R_VTYPE_RESERVED_SHIFT);
 
 if (lmul & 4) {
-/* Fractional LMUL - check LMUL * VLEN >= SEW */
+/*
+ * Fractional LMUL: check VLEN * LMUL >= SEW,
+ * or VLEN * (8 - lmul) >= SEW. Using VLENB we
+ * need 3 less shifts rights.
+ */
 if (lmul == 4 ||
-cpu->cfg.vlen >> (8 - lmul) < sew) {
+cpu->cfg.vlenb >> (8 - 3 - lmul) < sew) {
 vill = true;
 }
 }
-- 
2.43.0

[PATCH v2 05/12] target/riscv/insn_trans/trans_rvv.c.inc: use 'vlenb'

2024-01-15 Thread Daniel Henrique Barboza

Use s->cfg_ptr->vlenb instead of "s->cfg_ptr->vlen / 8"  and
"s->cfg_ptr->vlen >> 3".

Signed-off-by: Daniel Henrique Barboza 
---
 target/riscv/insn_trans/trans_rvv.c.inc | 140 
 1 file changed, 70 insertions(+), 70 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvv.c.inc 
b/target/riscv/insn_trans/trans_rvv.c.inc
index 3871f0ea73..d743675262 100644
--- a/target/riscv/insn_trans/trans_rvv.c.inc
+++ b/target/riscv/insn_trans/trans_rvv.c.inc
@@ -217,7 +217,7 @@ static bool trans_vsetivli(DisasContext *s, arg_vsetivli *a)
 /* vector register offset from env */
 static uint32_t vreg_ofs(DisasContext *s, int reg)
 {
-return offsetof(CPURISCVState, vreg) + reg * s->cfg_ptr->vlen / 8;
+return offsetof(CPURISCVState, vreg) + reg * s->cfg_ptr->vlenb;
 }
 
 /* check functions */
@@ -627,11 +627,11 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, 
uint32_t data,
  * As simd_desc supports at most 2048 bytes, and in this implementation,
  * the max vector group length is 4096 bytes. So split it into two parts.
  *
- * The first part is vlen in bytes, encoded in maxsz of simd_desc.
+ * The first part is vlen in bytes (vlenb), encoded in maxsz of simd_desc.
  * The second part is lmul, encoded in data of simd_desc.
  */
-desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlen / 8,
-  s->cfg_ptr->vlen / 8, data));
+desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb,
+  s->cfg_ptr->vlenb, data));
 
 tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, vd));
 tcg_gen_addi_ptr(mask, tcg_env, vreg_ofs(s, 0));
@@ -791,8 +791,8 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, 
uint32_t rs2,
 mask = tcg_temp_new_ptr();
 base = get_gpr(s, rs1, EXT_NONE);
 stride = get_gpr(s, rs2, EXT_NONE);
-desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlen / 8,
-  s->cfg_ptr->vlen / 8, data));
+desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb,
+  s->cfg_ptr->vlenb, data));
 
 tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, vd));
 tcg_gen_addi_ptr(mask, tcg_env, vreg_ofs(s, 0));
@@ -897,8 +897,8 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, 
uint32_t vs2,
 mask = tcg_temp_new_ptr();
 index = tcg_temp_new_ptr();
 base = get_gpr(s, rs1, EXT_NONE);
-desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlen / 8,
-  s->cfg_ptr->vlen / 8, data));
+desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb,
+  s->cfg_ptr->vlenb, data));
 
 tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, vd));
 tcg_gen_addi_ptr(index, tcg_env, vreg_ofs(s, vs2));
@@ -1036,8 +1036,8 @@ static bool ldff_trans(uint32_t vd, uint32_t rs1, 
uint32_t data,
 dest = tcg_temp_new_ptr();
 mask = tcg_temp_new_ptr();
 base = get_gpr(s, rs1, EXT_NONE);
-desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlen / 8,
-  s->cfg_ptr->vlen / 8, data));
+desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb,
+  s->cfg_ptr->vlenb, data));
 
 tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, vd));
 tcg_gen_addi_ptr(mask, tcg_env, vreg_ofs(s, 0));
@@ -1086,7 +1086,7 @@ static bool ldst_whole_trans(uint32_t vd, uint32_t rs1, 
uint32_t nf,
  uint32_t width, gen_helper_ldst_whole *fn,
  DisasContext *s, bool is_store)
 {
-uint32_t evl = (s->cfg_ptr->vlen / 8) * nf / width;
+uint32_t evl = s->cfg_ptr->vlenb * nf / width;
 TCGLabel *over = gen_new_label();
 tcg_gen_brcondi_tl(TCG_COND_GEU, cpu_vstart, evl, over);
 
@@ -1096,8 +1096,8 @@ static bool ldst_whole_trans(uint32_t vd, uint32_t rs1, 
uint32_t nf,
 
 uint32_t data = FIELD_DP32(0, VDATA, NF, nf);
 dest = tcg_temp_new_ptr();
-desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlen / 8,
-  s->cfg_ptr->vlen / 8, data));
+desc = tcg_constant_i32(simd_desc(s->cfg_ptr->vlenb,
+  s->cfg_ptr->vlenb, data));
 
 base = get_gpr(s, rs1, EXT_NONE);
 tcg_gen_addi_ptr(dest, tcg_env, vreg_ofs(s, vd));
@@ -1199,8 +1199,8 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn 
*gvec_fn,
 data = FIELD_DP32(data, VDATA, VMA, s->vma);
 tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0),
vreg_ofs(s, a->rs1), vreg_ofs(s, a->rs2),
-   tcg_env, s->cfg_ptr->vlen / 8,
-   s->cfg_ptr->vlen / 8, data, fn);
+   tcg_env, s->cfg_ptr->vlenb,
+   s->cfg_ptr->vlenb, data, fn);
 }
 mark_vs_dirty(s);
 gen_set_label(over);
@@ -1248,8 +1248,8 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, 
uint32_t vs2, uint32_t vm,

[PATCH v2 00/12] target/riscv: add 'cpu->cfg.vlenb', remove 'cpu->cfg.vlen'

2024-01-15 Thread Daniel Henrique Barboza

Hi,

This v2 contains changes proposed after the review from Richard. Most
notable change is in patch 11, where we simplified our lives a bit and
just calculated 'vlmax' using 'vlenb' instead of trying to use the TCG
global 'cpu_vl'.

Patch 10 also changed a bit to avoid the negative shift we were doing in
v1.


Changes from v1:
- patch 8:
  - instead of shifting sew, adjust the amount of shift rights to use 'vlenb'
- patch 9:
  - adjust the shifts rights to use 'vlenb' instead of shifting 'vlenb' left 
beforehand
- patch 10:
  - calc 'max_sz' before doing shift rights to retrieve MAX_SZ()
- patches 11 and 12:
  - squashed together
  - calc 'vlmax' using the same logic from patch 9 in both functions
- v1 link: 
https://lore.kernel.org/qemu-riscv/20240112213812.173521-1-dbarb...@ventanamicro.com/


Daniel Henrique Barboza (12):
  target/riscv: add 'vlenb' field in cpu->cfg
  target/riscv/csr.c: use 'vlenb' instead of 'vlen'
  target/riscv/gdbstub.c: use 'vlenb' instead of shifting 'vlen'
  target/riscv/insn_trans/trans_rvbf16.c.inc: use cpu->cfg.vlenb
  target/riscv/insn_trans/trans_rvv.c.inc: use 'vlenb'
  target/riscv/insn_trans/trans_rvvk.c.inc: use 'vlenb'
  target/riscv/vector_helper.c: use 'vlenb'
  target/riscv/vector_helper.c: use vlenb in HELPER(vsetvl)
  target/riscv/cpu.h: use 'vlenb' in vext_get_vlmax()
  target/riscv/insn_trans/trans_rvv.c.inc: use 'vlenb' in MAXSZ()
  trans_rvv.c.inc: use 'vlenb' to calc vlmax in trans_vrgather_v*()
  target/riscv/cpu.c: remove cpu->cfg.vlen

 target/riscv/cpu.c |  12 +-
 target/riscv/cpu.h |   6 +-
 target/riscv/cpu_cfg.h |   2 +-
 target/riscv/csr.c |   4 +-
 target/riscv/gdbstub.c |   6 +-
 target/riscv/insn_trans/trans_rvbf16.c.inc |  12 +-
 target/riscv/insn_trans/trans_rvv.c.inc| 152 ++---
 target/riscv/insn_trans/trans_rvvk.c.inc   |  16 +--
 target/riscv/tcg/tcg-cpu.c |   4 +-
 target/riscv/vector_helper.c   |  26 ++--
 10 files changed, 124 insertions(+), 116 deletions(-)

-- 
2.43.0

[PATCH v2 2/3] Hexagon (target/hexagon) Use QEMU decodetree (16-bit instructions)

2024-01-15 Thread Taylor Simpson

Section 10.3 of the Hexagon V73 Programmer's Reference Manual

A duplex is encoded as a 32-bit instruction with bits [15:14] set to 00.
The sub-instructions that comprise a duplex are encoded as 13-bit fields
in the duplex.

Create a decoder for each subinstruction class (a, l1, l2, s1, s2).

Extend gen_trans_funcs.py to handle all instructions rather than
filter by instruction class.

There is a g_assert_not_reached() in decode_insns() in decode.c to
verify we never try to use the old decoder on 16-bit instructions.

Signed-off-by: Taylor Simpson 
Reviewed-by: Brian Cain 
---
 target/hexagon/decode.c   | 85 +
 target/hexagon/README |  1 +
 target/hexagon/gen_decodetree.py  | 12 -
 target/hexagon/gen_trans_funcs.py | 12 +
 target/hexagon/meson.build| 90 +++
 5 files changed, 188 insertions(+), 12 deletions(-)

diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index bddad1f75e..160b23a895 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -60,6 +60,7 @@ static int decode_mapped_reg_##NAME(DisasContext *ctx, int x) 
\
 }
 DECODE_MAPPED(R_16)
 DECODE_MAPPED(R_8)
+DECODE_MAPPED(R__8)
 
 /* Helper function for decodetree_trans_funcs_generated.c.inc */
 static int shift_left(DisasContext *ctx, int x, int n, int immno)
@@ -77,6 +78,13 @@ static int shift_left(DisasContext *ctx, int x, int n, int 
immno)
 #include "decode_normal_generated.c.inc"
 #include "decode_hvx_generated.c.inc"
 
+/* Include the generated decoder for 16 bit insn */
+#include "decode_subinsn_a_generated.c.inc"
+#include "decode_subinsn_l1_generated.c.inc"
+#include "decode_subinsn_l2_generated.c.inc"
+#include "decode_subinsn_s1_generated.c.inc"
+#include "decode_subinsn_s2_generated.c.inc"
+
 /* Include the generated helpers for the decoder */
 #include "decodetree_trans_funcs_generated.c.inc"
 
@@ -790,6 +798,63 @@ decode_insns_tablewalk(Insn *insn, const DectreeTable 
*table,
 }
 }
 
+/*
+ * Section 10.3 of the Hexagon V73 Programmer's Reference Manual
+ *
+ * A duplex is encoded as a 32-bit instruction with bits [15:14] set to 00.
+ * The sub-instructions that comprise a duplex are encoded as 13-bit fields
+ * in the duplex.
+ *
+ * Per table 10-4, the 4-bit duplex iclass is encoded in bits 31:29, 13
+ */
+static uint32_t get_duplex_iclass(uint32_t encoding)
+{
+uint32_t iclass = extract32(encoding, 13, 1);
+iclass = deposit32(iclass, 1, 3, extract32(encoding, 29, 3));
+return iclass;
+}
+
+/*
+ * Per table 10-5, the duplex ICLASS field values that specify the group of
+ * each sub-instruction in a duplex
+ *
+ * This table points to the decode instruction for each entry in the table
+ */
+typedef bool (*subinsn_decode_func)(DisasContext *ctx, uint16_t insn);
+typedef struct {
+subinsn_decode_func decode_slot0_subinsn;
+subinsn_decode_func decode_slot1_subinsn;
+} subinsn_decode_groups;
+
+static const subinsn_decode_groups decode_groups[16] = {
+[0x0] = { decode_subinsn_l1, decode_subinsn_l1 },
+[0x1] = { decode_subinsn_l2, decode_subinsn_l1 },
+[0x2] = { decode_subinsn_l2, decode_subinsn_l2 },
+[0x3] = { decode_subinsn_a,  decode_subinsn_a },
+[0x4] = { decode_subinsn_l1, decode_subinsn_a },
+[0x5] = { decode_subinsn_l2, decode_subinsn_a },
+[0x6] = { decode_subinsn_s1, decode_subinsn_a },
+[0x7] = { decode_subinsn_s2, decode_subinsn_a },
+[0x8] = { decode_subinsn_s1, decode_subinsn_l1 },
+[0x9] = { decode_subinsn_s1, decode_subinsn_l2 },
+[0xa] = { decode_subinsn_s1, decode_subinsn_s1 },
+[0xb] = { decode_subinsn_s2, decode_subinsn_s1 },
+[0xc] = { decode_subinsn_s2, decode_subinsn_l1 },
+[0xd] = { decode_subinsn_s2, decode_subinsn_l2 },
+[0xe] = { decode_subinsn_s2, decode_subinsn_s2 },
+[0xf] = { NULL,  NULL },  /* Reserved */
+};
+
+static uint16_t get_slot0_subinsn(uint32_t encoding)
+{
+return extract32(encoding, 0, 13);
+}
+
+static uint16_t get_slot1_subinsn(uint32_t encoding)
+{
+return extract32(encoding, 16, 13);
+}
+
 static unsigned int
 decode_insns(DisasContext *ctx, Insn *insn, uint32_t encoding)
 {
@@ -805,8 +870,28 @@ decode_insns(DisasContext *ctx, Insn *insn, uint32_t 
encoding)
 table = _table_DECODE_ROOT_32;
 g_assert_not_reached();
 } else {
+uint32_t iclass = get_duplex_iclass(encoding);
+unsigned int slot0_subinsn = get_slot0_subinsn(encoding);
+unsigned int slot1_subinsn = get_slot1_subinsn(encoding);
+subinsn_decode_func decode_slot0_subinsn =
+decode_groups[iclass].decode_slot0_subinsn;
+subinsn_decode_func decode_slot1_subinsn =
+decode_groups[iclass].decode_slot1_subinsn;
+
+/* The slot1 subinsn needs to be in the packet first */
+if (decode_slot1_subinsn(ctx, slot1_subinsn)) {
+insn->generate = opcode_genptr[insn->opcode];
+insn->iclass =

1 2 3 >

1 - 100 of 281 matches

Mail list logo