Re: [PATCH v2 2/3] hw/smbios: Fix thread count in type4

2023-08-04 Thread Michael Tokarev

01.06.2023 12:29, Zhao Liu wrote:

From: Zhao Liu 

 From SMBIOS 3.0 specification, thread count field means:

Thread Count is the total number of threads detected by the BIOS for
this processor socket. It is a processor-wide count, not a
thread-per-core count. [1]

So here we should use threads per socket other than threads per core.

[1] SMBIOS 3.0.0, section 7.5.8, Processor Information - Thread Count

Fixes: c97294ec1b9e ("SMBIOS: Build aggregate smbios tables and entry point")
Signed-off-by: Zhao Liu 


Hi!

This, and other two patches in this area, smells like a -stable material.
Is it not?

196ea60a73 hw/smbios: Fix core count in type4
7298fd7de5 hw/smbios: Fix thread count in type4
d79a284a44 hw/smbios: Fix smbios_smp_sockets caculation

Thanks,

/mjt



Re: [PATCH v3] hw/cxl: Fix CFMW config memory leak

2023-08-04 Thread Michael Tokarev

31.05.2023 14:08, Jonathan Cameron via wrote:

On Wed, 31 May 2023 09:51:43 +0200
Philippe Mathieu-Daudé  wrote:


On 31/5/23 08:07, Li Zhijian wrote:

Allocate targets and targets[n] resources when all sanity checks are
passed to avoid memory leaks.

Suggested-by: Philippe Mathieu-Daudé 
Signed-off-by: Li Zhijian 
---
V3: allocte further resource when we can't fail # Philippe


Thanks for the v3!

Reviewed-by: Philippe Mathieu-Daudé 


Thanks.  I've added this near the top of my queue so will send
it out along with other similar fixes as a series for Michael
to consider picking up.


Hi!

Has this been forgotten? Is it still needed?

/mjt



Re: [PATCH v9 07/24] linux-user: Do not call get_errno() in do_brk()

2023-08-04 Thread Richard Henderson

On 8/4/23 16:40, Nathan Egge wrote:
The linux-user/syscall.c has many such places where this style check is failing. Should 
these be fixed in a separate patch?


Yes, eventually.


r~




Re: [PATCH 1/3] linux-user/elfload: Enable vxe2 on s390x

2023-08-04 Thread Richard Henderson

On 8/4/23 16:03, Ilya Leoshkevich wrote:

The vxe2 hwcap is not set for programs running in linux-user, but is
set by a Linux kernel running in softmmu. Add it to the former.

Signed-off-by: Ilya Leoshkevich
---
  linux-user/elfload.c | 1 +
  1 file changed, 1 insertion(+)


Reviewed-by: Richard Henderson 

r~



Re: Rutabaga backwards compatibility

2023-08-04 Thread Gurchetan Singh
On Tue, Aug 1, 2023 at 8:18 AM Alyssa Ross  wrote:

> Gurchetan Singh  writes:
>
> > On Mon, Jul 24, 2023 at 2:56 AM Alyssa Ross  wrote:
> >>
> >> Gurchetan Singh  writes:
> >>
> >> > In terms of API stability/versioning/packaging, once this series is
> >> > reviewed, the plan is to cut a "gfxstream upstream release branch".
> We
> >> > will have the same API guarantees as any other QEMU project then, i.e
> no
> >> > breaking API changes for 5 years.
> >>
> >> What about Rutabaga?
> >
> > Yes, rutabaga + gfxstream will both be versioned and maintain API
> > backwards compatibility in line with QEMU guidelines.
>
> In that case, I should draw your attention to
> , which I've just realised while testing v2
> of your series here breaks the build of the rutabaga ffi, and which will
> require the addition of a "prot" field to struct rutabaga_handle (a
> breaking change).  I'll push a new version of that CL to fix rutabaga
> ffi in the next few days.
>

Sorry, I didn't see this until now.  At first glance, do we need to modify
the rutabaga_handle?  Can't we do fcntl(.., GET_FL) to get the access flags
when needed?

Since this is already coming up, before the release has even been made,
> is it worth exploring how to limit the rutabaga API to avoid more
> breaking changes after the release?  Could there be more use of opaque
> structs, for example?
>
> (CCing the crosvm list)
>


[PATCH 0/3] target/s390x: Fix the "ignored match" case in VSTRS

2023-08-04 Thread Ilya Leoshkevich
Hi,

this series should hopefully fix the issue with __strstr_arch13(),
which Claudio reported. I have to admit I did not manage to fully
reproduce it, but at least with this change the traces of a simple test
from TCG and real hardware match.

I've also fuzzed the changed helper and strstr() itself; not sure
whether anything generic may come out of it, but here are the links
anyway [1] [2].

Patch 1 makes glibc pick __strstr_arch13() in qemu-user, patch 2 is the
fix and patch 3 is the test (generated from Claudio's strings and
further fuzzer's findings).

[1] https://gist.github.com/iii-i/5adad06d911c46079d4388001b22ab61
[2] https://gist.github.com/iii-i/c425800e75796eae65660491ac511356

Ilya Leoshkevich (3):
  linux-user/elfload: Enable vxe2 on s390x
  target/s390x: Fix the "ignored match" case in VSTRS
  tests/tcg/s390x: Test VSTRS

 linux-user/elfload.c |  1 +
 target/s390x/tcg/vec_string_helper.c | 54 ++---
 tests/tcg/s390x/Makefile.target  |  1 +
 tests/tcg/s390x/vxeh2_vstrs.c| 88 
 4 files changed, 107 insertions(+), 37 deletions(-)
 create mode 100644 tests/tcg/s390x/vxeh2_vstrs.c

-- 
2.41.0




[PATCH 1/2] target/s390x: Fix VSTL with a large length

2023-08-04 Thread Ilya Leoshkevich
The length is always truncated to 16 bytes. Do not probe more than
that.

Cc: qemu-sta...@nongnu.org
Fixes: 0e0a5b49ad58 ("s390x/tcg: Implement VECTOR STORE WITH LENGTH")
Signed-off-by: Ilya Leoshkevich 
---
 target/s390x/tcg/vec_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/s390x/tcg/vec_helper.c b/target/s390x/tcg/vec_helper.c
index 48d86722b2d..dafc4c3582c 100644
--- a/target/s390x/tcg/vec_helper.c
+++ b/target/s390x/tcg/vec_helper.c
@@ -193,7 +193,7 @@ void HELPER(vstl)(CPUS390XState *env, const void *v1, 
uint64_t addr,
   uint64_t bytes)
 {
 /* Probe write access before actually modifying memory */
-probe_write_access(env, addr, bytes, GETPC());
+probe_write_access(env, addr, MIN(bytes, 16), GETPC());
 
 if (likely(bytes >= 16)) {
 cpu_stq_data_ra(env, addr, s390_vec_read_element64(v1, 0), GETPC());
-- 
2.41.0




[PATCH 2/2] tests/tcg/s390x: Test VSTL

2023-08-04 Thread Ilya Leoshkevich
Add a small test to prevent regressions.

Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/s390x/Makefile.target |  1 +
 tests/tcg/s390x/vstl.c  | 37 +
 2 files changed, 38 insertions(+)
 create mode 100644 tests/tcg/s390x/vstl.c

diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index 649c1e520e6..e3e22c6f7e3 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -60,6 +60,7 @@ Z13_TESTS=vistr
 Z13_TESTS+=lcbb
 Z13_TESTS+=locfhr
 Z13_TESTS+=vcksm
+Z13_TESTS+=vstl
 $(Z13_TESTS): CFLAGS+=-march=z13 -O2
 TESTS+=$(Z13_TESTS)
 
diff --git a/tests/tcg/s390x/vstl.c b/tests/tcg/s390x/vstl.c
new file mode 100644
index 000..bece952c7ee
--- /dev/null
+++ b/tests/tcg/s390x/vstl.c
@@ -0,0 +1,37 @@
+/*
+ * Test the VSTL instruction.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include 
+#include 
+#include "vx.h"
+
+static inline void vstl(S390Vector *v1, void *db2, size_t r3)
+{
+asm("vstl %[v1],%[r3],%[db2]"
+: [db2] "=Q" (*(char *)db2)
+: [v1] "v" (v1->v), [r3] "r" (r3)
+: "memory");
+}
+
+int main(void)
+{
+uint64_t buf[3] = {0x1122334455667788ULL, 0x99aabbccddeeffULL,
+   0x5a5a5a5a5a5a5a5aULL};
+S390Vector v = {.d[0] = 0x1234567887654321ULL,
+.d[1] = 0x9abcdef00fedcba9ULL};
+
+vstl(, buf, 0);
+assert(buf[0] == 0x1222334455667788ULL);
+
+vstl(, buf, 1);
+assert(buf[0] == 0x1234334455667788ULL);
+
+vstl(, buf, -1);
+assert(buf[0] == 0x1234567887654321ULL);
+assert(buf[1] == 0x9abcdef00fedcba9ULL);
+assert(buf[2] == 0x5a5a5a5a5a5a5a5aULL);
+
+return EXIT_SUCCESS;
+}
-- 
2.41.0




[PATCH] target/s390x: Check reserved bits of VFMIN/VFMAX's M5

2023-08-04 Thread Ilya Leoshkevich
VFMIN and VFMAX should raise a specification exceptions when bits 1-3
of M5 are set.

Cc: qemu-sta...@nongnu.org
Fixes: da4807527f3b ("s390x/tcg: Implement VECTOR FP (MAXIMUM|MINIMUM)")
Signed-off-by: Ilya Leoshkevich 
---
 target/s390x/tcg/translate_vx.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/s390x/tcg/translate_vx.c.inc 
b/target/s390x/tcg/translate_vx.c.inc
index f8df121d3d3..b5d07d5ec53 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -3047,7 +3047,7 @@ static DisasJumpType op_vfmax(DisasContext *s, DisasOps 
*o)
 const uint8_t m5 = get_field(s, m5);
 gen_helper_gvec_3_ptr *fn;
 
-if (m6 == 5 || m6 == 6 || m6 == 7 || m6 >= 13) {
+if (m6 == 5 || m6 == 6 || m6 == 7 || m6 >= 13 || (m5 & 7)) {
 gen_program_exception(s, PGM_SPECIFICATION);
 return DISAS_NORETURN;
 }
-- 
2.41.0




[PATCH] linux-user: Emulate the Anonymous: keyword in /proc/self/smaps

2023-08-04 Thread Ilya Leoshkevich
Core dumps produced by gdb's gcore when connected to qemu's gdbstub
lack stack. The reason is that gdb includes only anonymous memory in
core dumps, which is distinguished by a non-0 Anonymous: value.

Consider the mappings with PAGE_ANON fully anonymous, and the mappings
without it fully non-anonymous.

Signed-off-by: Ilya Leoshkevich 
---
 linux-user/syscall.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 95727a816ad..150be661dba 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8078,7 +8078,7 @@ static int open_self_cmdline(CPUArchState *cpu_env, int 
fd)
 return 0;
 }
 
-static void show_smaps(int fd, unsigned long size)
+static void show_smaps(int fd, unsigned long size, int flags)
 {
 unsigned long page_size_kb = TARGET_PAGE_SIZE >> 10;
 unsigned long size_kb = size >> 10;
@@ -8094,7 +8094,7 @@ static void show_smaps(int fd, unsigned long size)
 "Private_Clean: 0 kB\n"
 "Private_Dirty: 0 kB\n"
 "Referenced:0 kB\n"
-"Anonymous: 0 kB\n"
+"Anonymous: %lu kB\n"
 "LazyFree:  0 kB\n"
 "AnonHugePages: 0 kB\n"
 "ShmemPmdMapped:0 kB\n"
@@ -8104,7 +8104,9 @@ static void show_smaps(int fd, unsigned long size)
 "Swap:  0 kB\n"
 "SwapPss:   0 kB\n"
 "Locked:0 kB\n"
-"THPeligible:0\n", size_kb, page_size_kb, page_size_kb);
+"THPeligible:0\n",
+size_kb, page_size_kb, page_size_kb,
+(flags & PAGE_ANON) ? size_kb : 0);
 }
 
 static int open_self_maps_1(CPUArchState *cpu_env, int fd, bool smaps)
@@ -8155,7 +8157,7 @@ static int open_self_maps_1(CPUArchState *cpu_env, int 
fd, bool smaps)
 dprintf(fd, "\n");
 }
 if (smaps) {
-show_smaps(fd, max - min);
+show_smaps(fd, max - min, flags);
 dprintf(fd, "VmFlags:%s%s%s%s%s%s%s%s\n",
 (flags & PAGE_READ) ? " rd" : "",
 (flags & PAGE_WRITE_ORG) ? " wr" : "",
-- 
2.41.0




[PATCH v9 07/24] linux-user: Do not call get_errno() in do_brk()

2023-08-04 Thread Nathan Egge

On 2023-08-04 18:00, Richard Henderson wrote:

From: Akihiko Odaki 

Later the returned value is compared with -1, and negated errno is not
expected.

Fixes: 00faf08c95 ("linux-user: Don't use MAP_FIXED in do_brk()")
Reviewed-by: Helge Deller 
Signed-off-by: Akihiko Odaki 
Message-Id: <20230802071754.14876-4-akihiko.od...@daynix.com>
Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 95727a816a..b9d2ec02f9 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -862,9 +862,9 @@ abi_long do_brk(abi_ulong brk_val)
  */
 if (new_host_brk_page > brk_page) {
 new_alloc_size = new_host_brk_page - brk_page;
-    mapped_addr = get_errno(target_mmap(brk_page, new_alloc_size,
-    PROT_READ|PROT_WRITE,
-    MAP_ANON|MAP_PRIVATE, 0, 0));
+    mapped_addr = target_mmap(brk_page, new_alloc_size,
+  PROT_READ|PROT_WRITE,
+  MAP_ANON|MAP_PRIVATE, 0, 0);
 } else {
 new_alloc_size = 0;
 mapped_addr = brk_page;
--
2.34.1


This patch is triggering a gitlab pipeline failure in Richard's tcg-next 
branch:


https://gitlab.com/rth7680/qemu/-/pipelines/956532662

It can be reproduced locally by adding a git remote for rth7680 and then 
running:


$ git checkout rth7680/tcg-next
$ .gitlab-ci.d/check-patch.py

Checking all commits since c26d005e62f4fd177dae0cd70c24cb96761edebc...

f7cca41b17c809958a7f04b7d7f64af40d64e645:31: ERROR: spaces required 
around that '|' (ctx:VxV)
f7cca41b17c809958a7f04b7d7f64af40d64e645:32: ERROR: spaces required 
around that '|' (ctx:VxV)

total: 2 errors, 0 warnings, 12 lines checked
    ❌ FAIL one or more commits failed scripts/checkpatch.pl

The linux-user/syscall.c has many such places where this style check is 
failing. Should these be fixed in a separate patch?


Sincerely,

Nathan



[PATCH 2/3] target/s390x: Fix the "ignored match" case in VSTRS

2023-08-04 Thread Ilya Leoshkevich
Currently the emulation of VSTRS recognizes partial matches in presence
of \0 in the haystack, which, according to PoP, is not correct:

If the ZS flag is one and a zero byte was detected
in the second operand, then there can not be a
partial match ...

Add a check for this. While at it, fold a number of explicitly handled
special cases into the generic logic.

Cc: qemu-sta...@nongnu.org
Reported-by: Claudio Fontana 
Closes: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg00633.html
Fixes: 1d706f314191 ("target/s390x: vxeh2: vector string search")
Signed-off-by: Ilya Leoshkevich 
---
 target/s390x/tcg/vec_string_helper.c | 54 +---
 1 file changed, 17 insertions(+), 37 deletions(-)

diff --git a/target/s390x/tcg/vec_string_helper.c 
b/target/s390x/tcg/vec_string_helper.c
index 9b85becdfbf..a19f429768f 100644
--- a/target/s390x/tcg/vec_string_helper.c
+++ b/target/s390x/tcg/vec_string_helper.c
@@ -474,9 +474,9 @@ DEF_VSTRC_CC_RT_HELPER(32)
 static int vstrs(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
  const S390Vector *v4, uint8_t es, bool zs)
 {
-int substr_elen, substr_0, str_elen, i, j, k, cc;
+int substr_elen, i, j, k, cc;
 int nelem = 16 >> es;
-bool eos = false;
+int str_leftmost_0;
 
 substr_elen = s390_vec_read_element8(v4, 7) >> es;
 
@@ -498,47 +498,20 @@ static int vstrs(S390Vector *v1, const S390Vector *v2, 
const S390Vector *v3,
 }
 
 /* If ZS, look for eos in the searched string. */
+str_leftmost_0 = nelem;
 if (zs) {
 for (k = 0; k < nelem; k++) {
 if (s390_vec_read_element(v2, k, es) == 0) {
-eos = true;
+str_leftmost_0 = k;
 break;
 }
 }
-str_elen = k;
-} else {
-str_elen = nelem;
 }
 
-substr_0 = s390_vec_read_element(v3, 0, es);
-
-for (k = 0; ; k++) {
-for (; k < str_elen; k++) {
-if (s390_vec_read_element(v2, k, es) == substr_0) {
-break;
-}
-}
-
-/* If we reached the end of the string, no match. */
-if (k == str_elen) {
-cc = eos; /* no match (with or without zero char) */
-goto done;
-}
-
-/* If the substring is only one char, match. */
-if (substr_elen == 1) {
-cc = 2; /* full match */
-goto done;
-}
-
-/* If the match begins at the last char, we have a partial match. */
-if (k == str_elen - 1) {
-cc = 3; /* partial match */
-goto done;
-}
-
+cc = str_leftmost_0 == nelem ? 0 : 1;  /* No match. */
+for (k = 0; k < nelem; k++) {
 i = MIN(nelem, k + substr_elen);
-for (j = k + 1; j < i; j++) {
+for (j = k; j < i; j++) {
 uint32_t e2 = s390_vec_read_element(v2, j, es);
 uint32_t e3 = s390_vec_read_element(v3, j - k, es);
 if (e2 != e3) {
@@ -546,9 +519,16 @@ static int vstrs(S390Vector *v1, const S390Vector *v2, 
const S390Vector *v3,
 }
 }
 if (j == i) {
-/* Matched up until "end". */
-cc = i - k == substr_elen ? 2 : 3; /* full or partial match */
-goto done;
+/* All elements matched. */
+if (k > str_leftmost_0) {
+cc = 1;  /* Ignored match. */
+k = nelem;
+} else if (i - k == substr_elen) {
+cc = 2;  /* Full match. */
+} else {
+cc = 3;  /* Partial match. */
+}
+break;
 }
 }
 
-- 
2.41.0




[PATCH 1/3] linux-user/elfload: Enable vxe2 on s390x

2023-08-04 Thread Ilya Leoshkevich
The vxe2 hwcap is not set for programs running in linux-user, but is
set by a Linux kernel running in softmmu. Add it to the former.

Signed-off-by: Ilya Leoshkevich 
---
 linux-user/elfload.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 861ec07abcd..33b20548721 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1614,6 +1614,7 @@ uint32_t get_elf_hwcap(void)
 }
 GET_FEATURE(S390_FEAT_VECTOR, HWCAP_S390_VXRS);
 GET_FEATURE(S390_FEAT_VECTOR_ENH, HWCAP_S390_VXRS_EXT);
+GET_FEATURE(S390_FEAT_VECTOR_ENH2, HWCAP_S390_VXRS_EXT2);
 
 return hwcap;
 }
-- 
2.41.0




[PATCH 3/3] tests/tcg/s390x: Test VSTRS

2023-08-04 Thread Ilya Leoshkevich
Add a small test to prevent regressions.

Signed-off-by: Ilya Leoshkevich 
---
 tests/tcg/s390x/Makefile.target |  1 +
 tests/tcg/s390x/vxeh2_vstrs.c   | 88 +
 2 files changed, 89 insertions(+)
 create mode 100644 tests/tcg/s390x/vxeh2_vstrs.c

diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index 1fc98099070..8ba36e5985b 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -73,6 +73,7 @@ ifneq ($(CROSS_CC_HAS_Z15),)
 Z15_TESTS=vxeh2_vs
 Z15_TESTS+=vxeh2_vcvt
 Z15_TESTS+=vxeh2_vlstr
+Z15_TESTS+=vxeh2_vstrs
 $(Z15_TESTS): CFLAGS+=-march=z15 -O2
 TESTS+=$(Z15_TESTS)
 endif
diff --git a/tests/tcg/s390x/vxeh2_vstrs.c b/tests/tcg/s390x/vxeh2_vstrs.c
new file mode 100644
index 000..313ec1d728f
--- /dev/null
+++ b/tests/tcg/s390x/vxeh2_vstrs.c
@@ -0,0 +1,88 @@
+/*
+ * Test the VSTRS instruction.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include 
+#include 
+#include 
+#include 
+#include "vx.h"
+
+static inline __attribute__((__always_inline__)) int
+vstrs(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
+  const S390Vector *v4, const uint8_t m5, const uint8_t m6)
+{
+int cc;
+
+asm("vstrs %[v1],%[v2],%[v3],%[v4],%[m5],%[m6]\n"
+"ipm %[cc]"
+: [v1] "=v" (v1->v)
+, [cc] "=r" (cc)
+: [v2] "v" (v2->v)
+, [v3] "v" (v3->v)
+, [v4] "v" (v4->v)
+, [m5] "i" (m5)
+, [m6]  "i" (m6)
+: "cc");
+
+return (cc >> 28) & 3;
+}
+
+static void test_ignored_match(void)
+{
+S390Vector v1;
+S390Vector v2 = {.d[0] = 0x222000205e41ULL, .d[1] = 0};
+S390Vector v3 = {.d[0] = 0x205e4100ULL, .d[1] = 0};
+S390Vector v4 = {.d[0] = 3, .d[1] = 0};
+
+assert(vstrs(, , , , 0, 2) == 1);
+assert(v1.d[0] == 16);
+assert(v1.d[1] == 0);
+}
+
+static void test_empty_needle(void)
+{
+S390Vector v1;
+S390Vector v2 = {.d[0] = 0x5300ULL, .d[1] = 0};
+S390Vector v3 = {.d[0] = 0, .d[1] = 0};
+S390Vector v4 = {.d[0] = 0, .d[1] = 0};
+
+assert(vstrs(, , , , 0, 0) == 2);
+assert(v1.d[0] == 0);
+assert(v1.d[1] == 0);
+}
+
+static void test_max_length(void)
+{
+S390Vector v1;
+S390Vector v2 = {.d[0] = 0x1122334455667700ULL, .d[1] = 0};
+S390Vector v3 = {.d[0] = 0, .d[1] = 0};
+S390Vector v4 = {.d[0] = 16, .d[1] = 0};
+
+assert(vstrs(, , , , 0, 0) == 3);
+assert(v1.d[0] == 7);
+assert(v1.d[1] == 0);
+}
+
+static void test_no_match(void)
+{
+S390Vector v1;
+S390Vector v2 = {.d[0] = 0xff000f00ULL, .d[1] = 0x82b};
+S390Vector v3 = {.d[0] = 0xfffeULL,
+ .d[1] = 0xULL};
+S390Vector v4 = {.d[0] = 11, .d[1] = 0};
+
+assert(vstrs(, , , , 0, 2) == 1);
+assert(v1.d[0] == 16);
+assert(v1.d[1] == 0);
+}
+
+int main(void)
+{
+test_ignored_match();
+test_empty_needle();
+test_max_length();
+test_no_match();
+return EXIT_SUCCESS;
+}
-- 
2.41.0




[PATCH v9 23/24] accel/tcg: Call save_iotlb_data from io_readx as well.

2023-08-04 Thread Richard Henderson
From: Mikhail Tyutin 

Apply save_iotlb_data() to io_readx() as well as to io_writex().
This fixes SEGFAULT on qemu_plugin_hwaddr_phys_addr() call plugins
for addresses inside of MMIO region.

Signed-off-by: Dmitriy Solovev 
Signed-off-by: Mikhail Tyutin 
Reviewed-by: Richard Henderson 
Message-Id: <20230804110903.19968-1-m.tyu...@yadro.com>
Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 36 +---
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 4b1bfaa53d..d68fa6867c 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1363,6 +1363,21 @@ static inline void cpu_transaction_failed(CPUState *cpu, 
hwaddr physaddr,
 }
 }
 
+/*
+ * Save a potentially trashed CPUTLBEntryFull for later lookup by plugin.
+ * This is read by tlb_plugin_lookup if the fulltlb entry doesn't match
+ * because of the side effect of io_writex changing memory layout.
+ */
+static void save_iotlb_data(CPUState *cs, MemoryRegionSection *section,
+hwaddr mr_offset)
+{
+#ifdef CONFIG_PLUGIN
+SavedIOTLB *saved = >saved_iotlb;
+saved->section = section;
+saved->mr_offset = mr_offset;
+#endif
+}
+
 static uint64_t io_readx(CPUArchState *env, CPUTLBEntryFull *full,
  int mmu_idx, vaddr addr, uintptr_t retaddr,
  MMUAccessType access_type, MemOp op)
@@ -1382,6 +1397,12 @@ static uint64_t io_readx(CPUArchState *env, 
CPUTLBEntryFull *full,
 cpu_io_recompile(cpu, retaddr);
 }
 
+/*
+ * The memory_region_dispatch may trigger a flush/resize
+ * so for plugins we save the iotlb_data just in case.
+ */
+save_iotlb_data(cpu, section, mr_offset);
+
 {
 QEMU_IOTHREAD_LOCK_GUARD();
 r = memory_region_dispatch_read(mr, mr_offset, , op, full->attrs);
@@ -1398,21 +1419,6 @@ static uint64_t io_readx(CPUArchState *env, 
CPUTLBEntryFull *full,
 return val;
 }
 
-/*
- * Save a potentially trashed CPUTLBEntryFull for later lookup by plugin.
- * This is read by tlb_plugin_lookup if the fulltlb entry doesn't match
- * because of the side effect of io_writex changing memory layout.
- */
-static void save_iotlb_data(CPUState *cs, MemoryRegionSection *section,
-hwaddr mr_offset)
-{
-#ifdef CONFIG_PLUGIN
-SavedIOTLB *saved = >saved_iotlb;
-saved->section = section;
-saved->mr_offset = mr_offset;
-#endif
-}
-
 static void io_writex(CPUArchState *env, CPUTLBEntryFull *full,
   int mmu_idx, uint64_t val, vaddr addr,
   uintptr_t retaddr, MemOp op)
-- 
2.34.1




[PATCH v9 21/24] linux-user: Do not adjust zero_bss for host page size

2023-08-04 Thread Richard Henderson
Rely on target_mmap to handle guest vs host page size mismatch.

Tested-by: Helge Deller 
Reviewed-by: Helge Deller 
Reviewed-by: Akihiko Odaki 
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c | 54 +++-
 1 file changed, 23 insertions(+), 31 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index e853a4ab33..66ab617bd1 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -2212,44 +2212,36 @@ static abi_ulong setup_arg_pages(struct linux_binprm 
*bprm,
 
 /* Map and zero the bss.  We need to explicitly zero any fractional pages
after the data section (i.e. bss).  */
-static void zero_bss(abi_ulong elf_bss, abi_ulong last_bss, int prot)
+static void zero_bss(abi_ulong start_bss, abi_ulong end_bss, int prot)
 {
-uintptr_t host_start, host_map_start, host_end;
+abi_ulong align_bss;
 
-last_bss = TARGET_PAGE_ALIGN(last_bss);
+align_bss = TARGET_PAGE_ALIGN(start_bss);
+end_bss = TARGET_PAGE_ALIGN(end_bss);
 
-/* ??? There is confusion between qemu_real_host_page_size and
-   qemu_host_page_size here and elsewhere in target_mmap, which
-   may lead to the end of the data section mapping from the file
-   not being mapped.  At least there was an explicit test and
-   comment for that here, suggesting that "the file size must
-   be known".  The comment probably pre-dates the introduction
-   of the fstat system call in target_mmap which does in fact
-   find out the size.  What isn't clear is if the workaround
-   here is still actually needed.  For now, continue with it,
-   but merge it with the "normal" mmap that would allocate the bss.  */
+if (start_bss < align_bss) {
+int flags = page_get_flags(start_bss);
 
-host_start = (uintptr_t) g2h_untagged(elf_bss);
-host_end = (uintptr_t) g2h_untagged(last_bss);
-host_map_start = REAL_HOST_PAGE_ALIGN(host_start);
-
-if (host_map_start < host_end) {
-void *p = mmap((void *)host_map_start, host_end - host_map_start,
-   prot, MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
-if (p == MAP_FAILED) {
-perror("cannot mmap brk");
-exit(-1);
+if (!(flags & PAGE_VALID)) {
+/* Map the start of the bss. */
+align_bss -= TARGET_PAGE_SIZE;
+} else if (flags & PAGE_WRITE) {
+/* The page is already mapped writable. */
+memset(g2h_untagged(start_bss), 0, align_bss - start_bss);
+} else {
+/* Read-only zeros? */
+g_assert_not_reached();
 }
 }
 
-/* Ensure that the bss page(s) are valid */
-if ((page_get_flags(last_bss-1) & prot) != prot) {
-page_set_flags(elf_bss & TARGET_PAGE_MASK, last_bss - 1,
-   prot | PAGE_VALID);
-}
-
-if (host_start < host_map_start) {
-memset((void *)host_start, 0, host_map_start - host_start);
+if (align_bss < end_bss) {
+abi_long err = target_mmap(align_bss, end_bss - align_bss, prot,
+   MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS,
+   -1, 0);
+if (err == -1) {
+perror("cannot mmap brk");
+exit(-1);
+}
 }
 }
 
-- 
2.34.1




[PATCH v9 07/24] linux-user: Do not call get_errno() in do_brk()

2023-08-04 Thread Richard Henderson
From: Akihiko Odaki 

Later the returned value is compared with -1, and negated errno is not
expected.

Fixes: 00faf08c95 ("linux-user: Don't use MAP_FIXED in do_brk()")
Reviewed-by: Helge Deller 
Signed-off-by: Akihiko Odaki 
Message-Id: <20230802071754.14876-4-akihiko.od...@daynix.com>
Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 95727a816a..b9d2ec02f9 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -862,9 +862,9 @@ abi_long do_brk(abi_ulong brk_val)
  */
 if (new_host_brk_page > brk_page) {
 new_alloc_size = new_host_brk_page - brk_page;
-mapped_addr = get_errno(target_mmap(brk_page, new_alloc_size,
-PROT_READ|PROT_WRITE,
-MAP_ANON|MAP_PRIVATE, 0, 0));
+mapped_addr = target_mmap(brk_page, new_alloc_size,
+  PROT_READ|PROT_WRITE,
+  MAP_ANON|MAP_PRIVATE, 0, 0);
 } else {
 new_alloc_size = 0;
 mapped_addr = brk_page;
-- 
2.34.1




[PATCH v9 22/24] linux-user: Use zero_bss for PT_LOAD with no file contents too

2023-08-04 Thread Richard Henderson
If p_filesz == 0, then vaddr_ef == vaddr.  We can reuse the
code in zero_bss rather than incompletely duplicating it in
load_elf_image.

Tested-by: Helge Deller 
Reviewed-by: Helge Deller 
Reviewed-by: Akihiko Odaki 
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c | 27 +++
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 66ab617bd1..51591a1d94 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3209,7 +3209,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 for (i = 0; i < ehdr->e_phnum; i++) {
 struct elf_phdr *eppnt = phdr + i;
 if (eppnt->p_type == PT_LOAD) {
-abi_ulong vaddr, vaddr_po, vaddr_ps, vaddr_ef, vaddr_em, vaddr_len;
+abi_ulong vaddr, vaddr_po, vaddr_ps, vaddr_ef, vaddr_em;
 int elf_prot = 0;
 
 if (eppnt->p_flags & PF_R) {
@@ -3234,30 +3234,17 @@ static void load_elf_image(const char *image_name, int 
image_fd,
  * but no backing file segment.
  */
 if (eppnt->p_filesz != 0) {
-vaddr_len = eppnt->p_filesz + vaddr_po;
-error = target_mmap(vaddr_ps, vaddr_len, elf_prot,
-MAP_PRIVATE | MAP_FIXED,
+error = target_mmap(vaddr_ps, eppnt->p_filesz + vaddr_po,
+elf_prot, MAP_PRIVATE | MAP_FIXED,
 image_fd, eppnt->p_offset - vaddr_po);
-
 if (error == -1) {
 goto exit_mmap;
 }
+}
 
-/*
- * If the load segment requests extra zeros (e.g. bss), map it.
- */
-if (eppnt->p_filesz < eppnt->p_memsz) {
-zero_bss(vaddr_ef, vaddr_em, elf_prot);
-}
-} else if (eppnt->p_memsz != 0) {
-vaddr_len = eppnt->p_memsz + vaddr_po;
-error = target_mmap(vaddr_ps, vaddr_len, elf_prot,
-MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS,
--1, 0);
-
-if (error == -1) {
-goto exit_mmap;
-}
+/* If the load segment requests extra zeros (e.g. bss), map it. */
+if (vaddr_ef < vaddr_em) {
+zero_bss(vaddr_ef, vaddr_em, elf_prot);
 }
 
 /* Find the full program boundaries.  */
-- 
2.34.1




[PATCH v9 12/24] bsd-user: Remove last_brk

2023-08-04 Thread Richard Henderson
This variable is unused.

Signed-off-by: Richard Henderson 
---
 bsd-user/qemu.h | 1 -
 bsd-user/mmap.c | 2 --
 2 files changed, 3 deletions(-)

diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index edf9602f9b..8f2d6a3c78 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -232,7 +232,6 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong 
old_size,
abi_ulong new_size, unsigned long flags,
abi_ulong new_addr);
 int target_msync(abi_ulong start, abi_ulong len, int flags);
-extern unsigned long last_brk;
 extern abi_ulong mmap_next_start;
 abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size);
 void TSA_NO_TSA mmap_fork_start(void);
diff --git a/bsd-user/mmap.c b/bsd-user/mmap.c
index b62a69bd07..8e148a2ea3 100644
--- a/bsd-user/mmap.c
+++ b/bsd-user/mmap.c
@@ -214,8 +214,6 @@ static int mmap_frag(abi_ulong real_start,
 #endif
 abi_ulong mmap_next_start = TASK_UNMAPPED_BASE;
 
-unsigned long last_brk;
-
 /*
  * Subroutine of mmap_find_vma, used when we have pre-allocated a chunk of 
guest
  * address space.
-- 
2.34.1




[PATCH v9 14/24] linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h

2023-08-04 Thread Richard Henderson
Provide default values that are as close as possible to the
values used by the guest's kernel.

Tested-by: Helge Deller 
Reviewed-by: Helge Deller 
Reviewed-by: Akihiko Odaki 
Signed-off-by: Richard Henderson 
---
 linux-user/aarch64/target_mman.h | 10 ++
 linux-user/alpha/target_mman.h   |  8 
 linux-user/arm/target_mman.h |  8 
 linux-user/cris/target_mman.h|  9 +
 linux-user/hexagon/target_mman.h | 10 ++
 linux-user/hppa/target_mman.h|  3 +++
 linux-user/i386/target_mman.h| 13 +
 linux-user/loongarch64/target_mman.h |  8 
 linux-user/m68k/target_mman.h|  3 +++
 linux-user/microblaze/target_mman.h  |  8 
 linux-user/mips/target_mman.h|  7 +++
 linux-user/nios2/target_mman.h   |  7 +++
 linux-user/openrisc/target_mman.h|  7 +++
 linux-user/ppc/target_mman.h | 13 +
 linux-user/riscv/target_mman.h   |  7 +++
 linux-user/s390x/target_mman.h   | 10 ++
 linux-user/sh4/target_mman.h |  4 
 linux-user/sparc/target_mman.h   | 14 ++
 linux-user/user-mmap.h   | 14 --
 linux-user/x86_64/target_mman.h  | 12 
 linux-user/xtensa/target_mman.h  |  6 ++
 21 files changed, 167 insertions(+), 14 deletions(-)

diff --git a/linux-user/aarch64/target_mman.h b/linux-user/aarch64/target_mman.h
index f721295fe1..4d3eecfb26 100644
--- a/linux-user/aarch64/target_mman.h
+++ b/linux-user/aarch64/target_mman.h
@@ -4,6 +4,16 @@
 #define TARGET_PROT_BTI 0x10
 #define TARGET_PROT_MTE 0x20
 
+/*
+ * arch/arm64/include/asm/processor.h:
+ *
+ * TASK_UNMAPPED_BASE DEFAULT_MAP_WINDOW / 4
+ * DEFAULT_MAP_WINDOW DEFAULT_MAP_WINDOW_64
+ * DEFAULT_MAP_WINDOW_64  UL(1) << VA_BITS_MIN
+ * VA_BITS_MIN48 (unless explicitly configured smaller)
+ */
+#define TASK_UNMAPPED_BASE  (1ull << (48 - 2))
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/alpha/target_mman.h b/linux-user/alpha/target_mman.h
index 6bb03e7336..c90b493711 100644
--- a/linux-user/alpha/target_mman.h
+++ b/linux-user/alpha/target_mman.h
@@ -20,6 +20,14 @@
 #define TARGET_MS_SYNC 2
 #define TARGET_MS_INVALIDATE 4
 
+/*
+ * arch/alpha/include/asm/processor.h:
+ *
+ * TASK_UNMAPPED_BASE   TASK_SIZE / 2
+ * TASK_SIZE0x400UL
+ */
+#define TASK_UNMAPPED_BASE  0x200ull
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/arm/target_mman.h b/linux-user/arm/target_mman.h
index e7ba6070fe..76275b2c7e 100644
--- a/linux-user/arm/target_mman.h
+++ b/linux-user/arm/target_mman.h
@@ -1 +1,9 @@
+/*
+ * arch/arm/include/asm/memory.h
+ * TASK_UNMAPPED_BASEALIGN(TASK_SIZE / 3, SZ_16M)
+ * TASK_SIZE CONFIG_PAGE_OFFSET
+ * CONFIG_PAGE_OFFSET0xC000 (default in Kconfig)
+ */
+#define TASK_UNMAPPED_BASE   0x4000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/cris/target_mman.h b/linux-user/cris/target_mman.h
index e7ba6070fe..9df7b1eda5 100644
--- a/linux-user/cris/target_mman.h
+++ b/linux-user/cris/target_mman.h
@@ -1 +1,10 @@
+/*
+ * arch/cris/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE  (PAGE_ALIGN(TASK_SIZE / 3))
+ *
+ * arch/cris/include/arch-v32/arch/processor.h
+ * TASK_SIZE   0xb000
+ */
+#define TASK_UNMAPPED_BASE TARGET_PAGE_ALIGN(0xb000 / 3)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/hexagon/target_mman.h b/linux-user/hexagon/target_mman.h
index e7ba6070fe..c5ae336e07 100644
--- a/linux-user/hexagon/target_mman.h
+++ b/linux-user/hexagon/target_mman.h
@@ -1 +1,11 @@
+/*
+ * arch/hexgon/include/asm/processor.h
+ * TASK_UNMAPPED_BASEPAGE_ALIGN(TASK_SIZE / 3)
+ *
+ * arch/hexagon/include/asm/mem-layout.h
+ * TASK_SIZE PAGE_OFFSET
+ * PAGE_OFFSET   0xc000
+ */
+#define TASK_UNMAPPED_BASE   0x4000
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/hppa/target_mman.h b/linux-user/hppa/target_mman.h
index 97f87d042a..6459e7dbdd 100644
--- a/linux-user/hppa/target_mman.h
+++ b/linux-user/hppa/target_mman.h
@@ -24,6 +24,9 @@
 #define TARGET_MS_ASYNC 2
 #define TARGET_MS_INVALIDATE 4
 
+/* arch/parisc/include/asm/processor.h: DEFAULT_MAP_BASE32 */
+#define TASK_UNMAPPED_BASE  0x4000
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/i386/target_mman.h b/linux-user/i386/target_mman.h
index e7ba6070fe..cc3382007f 100644
--- a/linux-user/i386/target_mman.h
+++ b/linux-user/i386/target_mman.h
@@ -1 +1,14 @@
+/*
+ * arch/x86/include/asm/processor.h:
+ * TASK_UNMAPPED_BASE __TASK_UNMAPPED_BASE(TASK_SIZE_LOW)
+ * __TASK_UNMAPPED_BASE(S)PAGE_ALIGN(S / 3)
+ *
+ * arch/x86/include/asm/page_32_types.h:
+ * TASK_SIZE_LOW  TASK_SIZE
+ * TASK_SIZE  __PAGE_OFFSET
+ * __PAGE_OFFSET  

[PATCH v9 15/24] linux-user: Define ELF_ET_DYN_BASE in $guest/target_mman.h

2023-08-04 Thread Richard Henderson
Copy each guest kernel's default value, then bound it
against reserved_va or the host address space.

Tested-by: Helge Deller 
Reviewed-by: Helge Deller 
Signed-off-by: Richard Henderson 
---
 linux-user/aarch64/target_mman.h |  3 +++
 linux-user/alpha/target_mman.h   |  3 +++
 linux-user/arm/target_mman.h |  3 +++
 linux-user/cris/target_mman.h|  3 +++
 linux-user/hexagon/target_mman.h |  3 +++
 linux-user/hppa/target_mman.h|  3 +++
 linux-user/i386/target_mman.h|  3 +++
 linux-user/loongarch64/target_mman.h |  3 +++
 linux-user/m68k/target_mman.h|  2 ++
 linux-user/microblaze/target_mman.h  |  3 +++
 linux-user/mips/target_mman.h|  3 +++
 linux-user/nios2/target_mman.h   |  3 +++
 linux-user/openrisc/target_mman.h|  3 +++
 linux-user/ppc/target_mman.h |  7 +++
 linux-user/riscv/target_mman.h   |  3 +++
 linux-user/s390x/target_mman.h   | 10 ++
 linux-user/sh4/target_mman.h |  3 +++
 linux-user/sparc/target_mman.h   | 11 +++
 linux-user/user-mmap.h   |  1 +
 linux-user/x86_64/target_mman.h  |  3 +++
 linux-user/xtensa/target_mman.h  |  4 
 linux-user/main.c| 15 +++
 linux-user/mmap.c|  1 +
 23 files changed, 96 insertions(+)

diff --git a/linux-user/aarch64/target_mman.h b/linux-user/aarch64/target_mman.h
index 4d3eecfb26..69ec5d5739 100644
--- a/linux-user/aarch64/target_mman.h
+++ b/linux-user/aarch64/target_mman.h
@@ -14,6 +14,9 @@
  */
 #define TASK_UNMAPPED_BASE  (1ull << (48 - 2))
 
+/* arch/arm64/include/asm/elf.h */
+#define ELF_ET_DYN_BASE TARGET_PAGE_ALIGN((1ull << 48) / 3 * 2)
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/alpha/target_mman.h b/linux-user/alpha/target_mman.h
index c90b493711..8edfe2b88c 100644
--- a/linux-user/alpha/target_mman.h
+++ b/linux-user/alpha/target_mman.h
@@ -28,6 +28,9 @@
  */
 #define TASK_UNMAPPED_BASE  0x200ull
 
+/* arch/alpha/include/asm/elf.h */
+#define ELF_ET_DYN_BASE (TASK_UNMAPPED_BASE + 0x100)
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/arm/target_mman.h b/linux-user/arm/target_mman.h
index 76275b2c7e..51005da869 100644
--- a/linux-user/arm/target_mman.h
+++ b/linux-user/arm/target_mman.h
@@ -6,4 +6,7 @@
  */
 #define TASK_UNMAPPED_BASE   0x4000
 
+/* arch/arm/include/asm/elf.h */
+#define ELF_ET_DYN_BASE  0x0040
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/cris/target_mman.h b/linux-user/cris/target_mman.h
index 9df7b1eda5..9ace8ac292 100644
--- a/linux-user/cris/target_mman.h
+++ b/linux-user/cris/target_mman.h
@@ -7,4 +7,7 @@
  */
 #define TASK_UNMAPPED_BASE TARGET_PAGE_ALIGN(0xb000 / 3)
 
+/* arch/cris/include/uapi/asm/elf.h */
+#define ELF_ET_DYN_BASE(TASK_UNMAPPED_BASE * 2)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/hexagon/target_mman.h b/linux-user/hexagon/target_mman.h
index c5ae336e07..e6b5e2ca36 100644
--- a/linux-user/hexagon/target_mman.h
+++ b/linux-user/hexagon/target_mman.h
@@ -8,4 +8,7 @@
  */
 #define TASK_UNMAPPED_BASE   0x4000
 
+/* arch/hexagon/include/asm/elf.h */
+#define ELF_ET_DYN_BASE  0x0800
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/hppa/target_mman.h b/linux-user/hppa/target_mman.h
index 6459e7dbdd..ccda46e842 100644
--- a/linux-user/hppa/target_mman.h
+++ b/linux-user/hppa/target_mman.h
@@ -27,6 +27,9 @@
 /* arch/parisc/include/asm/processor.h: DEFAULT_MAP_BASE32 */
 #define TASK_UNMAPPED_BASE  0x4000
 
+/* arch/parisc/include/asm/elf.h */
+#define ELF_ET_DYN_BASE (TASK_UNMAPPED_BASE + 0x0100)
+
 #include "../generic/target_mman.h"
 
 #endif
diff --git a/linux-user/i386/target_mman.h b/linux-user/i386/target_mman.h
index cc3382007f..e3b8e1eaa6 100644
--- a/linux-user/i386/target_mman.h
+++ b/linux-user/i386/target_mman.h
@@ -11,4 +11,7 @@
  */
 #define TASK_UNMAPPED_BASE0x4000
 
+/* arch/x86/include/asm/elf.h */
+#define ELF_ET_DYN_BASE   0x0040
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/loongarch64/target_mman.h 
b/linux-user/loongarch64/target_mman.h
index d70e44d44c..8c2a3d5596 100644
--- a/linux-user/loongarch64/target_mman.h
+++ b/linux-user/loongarch64/target_mman.h
@@ -6,4 +6,7 @@
 #define TASK_UNMAPPED_BASE \
 TARGET_PAGE_ALIGN((1ull << TARGET_VIRT_ADDR_SPACE_BITS) / 3)
 
+/* arch/loongarch/include/asm/elf.h */
+#define ELF_ET_DYN_BASE   (TASK_UNMAPPED_BASE * 2)
+
 #include "../generic/target_mman.h"
diff --git a/linux-user/m68k/target_mman.h b/linux-user/m68k/target_mman.h
index d3eceb663b..20cfe750c5 100644
--- a/linux-user/m68k/target_mman.h
+++ b/linux-user/m68k/target_mman.h
@@ -1,4 +1,6 @@
 /* arch/m68k/include/asm/processor.h */
 #define TASK_UNMAPPED_BASE  0xC000
+/* arch/m68k/include/asm/elf.h */
+#define ELF_ET_DYN_BASE 0xD000
 
 #include 

[PATCH v9 20/24] linux-user: Do not adjust image mapping for host page size

2023-08-04 Thread Richard Henderson
Remove TARGET_ELF_EXEC_PAGESIZE, and 3 other TARGET_ELF_PAGE* macros
based off of that.  Rely on target_mmap to handle guest vs host page
size mismatch.

Tested-by: Helge Deller 
Reviewed-by: Helge Deller 
Reviewed-by: Akihiko Odaki 
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c | 17 -
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index fa0c9ace8e..e853a4ab33 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1959,15 +1959,6 @@ struct exec
 #define ZMAGIC 0413
 #define QMAGIC 0314
 
-/* Necessary parameters */
-#define TARGET_ELF_EXEC_PAGESIZE \
-(((eppnt->p_align & ~qemu_host_page_mask) != 0) ? \
- TARGET_PAGE_SIZE : MAX(qemu_host_page_size, TARGET_PAGE_SIZE))
-#define TARGET_ELF_PAGELENGTH(_v) ROUND_UP((_v), TARGET_ELF_EXEC_PAGESIZE)
-#define TARGET_ELF_PAGESTART(_v) ((_v) & \
- ~(abi_ulong)(TARGET_ELF_EXEC_PAGESIZE-1))
-#define TARGET_ELF_PAGEOFFSET(_v) ((_v) & (TARGET_ELF_EXEC_PAGESIZE-1))
-
 #define DLINFO_ITEMS 16
 
 static inline void memcpy_fromfs(void * to, const void * from, unsigned long n)
@@ -3240,8 +3231,8 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 }
 
 vaddr = load_bias + eppnt->p_vaddr;
-vaddr_po = TARGET_ELF_PAGEOFFSET(vaddr);
-vaddr_ps = TARGET_ELF_PAGESTART(vaddr);
+vaddr_po = vaddr & ~TARGET_PAGE_MASK;
+vaddr_ps = vaddr & TARGET_PAGE_MASK;
 
 vaddr_ef = vaddr + eppnt->p_filesz;
 vaddr_em = vaddr + eppnt->p_memsz;
@@ -3251,7 +3242,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
  * but no backing file segment.
  */
 if (eppnt->p_filesz != 0) {
-vaddr_len = TARGET_ELF_PAGELENGTH(eppnt->p_filesz + vaddr_po);
+vaddr_len = eppnt->p_filesz + vaddr_po;
 error = target_mmap(vaddr_ps, vaddr_len, elf_prot,
 MAP_PRIVATE | MAP_FIXED,
 image_fd, eppnt->p_offset - vaddr_po);
@@ -3267,7 +3258,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 zero_bss(vaddr_ef, vaddr_em, elf_prot);
 }
 } else if (eppnt->p_memsz != 0) {
-vaddr_len = TARGET_ELF_PAGELENGTH(eppnt->p_memsz + vaddr_po);
+vaddr_len = eppnt->p_memsz + vaddr_po;
 error = target_mmap(vaddr_ps, vaddr_len, elf_prot,
 MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS,
 -1, 0);
-- 
2.34.1




[PATCH v9 02/24] accel/tcg: Issue wider aligned i/o in do_{ld, st}_mmio_*

2023-08-04 Thread Richard Henderson
If the address and size are aligned, send larger chunks
to the memory subsystem.  This will be required to make
more use of these helpers.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 76 +-
 1 file changed, 69 insertions(+), 7 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 23386ecfde..a308cb7534 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2081,10 +2081,40 @@ static uint64_t do_ld_mmio_beN(CPUArchState *env, 
CPUTLBEntryFull *full,
uint64_t ret_be, vaddr addr, int size,
int mmu_idx, MMUAccessType type, uintptr_t ra)
 {
-for (int i = 0; i < size; i++) {
-uint8_t x = io_readx(env, full, mmu_idx, addr + i, ra, type, MO_UB);
-ret_be = (ret_be << 8) | x;
-}
+uint64_t t;
+
+tcg_debug_assert(size > 0 && size <= 8);
+do {
+/* Read aligned pieces up to 8 bytes. */
+switch ((size | (int)addr) & 7) {
+case 1:
+case 3:
+case 5:
+case 7:
+t = io_readx(env, full, mmu_idx, addr, ra, type, MO_UB);
+ret_be = (ret_be << 8) | t;
+size -= 1;
+addr += 1;
+break;
+case 2:
+case 6:
+t = io_readx(env, full, mmu_idx, addr, ra, type, MO_BEUW);
+ret_be = (ret_be << 16) | t;
+size -= 2;
+addr += 2;
+break;
+case 4:
+t = io_readx(env, full, mmu_idx, addr, ra, type, MO_BEUL);
+ret_be = (ret_be << 32) | t;
+size -= 4;
+addr += 4;
+break;
+case 0:
+return io_readx(env, full, mmu_idx, addr, ra, type, MO_BEUQ);
+default:
+qemu_build_not_reached();
+}
+} while (size);
 return ret_be;
 }
 
@@ -2680,9 +2710,41 @@ static uint64_t do_st_mmio_leN(CPUArchState *env, 
CPUTLBEntryFull *full,
uint64_t val_le, vaddr addr, int size,
int mmu_idx, uintptr_t ra)
 {
-for (int i = 0; i < size; i++, val_le >>= 8) {
-io_writex(env, full, mmu_idx, val_le, addr + i, ra, MO_UB);
-}
+tcg_debug_assert(size > 0 && size <= 8);
+
+do {
+/* Store aligned pieces up to 8 bytes. */
+switch ((size | (int)addr) & 7) {
+case 1:
+case 3:
+case 5:
+case 7:
+io_writex(env, full, mmu_idx, val_le, addr, ra, MO_UB);
+val_le >>= 8;
+size -= 1;
+addr += 1;
+break;
+case 2:
+case 6:
+io_writex(env, full, mmu_idx, val_le, addr, ra, MO_LEUW);
+val_le >>= 16;
+size -= 2;
+addr += 2;
+break;
+case 4:
+io_writex(env, full, mmu_idx, val_le, addr, ra, MO_LEUL);
+val_le >>= 32;
+size -= 4;
+addr += 4;
+break;
+case 0:
+io_writex(env, full, mmu_idx, val_le, addr, ra, MO_LEUQ);
+return 0;
+default:
+qemu_build_not_reached();
+}
+} while (size);
+
 return val_le;
 }
 
-- 
2.34.1




[PATCH v9 17/24] linux-user: Use elf_et_dyn_base for ET_DYN with interpreter

2023-08-04 Thread Richard Henderson
Follow the lead of the linux kernel in fs/binfmt_elf.c,
in which an ET_DYN executable which uses an interpreter
(usually a PIE executable) is loaded away from where the
interpreter itself will be loaded.

Tested-by: Helge Deller 
Reviewed-by: Helge Deller 
Reviewed-by: Akihiko Odaki 
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 0c64aad8a5..a3aa08a13e 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3106,6 +3106,8 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 }
 }
 
+load_addr = loaddr;
+
 if (pinterp_name != NULL) {
 /*
  * This is the main executable.
@@ -3135,11 +3137,32 @@ static void load_elf_image(const char *image_name, int 
image_fd,
  */
 probe_guest_base(image_name, loaddr, hiaddr);
 } else {
+abi_ulong align;
+
 /*
  * The binary is dynamic, but we still need to
  * select guest_base.  In this case we pass a size.
  */
 probe_guest_base(image_name, 0, hiaddr - loaddr);
+
+/*
+ * Avoid collision with the loader by providing a different
+ * default load address.
+ */
+load_addr += elf_et_dyn_base;
+
+/*
+ * TODO: Better support for mmap alignment is desirable.
+ * Since we do not have complete control over the guest
+ * address space, we prefer the kernel to choose some address
+ * rather than force the use of LOAD_ADDR via MAP_FIXED.
+ * But without MAP_FIXED we cannot guarantee alignment,
+ * only suggest it.
+ */
+align = pow2ceil(info->alignment);
+if (align) {
+load_addr &= -align;
+}
 }
 }
 
@@ -3154,13 +3177,13 @@ static void load_elf_image(const char *image_name, int 
image_fd,
  *
  * Otherwise this is ET_DYN, and we are searching for a location
  * that can hold the memory space required.  If the image is
- * pre-linked, LOADDR will be non-zero, and the kernel should
+ * pre-linked, LOAD_ADDR will be non-zero, and the kernel should
  * honor that address if it happens to be free.
  *
  * In both cases, we will overwrite pages in this range with mappings
  * from the executable.
  */
-load_addr = target_mmap(loaddr, (size_t)hiaddr - loaddr + 1, PROT_NONE,
+load_addr = target_mmap(load_addr, (size_t)hiaddr - loaddr + 1, PROT_NONE,
 MAP_PRIVATE | MAP_ANON | MAP_NORESERVE |
 (ehdr->e_type == ET_EXEC ? MAP_FIXED_NOREPLACE : 
0),
 -1, 0);
-- 
2.34.1




[PATCH v9 for-8.1 00/24] linux-user + tcg patch queue

2023-08-04 Thread Richard Henderson
Supercedes: 20230804014517.6361-1-richard.hender...@linaro.org
("[PATCH for-8.1 v8 00/17] linux-user: brk fixes")

Changes for linux-user brk v9:
  Recover some changes that should have been in v8, had I
  generated the patches from the correct tree:
- bsd-user: Remove last_brk
- Fix typos in patch 15 ("Define ELF_ET_DYN_BASE...")
- Disable -Werror=type-limits in patch 13
  ("linux-user: Adjust task_unmapped_base")


r~


Akihiko Odaki (6):
  linux-user: Unset MAP_FIXED_NOREPLACE for host
  linux-user: Fix MAP_FIXED_NOREPLACE on old kernels
  linux-user: Do not call get_errno() in do_brk()
  linux-user: Use MAP_FIXED_NOREPLACE for do_brk()
  linux-user: Do nothing if too small brk is specified
  linux-user: Do not align brk with host page size

Helge Deller (1):
  linux-user: Adjust initial brk when interpreter is close to executable

Matheus Tavares Bernardino (1):
  gdbstub: use 0 ("any process") on packets with no PID

Mikhail Tyutin (1):
  accel/tcg: Call save_iotlb_data from io_readx as well.

Nathan Egge (1):
  linux-user/elfload: Set V in ELF_HWCAP for RISC-V

Richard Henderson (14):
  accel/tcg: Adjust parameters and locking with do_{ld,st}_mmio_*
  accel/tcg: Issue wider aligned i/o in do_{ld,st}_mmio_*
  accel/tcg: Do not issue misaligned i/o
  linux-user: Remove last_brk
  bsd-user: Remove last_brk
  linux-user: Adjust task_unmapped_base for reserved_va
  linux-user: Define TASK_UNMAPPED_BASE in $guest/target_mman.h
  linux-user: Define ELF_ET_DYN_BASE in $guest/target_mman.h
  linux-user: Use MAP_FIXED_NOREPLACE for initial image mmap
  linux-user: Use elf_et_dyn_base for ET_DYN with interpreter
  linux-user: Properly set image_info.brk in flatload
  linux-user: Do not adjust image mapping for host page size
  linux-user: Do not adjust zero_bss for host page size
  linux-user: Use zero_bss for PT_LOAD with no file contents too

 bsd-user/qemu.h  |   1 -
 linux-user/aarch64/target_mman.h |  13 ++
 linux-user/alpha/target_mman.h   |  11 +
 linux-user/arm/target_mman.h |  11 +
 linux-user/cris/target_mman.h|  12 ++
 linux-user/hexagon/target_mman.h |  13 ++
 linux-user/hppa/target_mman.h|   6 +
 linux-user/i386/target_mman.h|  16 ++
 linux-user/loongarch64/target_mman.h |  11 +
 linux-user/m68k/target_mman.h|   5 +
 linux-user/microblaze/target_mman.h  |  11 +
 linux-user/mips/target_mman.h|  10 +
 linux-user/nios2/target_mman.h   |  10 +
 linux-user/openrisc/target_mman.h|  10 +
 linux-user/ppc/target_mman.h |  20 ++
 linux-user/qemu.h|   2 -
 linux-user/riscv/target_mman.h   |  10 +
 linux-user/s390x/target_mman.h   |  20 ++
 linux-user/sh4/target_mman.h |   7 +
 linux-user/sparc/target_mman.h   |  25 +++
 linux-user/user-mmap.h   |   6 +-
 linux-user/x86_64/target_mman.h  |  15 ++
 linux-user/xtensa/target_mman.h  |  10 +
 accel/tcg/cputlb.c   | 289 ++-
 bsd-user/mmap.c  |   2 -
 gdbstub/gdbstub.c|   2 +-
 linux-user/elfload.c | 184 -
 linux-user/flatload.c|   2 +-
 linux-user/main.c|  45 -
 linux-user/mmap.c|  68 ---
 linux-user/syscall.c |  69 ++-
 31 files changed, 622 insertions(+), 294 deletions(-)

-- 
2.34.1




[PATCH v9 18/24] linux-user: Adjust initial brk when interpreter is close to executable

2023-08-04 Thread Richard Henderson
From: Helge Deller 

While we attempt to load a ET_DYN executable far away from
TASK_UNMAPPED_BASE, we are not completely in control of the
address space layout.  If the interpreter lands close to
the executable, leaving insufficient heap space, move brk.

Tested-by: Helge Deller 
Signed-off-by: Helge Deller 
[rth: Re-order after ELF_ET_DYN_BASE patch so that we do not
 "temporarily break" tsan, and also to minimize the changes required.
 Remove image_info.reserve_brk as unused.]
Reviewed-by: Akihiko Odaki 
Signed-off-by: Richard Henderson 
---
 linux-user/qemu.h|  1 -
 linux-user/elfload.c | 51 +---
 2 files changed, 15 insertions(+), 37 deletions(-)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 802794db63..4b0c9da0dc 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -31,7 +31,6 @@ struct image_info {
 abi_ulong   end_data;
 abi_ulong   start_brk;
 abi_ulong   brk;
-abi_ulong   reserve_brk;
 abi_ulong   start_mmap;
 abi_ulong   start_stack;
 abi_ulong   stack_limit;
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index a3aa08a13e..fa0c9ace8e 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3109,27 +3109,6 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 load_addr = loaddr;
 
 if (pinterp_name != NULL) {
-/*
- * This is the main executable.
- *
- * Reserve extra space for brk.
- * We hold on to this space while placing the interpreter
- * and the stack, lest they be placed immediately after
- * the data segment and block allocation from the brk.
- *
- * 16MB is chosen as "large enough" without being so large as
- * to allow the result to not fit with a 32-bit guest on a
- * 32-bit host. However some 64 bit guests (e.g. s390x)
- * attempt to place their heap further ahead and currently
- * nothing stops them smashing into QEMUs address space.
- */
-#if TARGET_LONG_BITS == 64
-info->reserve_brk = 32 * MiB;
-#else
-info->reserve_brk = 16 * MiB;
-#endif
-hiaddr += info->reserve_brk;
-
 if (ehdr->e_type == ET_EXEC) {
 /*
  * Make sure that the low address does not conflict with
@@ -3220,7 +3199,8 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 info->end_code = 0;
 info->start_data = -1;
 info->end_data = 0;
-info->brk = 0;
+/* Usual start for brk is after all sections of the main executable. */
+info->brk = TARGET_PAGE_ALIGN(hiaddr);
 info->elf_flags = ehdr->e_flags;
 
 prot_exec = PROT_EXEC;
@@ -3314,9 +3294,6 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 info->end_data = vaddr_ef;
 }
 }
-if (vaddr_em > info->brk) {
-info->brk = vaddr_em;
-}
 #ifdef TARGET_MIPS
 } else if (eppnt->p_type == PT_MIPS_ABIFLAGS) {
 Mips_elf_abiflags_v0 abiflags;
@@ -3645,6 +3622,19 @@ int load_elf_binary(struct linux_binprm *bprm, struct 
image_info *info)
 if (elf_interpreter) {
 load_elf_interp(elf_interpreter, _info, bprm->buf);
 
+/*
+ * While unusual because of ELF_ET_DYN_BASE, if we are unlucky
+ * with the mappings the interpreter can be loaded above but
+ * near the main executable, which can leave very little room
+ * for the heap.
+ * If the current brk has less than 16MB, use the end of the
+ * interpreter.
+ */
+if (interp_info.brk > info->brk &&
+interp_info.load_bias - info->brk < 16 * MiB)  {
+info->brk = interp_info.brk;
+}
+
 /* If the program interpreter is one of these two, then assume
an iBCS2 image.  Otherwise assume a native linux image.  */
 
@@ -3698,17 +3688,6 @@ int load_elf_binary(struct linux_binprm *bprm, struct 
image_info *info)
 bprm->core_dump = _core_dump;
 #endif
 
-/*
- * If we reserved extra space for brk, release it now.
- * The implementation of do_brk in syscalls.c expects to be able
- * to mmap pages in this space.
- */
-if (info->reserve_brk) {
-abi_ulong start_brk = TARGET_PAGE_ALIGN(info->brk);
-abi_ulong end_brk = TARGET_PAGE_ALIGN(info->brk + info->reserve_brk);
-target_munmap(start_brk, end_brk - start_brk);
-}
-
 return 0;
 }
 
-- 
2.34.1




[PATCH v9 01/24] accel/tcg: Adjust parameters and locking with do_{ld, st}_mmio_*

2023-08-04 Thread Richard Henderson
Replace MMULookupPageData* with CPUTLBEntryFull, addr, size.
Move QEMU_IOTHREAD_LOCK_GUARD to the caller.

This simplifies the usage from do_ld16_beN and do_st16_leN, where
we weren't locking the entire operation, and required hoop jumping
for passing addr and size.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 67 +++---
 1 file changed, 34 insertions(+), 33 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index ba44501a7c..23386ecfde 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2066,24 +2066,22 @@ static void *atomic_mmu_lookup(CPUArchState *env, vaddr 
addr, MemOpIdx oi,
 /**
  * do_ld_mmio_beN:
  * @env: cpu context
- * @p: translation parameters
+ * @full: page parameters
  * @ret_be: accumulated data
+ * @addr: virtual address
+ * @size: number of bytes
  * @mmu_idx: virtual address context
  * @ra: return address into tcg generated code, or 0
+ * Context: iothread lock held
  *
- * Load @p->size bytes from @p->addr, which is memory-mapped i/o.
+ * Load @size bytes from @addr, which is memory-mapped i/o.
  * The bytes are concatenated in big-endian order with @ret_be.
  */
-static uint64_t do_ld_mmio_beN(CPUArchState *env, MMULookupPageData *p,
-   uint64_t ret_be, int mmu_idx,
-   MMUAccessType type, uintptr_t ra)
+static uint64_t do_ld_mmio_beN(CPUArchState *env, CPUTLBEntryFull *full,
+   uint64_t ret_be, vaddr addr, int size,
+   int mmu_idx, MMUAccessType type, uintptr_t ra)
 {
-CPUTLBEntryFull *full = p->full;
-vaddr addr = p->addr;
-int i, size = p->size;
-
-QEMU_IOTHREAD_LOCK_GUARD();
-for (i = 0; i < size; i++) {
+for (int i = 0; i < size; i++) {
 uint8_t x = io_readx(env, full, mmu_idx, addr + i, ra, type, MO_UB);
 ret_be = (ret_be << 8) | x;
 }
@@ -2232,7 +2230,9 @@ static uint64_t do_ld_beN(CPUArchState *env, 
MMULookupPageData *p,
 unsigned tmp, half_size;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-return do_ld_mmio_beN(env, p, ret_be, mmu_idx, type, ra);
+QEMU_IOTHREAD_LOCK_GUARD();
+return do_ld_mmio_beN(env, p->full, ret_be, p->addr, p->size,
+  mmu_idx, type, ra);
 }
 
 /*
@@ -2281,11 +2281,11 @@ static Int128 do_ld16_beN(CPUArchState *env, 
MMULookupPageData *p,
 MemOp atom;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-p->size = size - 8;
-a = do_ld_mmio_beN(env, p, a, mmu_idx, MMU_DATA_LOAD, ra);
-p->addr += p->size;
-p->size = 8;
-b = do_ld_mmio_beN(env, p, 0, mmu_idx, MMU_DATA_LOAD, ra);
+QEMU_IOTHREAD_LOCK_GUARD();
+a = do_ld_mmio_beN(env, p->full, a, p->addr, size - 8,
+   mmu_idx, MMU_DATA_LOAD, ra);
+b = do_ld_mmio_beN(env, p->full, 0, p->addr + 8, 8,
+   mmu_idx, MMU_DATA_LOAD, ra);
 return int128_make128(b, a);
 }
 
@@ -2664,24 +2664,23 @@ Int128 cpu_ld16_mmu(CPUArchState *env, abi_ptr addr,
 /**
  * do_st_mmio_leN:
  * @env: cpu context
- * @p: translation parameters
+ * @full: page parameters
  * @val_le: data to store
+ * @addr: virtual address
+ * @size: number of bytes
  * @mmu_idx: virtual address context
  * @ra: return address into tcg generated code, or 0
+ * Context: iothread lock held
  *
- * Store @p->size bytes at @p->addr, which is memory-mapped i/o.
+ * Store @size bytes at @addr, which is memory-mapped i/o.
  * The bytes to store are extracted in little-endian order from @val_le;
  * return the bytes of @val_le beyond @p->size that have not been stored.
  */
-static uint64_t do_st_mmio_leN(CPUArchState *env, MMULookupPageData *p,
-   uint64_t val_le, int mmu_idx, uintptr_t ra)
+static uint64_t do_st_mmio_leN(CPUArchState *env, CPUTLBEntryFull *full,
+   uint64_t val_le, vaddr addr, int size,
+   int mmu_idx, uintptr_t ra)
 {
-CPUTLBEntryFull *full = p->full;
-vaddr addr = p->addr;
-int i, size = p->size;
-
-QEMU_IOTHREAD_LOCK_GUARD();
-for (i = 0; i < size; i++, val_le >>= 8) {
+for (int i = 0; i < size; i++, val_le >>= 8) {
 io_writex(env, full, mmu_idx, val_le, addr + i, ra, MO_UB);
 }
 return val_le;
@@ -2698,7 +2697,9 @@ static uint64_t do_st_leN(CPUArchState *env, 
MMULookupPageData *p,
 unsigned tmp, half_size;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-return do_st_mmio_leN(env, p, val_le, mmu_idx, ra);
+QEMU_IOTHREAD_LOCK_GUARD();
+return do_st_mmio_leN(env, p->full, val_le, p->addr,
+  p->size, mmu_idx, ra);
 } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) {
 return val_le >> (p->size * 8);
 }
@@ -2751,11 +2752,11 @@ static uint64_t do_st16_leN(CPUArchState *env, 
MMULookupPageData *p,

[PATCH v9 24/24] linux-user/elfload: Set V in ELF_HWCAP for RISC-V

2023-08-04 Thread Richard Henderson
From: Nathan Egge 

Set V bit for hwcap if misa is set.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1793
Signed-off-by: Nathan Egge 
Reviewed-by: Daniel Henrique Barboza 
Tested-by: Daniel Henrique Barboza 
Message-Id: <20230803131424.40744-1-ne...@xiph.org>
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 51591a1d94..c9e176a9f6 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1710,7 +1710,8 @@ static uint32_t get_elf_hwcap(void)
 #define MISA_BIT(EXT) (1 << (EXT - 'A'))
 RISCVCPU *cpu = RISCV_CPU(thread_cpu);
 uint32_t mask = MISA_BIT('I') | MISA_BIT('M') | MISA_BIT('A')
-| MISA_BIT('F') | MISA_BIT('D') | MISA_BIT('C');
+| MISA_BIT('F') | MISA_BIT('D') | MISA_BIT('C')
+| MISA_BIT('V');
 
 return cpu->env.misa_ext & mask;
 #undef MISA_BIT
-- 
2.34.1




[PATCH v9 09/24] linux-user: Do nothing if too small brk is specified

2023-08-04 Thread Richard Henderson
From: Akihiko Odaki 

Linux 6.4.7 does nothing when a value smaller than the initial brk is
specified.

Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Reviewed-by: Helge Deller 
Signed-off-by: Akihiko Odaki 
Message-Id: <20230802071754.14876-6-akihiko.od...@daynix.com>
Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f64024273f..e1436a3962 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -820,14 +820,14 @@ abi_long do_brk(abi_ulong brk_val)
 
 /* brk pointers are always untagged */
 
-/* return old brk value if brk_val unchanged or zero */
-if (!brk_val || brk_val == target_brk) {
+/* return old brk value if brk_val unchanged */
+if (brk_val == target_brk) {
 return target_brk;
 }
 
 /* do not allow to shrink below initial brk value */
 if (brk_val < initial_target_brk) {
-brk_val = initial_target_brk;
+return target_brk;
 }
 
 new_brk = TARGET_PAGE_ALIGN(brk_val);
-- 
2.34.1




[PATCH v9 19/24] linux-user: Properly set image_info.brk in flatload

2023-08-04 Thread Richard Henderson
The heap starts at "brk" not "start_brk".  With this fixed,
image_info.start_brk is unused and may be removed.

Tested-by: Helge Deller 
Reviewed-by: Helge Deller 
Reviewed-by: Akihiko Odaki 
Signed-off-by: Richard Henderson 
---
 linux-user/qemu.h | 1 -
 linux-user/flatload.c | 2 +-
 linux-user/main.c | 2 --
 3 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 4b0c9da0dc..4f8b55e2fb 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -29,7 +29,6 @@ struct image_info {
 abi_ulong   end_code;
 abi_ulong   start_data;
 abi_ulong   end_data;
-abi_ulong   start_brk;
 abi_ulong   brk;
 abi_ulong   start_mmap;
 abi_ulong   start_stack;
diff --git a/linux-user/flatload.c b/linux-user/flatload.c
index 5efec2630e..8f5e9f489b 100644
--- a/linux-user/flatload.c
+++ b/linux-user/flatload.c
@@ -811,7 +811,7 @@ int load_flt_binary(struct linux_binprm *bprm, struct 
image_info *info)
 info->end_code = libinfo[0].start_code + libinfo[0].text_len;
 info->start_data = libinfo[0].start_data;
 info->end_data = libinfo[0].end_data;
-info->start_brk = libinfo[0].start_brk;
+info->brk = libinfo[0].start_brk;
 info->start_stack = sp;
 info->stack_limit = libinfo[0].start_brk;
 info->entry = start_addr;
diff --git a/linux-user/main.c b/linux-user/main.c
index cb5e80612b..96be354897 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -963,8 +963,6 @@ int main(int argc, char **argv, char **envp)
 fprintf(f, "page layout changed following binary load\n");
 page_dump(f);
 
-fprintf(f, "start_brk   0x" TARGET_ABI_FMT_lx "\n",
-info->start_brk);
 fprintf(f, "end_code0x" TARGET_ABI_FMT_lx "\n",
 info->end_code);
 fprintf(f, "start_code  0x" TARGET_ABI_FMT_lx "\n",
-- 
2.34.1




[PATCH v9 13/24] linux-user: Adjust task_unmapped_base for reserved_va

2023-08-04 Thread Richard Henderson
Ensure that the chosen values for mmap_next_start and
task_unmapped_base are within the guest address space.

Tested-by: Helge Deller 
Reviewed-by: Akihiko Odaki 
Signed-off-by: Richard Henderson 
---
 linux-user/user-mmap.h | 18 +-
 linux-user/main.c  | 28 
 linux-user/mmap.c  | 18 +++---
 3 files changed, 48 insertions(+), 16 deletions(-)

diff --git a/linux-user/user-mmap.h b/linux-user/user-mmap.h
index 7265c2c116..fd456e024e 100644
--- a/linux-user/user-mmap.h
+++ b/linux-user/user-mmap.h
@@ -18,6 +18,23 @@
 #ifndef LINUX_USER_USER_MMAP_H
 #define LINUX_USER_USER_MMAP_H
 
+#if HOST_LONG_BITS == 64 && TARGET_ABI_BITS == 64
+#ifdef TARGET_AARCH64
+# define TASK_UNMAPPED_BASE  0x55
+#else
+# define TASK_UNMAPPED_BASE  (1ul << 38)
+#endif
+#else
+#ifdef TARGET_HPPA
+# define TASK_UNMAPPED_BASE  0xfa00
+#else
+# define TASK_UNMAPPED_BASE  0x4000
+#endif
+#endif
+
+extern abi_ulong task_unmapped_base;
+extern abi_ulong mmap_next_start;
+
 int target_mprotect(abi_ulong start, abi_ulong len, int prot);
 abi_long target_mmap(abi_ulong start, abi_ulong len, int prot,
  int flags, int fd, off_t offset);
@@ -26,7 +43,6 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
abi_ulong new_size, unsigned long flags,
abi_ulong new_addr);
 abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice);
-extern abi_ulong mmap_next_start;
 abi_ulong mmap_find_vma(abi_ulong, abi_ulong, abi_ulong);
 void mmap_fork_start(void);
 void mmap_fork_end(int child);
diff --git a/linux-user/main.c b/linux-user/main.c
index dba67ffa36..7ba7039988 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -821,6 +821,34 @@ int main(int argc, char **argv, char **envp)
 reserved_va = max_reserved_va;
 }
 
+/*
+ * Temporarily disable
+ *   "comparison is always false due to limited range of data type"
+ * due to comparison between (possible) uint64_t and uintptr_t.
+ */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wtype-limits"
+
+/*
+ * Select an initial value for task_unmapped_base that is in range.
+ */
+if (reserved_va) {
+if (TASK_UNMAPPED_BASE < reserved_va) {
+task_unmapped_base = TASK_UNMAPPED_BASE;
+} else {
+/* The most common default formula is TASK_SIZE / 3. */
+task_unmapped_base = TARGET_PAGE_ALIGN(reserved_va / 3);
+}
+} else if (TASK_UNMAPPED_BASE < UINTPTR_MAX) {
+task_unmapped_base = TASK_UNMAPPED_BASE;
+} else {
+/* 32-bit host: pick something medium size. */
+task_unmapped_base = 0x1000;
+}
+mmap_next_start = task_unmapped_base;
+
+#pragma GCC diagnostic pop
+
 {
 Error *err = NULL;
 if (seed_optarg != NULL) {
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index eb04fab8ab..84436d45c8 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -299,20 +299,8 @@ static bool mmap_frag(abi_ulong real_start, abi_ulong 
start, abi_ulong last,
 return true;
 }
 
-#if HOST_LONG_BITS == 64 && TARGET_ABI_BITS == 64
-#ifdef TARGET_AARCH64
-# define TASK_UNMAPPED_BASE  0x55
-#else
-# define TASK_UNMAPPED_BASE  (1ul << 38)
-#endif
-#else
-#ifdef TARGET_HPPA
-# define TASK_UNMAPPED_BASE  0xfa00
-#else
-# define TASK_UNMAPPED_BASE  0x4000
-#endif
-#endif
-abi_ulong mmap_next_start = TASK_UNMAPPED_BASE;
+abi_ulong task_unmapped_base;
+abi_ulong mmap_next_start;
 
 /*
  * Subroutine of mmap_find_vma, used when we have pre-allocated
@@ -391,7 +379,7 @@ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size, 
abi_ulong align)
 
 if ((addr & (align - 1)) == 0) {
 /* Success.  */
-if (start == mmap_next_start && addr >= TASK_UNMAPPED_BASE) {
+if (start == mmap_next_start && addr >= task_unmapped_base) {
 mmap_next_start = addr + size;
 }
 return addr;
-- 
2.34.1




[PATCH v9 05/24] linux-user: Unset MAP_FIXED_NOREPLACE for host

2023-08-04 Thread Richard Henderson
From: Akihiko Odaki 

Passing MAP_FIXED_NOREPLACE to host will fail for reserved_va because
the address space is reserved with mmap.  Replace it with MAP_FIXED
in that case.

Signed-off-by: Akihiko Odaki 
Message-Id: <20230802071754.14876-2-akihiko.od...@daynix.com>
[rth: Expand inline commentary.]
Reviewed-by: Richard Henderson 
Signed-off-by: Richard Henderson 
---
 linux-user/mmap.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index a5dfb56545..a11c630a7b 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -603,11 +603,26 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
target_prot,
 goto fail;
 }
 
-/* Validate that the chosen range is empty. */
-if ((flags & MAP_FIXED_NOREPLACE)
-&& !page_check_range_empty(start, last)) {
-errno = EEXIST;
-goto fail;
+if (flags & MAP_FIXED_NOREPLACE) {
+/* Validate that the chosen range is empty. */
+if (!page_check_range_empty(start, last)) {
+errno = EEXIST;
+goto fail;
+}
+
+/*
+ * With reserved_va, the entire address space is mmaped in the
+ * host to ensure it isn't accidentally used for something else.
+ * We have just checked that the guest address is not mapped
+ * within the guest, but need to replace the host reservation.
+ *
+ * Without reserved_va, despite the guest address check above,
+ * keep MAP_FIXED_NOREPLACE so that the guest does not overwrite
+ * any host address mappings.
+ */
+if (reserved_va) {
+flags = (flags & ~MAP_FIXED_NOREPLACE) | MAP_FIXED;
+}
 }
 
 /*
-- 
2.34.1




[PATCH v9 10/24] linux-user: Do not align brk with host page size

2023-08-04 Thread Richard Henderson
From: Akihiko Odaki 

do_brk() minimizes calls into target_mmap() by aligning the address
with host page size, which is potentially larger than the target page
size. However, the current implementation of this optimization has two
bugs:

- The start of brk is rounded up with the host page size while brk
  advertises an address aligned with the target page size as the
  beginning of brk. This makes the beginning of brk unmapped.
- Content clearing after mapping is flawed. The size to clear is
  specified as HOST_PAGE_ALIGN(brk_page) - brk_page, but brk_page is
  aligned with the host page size so it is always zero.

This optimization actually has no practical benefit. It makes difference
when brk() is called multiple times with values in a range of the host
page size. However, sophisticated memory allocators try to avoid to
make such frequent brk() calls. For example, glibc 2.37 calls brk() to
shrink the heap only when there is a room more than 128 KiB. It is
rare to have a page size larger than 128 KiB if it happens.

Let's remove the optimization to fix the bugs and make the code simpler.

Fixes: 86f04735ac ("linux-user: Fix brk() to release pages")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1616
Signed-off-by: Akihiko Odaki 
Message-Id: <20230802071754.14876-7-akihiko.od...@daynix.com>
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c |  4 ++--
 linux-user/syscall.c | 54 ++--
 2 files changed, 14 insertions(+), 44 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 861ec07abc..2aee2298ec 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3678,8 +3678,8 @@ int load_elf_binary(struct linux_binprm *bprm, struct 
image_info *info)
  * to mmap pages in this space.
  */
 if (info->reserve_brk) {
-abi_ulong start_brk = HOST_PAGE_ALIGN(info->brk);
-abi_ulong end_brk = HOST_PAGE_ALIGN(info->brk + info->reserve_brk);
+abi_ulong start_brk = TARGET_PAGE_ALIGN(info->brk);
+abi_ulong end_brk = TARGET_PAGE_ALIGN(info->brk + info->reserve_brk);
 target_munmap(start_brk, end_brk - start_brk);
 }
 
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index e1436a3962..7c2c2f6e2f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -802,81 +802,51 @@ static inline int host_to_target_sock_type(int host_type)
 }
 
 static abi_ulong target_brk, initial_target_brk;
-static abi_ulong brk_page;
 
 void target_set_brk(abi_ulong new_brk)
 {
 target_brk = TARGET_PAGE_ALIGN(new_brk);
 initial_target_brk = target_brk;
-brk_page = HOST_PAGE_ALIGN(target_brk);
 }
 
 /* do_brk() must return target values and target errnos. */
 abi_long do_brk(abi_ulong brk_val)
 {
 abi_long mapped_addr;
-abi_ulong new_alloc_size;
-abi_ulong new_brk, new_host_brk_page;
+abi_ulong new_brk;
+abi_ulong old_brk;
 
 /* brk pointers are always untagged */
 
-/* return old brk value if brk_val unchanged */
-if (brk_val == target_brk) {
-return target_brk;
-}
-
 /* do not allow to shrink below initial brk value */
 if (brk_val < initial_target_brk) {
 return target_brk;
 }
 
 new_brk = TARGET_PAGE_ALIGN(brk_val);
-new_host_brk_page = HOST_PAGE_ALIGN(brk_val);
+old_brk = TARGET_PAGE_ALIGN(target_brk);
 
-/* brk_val and old target_brk might be on the same page */
-if (new_brk == TARGET_PAGE_ALIGN(target_brk)) {
-/* empty remaining bytes in (possibly larger) host page */
-memset(g2h_untagged(new_brk), 0, new_host_brk_page - new_brk);
+/* new and old target_brk might be on the same page */
+if (new_brk == old_brk) {
 target_brk = brk_val;
 return target_brk;
 }
 
 /* Release heap if necesary */
-if (new_brk < target_brk) {
-/* empty remaining bytes in (possibly larger) host page */
-memset(g2h_untagged(new_brk), 0, new_host_brk_page - new_brk);
-
-/* free unused host pages and set new brk_page */
-target_munmap(new_host_brk_page, brk_page - new_host_brk_page);
-brk_page = new_host_brk_page;
+if (new_brk < old_brk) {
+target_munmap(new_brk, old_brk - new_brk);
 
 target_brk = brk_val;
 return target_brk;
 }
 
-if (new_host_brk_page > brk_page) {
-new_alloc_size = new_host_brk_page - brk_page;
-mapped_addr = target_mmap(brk_page, new_alloc_size,
-  PROT_READ | PROT_WRITE,
-  MAP_FIXED_NOREPLACE | MAP_ANON | MAP_PRIVATE,
-  -1, 0);
-} else {
-new_alloc_size = 0;
-mapped_addr = brk_page;
-}
-
-if (mapped_addr == brk_page) {
-/* Heap contents are initialized to zero, as for anonymous
- * mapped pages.  Technically the new pages are already
- * initialized to zero since they *are* anonymous mapped
- * pages, 

[PATCH v9 08/24] linux-user: Use MAP_FIXED_NOREPLACE for do_brk()

2023-08-04 Thread Richard Henderson
From: Akihiko Odaki 

MAP_FIXED_NOREPLACE can ensure the mapped address is fixed without
concerning that the new mapping overwrites something else.

Signed-off-by: Akihiko Odaki 
Message-Id: <20230802071754.14876-5-akihiko.od...@daynix.com>
[rth: Pass -1 as fd for MAP_ANON]
Reviewed-by: Richard Henderson 
Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index b9d2ec02f9..f64024273f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -854,17 +854,12 @@ abi_long do_brk(abi_ulong brk_val)
 return target_brk;
 }
 
-/* We need to allocate more memory after the brk... Note that
- * we don't use MAP_FIXED because that will map over the top of
- * any existing mapping (like the one with the host libc or qemu
- * itself); instead we treat "mapped but at wrong address" as
- * a failure and unmap again.
- */
 if (new_host_brk_page > brk_page) {
 new_alloc_size = new_host_brk_page - brk_page;
 mapped_addr = target_mmap(brk_page, new_alloc_size,
-  PROT_READ|PROT_WRITE,
-  MAP_ANON|MAP_PRIVATE, 0, 0);
+  PROT_READ | PROT_WRITE,
+  MAP_FIXED_NOREPLACE | MAP_ANON | MAP_PRIVATE,
+  -1, 0);
 } else {
 new_alloc_size = 0;
 mapped_addr = brk_page;
@@ -883,12 +878,6 @@ abi_long do_brk(abi_ulong brk_val)
 target_brk = brk_val;
 brk_page = new_host_brk_page;
 return target_brk;
-} else if (mapped_addr != -1) {
-/* Mapped but at wrong address, meaning there wasn't actually
- * enough space for this brk.
- */
-target_munmap(mapped_addr, new_alloc_size);
-mapped_addr = -1;
 }
 
 #if defined(TARGET_ALPHA)
-- 
2.34.1




[PATCH v9 06/24] linux-user: Fix MAP_FIXED_NOREPLACE on old kernels

2023-08-04 Thread Richard Henderson
From: Akihiko Odaki 

The man page states:
> Note that older kernels which do not recognize the MAP_FIXED_NOREPLACE
> flag will typically (upon detecting a collision with a preexisting
> mapping) fall back to a “non-MAP_FIXED” type of behavior: they will
> return an address that is different from the requested address.
> Therefore, backward-compatible software should check the returned
> address against the requested address.
https://man7.org/linux/man-pages/man2/mmap.2.html

Signed-off-by: Akihiko Odaki 
Message-Id: <20230802071754.14876-3-akihiko.od...@daynix.com>
Reviewed-by: Richard Henderson 
Signed-off-by: Richard Henderson 
---
 linux-user/mmap.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index a11c630a7b..90b3ef2140 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -263,7 +263,11 @@ static bool mmap_frag(abi_ulong real_start, abi_ulong 
start, abi_ulong last,
 void *p = mmap(host_start, qemu_host_page_size,
target_to_host_prot(prot),
flags | MAP_ANONYMOUS, -1, 0);
-if (p == MAP_FAILED) {
+if (p != host_start) {
+if (p != MAP_FAILED) {
+munmap(p, qemu_host_page_size);
+errno = EEXIST;
+}
 return false;
 }
 prot_old = prot;
@@ -687,17 +691,25 @@ abi_long target_mmap(abi_ulong start, abi_ulong len, int 
target_prot,
 
 /* map the middle (easier) */
 if (real_start < real_last) {
-void *p;
+void *p, *want_p;
 off_t offset1;
+size_t len1;
 
 if (flags & MAP_ANONYMOUS) {
 offset1 = 0;
 } else {
 offset1 = offset + real_start - start;
 }
-p = mmap(g2h_untagged(real_start), real_last - real_start + 1,
- target_to_host_prot(target_prot), flags, fd, offset1);
-if (p == MAP_FAILED) {
+len1 = real_last - real_start + 1;
+want_p = g2h_untagged(real_start);
+
+p = mmap(want_p, len1, target_to_host_prot(target_prot),
+ flags, fd, offset1);
+if (p != want_p) {
+if (p != MAP_FAILED) {
+munmap(p, len1);
+errno = EEXIST;
+}
 goto fail;
 }
 passthrough_start = real_start;
-- 
2.34.1




[PATCH v9 16/24] linux-user: Use MAP_FIXED_NOREPLACE for initial image mmap

2023-08-04 Thread Richard Henderson
Use this as extra protection for the guest mapping over
any qemu host mappings.

Tested-by: Helge Deller 
Reviewed-by: Helge Deller 
Reviewed-by: Akihiko Odaki 
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 2aee2298ec..0c64aad8a5 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3146,8 +3146,11 @@ static void load_elf_image(const char *image_name, int 
image_fd,
 /*
  * Reserve address space for all of this.
  *
- * In the case of ET_EXEC, we supply MAP_FIXED so that we get
- * exactly the address range that is required.
+ * In the case of ET_EXEC, we supply MAP_FIXED_NOREPLACE so that we get
+ * exactly the address range that is required.  Without reserved_va,
+ * the guest address space is not isolated.  We have attempted to avoid
+ * conflict with the host program itself via probe_guest_base, but using
+ * MAP_FIXED_NOREPLACE instead of MAP_FIXED provides an extra check.
  *
  * Otherwise this is ET_DYN, and we are searching for a location
  * that can hold the memory space required.  If the image is
@@ -3159,7 +3162,7 @@ static void load_elf_image(const char *image_name, int 
image_fd,
  */
 load_addr = target_mmap(loaddr, (size_t)hiaddr - loaddr + 1, PROT_NONE,
 MAP_PRIVATE | MAP_ANON | MAP_NORESERVE |
-(ehdr->e_type == ET_EXEC ? MAP_FIXED : 0),
+(ehdr->e_type == ET_EXEC ? MAP_FIXED_NOREPLACE : 
0),
 -1, 0);
 if (load_addr == -1) {
 goto exit_mmap;
-- 
2.34.1




[PATCH v9 03/24] accel/tcg: Do not issue misaligned i/o

2023-08-04 Thread Richard Henderson
In the single-page case we were issuing misaligned i/o to
the memory subsystem, which does not handle it properly.
Split such accesses via do_{ld,st}_mmio_*.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1800
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 118 +++--
 1 file changed, 72 insertions(+), 46 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index a308cb7534..4b1bfaa53d 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2370,16 +2370,20 @@ static uint8_t do_ld_1(CPUArchState *env, 
MMULookupPageData *p, int mmu_idx,
 static uint16_t do_ld_2(CPUArchState *env, MMULookupPageData *p, int mmu_idx,
 MMUAccessType type, MemOp memop, uintptr_t ra)
 {
-uint64_t ret;
+uint16_t ret;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
-}
-
-/* Perform the load host endian, then swap if necessary. */
-ret = load_atom_2(env, ra, p->haddr, memop);
-if (memop & MO_BSWAP) {
-ret = bswap16(ret);
+QEMU_IOTHREAD_LOCK_GUARD();
+ret = do_ld_mmio_beN(env, p->full, 0, p->addr, 2, mmu_idx, type, ra);
+if ((memop & MO_BSWAP) == MO_LE) {
+ret = bswap16(ret);
+}
+} else {
+/* Perform the load host endian, then swap if necessary. */
+ret = load_atom_2(env, ra, p->haddr, memop);
+if (memop & MO_BSWAP) {
+ret = bswap16(ret);
+}
 }
 return ret;
 }
@@ -2390,13 +2394,17 @@ static uint32_t do_ld_4(CPUArchState *env, 
MMULookupPageData *p, int mmu_idx,
 uint32_t ret;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
-}
-
-/* Perform the load host endian. */
-ret = load_atom_4(env, ra, p->haddr, memop);
-if (memop & MO_BSWAP) {
-ret = bswap32(ret);
+QEMU_IOTHREAD_LOCK_GUARD();
+ret = do_ld_mmio_beN(env, p->full, 0, p->addr, 4, mmu_idx, type, ra);
+if ((memop & MO_BSWAP) == MO_LE) {
+ret = bswap32(ret);
+}
+} else {
+/* Perform the load host endian. */
+ret = load_atom_4(env, ra, p->haddr, memop);
+if (memop & MO_BSWAP) {
+ret = bswap32(ret);
+}
 }
 return ret;
 }
@@ -2407,13 +2415,17 @@ static uint64_t do_ld_8(CPUArchState *env, 
MMULookupPageData *p, int mmu_idx,
 uint64_t ret;
 
 if (unlikely(p->flags & TLB_MMIO)) {
-return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop);
-}
-
-/* Perform the load host endian. */
-ret = load_atom_8(env, ra, p->haddr, memop);
-if (memop & MO_BSWAP) {
-ret = bswap64(ret);
+QEMU_IOTHREAD_LOCK_GUARD();
+ret = do_ld_mmio_beN(env, p->full, 0, p->addr, 8, mmu_idx, type, ra);
+if ((memop & MO_BSWAP) == MO_LE) {
+ret = bswap64(ret);
+}
+} else {
+/* Perform the load host endian. */
+ret = load_atom_8(env, ra, p->haddr, memop);
+if (memop & MO_BSWAP) {
+ret = bswap64(ret);
+}
 }
 return ret;
 }
@@ -2561,20 +2573,22 @@ static Int128 do_ld16_mmu(CPUArchState *env, vaddr addr,
 cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD, );
 if (likely(!crosspage)) {
-/* Perform the load host endian. */
 if (unlikely(l.page[0].flags & TLB_MMIO)) {
 QEMU_IOTHREAD_LOCK_GUARD();
-a = io_readx(env, l.page[0].full, l.mmu_idx, addr,
- ra, MMU_DATA_LOAD, MO_64);
-b = io_readx(env, l.page[0].full, l.mmu_idx, addr + 8,
- ra, MMU_DATA_LOAD, MO_64);
-ret = int128_make128(HOST_BIG_ENDIAN ? b : a,
- HOST_BIG_ENDIAN ? a : b);
+a = do_ld_mmio_beN(env, l.page[0].full, 0, addr, 8,
+   l.mmu_idx, MMU_DATA_LOAD, ra);
+b = do_ld_mmio_beN(env, l.page[0].full, 0, addr + 8, 8,
+   l.mmu_idx, MMU_DATA_LOAD, ra);
+ret = int128_make128(b, a);
+if ((l.memop & MO_BSWAP) == MO_LE) {
+ret = bswap128(ret);
+}
 } else {
+/* Perform the load host endian. */
 ret = load_atom_16(env, ra, l.page[0].haddr, l.memop);
-}
-if (l.memop & MO_BSWAP) {
-ret = bswap128(ret);
+if (l.memop & MO_BSWAP) {
+ret = bswap128(ret);
+}
 }
 return ret;
 }
@@ -2874,7 +2888,11 @@ static void do_st_2(CPUArchState *env, MMULookupPageData 
*p, uint16_t val,
 int mmu_idx, MemOp memop, uintptr_t ra)
 {
 if (unlikely(p->flags & TLB_MMIO)) {
-io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop);
+if 

[PATCH v9 11/24] linux-user: Remove last_brk

2023-08-04 Thread Richard Henderson
This variable is unused.

Reviewed-by: Helge Deller 
Reviewed-by: Akihiko Odaki 
Signed-off-by: Richard Henderson 
---
 linux-user/user-mmap.h | 1 -
 linux-user/mmap.c  | 2 --
 2 files changed, 3 deletions(-)

diff --git a/linux-user/user-mmap.h b/linux-user/user-mmap.h
index 3fc986f92f..7265c2c116 100644
--- a/linux-user/user-mmap.h
+++ b/linux-user/user-mmap.h
@@ -26,7 +26,6 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
abi_ulong new_size, unsigned long flags,
abi_ulong new_addr);
 abi_long target_madvise(abi_ulong start, abi_ulong len_in, int advice);
-extern unsigned long last_brk;
 extern abi_ulong mmap_next_start;
 abi_ulong mmap_find_vma(abi_ulong, abi_ulong, abi_ulong);
 void mmap_fork_start(void);
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 90b3ef2140..eb04fab8ab 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -314,8 +314,6 @@ static bool mmap_frag(abi_ulong real_start, abi_ulong 
start, abi_ulong last,
 #endif
 abi_ulong mmap_next_start = TASK_UNMAPPED_BASE;
 
-unsigned long last_brk;
-
 /*
  * Subroutine of mmap_find_vma, used when we have pre-allocated
  * a chunk of guest address space.
-- 
2.34.1




[PATCH v9 04/24] gdbstub: use 0 ("any process") on packets with no PID

2023-08-04 Thread Richard Henderson
From: Matheus Tavares Bernardino 

Previously, qemu-user would always report PID 1 to GDB. This was changed
at dc14a7a6e9 (gdbstub: Report the actual qemu-user pid, 2023-06-30),
but read_thread_id() still considers GDB packets with "no PID" as "PID
1", which is not the qemu-user PID. Fix that by parsing "no PID" as "0",
which the GDB Remote Protocol defines as "any process".

Note that this should have no effect for system emulation as, in this
case, gdb_create_default_process() will assign PID 1 for the first
process and that is what the gdbstub uses for GDB requests with no PID,
or PID 0.

This issue was found with hexagon-lldb, which sends a "Hg" packet with
only the thread-id, but no process-id, leading to the invalid usage of
"PID 1" by qemu-hexagon and a subsequent "E22" reply.

Signed-off-by: Matheus Tavares Bernardino 
Acked-by: Ilya Leoshkevich 
Message-Id: 
<78a3b06f6ab90a7ff8e73ae14a996eb27ec76c85.1690904195.git.quic_mathb...@quicinc.com>
Reviewed-by: Richard Henderson 
Signed-off-by: Richard Henderson 
---
 gdbstub/gdbstub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gdbstub/gdbstub.c b/gdbstub/gdbstub.c
index ce8b42eb15..e74ecc78cc 100644
--- a/gdbstub/gdbstub.c
+++ b/gdbstub/gdbstub.c
@@ -537,7 +537,7 @@ static GDBThreadIdKind read_thread_id(const char *buf, 
const char **end_buf,
 /* Skip '.' */
 buf++;
 } else {
-p = 1;
+p = 0;
 }
 
 ret = qemu_strtoul(buf, , 16, );
-- 
2.34.1




[PATCH 3/7] tcg/ppc: Use prefixed instructions in tcg_out_mem_long

2023-08-04 Thread Richard Henderson
When the offset is out of range of the non-prefixed insn, but
fits the 34-bit immediate of the prefixed insn, use that.

Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.c.inc | 66 
 1 file changed, 66 insertions(+)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 7fa2a2500b..d41c499b7d 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -323,6 +323,15 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct)
 #define STDX   XO31(149)
 #define STQXO62(  2)
 
+#define PLWA   OPCD( 41)
+#define PLDOPCD( 57)
+#define PLXSD  OPCD( 42)
+#define PLXV   OPCD(25 * 2 + 1)  /* force tx=1 */
+
+#define PSTD   OPCD( 61)
+#define PSTXSD OPCD( 46)
+#define PSTXV  OPCD(27 * 2 + 1)  /* force tx=1 */
+
 #define ADDIC  OPCD( 12)
 #define ADDI   OPCD( 14)
 #define ADDIS  OPCD( 15)
@@ -720,6 +729,20 @@ static void tcg_out_prefix_align(TCGContext *s)
 }
 }
 
+/* Output Type 00 Prefix - 8-Byte Load/Store Form (8LS:D) */
+static void tcg_out_8ls_d(TCGContext *s, tcg_insn_unit opc, unsigned rt,
+  unsigned ra, tcg_target_long imm, bool r)
+{
+tcg_insn_unit p, i;
+
+p = OPCD(1) | (r << 20) | ((imm >> 16) & 0x3);
+i = opc | TAI(rt, ra, imm);
+
+tcg_out_prefix_align(s);
+tcg_out32(s, p);
+tcg_out32(s, i);
+}
+
 /* Output Type 10 Prefix - Modified Load/Store Form (MLS:D) */
 static void tcg_out_mls_d(TCGContext *s, tcg_insn_unit opc, unsigned rt,
   unsigned ra, tcg_target_long imm, bool r)
@@ -1364,6 +1387,49 @@ static void tcg_out_mem_long(TCGContext *s, int opi, int 
opx, TCGReg rt,
 break;
 }
 
+/* For unaligned or large offsets, use the prefixed form. */
+if (have_isa_3_10
+&& (offset != (int16_t)offset || (offset & align))
+&& offset == sextract64(offset, 0, 34)) {
+/*
+ * Note that the MLS:D insns retain their un-prefixed opcode,
+ * while the 8LS:D insns use a different opcode space.
+ */
+switch (opi) {
+case LBZ:
+case LHZ:
+case LHA:
+case LWZ:
+case STB:
+case STH:
+case STW:
+case ADDI:
+tcg_out_mls_d(s, opi, rt, base, offset, 0);
+return;
+case LWA:
+tcg_out_8ls_d(s, PLWA, rt, base, offset, 0);
+return;
+case LD:
+tcg_out_8ls_d(s, PLD, rt, base, offset, 0);
+return;
+case STD:
+tcg_out_8ls_d(s, PSTD, rt, base, offset, 0);
+return;
+case LXSD:
+tcg_out_8ls_d(s, PLXSD, rt & 31, base, offset, 0);
+return;
+case STXSD:
+tcg_out_8ls_d(s, PSTXSD, rt & 31, base, offset, 0);
+return;
+case LXV:
+tcg_out_8ls_d(s, PLXV, rt & 31, base, offset, 0);
+return;
+case STXV:
+tcg_out_8ls_d(s, PSTXV, rt & 31, base, offset, 0);
+return;
+}
+}
+
 /* For unaligned, or very large offsets, use the indexed form.  */
 if (offset & align || offset != (int32_t)offset || opi == 0) {
 if (rs == base) {
-- 
2.34.1




[PATCH 2/7] tcg/ppc: Use PADDI in tcg_out_movi

2023-08-04 Thread Richard Henderson
PADDI can load 34-bit immediates and 34-bit pc-relative addresses.

Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.c.inc | 47 
 1 file changed, 47 insertions(+)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 642d0fd128..7fa2a2500b 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -707,6 +707,33 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 return true;
 }
 
+/* Ensure that the prefixed instruction does not cross a 64-byte boundary. */
+static bool tcg_out_need_prefix_align(TCGContext *s)
+{
+return ((uintptr_t)s->code_ptr & 0x3f) == 0x3c;
+}
+
+static void tcg_out_prefix_align(TCGContext *s)
+{
+if (tcg_out_need_prefix_align(s)) {
+tcg_out32(s, NOP);
+}
+}
+
+/* Output Type 10 Prefix - Modified Load/Store Form (MLS:D) */
+static void tcg_out_mls_d(TCGContext *s, tcg_insn_unit opc, unsigned rt,
+  unsigned ra, tcg_target_long imm, bool r)
+{
+tcg_insn_unit p, i;
+
+p = OPCD(1) | (2 << 24) | (r << 20) | ((imm >> 16) & 0x3);
+i = opc | TAI(rt, ra, imm);
+
+tcg_out_prefix_align(s);
+tcg_out32(s, p);
+tcg_out32(s, i);
+}
+
 static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt,
  TCGReg base, tcg_target_long offset);
 
@@ -992,6 +1019,26 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 return;
 }
 
+/*
+ * Load values up to 34 bits, and pc-relative addresses,
+ * with one prefixed insn.
+ */
+if (have_isa_3_10) {
+if (arg == sextract64(arg, 0, 34)) {
+/* pli ret,value = paddi ret,0,value,0 */
+tcg_out_mls_d(s, ADDI, ret, 0, arg, 0);
+return;
+}
+
+tmp = tcg_out_need_prefix_align(s) * 4;
+tmp = tcg_pcrel_diff(s, (void *)arg) - tmp;
+if (tmp == sextract64(tmp, 0, 34)) {
+/* pla ret,value = paddi ret,0,value,1 */
+tcg_out_mls_d(s, ADDI, ret, 0, tmp, 1);
+return;
+}
+}
+
 /* Load 32-bit immediates with two insns.  Note that we've already
eliminated bare ADDIS, so we know both insns are required.  */
 if (TCG_TARGET_REG_BITS == 32 || arg == (int32_t)arg) {
-- 
2.34.1




[PATCH 1/7] tcg/ppc: Untabify tcg-target.c.inc

2023-08-04 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.c.inc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 511e14b180..642d0fd128 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -221,7 +221,7 @@ static inline bool in_range_b(tcg_target_long target)
 }
 
 static uint32_t reloc_pc24_val(const tcg_insn_unit *pc,
-  const tcg_insn_unit *target)
+   const tcg_insn_unit *target)
 {
 ptrdiff_t disp = tcg_ptr_byte_diff(target, pc);
 tcg_debug_assert(in_range_b(disp));
@@ -241,7 +241,7 @@ static bool reloc_pc24(tcg_insn_unit *src_rw, const 
tcg_insn_unit *target)
 }
 
 static uint16_t reloc_pc14_val(const tcg_insn_unit *pc,
-  const tcg_insn_unit *target)
+   const tcg_insn_unit *target)
 {
 ptrdiff_t disp = tcg_ptr_byte_diff(target, pc);
 tcg_debug_assert(disp == (int16_t) disp);
@@ -3587,7 +3587,7 @@ static void expand_vec_mul(TCGType type, unsigned vece, 
TCGv_vec v0,
   tcgv_vec_arg(t1), tcgv_vec_arg(t2));
 vec_gen_3(INDEX_op_ppc_pkum_vec, type, vece, tcgv_vec_arg(v0),
   tcgv_vec_arg(v0), tcgv_vec_arg(t1));
-   break;
+break;
 
 case MO_32:
 tcg_debug_assert(!have_isa_2_07);
-- 
2.34.1




[PATCH for-8.2 0/7] tcg/ppc: Support power10 prefixed instructions

2023-08-04 Thread Richard Henderson
Emit one 64-bit instruction for large constants and pc-relatives.
With pc-relative addressing, we don't need REG_TB, which means we
can re-enable direct branching for goto_tb.


r~


Richard Henderson (7):
  tcg/ppc: Untabify tcg-target.c.inc
  tcg/ppc: Use PADDI in tcg_out_movi
  tcg/ppc: Use prefixed instructions in tcg_out_mem_long
  tcg/ppc: Use PLD in tcg_out_movi for constant pool
  tcg/ppc: Use prefixed instructions in tcg_out_dupi_vec
  tcg/ppc: Disable USE_REG_TB for Power v3.1
  tcg/ppc: Use prefixed instructions for tcg_out_goto_tb

 tcg/ppc/tcg-target.c.inc | 233 +++
 1 file changed, 211 insertions(+), 22 deletions(-)

-- 
2.34.1




[PATCH 6/7] tcg/ppc: Disable USE_REG_TB for Power v3.1

2023-08-04 Thread Richard Henderson
With Power v3.1, we have pc-relative addressing and so
do not require a register holding the current TB.

Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index e8eced7cf3..5b243b2353 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -83,7 +83,7 @@
 #define TCG_VEC_TMP2TCG_REG_V1
 
 #define TCG_REG_TB TCG_REG_R31
-#define USE_REG_TB (TCG_TARGET_REG_BITS == 64)
+#define USE_REG_TB (TCG_TARGET_REG_BITS == 64 && !have_isa_3_10)
 
 /* Shorthand for size of a pointer.  Avoid promotion to unsigned.  */
 #define SZP  ((int)sizeof(void *))
-- 
2.34.1




[PATCH 4/7] tcg/ppc: Use PLD in tcg_out_movi for constant pool

2023-08-04 Thread Richard Henderson
The prefixed instruction has a pc-relative form to use here.

Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.c.inc | 24 
 1 file changed, 24 insertions(+)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index d41c499b7d..a9e48a51c8 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -101,6 +101,10 @@
 #define ALL_GENERAL_REGS  0xu
 #define ALL_VECTOR_REGS   0xull
 
+#ifndef R_PPC64_PCREL34
+#define R_PPC64_PCREL34  132
+#endif
+
 #define have_isel  (cpuinfo & CPUINFO_ISEL)
 
 #ifndef CONFIG_SOFTMMU
@@ -260,6 +264,19 @@ static bool reloc_pc14(tcg_insn_unit *src_rw, const 
tcg_insn_unit *target)
 return false;
 }
 
+static bool reloc_pc34(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
+{
+const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+ptrdiff_t disp = tcg_ptr_byte_diff(target, src_rx);
+
+if (disp == sextract64(disp, 0, 34)) {
+src_rw[0] = (src_rw[0] & ~0x3) | ((disp >> 16) & 0x3);
+src_rw[1] = (src_rw[1] & ~0x) | (disp & 0x);
+return true;
+}
+return false;
+}
+
 /* test if a constant matches the constraint */
 static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
 {
@@ -684,6 +701,8 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
 return reloc_pc14(code_ptr, target);
 case R_PPC_REL24:
 return reloc_pc24(code_ptr, target);
+case R_PPC64_PCREL34:
+return reloc_pc34(code_ptr, target);
 case R_PPC_ADDR16:
 /*
  * We are (slightly) abusing this relocation type.  In particular,
@@ -1107,6 +1126,11 @@ static void tcg_out_movi_int(TCGContext *s, TCGType 
type, TCGReg ret,
 }
 
 /* Use the constant pool, if possible.  */
+if (have_isa_3_10) {
+tcg_out_8ls_d(s, PLD, ret, 0, 0, 1);
+new_pool_label(s, arg, R_PPC64_PCREL34, s->code_ptr - 2, 0);
+return;
+}
 if (!in_prologue && USE_REG_TB) {
 new_pool_label(s, arg, R_PPC_ADDR16, s->code_ptr,
tcg_tbrel_diff(s, NULL));
-- 
2.34.1




[PATCH 5/7] tcg/ppc: Use prefixed instructions in tcg_out_dupi_vec

2023-08-04 Thread Richard Henderson
The prefixed instructions have a pc-relative form to use here.

Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.c.inc | 12 
 1 file changed, 12 insertions(+)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index a9e48a51c8..e8eced7cf3 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -1191,6 +1191,18 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType 
type, unsigned vece,
 /*
  * Otherwise we must load the value from the constant pool.
  */
+
+if (have_isa_3_10) {
+if (type == TCG_TYPE_V64) {
+tcg_out_8ls_d(s, PLXSD, ret & 31, 0, 0, 1);
+new_pool_label(s, val, R_PPC64_PCREL34, s->code_ptr - 2, 0);
+} else {
+tcg_out_8ls_d(s, PLXV, ret & 31, 0, 0, 1);
+new_pool_l2(s, R_PPC64_PCREL34, s->code_ptr - 2, 0, val, val);
+}
+return;
+}
+
 if (USE_REG_TB) {
 rel = R_PPC_ADDR16;
 add = tcg_tbrel_diff(s, NULL);
-- 
2.34.1




[PATCH 7/7] tcg/ppc: Use prefixed instructions for tcg_out_goto_tb

2023-08-04 Thread Richard Henderson
When a direct branch is out of range, we can load the destination for
the indirect branch using PLA (for 16GB worth of buffer) and PLD from
the TranslationBlock for everything larger.

This means the patch affects exactly one instruction: B (plus filler),
PLA or PLD.  Which means we can update and execute the patch atomically.

Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.c.inc | 76 ++--
 1 file changed, 58 insertions(+), 18 deletions(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 5b243b2353..47c71bb5f2 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2642,31 +2642,41 @@ static void tcg_out_goto_tb(TCGContext *s, int which)
 uintptr_t ptr = get_jmp_target_addr(s, which);
 
 if (USE_REG_TB) {
+/*
+ * With REG_TB, we must always use indirect branching,
+ * so that the branch destination and TCG_REG_TB match.
+ */
 ptrdiff_t offset = tcg_tbrel_diff(s, (void *)ptr);
 tcg_out_mem_long(s, LD, LDX, TCG_REG_TB, TCG_REG_TB, offset);
-
-/* TODO: Use direct branches when possible. */
-set_jmp_insn_offset(s, which);
 tcg_out32(s, MTSPR | RS(TCG_REG_TB) | CTR);
-
 tcg_out32(s, BCCTR | BO_ALWAYS);
 
 /* For the unlinked case, need to reset TCG_REG_TB.  */
 set_jmp_reset_offset(s, which);
 tcg_out_mem_long(s, ADDI, ADD, TCG_REG_TB, TCG_REG_TB,
  -tcg_current_code_size(s));
+return;
+}
+
+if (have_isa_3_10) {
+/* Align, so that we can patch 8 bytes atomically. */
+if ((uintptr_t)s->code_ptr & 7) {
+tcg_out32(s, NOP);
+}
+set_jmp_insn_offset(s, which);
+/* Direct branch will be patched by tb_target_set_jmp_target. */
+tcg_out_mls_d(s, ADDI, TCG_REG_TMP1, 0, 0, 1);
 } else {
 /* Direct branch will be patched by tb_target_set_jmp_target. */
-set_jmp_insn_offset(s, which);
-tcg_out32(s, NOP);
-
+tcg_out32(s, B);
 /* When branch is out of range, fall through to indirect. */
 tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP1, ptr - (int16_t)ptr);
 tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_REG_TMP1, (int16_t)ptr);
-tcg_out32(s, MTSPR | RS(TCG_REG_TMP1) | CTR);
-tcg_out32(s, BCCTR | BO_ALWAYS);
-set_jmp_reset_offset(s, which);
 }
+
+tcg_out32(s, MTSPR | RS(TCG_REG_TMP1) | CTR);
+tcg_out32(s, BCCTR | BO_ALWAYS);
+set_jmp_reset_offset(s, which);
 }
 
 void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
@@ -2674,20 +2684,50 @@ void tb_target_set_jmp_target(const TranslationBlock 
*tb, int n,
 {
 uintptr_t addr = tb->jmp_target_addr[n];
 intptr_t diff = addr - jmp_rx;
-tcg_insn_unit insn;
 
 if (USE_REG_TB) {
 return;
 }
 
-if (in_range_b(diff)) {
-insn = B | (diff & 0x3fc);
-} else {
-insn = NOP;
-}
+if (have_isa_3_10) {
+tcg_insn_unit insn1, insn2;
+uint64_t pair;
 
-qatomic_set((uint32_t *)jmp_rw, insn);
-flush_idcache_range(jmp_rx, jmp_rw, 4);
+if (in_range_b(diff)) {
+insn1 = B | (diff & 0x3fc);
+insn2 = NOP;
+} else if (diff == sextract64(diff, 0, 34)) {
+/* PLA tmp1, diff */
+insn1 = OPCD(1) | (2 << 24) | (1 << 20) | ((diff >> 16) & 0x3);
+insn2 = ADDI | TAI(TCG_REG_TMP1, 0, diff);
+} else {
+addr = (uintptr_t)>jmp_target_addr[n];
+diff = addr - jmp_rx;
+tcg_debug_assert(diff == sextract64(diff, 0, 34));
+/* PLD tmp1, diff */
+insn1 = OPCD(1) | (1 << 20) | ((diff >> 16) & 0x3);
+insn2 = PLD | TAI(TCG_REG_TMP1, 0, diff);
+}
+
+if (HOST_BIG_ENDIAN) {
+pair = ((uint64_t)insn1) << 32 | insn2;
+} else {
+pair = ((uint64_t)insn2) << 32 | insn1;
+}
+
+qatomic_set((uint64_t *)jmp_rw, pair);
+flush_idcache_range(jmp_rx, jmp_rw, 8);
+} else {
+tcg_insn_unit insn;
+
+if (in_range_b(diff)) {
+insn = B | (diff & 0x3fc);
+} else {
+insn = NOP;
+}
+qatomic_set((uint32_t *)jmp_rw, insn);
+flush_idcache_range(jmp_rx, jmp_rw, 4);
+}
 }
 
 static void tcg_out_op(TCGContext *s, TCGOpcode opc,
-- 
2.34.1




[PATCH] print memory in MB units in initrd-too-large errmsg

2023-08-04 Thread Jim Cromie
Change 2 error messages to display sizes in MB, not bytes.

qemu: initrd is too large, cannot support this. (max: 2047 MB, need 5833 MB)

Also, distinguish 2 sites by adding "it" and "this" respectively.
This tells a careful reader that the error above is from the 2nd size
check.

With MB displayed, I have to ask: is it coincidence that max == 2048-1 ?

Signed-off-by: Jim Cromie 
---
 hw/i386/x86.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index a88a126123..0677fe2fd1 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -878,9 +878,9 @@ void x86_load_linux(X86MachineState *x86ms,
 initrd_size = g_mapped_file_get_length(mapped_file);
 initrd_max = x86ms->below_4g_mem_size - acpi_data_size - 1;
 if (initrd_size >= initrd_max) {
-fprintf(stderr, "qemu: initrd is too large, cannot 
support."
-"(max: %"PRIu32", need %"PRId64")\n",
-initrd_max, (uint64_t)initrd_size);
+fprintf(stderr, "qemu: initrd is too large, cannot support 
it. "
+"(max: %"PRIu32" MB, need %"PRId64" MB)\n",
+initrd_max>>20, (uint64_t)initrd_size>>20);
 exit(1);
 }
 
@@ -1023,9 +1023,9 @@ void x86_load_linux(X86MachineState *x86ms,
 initrd_data = g_mapped_file_get_contents(mapped_file);
 initrd_size = g_mapped_file_get_length(mapped_file);
 if (initrd_size >= initrd_max) {
-fprintf(stderr, "qemu: initrd is too large, cannot support."
-"(max: %"PRIu32", need %"PRId64")\n",
-initrd_max, (uint64_t)initrd_size);
+fprintf(stderr, "qemu: initrd is too large, cannot support this. "
+"(max: %"PRIu32" MB, need %"PRId64" MB)\n",
+initrd_max>>20, (uint64_t)initrd_size>>20);
 exit(1);
 }
 
-- 
2.41.0




Re: [PATCH for-8.2 v2 1/2] qapi/migration: Deduplicate migration parameter field comments

2023-08-04 Thread Peter Xu
On Fri, Aug 04, 2023 at 05:48:49PM +0100, Daniel P. Berrangé wrote:
> On Fri, Aug 04, 2023 at 12:46:18PM -0400, Peter Xu wrote:
> > On Fri, Aug 04, 2023 at 05:29:19PM +0100, Daniel P. Berrangé wrote:
> > > On Fri, Aug 04, 2023 at 12:01:54PM -0400, Peter Xu wrote:
> > > > On Fri, Aug 04, 2023 at 02:59:07PM +0100, Daniel P. Berrangé wrote:
> > > > > On Fri, Aug 04, 2023 at 02:28:05PM +0200, Markus Armbruster wrote:
> > > > > > Peter Xu  writes:
> > > > > > 
> > > > > > > We used to have three objects that have always the same list of 
> > > > > > > parameters
> > > > > > 
> > > > > > We have!
> > > > > > 
> > > > > > > and comments are always duplicated:
> > > > > > >
> > > > > > >   - @MigrationParameter
> > > > > > >   - @MigrationParameters
> > > > > > >   - @MigrateSetParameters
> > > > > > >
> > > > > > > Before we can deduplicate the code, it's fairly straightforward to
> > > > > > > deduplicate the comments first, so for each time we add a new 
> > > > > > > migration
> > > > > > > parameter we don't need to copy the same paragraphs three times.
> > > > > > 
> > > > > > De-duplicating the code would be nice, but we haven't done so in 
> > > > > > years,
> > > > > > which suggests it's hard enough not to be worth the trouble.
> > > > > 
> > > > > The "MigrationParameter" enumeration isn't actually used in
> > > > > QMP at all.
> > > > > 
> > > > > It is only used in HMP for hmp_migrate_set_parameter and
> > > > > hmp_info_migrate_parameters. So it is questionable documenting
> > > > > that enum in the QMP reference docs at all.
> > > > > 
> > > > > 1c1
> > > > > < { 'struct': 'MigrationParameters',
> > > > > ---
> > > > > > { 'struct': 'MigrateSetParameters',
> > > > > 14,16c14,16
> > > > > < '*tls-creds': 'str',
> > > > > < '*tls-hostname': 'str',
> > > > > < '*tls-authz': 'str',
> > > > > ---
> > > > > > '*tls-creds': 'StrOrNull',
> > > > > > '*tls-hostname': 'StrOrNull',
> > > > > > '*tls-authz': 'StrOrNull',
> > > > > 
> > > > > Is it not valid to use StrOrNull in both cases and thus
> > > > > delete the duplication here ?
> > > > 
> > > > I tested removing MigrateSetParameters by replacing it with
> > > > MigrationParameters and it looks all fine here... I manually tested 
> > > > qmp/hmp
> > > > on set/query parameters, and qtests are all happy.
> > > 
> > > I meant the other way around, such we would be using 'StrOrNull'
> > > in all scenarios.
> > 
> > Yes, that should also work and even without worrying on nulls.  I just took
> > a random one replacing the other.
> > 
> > > 
> > > > 
> > > > The only thing I see that may affect it is we used to logically allow
> > > > taking things like '"tls-authz": null' in the json input, but now we 
> > > > won't
> > > > allow that because we'll be asking for a string type only.
> > > > 
> > > > Since we have query-qmp-schema I suppose we're all fine, because 
> > > > logically
> > > > the mgmt app (libvirt?) will still query that to understand the 
> > > > protocol,
> > > > so now we'll have (response of query-qmp-schema):
> > > > 
> > > > {
> > > > "arg-type": "144",
> > > > "meta-type": "command",
> > > > "name": "migrate-set-parameters",
> > > > "ret-type": "0"
> > > > },
> > > > 
> > > > Where 144 can start to point to MigrationParameters, rather than
> > > > MigrateSetParameters.
> > > > 
> > > > Ok, then what if the mgmt app doesn't care and just used "null" in tls-*
> > > > fields when setting?  Funnily I tried it and actually anything that does
> > > > migrate-set-parameters with a "null" passed over to tls-* fields will
> > > > already crash qemu...
> > > > 
> > > > ./migration/options.c:1333: migrate_params_apply: Assertion 
> > > > `params->tls_authz->type == QTYPE_QSTRING' failed.
> > > > 
> > > > #0  0x7f72f4b2a844 in __pthread_kill_implementation () at 
> > > > /lib64/libc.so.6
> > > > #1  0x7f72f4ad9abe in raise () at /lib64/libc.so.6
> > > > #2  0x7f72f4ac287f in abort () at /lib64/libc.so.6
> > > > #3  0x7f72f4ac279b in _nl_load_domain.cold () at /lib64/libc.so.6
> > > > #4  0x7f72f4ad2147 in  () at /lib64/libc.so.6
> > > > #5  0x5573308740e6 in migrate_params_apply (params=0x7ffc74fd09d0, 
> > > > errp=0x7ffc74fd0998) at ../migration/options.c:1333
> > > > #6  0x557330874591 in qmp_migrate_set_parameters 
> > > > (params=0x7ffc74fd09d0, errp=0x7ffc74fd0998) at 
> > > > ../migration/options.c:1433
> > > > #7  0x557330cb9132 in qmp_marshal_migrate_set_parameters 
> > > > (args=0x7f72e00036d0, ret=0x7f72f133cd98, errp=0x7f72f133cd90) at 
> > > > qapi/qapi-commands-migration.c:214
> > > > #8  0x557330d07fab in do_qmp_dispatch_bh (opaque=0x7f72f133ce30) at 
> > > > ../qapi/qmp-dispatch.c:128
> > > > #9  0x557330d33bbb in aio_bh_call (bh=0x5573337d7920) at 
> > > > ../util/async.c:169
> > > > #10 0x557330d33cd8 in aio_bh_poll (ctx=0x55733356e7d0) at 
> > > > 

[PULL 2/2] ci: install meson in CirrusCI KVM build environment

2023-08-04 Thread Paolo Bonzini
scripts/archive-source.sh needs meson in order to download the subprojects,
therefore meson needs to be part of the host environment in which VM-based
build jobs run.

Fixes: 2019cabfee0 ("meson: subprojects: replace submodules with wrap files", 
2023-06-06)
Reported-by: Daniel P. Berrangé 
Signed-off-by: Paolo Bonzini 
---
 .gitlab-ci.d/cirrus/kvm-build.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.gitlab-ci.d/cirrus/kvm-build.yml 
b/.gitlab-ci.d/cirrus/kvm-build.yml
index 4334fabf39e..a93881aa8b5 100644
--- a/.gitlab-ci.d/cirrus/kvm-build.yml
+++ b/.gitlab-ci.d/cirrus/kvm-build.yml
@@ -15,7 +15,7 @@ env:
 folder: $HOME/.cache/qemu-vm
   install_script:
 - dnf update -y
-- dnf install -y git make openssh-clients qemu-img qemu-system-x86 wget
+- dnf install -y git make openssh-clients qemu-img qemu-system-x86 wget 
meson
   clone_script:
 - git clone --depth 100 "$CI_REPOSITORY_URL" .
 - git fetch origin "$CI_COMMIT_REF_NAME"
-- 
2.41.0




[PULL 0/2] Fixes for x86 TCG and CirrusCI

2023-08-04 Thread Paolo Bonzini
The following changes since commit c26d005e62f4fd177dae0cd70c24cb96761edebc:

  Merge tag 'hppa-linux-user-speedup-pull-request' of 
https://github.com/hdeller/qemu-hppa into staging (2023-08-03 18:49:45 -0700)

are available in the Git repository at:

  https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to d9ab1f1f4d79683b2db00b0995fa65530c535972:

  ci: install meson in CirrusCI KVM build environment (2023-08-04 13:56:17 
+0200)


* fix VM build jobs on CirrusCI
* fix MMX instructions clobbering x87 state before raising #NM


Matt Borgerson (1):
  target/i386: Check CR0.TS before enter_mmx

Paolo Bonzini (1):
  ci: install meson in CirrusCI KVM build environment

 .gitlab-ci.d/cirrus/kvm-build.yml |  2 +-
 target/i386/tcg/decode-new.c.inc  | 10 ++
 2 files changed, 7 insertions(+), 5 deletions(-)
-- 
2.41.0




[PULL 1/2] target/i386: Check CR0.TS before enter_mmx

2023-08-04 Thread Paolo Bonzini
From: Matt Borgerson 

When CR0.TS=1, execution of x87 FPU, MMX, and some SSE instructions will
cause a Device Not Available (DNA) exception (#NM). System software uses
this exception event to lazily context switch FPU state.

Before this patch, enter_mmx helpers may be generated just before #NM
generation, prematurely resetting FPU state before the guest has a
chance to save it.

Signed-off-by: Matt Borgerson 
Message-ID: 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 46afd9960bb..8f93a239ddb 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1803,16 +1803,18 @@ static void disas_insn_new(DisasContext *s, CPUState 
*cpu, int b)
 }
 break;
 
-case X86_SPECIAL_MMX:
-if (!(s->prefix & (PREFIX_REPZ | PREFIX_REPNZ | PREFIX_DATA))) {
-gen_helper_enter_mmx(cpu_env);
-}
+default:
 break;
 }
 
 if (!validate_vex(s, )) {
 return;
 }
+if (decode.e.special == X86_SPECIAL_MMX &&
+!(s->prefix & (PREFIX_REPZ | PREFIX_REPNZ | PREFIX_DATA))) {
+gen_helper_enter_mmx(cpu_env);
+}
+
 if (decode.op[0].has_ea || decode.op[1].has_ea || decode.op[2].has_ea) {
 gen_load_ea(s, , decode.e.vex_class == 12);
 }
-- 
2.41.0




Re: [PATCH v4 16/24] nbd/server: Support 64-bit block status

2023-08-04 Thread Eric Blake
On Tue, Jun 27, 2023 at 04:23:49PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.06.23 16:56, Eric Blake wrote:
> > The NBD spec states that if the client negotiates extended headers,
> > the server must avoid NBD_REPLY_TYPE_BLOCK_STATUS and instead use
> > NBD_REPLY_TYPE_BLOCK_STATUS_EXT which supports 64-bit lengths, even if
> > the reply does not need more than 32 bits.  As of this patch,
> > client->mode is still never NBD_MODE_EXTENDED, so the code added here
> > does not take effect until the next patch enables negotiation.
> > 
> > For now, all metacontexts that we know how to export never populate
> > more than 32 bits of information, so we don't have to worry about
> > NBD_REP_ERR_EXT_HEADER_REQD or filtering during handshake, and we
> > always send all zeroes for the upper 32 bits of status during
> > NBD_CMD_BLOCK_STATUS.
> > 
> > Note that we previously had some interesting size-juggling on call
> > chains, such as:
> > 
> > nbd_co_send_block_status(uint32_t length)
> > -> blockstatus_to_extents(uint32_t bytes)
> >-> bdrv_block_status_above(bytes, _t num)
> >-> nbd_extent_array_add(uint64_t num)
> >  -> store num in 32-bit length
> > 
> > But we were lucky that it never overflowed: bdrv_block_status_above
> > never sets num larger than bytes, and we had previously been capping
> > 'bytes' at 32 bits (since the protocol does not allow sending a larger
> > request without extended headers).  This patch adds some assertions
> > that ensure we continue to avoid overflowing 32 bits for a narrow
> 
> 
> [..]
> 
> > @@ -2162,19 +2187,23 @@ static void 
> > nbd_extent_array_convert_to_be(NBDExtentArray *ea)
> >* would result in an incorrect range reported to the client)
> >*/
> >   static int nbd_extent_array_add(NBDExtentArray *ea,
> > -uint32_t length, uint32_t flags)
> > +uint64_t length, uint32_t flags)
> >   {
> >   assert(ea->can_add);
> > 
> >   if (!length) {
> >   return 0;
> >   }
> > +if (!ea->extended) {
> > +assert(length <= UINT32_MAX);
> > +}
> > 
> >   /* Extend previous extent if flags are the same */
> >   if (ea->count > 0 && flags == ea->extents[ea->count - 1].flags) {
> > -uint64_t sum = (uint64_t)length + ea->extents[ea->count - 
> > 1].length;
> > +uint64_t sum = length + ea->extents[ea->count - 1].length;
> > 
> > -if (sum <= UINT32_MAX) {
> > +assert(sum >= length);
> > +if (sum <= UINT32_MAX || ea->extended) {
> 
> that "if" and uint64_t sum was to avoid overflow. I think, we can't just 
> assert, instead include the check into if:
> 
> if (sum >= length && (sum <= UINT32_MAX || ea->extended) {

Why?  The assertion is stating that there was no overflow, because we
are in control of ea->extents[ea->count - 1].length (it came from
local code performing block status, and our block layer guarantees
that no block status returns more than 2^63 bytes because we don't
support images larger than off_t).  At best, all I need is a comment
why the assertion is valid.

> 
>  ...
> 
> }
> 
> with this:
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> 
> -- 
> Best regards,
> Vladimir
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




Re: [PULL 0/7] ppc queue

2023-08-04 Thread Richard Henderson

On 8/4/23 08:29, Daniel Henrique Barboza wrote:

The following changes since commit c26d005e62f4fd177dae0cd70c24cb96761edebc:

   Merge tag 'hppa-linux-user-speedup-pull-request' 
ofhttps://github.com/hdeller/qemu-hppa  into staging (2023-08-03 18:49:45 -0700)

are available in the Git repository at:

   https://gitlab.com/danielhb/qemu.git  tags/pull-ppc-20230804

for you to fetch changes up to 0e2a3ec36885f6d79a96230f582d4455878c6373:

   target/ppc: Fix VRMA page size for ISA v3.0 (2023-08-04 12:22:03 -0300)


ppc patch queue for 2023-08-04:

This queue contains target/ppc register and VRMA fixes for 8.1. pegasos2
fixes are also included.


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as 
appropriate.


r~




Re: [PATCH v4 15/24] nbd/server: Prepare to send extended header replies

2023-08-04 Thread Eric Blake
On Fri, Jun 16, 2023 at 09:48:18PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.06.23 16:56, Eric Blake wrote:
> > Although extended mode is not yet enabled, once we do turn it on, we
> > need to reply with extended headers to all messages.  Update the low
> > level entry points necessary so that all other callers automatically
> > get the right header based on the current mode.
> > 
> > Signed-off-by: Eric Blake 
> > ---
> > 
> > v4: new patch, split out from v3 9/14
> > ---
> >   nbd/server.c | 30 ++
> >   1 file changed, 22 insertions(+), 8 deletions(-)
> > 
> > diff --git a/nbd/server.c b/nbd/server.c
> > index 119ac765f09..84c848a31d3 100644
> > --- a/nbd/server.c
> > +++ b/nbd/server.c
> > @@ -1947,8 +1947,6 @@ static inline void set_be_chunk(NBDClient *client, 
> > struct iovec *iov,
> >   size_t niov, uint16_t flags, uint16_t 
> > type,
> >   NBDRequest *request)
> >   {
> > -/* TODO - handle structured vs. extended replies */
> > -NBDStructuredReplyChunk *chunk = iov->iov_base;
> >   size_t i, length = 0;
> > 
> >   for (i = 1; i < niov; i++) {
> > @@ -1956,12 +1954,26 @@ static inline void set_be_chunk(NBDClient *client, 
> > struct iovec *iov,
> >   }
> >   assert(length <= NBD_MAX_BUFFER_SIZE + sizeof(NBDStructuredReadData));
> > 
> > -iov[0].iov_len = sizeof(*chunk);
> > -stl_be_p(>magic, NBD_STRUCTURED_REPLY_MAGIC);
> > -stw_be_p(>flags, flags);
> > -stw_be_p(>type, type);
> > -stq_be_p(>cookie, request->cookie);
> > -stl_be_p(>length, length);
> > +if (client->mode >= NBD_MODE_EXTENDED) {
> > +NBDExtendedReplyChunk *chunk = iov->iov_base;
> > +
> > +iov->iov_len = sizeof(*chunk);
> 
> I'd prefer to keep iov[0].iov_len notation, to stress that iov is an array
> 
> anyway:
> Reviewed-by: Vladimir Sementsov-Ogievskiy 

I can make that change, and keep your R-b.

> 
> > +stl_be_p(>magic, NBD_EXTENDED_REPLY_MAGIC);
> > +stw_be_p(>flags, flags);
> > +stw_be_p(>type, type);
> > +stq_be_p(>cookie, request->cookie);
> 
> Hm. Not about this patch:
> 
> we now moved to simple cookies. And it seems that actually, 64bit is too much 
> for number of request.

You're right that it's more than qemu cared about.  But there may be
other implementations that really do store a 64-bit pointer as their
opaque cookie, for ease of reverse-lookup on which command the
server's reply corresponds to, so I don't see it changing any time
soon in the NBD protocol.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




[Stable-8.0.4 54/63] virtio-crypto: verify src buffer length for sym request

2023-08-04 Thread Michael Tokarev
From: zhenwei pi 

For symmetric algorithms, the length of ciphertext must be as same
as the plaintext.
The missing verification of the src_len and the dst_len in
virtio_crypto_sym_op_helper() may lead buffer overflow/divulged.

This patch is originally written by Yiming Tao for QEMU-SECURITY,
resend it(a few changes of error message) in qemu-devel.

Fixes: CVE-2023-3180
Fixes: 04b9b37edda("virtio-crypto: add data queue processing handler")
Cc: Gonglei 
Cc: Mauro Matteo Cascella 
Cc: Yiming Tao 
Signed-off-by: zhenwei pi 
Message-Id: <20230803024314.29962-2-pizhen...@bytedance.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit 9d38a8434721a6479fe03fb5afb150ca793d3980)
Signed-off-by: Michael Tokarev 

diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index a1d122b9aa..ccaa704530 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -635,6 +635,11 @@ virtio_crypto_sym_op_helper(VirtIODevice *vdev,
 return NULL;
 }
 
+if (unlikely(src_len != dst_len)) {
+virtio_error(vdev, "sym request src len is different from dst len");
+return NULL;
+}
+
 max_len = (uint64_t)iv_len + aad_len + src_len + dst_len + hash_result_len;
 if (unlikely(max_len > vcrypto->conf.max_size)) {
 virtio_error(vdev, "virtio-crypto too big length");
-- 
2.39.2




[Stable-8.0.4 60/63] hw/i386/intel_iommu: Fix struct VTDInvDescIEC on big endian hosts

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

On big endian hosts, we need to reverse the bitfield order in the
struct VTDInvDescIEC, just like it is already done for the other
bitfields in the various structs of the intel-iommu device.

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-4-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Xu 
(cherry picked from commit 4572b22cf9ba432fa3955686853c706a1821bbc7)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index f090e61e11..e4d43ce48c 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -321,12 +321,21 @@ typedef enum VTDFaultReason {
 
 /* Interrupt Entry Cache Invalidation Descriptor: VT-d 6.5.2.7. */
 struct VTDInvDescIEC {
+#if HOST_BIG_ENDIAN
+uint64_t reserved_2:16;
+uint64_t index:16;  /* Start index to invalidate */
+uint64_t index_mask:5;  /* 2^N for continuous int invalidation */
+uint64_t resved_1:22;
+uint64_t granularity:1; /* If set, it's global IR invalidation */
+uint64_t type:4;/* Should always be 0x4 */
+#else
 uint32_t type:4;/* Should always be 0x4 */
 uint32_t granularity:1; /* If set, it's global IR invalidation */
 uint32_t resved_1:22;
 uint32_t index_mask:5;  /* 2^N for continuous int invalidation */
 uint32_t index:16;  /* Start index to invalidate */
 uint32_t reserved_2:16;
+#endif
 };
 typedef struct VTDInvDescIEC VTDInvDescIEC;
 
-- 
2.39.2




[Stable-8.0.4 61/63] hw/i386/intel_iommu: Fix index calculation in vtd_interrupt_remap_msi()

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

The values in "addr" are populated locally in this function in host
endian byte order, so we must not swap the index_l field here.

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-5-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Peter Xu 
(cherry picked from commit fcd8027423300b201b37842b88393dc5c6c8ee9e)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 03becd6384..9e6ce71454 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3458,7 +3458,7 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
 goto out;
 }
 
-index = addr.addr.index_h << 15 | le16_to_cpu(addr.addr.index_l);
+index = addr.addr.index_h << 15 | addr.addr.index_l;
 
 #define  VTD_IR_MSI_DATA_SUBHANDLE   (0x)
 #define  VTD_IR_MSI_DATA_RESERVED(0x)
-- 
2.39.2




[Stable-8.0.4 35/63] virtio-pci: add handling of PCI ATS and Device-TLB enable/disable

2023-08-04 Thread Michael Tokarev
From: Viktor Prutyanov 

According to PCIe Address Translation Services specification 5.1.3.,
ATS Control Register has Enable bit to enable/disable ATS. Guest may
enable/disable PCI ATS and, accordingly, Device-TLB for the VirtIO PCI
device. So, raise/lower a flag and call a trigger function to pass this
event to a device implementation.

Signed-off-by: Viktor Prutyanov 
Message-Id: <20230512135122.70403-2-vik...@daynix.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit 206e91d143301414df2deb48a411e402414ba6db)
Signed-off-by: Michael Tokarev 

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 247325c193..798eba9d6e 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -716,6 +716,38 @@ virtio_address_space_read(VirtIOPCIProxy *proxy, hwaddr 
addr,
 }
 }
 
+static void virtio_pci_ats_ctrl_trigger(PCIDevice *pci_dev, bool enable)
+{
+VirtIOPCIProxy *proxy = VIRTIO_PCI(pci_dev);
+VirtIODevice *vdev = virtio_bus_get_device(>bus);
+VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
+
+vdev->device_iotlb_enabled = enable;
+
+if (k->toggle_device_iotlb) {
+k->toggle_device_iotlb(vdev);
+}
+}
+
+static void pcie_ats_config_write(PCIDevice *dev, uint32_t address,
+  uint32_t val, int len)
+{
+uint32_t off;
+uint16_t ats_cap = dev->exp.ats_cap;
+
+if (!ats_cap || address < ats_cap) {
+return;
+}
+off = address - ats_cap;
+if (off >= PCI_EXT_CAP_ATS_SIZEOF) {
+return;
+}
+
+if (range_covers_byte(off, len, PCI_ATS_CTRL + 1)) {
+virtio_pci_ats_ctrl_trigger(dev, !!(val & PCI_ATS_CTRL_ENABLE));
+}
+}
+
 static void virtio_write_config(PCIDevice *pci_dev, uint32_t address,
 uint32_t val, int len)
 {
@@ -729,6 +761,10 @@ static void virtio_write_config(PCIDevice *pci_dev, 
uint32_t address,
 pcie_cap_flr_write_config(pci_dev, address, val, len);
 }
 
+if (proxy->flags & VIRTIO_PCI_FLAG_ATS) {
+pcie_ats_config_write(pci_dev, address, val, len);
+}
+
 if (range_covers_byte(address, len, PCI_COMMAND)) {
 if (!(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
 virtio_set_disabled(vdev, true);
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index f236e94ca6..bd3092a1ab 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -155,6 +155,7 @@ struct VirtIODevice
 QLIST_HEAD(, VirtQueue) *vector_queues;
 QTAILQ_ENTRY(VirtIODevice) next;
 EventNotifier config_notifier;
+bool device_iotlb_enabled;
 };
 
 struct VirtioDeviceClass {
@@ -212,6 +213,7 @@ struct VirtioDeviceClass {
 const VMStateDescription *vmsd;
 bool (*primary_unplug_pending)(void *opaque);
 struct vhost_dev *(*get_vhost)(VirtIODevice *vdev);
+void (*toggle_device_iotlb)(VirtIODevice *vdev);
 };
 
 void virtio_instance_init_common(Object *proxy_obj, void *data,
-- 
2.39.2




[Stable-8.0.4 04/63] linux-user: Fix fcntl() and fcntl64() to return O_LARGEFILE for 32-bit targets

2023-08-04 Thread Michael Tokarev
From: Helge Deller 

When running a 32-bit guest on a 64-bit host, fcntl[64](F_GETFL) should
return with the TARGET_O_LARGEFILE flag set, because all 64-bit hosts
support large files unconditionally.

But on 64-bit hosts, O_LARGEFILE has the value 0, so the flag
translation can't be done with the fcntl_flags_tbl[]. Instead add the
TARGET_O_LARGEFILE flag afterwards.

Note that for 64-bit guests the compiler will optimize away this code,
since TARGET_O_LARGEFILE is zero.

Signed-off-by: Helge Deller 
Reviewed-by: Richard Henderson 
(cherry picked from commit e0ddf8eac9f83c0bc5a3d39605d873ee0fe53421)
Signed-off-by: Michael Tokarev 

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 333e6b7026..011cadb281 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -7132,6 +7132,10 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
 ret = get_errno(safe_fcntl(fd, host_cmd, arg));
 if (ret >= 0) {
 ret = host_to_target_bitmask(ret, fcntl_flags_tbl);
+/* tell 32-bit guests it uses largefile on 64-bit hosts: */
+if (O_LARGEFILE == 0 && HOST_LONG_BITS == 64) {
+ret |= TARGET_O_LARGEFILE;
+}
 }
 break;
 
-- 
2.39.2




[Stable-8.0.4 51/63] target/m68k: Fix semihost lseek offset computation

2023-08-04 Thread Michael Tokarev
From: Peter Maydell 

The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.

Cc: qemu-sta...@nongnu.org
Fixes: 950272506d ("target/m68k: Use semihosting/syscalls.h")
Reported-by: Philippe Mathieu-Daudé 
Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20230801154519.3505531-1-peter.mayd...@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé 
(cherry picked from commit 8caaae7319a5f7ca449900c0e6bfcaed78fa3ae2)
Signed-off-by: Michael Tokarev 

diff --git a/target/m68k/m68k-semi.c b/target/m68k/m68k-semi.c
index 88ad9ba814..239f6e44e9 100644
--- a/target/m68k/m68k-semi.c
+++ b/target/m68k/m68k-semi.c
@@ -166,7 +166,7 @@ void do_m68k_semihosting(CPUM68KState *env, int nr)
 GET_ARG64(2);
 GET_ARG64(3);
 semihost_sys_lseek(cs, m68k_semi_u64_cb, arg0,
-   deposit64(arg2, arg1, 32, 32), arg3);
+   deposit64(arg2, 32, 32, arg1), arg3);
 break;
 
 case HOSTED_RENAME:
-- 
2.39.2




[Stable-8.0.4 36/63] vhost: register and change IOMMU flag depending on Device-TLB state

2023-08-04 Thread Michael Tokarev
From: Viktor Prutyanov 

The guest can disable or never enable Device-TLB. In these cases,
it can't be used even if enabled in QEMU. So, check Device-TLB state
before registering IOMMU notifier and select unmap flag depending on
that. Also, implement a way to change IOMMU notifier flag if Device-TLB
state is changed.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2001312
Signed-off-by: Viktor Prutyanov 
Acked-by: Jason Wang 
Message-Id: <20230626091258.24453-2-vik...@daynix.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit ee071f67f7a103c66f85f68ffe083712929122e3)
Signed-off-by: Michael Tokarev 

diff --git a/hw/virtio/vhost-stub.c b/hw/virtio/vhost-stub.c
index c175148fce..aa858ef3fb 100644
--- a/hw/virtio/vhost-stub.c
+++ b/hw/virtio/vhost-stub.c
@@ -15,3 +15,7 @@ bool vhost_user_init(VhostUserState *user, CharBackend *chr, 
Error **errp)
 void vhost_user_cleanup(VhostUserState *user)
 {
 }
+
+void vhost_toggle_device_iotlb(VirtIODevice *vdev)
+{
+}
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 69a7b5592a..480e7f8048 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -781,7 +781,6 @@ static void vhost_iommu_region_add(MemoryListener *listener,
 Int128 end;
 int iommu_idx;
 IOMMUMemoryRegion *iommu_mr;
-int ret;
 
 if (!memory_region_is_iommu(section->mr)) {
 return;
@@ -796,7 +795,9 @@ static void vhost_iommu_region_add(MemoryListener *listener,
 iommu_idx = memory_region_iommu_attrs_to_index(iommu_mr,
MEMTXATTRS_UNSPECIFIED);
 iommu_notifier_init(>n, vhost_iommu_unmap_notify,
-IOMMU_NOTIFIER_DEVIOTLB_UNMAP,
+dev->vdev->device_iotlb_enabled ?
+IOMMU_NOTIFIER_DEVIOTLB_UNMAP :
+IOMMU_NOTIFIER_UNMAP,
 section->offset_within_region,
 int128_get64(end),
 iommu_idx);
@@ -804,16 +805,8 @@ static void vhost_iommu_region_add(MemoryListener 
*listener,
 iommu->iommu_offset = section->offset_within_address_space -
   section->offset_within_region;
 iommu->hdev = dev;
-ret = memory_region_register_iommu_notifier(section->mr, >n, NULL);
-if (ret) {
-/*
- * Some vIOMMUs do not support dev-iotlb yet.  If so, try to use the
- * UNMAP legacy message
- */
-iommu->n.notifier_flags = IOMMU_NOTIFIER_UNMAP;
-memory_region_register_iommu_notifier(section->mr, >n,
-  _fatal);
-}
+memory_region_register_iommu_notifier(section->mr, >n,
+  _fatal);
 QLIST_INSERT_HEAD(>iommu_list, iommu, iommu_next);
 /* TODO: can replay help performance here? */
 }
@@ -841,6 +834,27 @@ static void vhost_iommu_region_del(MemoryListener 
*listener,
 }
 }
 
+void vhost_toggle_device_iotlb(VirtIODevice *vdev)
+{
+VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+struct vhost_dev *dev;
+struct vhost_iommu *iommu;
+
+if (vdev->vhost_started) {
+dev = vdc->get_vhost(vdev);
+} else {
+return;
+}
+
+QLIST_FOREACH(iommu, >iommu_list, iommu_next) {
+memory_region_unregister_iommu_notifier(iommu->mr, >n);
+iommu->n.notifier_flags = vdev->device_iotlb_enabled ?
+IOMMU_NOTIFIER_DEVIOTLB_UNMAP : IOMMU_NOTIFIER_UNMAP;
+memory_region_register_iommu_notifier(iommu->mr, >n,
+  _fatal);
+}
+}
+
 static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
 struct vhost_virtqueue *vq,
 unsigned idx, bool enable_log)
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index a52f273347..0a07f4435e 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -320,6 +320,7 @@ bool vhost_has_free_slot(void);
 int vhost_net_set_backend(struct vhost_dev *hdev,
   struct vhost_vring_file *file);
 
+void vhost_toggle_device_iotlb(VirtIODevice *vdev);
 int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int write);
 
 int vhost_virtqueue_start(struct vhost_dev *dev, struct VirtIODevice *vdev,
-- 
2.39.2




[Stable-8.0.4 43/63] target/ppc: Disable goto_tb with architectural singlestep

2023-08-04 Thread Michael Tokarev
From: Richard Henderson 

The change to use translator_use_goto_tb went too far, as the
CF_SINGLE_STEP flag managed by the translator only handles
gdb single stepping and not the architectural single stepping
modeled in DisasContext.singlestep_enabled.

Fixes: 6e9cc373ec5 ("target/ppc: Use translator_use_goto_tb")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1795
Reviewed-by: Cédric Le Goater 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
(cherry picked from commit 2e718e665706d5fcc3e3501bda26f277f055ed85)
Signed-off-by: Michael Tokarev 

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 49a6b91842..26222e9078 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4132,6 +4132,9 @@ static void pmu_count_insns(DisasContext *ctx)
 
 static inline bool use_goto_tb(DisasContext *ctx, target_ulong dest)
 {
+if (unlikely(ctx->singlestep_enabled)) {
+return false;
+}
 return translator_use_goto_tb(>base, dest);
 }
 
-- 
2.39.2




[Stable-8.0.4 63/63] include/hw/i386/x86-iommu: Fix struct X86IOMMU_MSIMessage for big endian hosts

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

The first bitfield here is supposed to be used as a 64-bit equivalent
to the "uint64_t msi_addr" in the union. To make this work correctly
on big endian hosts, too, the __addr_hi field has to be part of the
bitfield, and the the bitfield members must be declared with "uint64_t"
instead of "uint32_t" - otherwise the values are placed in the wrong
bytes on big endian hosts.

Same applies to the 32-bit "msi_data" field: __resved1 must be part
of the bitfield, and the members must be declared with "uint32_t"
instead of "uint16_t".

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-7-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Xu 
(cherry picked from commit e1e56c07d1fa24aa37a7e89e6633768fc8ea8705)
Signed-off-by: Michael Tokarev 

diff --git a/include/hw/i386/x86-iommu.h b/include/hw/i386/x86-iommu.h
index 8d8d53b18b..bfd21649d0 100644
--- a/include/hw/i386/x86-iommu.h
+++ b/include/hw/i386/x86-iommu.h
@@ -87,40 +87,42 @@ struct X86IOMMU_MSIMessage {
 union {
 struct {
 #if HOST_BIG_ENDIAN
-uint32_t __addr_head:12; /* 0xfee */
-uint32_t dest:8;
-uint32_t __reserved:8;
-uint32_t redir_hint:1;
-uint32_t dest_mode:1;
-uint32_t __not_used:2;
+uint64_t __addr_hi:32;
+uint64_t __addr_head:12; /* 0xfee */
+uint64_t dest:8;
+uint64_t __reserved:8;
+uint64_t redir_hint:1;
+uint64_t dest_mode:1;
+uint64_t __not_used:2;
 #else
-uint32_t __not_used:2;
-uint32_t dest_mode:1;
-uint32_t redir_hint:1;
-uint32_t __reserved:8;
-uint32_t dest:8;
-uint32_t __addr_head:12; /* 0xfee */
+uint64_t __not_used:2;
+uint64_t dest_mode:1;
+uint64_t redir_hint:1;
+uint64_t __reserved:8;
+uint64_t dest:8;
+uint64_t __addr_head:12; /* 0xfee */
+uint64_t __addr_hi:32;
 #endif
-uint32_t __addr_hi;
 } QEMU_PACKED;
 uint64_t msi_addr;
 };
 union {
 struct {
 #if HOST_BIG_ENDIAN
-uint16_t trigger_mode:1;
-uint16_t level:1;
-uint16_t __resved:3;
-uint16_t delivery_mode:3;
-uint16_t vector:8;
+uint32_t __resved1:16;
+uint32_t trigger_mode:1;
+uint32_t level:1;
+uint32_t __resved:3;
+uint32_t delivery_mode:3;
+uint32_t vector:8;
 #else
-uint16_t vector:8;
-uint16_t delivery_mode:3;
-uint16_t __resved:3;
-uint16_t level:1;
-uint16_t trigger_mode:1;
+uint32_t vector:8;
+uint32_t delivery_mode:3;
+uint32_t __resved:3;
+uint32_t level:1;
+uint32_t trigger_mode:1;
+uint32_t __resved1:16;
 #endif
-uint16_t __resved1;
 } QEMU_PACKED;
 uint32_t msi_data;
 };
-- 
2.39.2




[Stable-8.0.4 48/63] hw/xen: fix off-by-one in xen_evtchn_set_gsi()

2023-08-04 Thread Michael Tokarev
From: David Woodhouse 

Coverity points out (CID 1508128) a bounds checking error. We need to check
for gsi >= IOAPIC_NUM_PINS, not just greater-than.

Also fix up an assert() that has the same problem, that Coverity didn't see.

Fixes: 4f81baa33ed6 ("hw/xen: Support GSI mapping to PIRQ")
Signed-off-by: David Woodhouse 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20230801175747.145906-2-dw...@infradead.org>
Signed-off-by: Philippe Mathieu-Daudé 
(cherry picked from commit cf885b19579646d6a085470658bc83432d6786d2)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index 3048329474..8c86c91a9e 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -1587,7 +1587,7 @@ static int allocate_pirq(XenEvtchnState *s, int type, int 
gsi)
  found:
 pirq_inuse_word(s, pirq) |= pirq_inuse_bit(pirq);
 if (gsi >= 0) {
-assert(gsi <= IOAPIC_NUM_PINS);
+assert(gsi < IOAPIC_NUM_PINS);
 s->gsi_pirq[gsi] = pirq;
 }
 s->pirq[pirq].gsi = gsi;
@@ -1601,7 +1601,7 @@ bool xen_evtchn_set_gsi(int gsi, int level)
 
 assert(qemu_mutex_iothread_locked());
 
-if (!s || gsi < 0 || gsi > IOAPIC_NUM_PINS) {
+if (!s || gsi < 0 || gsi >= IOAPIC_NUM_PINS) {
 return false;
 }
 
-- 
2.39.2




[Stable-8.0.4 52/63] hw/virtio-iommu: Fix potential OOB access in virtio_iommu_handle_command()

2023-08-04 Thread Michael Tokarev
From: Eric Auger 

In the virtio_iommu_handle_command() when a PROBE request is handled,
output_size takes a value greater than the tail size and on a subsequent
iteration we can get a stack out-of-band access. Initialize the
output_size on each iteration.

The issue was found with ASAN. Credits to:
Yiming Tao(Zhejiang University)
Gaoning Pan(Zhejiang University)

Fixes: 1733eebb9e7 ("virtio-iommu: Implement RESV_MEM probe request")
Signed-off-by: Eric Auger 
Reported-by: Mauro Matteo Cascella 
Cc: qemu-sta...@nongnu.org

Message-Id: <20230717162126.11693-1-eric.au...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit cf2f89edf36a59183166ae8721a8d7ab5cd286bd)
Signed-off-by: Michael Tokarev 

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 1cd258135d..e84300d50c 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -727,13 +727,15 @@ static void virtio_iommu_handle_command(VirtIODevice 
*vdev, VirtQueue *vq)
 VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
 struct virtio_iommu_req_head head;
 struct virtio_iommu_req_tail tail = {};
-size_t output_size = sizeof(tail), sz;
 VirtQueueElement *elem;
 unsigned int iov_cnt;
 struct iovec *iov;
 void *buf = NULL;
+size_t sz;
 
 for (;;) {
+size_t output_size = sizeof(tail);
+
 elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
 if (!elem) {
 return;
-- 
2.39.2




[Stable-8.0.4 49/63] target/nios2: Pass semihosting arg to exit

2023-08-04 Thread Michael Tokarev
From: Keith Packard 

Instead of using R_ARG0 (the semihost function number), use R_ARG1
(the provided exit status).

Signed-off-by: Keith Packard 
Reviewed-by: Peter Maydell 
Message-Id: <20230801152245.332749-1-kei...@keithp.com>
Signed-off-by: Philippe Mathieu-Daudé 
(cherry picked from commit c11d5bdae79a8edaf00dfcb2e49c064a50c67671)
Signed-off-by: Michael Tokarev 

diff --git a/target/nios2/nios2-semi.c b/target/nios2/nios2-semi.c
index 3738774976..f3b7aee4f1 100644
--- a/target/nios2/nios2-semi.c
+++ b/target/nios2/nios2-semi.c
@@ -133,8 +133,8 @@ void do_nios2_semihosting(CPUNios2State *env)
 args = env->regs[R_ARG1];
 switch (nr) {
 case HOSTED_EXIT:
-gdb_exit(env->regs[R_ARG0]);
-exit(env->regs[R_ARG0]);
+gdb_exit(env->regs[R_ARG1]);
+exit(env->regs[R_ARG1]);
 
 case HOSTED_OPEN:
 GET_ARG(0);
-- 
2.39.2




[Stable-8.0.4 00/63] Patch Round-up for stable 8.0.4, freeze on 2023-08-05

2023-08-04 Thread Michael Tokarev
The following patches are queued for QEMU stable v8.0.4:

  https://gitlab.com/qemu-project/qemu/-/commits/staging-8.0

Patch freeze is 2023-08-05, and the release is planned for 2023-08-07:

  https://wiki.qemu.org/Planning/8.0

Please respond here or CC qemu-sta...@nongnu.org on any additional patches
you think should (or shouldn't) be included in the release.

The changes which are staging for inclusion, with the original commit hash
from master branch, are given below the bottom line.

Thanks!

/mjt

--
01* 4271f4038372 Laurent Vivier:
   virtio-net: correctly report maximum tx_queue_size value
02* ca2a5e630dc1 Fiona Ebner:
   qemu_cleanup: begin drained section after vm_shutdown()
03* 2ad2e113deb5 Nicholas Piggin:
   hw/ppc: Fix clock update drift
04 e0ddf8eac9f8 Helge Deller:
   linux-user: Fix fcntl() and fcntl64() to return O_LARGEFILE for 32-bit 
   targets
05* dca4c8384d68 Helge Deller:
   linux-user: Fix accept4(SOCK_NONBLOCK) syscall
06* 8af87a3ec7e4 Avihai Horon:
   vfio: Fix null pointer dereference bug in vfio_bars_finalize()
07* 110b1bac2ecd Ilya Leoshkevich:
   target/s390x: Fix EPSW CC reporting
08* fed9a4fe0ce0 Ilya Leoshkevich:
   target/s390x: Fix MDEB and MDEBR
09* 92a57534619a Ilya Leoshkevich:
   target/s390x: Fix MVCRL with a large value in R0
10* 6da311a60d58 Ilya Leoshkevich:
   target/s390x: Fix LRA overwriting the top 32 bits on DAT error
11* b0ef81062d24 Ilya Leoshkevich:
   target/s390x: Fix LRA when DAT is off
12* baf21eebc3e1 Marcin Nowakowski:
   target/mips: enable GINVx support for I6400 and I6500
13* 230dfd9257e9 Olaf Hering:
   hw/ide/piix: properly initialize the BMIBA register
14* 7a8d9f3a0e88 Pierrick Bouvier:
   linux-user/syscall: Implement execve without execveat
15* e18ed26ce785 Richard Henderson:
   tcg: Fix info_in_idx increment in layout_arg_by_ref
16* d713cf4d6c71 Philippe Mathieu-Daudé:
   linux-user/arm: Do not allocate a commpage at all for M-profile CPUs
17* d921fea338c1 Mauro Matteo Cascella:
   ui/vnc-clipboard: fix infinite loop in inflate_buffer (CVE-2023-3255)
18* d28b3c90cfad Andreas Schwab:
   linux-user: Make sure initial brk(0) is page-aligned
19* ea3c76f1494d Klaus Jensen:
   hw/nvme: fix endianness issue for shadow doorbells
20* 15ad98536ad9 Helge Deller:
   linux-user: Fix qemu brk() to not zero bytes on current page
21* dfe49864afb0 Helge Deller:
   linux-user: Prohibit brk() to to shrink below initial heap address
22* eac78a4b0b7d Helge Deller:
   linux-user: Fix signed math overflow in brk() syscall
23* 03b67621445d Denis V. Lunev:
   qemu-nbd: pass structure into nbd_client_thread instead of plain char*
24* 5c56dd27a2c9 Denis V. Lunev:
   qemu-nbd: fix regression with qemu-nbd --fork run over ssh
25 e5b815b0defc Denis V. Lunev:
   qemu-nbd: regression with arguments passing into nbd_client_thread()
26* 736a1588c104 Jordan Niethe:
   tcg/ppc: Fix race in goto_tb implementation
27* 22d2e5351a18 Ilya Leoshkevich:
   tcg/{i386, s390x}: Add earlyclobber to the op_add2's first output
28* 761b0aa9381e Ilya Leoshkevich:
   target/s390x: Make CKSM raise an exception if R2 is odd
29* 4b6e4c0b8223 Ilya Leoshkevich:
   target/s390x: Fix CLM with M3=0
30* 53684e344a27 Ilya Leoshkevich:
   target/s390x: Fix CONVERT TO LOGICAL/FIXED with out-of-range inputs
31* a2025557ed4d Ilya Leoshkevich:
   target/s390x: Fix ICM with M3=0
32* 9c028c057adc Ilya Leoshkevich:
   target/s390x: Make MC raise specification exception when class >= 16
33* ff537b0370ab Ilya Leoshkevich:
   target/s390x: Fix assertion failure in VFMIN/VFMAX with type 13
34 c34ad459926f Thomas Huth:
   target/loongarch: Fix the CSRRD CPUID instruction on big endian hosts
35 206e91d14330 Viktor Prutyanov:
   virtio-pci: add handling of PCI ATS and Device-TLB enable/disable
36 ee071f67f7a1 Viktor Prutyanov:
   vhost: register and change IOMMU flag depending on Device-TLB state
37 cd9b83468843 Viktor Prutyanov:
   virtio-net: pass Device-TLB enable/disable events to vhost
38 c6445544d4ce Peter Maydell:
   hw/arm/smmu: Handle big-endian hosts correctly
39 5d78893f39ca Peter Maydell:
   target/arm: Special case M-profile in debug_helper.c code
40 2b0d656ab648 Peter Maydell:
   target/arm: Avoid writing to constant TCGv in trans_CSEL()
41 055b86e0f0b4 Richard Henderson:
   util/interval-tree: Use qatomic_read for left/right while searching
42 4c8baa02d363 Richard Henderson:
   util/interval-tree: Use qatomic_set_mb in rb_link_node
43 2e718e665706 Richard Henderson:
   target/ppc: Disable goto_tb with architectural singlestep
44 38dd78c41eaf Helge Deller:
   linux-user/armeb: Fix __kernel_cmpxchg() for armeb
45 f4f71363fcdb Anthony PERARD:
   thread-pool: signal "request_cond" while locked
46 aa36243514a7 Anthony PERARD:
   xen-block: Avoid leaks on new error path
47 10be627d2b5e Daniel P. Berrangé:
   io: remove io watch if TLS channel is closed during handshake
48 cf885b195796 David Woodhouse:
   hw/xen: fix off-by-one in xen_evtchn_set_gsi()
49 c11d5bdae79a 

[Stable-8.0.4 50/63] target/nios2: Fix semihost lseek offset computation

2023-08-04 Thread Michael Tokarev
From: Keith Packard 

The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.

Signed-off-by: Keith Packard 
Reviewed-by: Philippe Mathieu-Daudé 
Fixes: d1e23cbaa403b2d ("target/nios2: Use semihosting/syscalls.h")
Reviewed-by: Peter Maydell 
Message-Id: <20230731235245.295513-1-kei...@keithp.com>
Signed-off-by: Philippe Mathieu-Daudé 
(cherry picked from commit 71e2dd6aa1bdbac19c661638a4ae91816002ac9e)
Signed-off-by: Michael Tokarev 

diff --git a/target/nios2/nios2-semi.c b/target/nios2/nios2-semi.c
index f3b7aee4f1..9d0241c758 100644
--- a/target/nios2/nios2-semi.c
+++ b/target/nios2/nios2-semi.c
@@ -169,7 +169,7 @@ void do_nios2_semihosting(CPUNios2State *env)
 GET_ARG64(2);
 GET_ARG64(3);
 semihost_sys_lseek(cs, nios2_semi_u64_cb, arg0,
-   deposit64(arg2, arg1, 32, 32), arg3);
+   deposit64(arg2, 32, 32, arg1), arg3);
 break;
 
 case HOSTED_RENAME:
-- 
2.39.2




[Stable-8.0.4 57/63] pci: do not respond config requests after PCI device eject

2023-08-04 Thread Michael Tokarev
From: Yuri Benditovich 

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2224964

In migration with VF failover, Windows guest and ACPI hot
unplug we do not need to satisfy config requests, otherwise
the guest immediately detects the device and brings up its
driver. Many network VF's are stuck on the guest PCI bus after
the migration.

Signed-off-by: Yuri Benditovich 
Message-Id: <20230728084049.191454-1-yuri.benditov...@daynix.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit 348e354417b64c484877354ee7cc66f29fa6c7df)
Signed-off-by: Michael Tokarev 

diff --git a/hw/pci/pci_host.c b/hw/pci/pci_host.c
index dfd185bbb4..7b09847be0 100644
--- a/hw/pci/pci_host.c
+++ b/hw/pci/pci_host.c
@@ -62,6 +62,17 @@ static void pci_adjust_config_limit(PCIBus *bus, uint32_t 
*limit)
 }
 }
 
+static bool is_pci_dev_ejected(PCIDevice *pci_dev)
+{
+/*
+ * device unplug was requested and the guest acked it,
+ * so we stop responding config accesses even if the
+ * device is not deleted (failover flow)
+ */
+return pci_dev && pci_dev->partially_hotplugged &&
+   !pci_dev->qdev.pending_deleted_event;
+}
+
 void pci_host_config_write_common(PCIDevice *pci_dev, uint32_t addr,
   uint32_t limit, uint32_t val, uint32_t len)
 {
@@ -75,7 +86,7 @@ void pci_host_config_write_common(PCIDevice *pci_dev, 
uint32_t addr,
  * allowing direct removal of unexposed functions.
  */
 if ((pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev)) ||
-!pci_dev->has_power) {
+!pci_dev->has_power || is_pci_dev_ejected(pci_dev)) {
 return;
 }
 
@@ -100,7 +111,7 @@ uint32_t pci_host_config_read_common(PCIDevice *pci_dev, 
uint32_t addr,
  * allowing direct removal of unexposed functions.
  */
 if ((pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev)) ||
-!pci_dev->has_power) {
+!pci_dev->has_power || is_pci_dev_ejected(pci_dev)) {
 return ~0x0;
 }
 
-- 
2.39.2




[Stable-8.0.4 62/63] hw/i386/x86-iommu: Fix endianness issue in x86_iommu_irq_to_msi_message()

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

The values in "msg" are assembled in host endian byte order (the other
field are also not swapped), so we must not swap the __addr_head here.

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-6-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Peter Xu 
(cherry picked from commit 37cf5cecb039a063c0abe3b51ae30f969e73aa84)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/x86-iommu.c b/hw/i386/x86-iommu.c
index 01d11325a6..726e9e1d16 100644
--- a/hw/i386/x86-iommu.c
+++ b/hw/i386/x86-iommu.c
@@ -63,7 +63,7 @@ void x86_iommu_irq_to_msi_message(X86IOMMUIrq *irq, 
MSIMessage *msg_out)
 msg.redir_hint = irq->redir_hint;
 msg.dest = irq->dest;
 msg.__addr_hi = irq->dest & 0xff00;
-msg.__addr_head = cpu_to_le32(0xfee);
+msg.__addr_head = 0xfee;
 /* Keep this from original MSI address bits */
 msg.__not_used = irq->msi_addr_last_bits;
 
-- 
2.39.2




[Stable-8.0.4 59/63] hw/i386/intel_iommu: Fix endianness problems related to VTD_IR_TableEntry

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

The code already tries to do some endianness handling here, but
currently fails badly:
- While it already swaps the data when logging errors / tracing, it fails
  to byteswap the value before e.g. accessing entry->irte.present
- entry->irte.source_id is swapped with le32_to_cpu(), though this is
  a 16-bit value
- The whole union is apparently supposed to be swapped via the 64-bit
  data[2] array, but the struct is a mixture between 32 bit values
  (the first 8 bytes) and 64 bit values (the second 8 bytes), so this
  cannot work as expected.

Fix it by converting the struct to two proper 64-bit bitfields, and
by swapping the values only once for everybody right after reading
the data from memory.

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-3-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Xu 
(cherry picked from commit 642ba89672279fbdd14016a90da239c85e845d18)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 255a881ad0..03becd6384 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3327,14 +3327,15 @@ static int vtd_irte_get(IntelIOMMUState *iommu, 
uint16_t index,
 return -VTD_FR_IR_ROOT_INVAL;
 }
 
-trace_vtd_ir_irte_get(index, le64_to_cpu(entry->data[1]),
-  le64_to_cpu(entry->data[0]));
+entry->data[0] = le64_to_cpu(entry->data[0]);
+entry->data[1] = le64_to_cpu(entry->data[1]);
+
+trace_vtd_ir_irte_get(index, entry->data[1], entry->data[0]);
 
 if (!entry->irte.present) {
 error_report_once("%s: detected non-present IRTE "
   "(index=%u, high=0x%" PRIx64 ", low=0x%" PRIx64 ")",
-  __func__, index, le64_to_cpu(entry->data[1]),
-  le64_to_cpu(entry->data[0]));
+  __func__, index, entry->data[1], entry->data[0]);
 return -VTD_FR_IR_ENTRY_P;
 }
 
@@ -3342,14 +3343,13 @@ static int vtd_irte_get(IntelIOMMUState *iommu, 
uint16_t index,
 entry->irte.__reserved_2) {
 error_report_once("%s: detected non-zero reserved IRTE "
   "(index=%u, high=0x%" PRIx64 ", low=0x%" PRIx64 ")",
-  __func__, index, le64_to_cpu(entry->data[1]),
-  le64_to_cpu(entry->data[0]));
+  __func__, index, entry->data[1], entry->data[0]);
 return -VTD_FR_IR_IRTE_RSVD;
 }
 
 if (sid != X86_IOMMU_SID_INVALID) {
 /* Validate IRTE SID */
-source_id = le32_to_cpu(entry->irte.source_id);
+source_id = entry->irte.source_id;
 switch (entry->irte.sid_vtype) {
 case VTD_SVT_NONE:
 break;
@@ -3403,7 +3403,7 @@ static int vtd_remap_irq_get(IntelIOMMUState *iommu, 
uint16_t index,
 irq->trigger_mode = irte.irte.trigger_mode;
 irq->vector = irte.irte.vector;
 irq->delivery_mode = irte.irte.delivery_mode;
-irq->dest = le32_to_cpu(irte.irte.dest_id);
+irq->dest = irte.irte.dest_id;
 if (!iommu->intr_eime) {
 #define  VTD_IR_APIC_DEST_MASK (0xff00ULL)
 #define  VTD_IR_APIC_DEST_SHIFT(8)
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 89dcbc5e1e..7fa0a695c8 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -178,37 +178,39 @@ enum {
 union VTD_IR_TableEntry {
 struct {
 #if HOST_BIG_ENDIAN
-uint32_t __reserved_1:8; /* Reserved 1 */
-uint32_t vector:8;   /* Interrupt Vector */
-uint32_t irte_mode:1;/* IRTE Mode */
-uint32_t __reserved_0:3; /* Reserved 0 */
-uint32_t __avail:4;  /* Available spaces for software */
-uint32_t delivery_mode:3;/* Delivery Mode */
-uint32_t trigger_mode:1; /* Trigger Mode */
-uint32_t redir_hint:1;   /* Redirection Hint */
-uint32_t dest_mode:1;/* Destination Mode */
-uint32_t fault_disable:1;/* Fault Processing Disable */
-uint32_t present:1;  /* Whether entry present/available */
+uint64_t dest_id:32; /* Destination ID */
+uint64_t __reserved_1:8; /* Reserved 1 */
+uint64_t vector:8;   /* Interrupt Vector */
+uint64_t irte_mode:1;/* IRTE Mode */
+uint64_t __reserved_0:3; /* Reserved 0 */
+uint64_t __avail:4;  /* Available spaces for software */
+uint64_t delivery_mode:3;/* Delivery Mode */
+uint64_t trigger_mode:1; /* Trigger Mode */
+uint64_t redir_hint:1;   /* Redirection Hint */
+uint64_t dest_mode:1;/* Destination Mode */
+uint64_t fault_disable:1;/* Fault Processing Disable */
+uint64_t present:1;  /* Whether entry present/available */
 #else
-uint32_t present:1;  /* Whether entry 

[Stable-8.0.4 44/63] linux-user/armeb: Fix __kernel_cmpxchg() for armeb

2023-08-04 Thread Michael Tokarev
From: Helge Deller 

Commit 7f4f0d9ea870 ("linux-user/arm: Implement __kernel_cmpxchg with host
atomics") switched to use qatomic_cmpxchg() to swap a word with the memory
content, but missed to endianess-swap the oldval and newval values when
emulating an armeb CPU, which expects words to be stored in big endian in
the guest memory.

The bug can be verified with qemu >= v7.0 on any little-endian host, when
starting the armeb binary of the upx program, which just hangs without
this patch.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Helge Deller 
Reported-by: "Markus F.X.J. Oberhumer" 
Reported-by: John Reiser 
Closes: https://github.com/upx/upx/issues/687
Message-Id: 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Richard Henderson 
(cherry picked from commit 38dd78c41eaf08b490c9e7ec68fc508bbaa5cb1d)
Signed-off-by: Michael Tokarev 

diff --git a/linux-user/arm/cpu_loop.c b/linux-user/arm/cpu_loop.c
index a992423257..b404117ff3 100644
--- a/linux-user/arm/cpu_loop.c
+++ b/linux-user/arm/cpu_loop.c
@@ -117,8 +117,9 @@ static void arm_kernel_cmpxchg32_helper(CPUARMState *env)
 {
 uint32_t oldval, newval, val, addr, cpsr, *host_addr;
 
-oldval = env->regs[0];
-newval = env->regs[1];
+/* Swap if host != guest endianness, for the host cmpxchg below */
+oldval = tswap32(env->regs[0]);
+newval = tswap32(env->regs[1]);
 addr = env->regs[2];
 
 mmap_lock();
@@ -174,6 +175,10 @@ static void arm_kernel_cmpxchg64_helper(CPUARMState *env)
 return;
 }
 
+/* Swap if host != guest endianness, for the host cmpxchg below */
+oldval = tswap64(oldval);
+newval = tswap64(newval);
+
 #ifdef CONFIG_ATOMIC64
 val = qatomic_cmpxchg__nocheck(host_addr, oldval, newval);
 cpsr = (val == oldval) * CPSR_C;
-- 
2.39.2




[Stable-8.0.4 56/63] target/hppa: Move iaoq registers and thus reduce generated code size

2023-08-04 Thread Michael Tokarev
From: Helge Deller 

On hppa the Instruction Address Offset Queue (IAOQ) registers specifies
the next to-be-executed instructions addresses. Each generated TB writes those
registers at least once, so those registers are used heavily in generated
code.

Looking at the generated assembly, for a x86-64 host this code
to write the address $0x7ffe826f into iaoq_f is generated:
0x7f73e8000184:  c7 85 d4 01 00 00 6f 82  movl $0x7ffe826f, 0x1d4(%rbp)
0x7f73e800018c:  fe 7f
0x7f73e800018e:  c7 85 d8 01 00 00 73 82  movl $0x7ffe8273, 0x1d8(%rbp)
0x7f73e8000196:  fe 7f

With the trivial change, by moving the variables iaoq_f and iaoq_b to
the top of struct CPUArchState, the offset to %rbp is reduced (from
0x1d4 to 0), which allows the x86-64 tcg to generate 3 bytes less of
generated code per move instruction:
0x7fc1e800018c:  c7 45 00 6f 82 fe 7f movl $0x7ffe826f, (%rbp)
0x7fc1e8000193:  c7 45 04 73 82 fe 7f movl $0x7ffe8273, 4(%rbp)

Overall this is a reduction of generated code (not a reduction of
number of instructions).
A test run with checks the generated code size by running "/bin/ls"
with qemu-user shows that the code size shrinks from 1616767 to 1569273
bytes, which is ~97% of the former size.

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Signed-off-by: Helge Deller 
Cc: qemu-sta...@nongnu.org
(cherry picked from commit f8c0fd9804f435a20c3baa4c0c77ba9a02af24ef)
Signed-off-by: Michael Tokarev 

diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h
index b595ef25a9..c7659e5b0d 100644
--- a/target/hppa/cpu.h
+++ b/target/hppa/cpu.h
@@ -168,6 +168,9 @@ typedef struct {
 } hppa_tlb_entry;
 
 typedef struct CPUArchState {
+target_ureg iaoq_f;  /* front */
+target_ureg iaoq_b;  /* back, aka next instruction */
+
 target_ureg gr[32];
 uint64_t fr[32];
 uint64_t sr[8];  /* stored shifted into place for gva */
@@ -186,8 +189,6 @@ typedef struct CPUArchState {
 target_ureg psw_cb;  /* in least significant bit of next nibble */
 target_ureg psw_cb_msb;  /* boolean */
 
-target_ureg iaoq_f;  /* front */
-target_ureg iaoq_b;  /* back, aka next instruction */
 uint64_t iasq_f;
 uint64_t iasq_b;
 
-- 
2.39.2




[Stable-8.0.4 38/63] hw/arm/smmu: Handle big-endian hosts correctly

2023-08-04 Thread Michael Tokarev
From: Peter Maydell 

The implementation of the SMMUv3 has multiple places where it reads a
data structure from the guest and directly operates on it without
doing a guest-to-host endianness conversion.  Since all SMMU data
structures are little-endian, this means that the SMMU doesn't work
on a big-endian host.  In particular, this causes the Avocado test
  machine_aarch64_virt.py:Aarch64VirtMachine.test_alpine_virt_tcg_gic_max
to fail on an s390x host.

Add appropriate byte-swapping on reads and writes of guest in-memory
data structures so that the device works correctly on big-endian
hosts.

As part of this we constrain queue_read() to operate only on Cmd
structs and queue_write() on Evt structs, because in practice these
are the only data structures the two functions are used with, and we
need to know what the data structure is to be able to byte-swap its
parts correctly.

Signed-off-by: Peter Maydell 
Tested-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Eric Auger 
Message-id: 20230717132641.764660-1-peter.mayd...@linaro.org
Cc: qemu-sta...@nongnu.org
(cherry picked from commit c6445544d4cea2628fbad3bad09f3d3a03c749d3)
Signed-off-by: Michael Tokarev 

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index e7f1c1f219..daa02ce798 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -192,8 +192,7 @@ static int get_pte(dma_addr_t baseaddr, uint32_t index, 
uint64_t *pte,
 dma_addr_t addr = baseaddr + index * sizeof(*pte);
 
 /* TODO: guarantee 64-bit single-copy atomicity */
-ret = dma_memory_read(_space_memory, addr, pte, sizeof(*pte),
-  MEMTXATTRS_UNSPECIFIED);
+ret = ldq_le_dma(_space_memory, addr, pte, MEMTXATTRS_UNSPECIFIED);
 
 if (ret != MEMTX_OK) {
 info->type = SMMU_PTW_ERR_WALK_EABT;
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 270c80b665..cfb56725a6 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -98,20 +98,34 @@ static void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t 
new_gerrorn)
 trace_smmuv3_write_gerrorn(toggled & pending, s->gerrorn);
 }
 
-static inline MemTxResult queue_read(SMMUQueue *q, void *data)
+static inline MemTxResult queue_read(SMMUQueue *q, Cmd *cmd)
 {
 dma_addr_t addr = Q_CONS_ENTRY(q);
+MemTxResult ret;
+int i;
 
-return dma_memory_read(_space_memory, addr, data, q->entry_size,
-   MEMTXATTRS_UNSPECIFIED);
+ret = dma_memory_read(_space_memory, addr, cmd, sizeof(Cmd),
+  MEMTXATTRS_UNSPECIFIED);
+if (ret != MEMTX_OK) {
+return ret;
+}
+for (i = 0; i < ARRAY_SIZE(cmd->word); i++) {
+le32_to_cpus(>word[i]);
+}
+return ret;
 }
 
-static MemTxResult queue_write(SMMUQueue *q, void *data)
+static MemTxResult queue_write(SMMUQueue *q, Evt *evt_in)
 {
 dma_addr_t addr = Q_PROD_ENTRY(q);
 MemTxResult ret;
+Evt evt = *evt_in;
+int i;
 
-ret = dma_memory_write(_space_memory, addr, data, q->entry_size,
+for (i = 0; i < ARRAY_SIZE(evt.word); i++) {
+cpu_to_le32s([i]);
+}
+ret = dma_memory_write(_space_memory, addr, , sizeof(Evt),
MEMTXATTRS_UNSPECIFIED);
 if (ret != MEMTX_OK) {
 return ret;
@@ -291,7 +305,7 @@ static void smmuv3_init_regs(SMMUv3State *s)
 static int smmu_get_ste(SMMUv3State *s, dma_addr_t addr, STE *buf,
 SMMUEventInfo *event)
 {
-int ret;
+int ret, i;
 
 trace_smmuv3_get_ste(addr);
 /* TODO: guarantee 64-bit single-copy atomicity */
@@ -304,6 +318,9 @@ static int smmu_get_ste(SMMUv3State *s, dma_addr_t addr, 
STE *buf,
 event->u.f_ste_fetch.addr = addr;
 return -EINVAL;
 }
+for (i = 0; i < ARRAY_SIZE(buf->word); i++) {
+le32_to_cpus(>word[i]);
+}
 return 0;
 
 }
@@ -313,7 +330,7 @@ static int smmu_get_cd(SMMUv3State *s, STE *ste, uint32_t 
ssid,
CD *buf, SMMUEventInfo *event)
 {
 dma_addr_t addr = STE_CTXPTR(ste);
-int ret;
+int ret, i;
 
 trace_smmuv3_get_cd(addr);
 /* TODO: guarantee 64-bit single-copy atomicity */
@@ -326,6 +343,9 @@ static int smmu_get_cd(SMMUv3State *s, STE *ste, uint32_t 
ssid,
 event->u.f_ste_fetch.addr = addr;
 return -EINVAL;
 }
+for (i = 0; i < ARRAY_SIZE(buf->word); i++) {
+le32_to_cpus(>word[i]);
+}
 return 0;
 }
 
@@ -407,7 +427,7 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE 
*ste,
 return -EINVAL;
 }
 if (s->features & SMMU_FEATURE_2LVL_STE) {
-int l1_ste_offset, l2_ste_offset, max_l2_ste, span;
+int l1_ste_offset, l2_ste_offset, max_l2_ste, span, i;
 dma_addr_t l1ptr, l2ptr;
 STEDesc l1std;
 
@@ -431,6 +451,9 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE 
*ste,
 event->u.f_ste_fetch.addr = l1ptr;
 return -EINVAL;
 }
+for (i = 0; i < ARRAY_SIZE(l1std.word); i++) {
+  

[Stable-8.0.4 34/63] target/loongarch: Fix the CSRRD CPUID instruction on big endian hosts

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

The test in tests/avocado/machine_loongarch.py is currently failing
on big endian hosts like s390x. By comparing the traces between running
the QEMU_EFI.fd bios on a s390x and on a x86 host, it's quickly obvious
that the CSRRD instruction for the CPUID is behaving differently. And
indeed: The code currently does a long read (i.e. 64 bit) from the
address that points to the CPUState->cpu_index field (with tcg_gen_ld_tl()
in the trans_csrrd() function). But this cpu_index field is only an "int"
(i.e. 32 bit). While this dirty pointer magic works on little endian hosts,
it of course fails on big endian hosts. Fix it by using a proper helper
function instead.

Message-Id: <20230720175307.854460-1-th...@redhat.com>
Reviewed-by: Song Gao 
Signed-off-by: Thomas Huth 
(cherry picked from commit c34ad459926f6c600a55fe6782a27edfa405d60b)
Signed-off-by: Michael Tokarev 

diff --git a/target/loongarch/cpu.h b/target/loongarch/cpu.h
index e11c875188..4bf453e002 100644
--- a/target/loongarch/cpu.h
+++ b/target/loongarch/cpu.h
@@ -319,6 +319,7 @@ typedef struct CPUArchState {
 uint64_t CSR_DBG;
 uint64_t CSR_DERA;
 uint64_t CSR_DSAVE;
+uint64_t CSR_CPUID;
 
 #ifndef CONFIG_USER_ONLY
 LoongArchTLB  tlb[LOONGARCH_TLB_MAX];
diff --git a/target/loongarch/csr_helper.c b/target/loongarch/csr_helper.c
index 7e02787895..b778e6952d 100644
--- a/target/loongarch/csr_helper.c
+++ b/target/loongarch/csr_helper.c
@@ -36,6 +36,15 @@ target_ulong helper_csrrd_pgd(CPULoongArchState *env)
 return v;
 }
 
+target_ulong helper_csrrd_cpuid(CPULoongArchState *env)
+{
+LoongArchCPU *lac = env_archcpu(env);
+
+env->CSR_CPUID = CPU(lac)->cpu_index;
+
+return env->CSR_CPUID;
+}
+
 target_ulong helper_csrrd_tval(CPULoongArchState *env)
 {
 LoongArchCPU *cpu = env_archcpu(env);
diff --git a/target/loongarch/helper.h b/target/loongarch/helper.h
index 9c01823a26..f47b0f2d05 100644
--- a/target/loongarch/helper.h
+++ b/target/loongarch/helper.h
@@ -98,6 +98,7 @@ DEF_HELPER_1(rdtime_d, i64, env)
 #ifndef CONFIG_USER_ONLY
 /* CSRs helper */
 DEF_HELPER_1(csrrd_pgd, i64, env)
+DEF_HELPER_1(csrrd_cpuid, i64, env)
 DEF_HELPER_1(csrrd_tval, i64, env)
 DEF_HELPER_2(csrwr_estat, i64, env, tl)
 DEF_HELPER_2(csrwr_asid, i64, env, tl)
diff --git a/target/loongarch/insn_trans/trans_privileged.c.inc 
b/target/loongarch/insn_trans/trans_privileged.c.inc
index 5a04352b01..71d7f37717 100644
--- a/target/loongarch/insn_trans/trans_privileged.c.inc
+++ b/target/loongarch/insn_trans/trans_privileged.c.inc
@@ -99,13 +99,7 @@ static const CSRInfo csr_info[] = {
 CSR_OFF(PWCH),
 CSR_OFF(STLBPS),
 CSR_OFF(RVACFG),
-[LOONGARCH_CSR_CPUID] = {
-.offset = (int)offsetof(CPUState, cpu_index)
-  - (int)offsetof(LoongArchCPU, env),
-.flags = CSRFL_READONLY,
-.readfn = NULL,
-.writefn = NULL
-},
+CSR_OFF_FUNCS(CPUID, CSRFL_READONLY, gen_helper_csrrd_cpuid, NULL),
 CSR_OFF_FLAGS(PRCFG1, CSRFL_READONLY),
 CSR_OFF_FLAGS(PRCFG2, CSRFL_READONLY),
 CSR_OFF_FLAGS(PRCFG3, CSRFL_READONLY),
-- 
2.39.2




[Stable-8.0.4 45/63] thread-pool: signal "request_cond" while locked

2023-08-04 Thread Michael Tokarev
From: Anthony PERARD 

thread_pool_free() might have been called on the `pool`, which would
be a reason for worker_thread() to quit. In this case,
`pool->request_cond` is been destroyed.

If worker_thread() didn't managed to signal `request_cond` before it
been destroyed by thread_pool_free(), we got:
util/qemu-thread-posix.c:198: qemu_cond_signal: Assertion 
`cond->initialized' failed.

One backtrace:
__GI___assert_fail (assertion=0x5614abcb "cond->initialized", 
file=0x5614ab88 "util/qemu-thread-posix.c", line=198,
function=0x5614ad80 <__PRETTY_FUNCTION__.17104> "qemu_cond_signal") 
at assert.c:101
qemu_cond_signal (cond=0x7fffb800db30) at util/qemu-thread-posix.c:198
worker_thread (opaque=0x7fffb800dab0) at util/thread-pool.c:129
qemu_thread_start (args=0x7fffb8000b20) at util/qemu-thread-posix.c:505
start_thread (arg=) at pthread_create.c:486

Reported here:
https://lore.kernel.org/all/ZJwoK50FcnTSfFZ8@MacBook-Air-de-Roger.local/T/#u

To avoid issue, keep lock while sending a signal to `request_cond`.

Fixes: 900fa208f506 ("thread-pool: replace semaphore with condition variable")
Signed-off-by: Anthony PERARD 
Reviewed-by: Stefan Hajnoczi 
Message-Id: <20230714152720.5077-1-anthony.per...@citrix.com>
Signed-off-by: Anthony PERARD 
(cherry picked from commit f4f71363fcdb1092ff64d2bba6f9af39570c2f2b)
Signed-off-by: Michael Tokarev 

diff --git a/util/thread-pool.c b/util/thread-pool.c
index 31113b5860..39accc9ebe 100644
--- a/util/thread-pool.c
+++ b/util/thread-pool.c
@@ -120,13 +120,13 @@ static void *worker_thread(void *opaque)
 
 pool->cur_threads--;
 qemu_cond_signal(>worker_stopped);
-qemu_mutex_unlock(>lock);
 
 /*
  * Wake up another thread, in case we got a wakeup but decided
  * to exit due to pool->cur_threads > pool->max_threads.
  */
 qemu_cond_signal(>request_cond);
+qemu_mutex_unlock(>lock);
 return NULL;
 }
 
-- 
2.39.2




[Stable-8.0.4 53/63] vhost: fix the fd leak

2023-08-04 Thread Michael Tokarev
From: Li Feng 

When the vhost-user reconnect to the backend, the notifer should be
cleanup. Otherwise, the fd resource will be exhausted.

Fixes: f9a09ca3ea ("vhost: add support for configure interrupt")

Signed-off-by: Li Feng 
Reviewed-by: Raphael Norwitz 
Message-Id: <20230731121018.2856310-2-fen...@smartx.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Tested-by: Fiona Ebner 
(cherry picked from commit 18f2971ce403008d5e1c2875b483c9d1778143dc)
Signed-off-by: Michael Tokarev 

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 480e7f8048..f394d69a0f 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -2059,6 +2059,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice 
*vdev, bool vrings)
 event_notifier_test_and_clear(
 >vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
 event_notifier_test_and_clear(>config_notifier);
+event_notifier_cleanup(
+>vqs[VHOST_QUEUE_NUM_CONFIG_INR].masked_config_notifier);
 
 trace_vhost_dev_stop(hdev, vdev->name, vrings);
 
-- 
2.39.2




[Stable-8.0.4 55/63] cryptodev: Handle unexpected request to avoid crash

2023-08-04 Thread Michael Tokarev
From: zhenwei pi 

Generally guest side should discover which services the device is
able to offer, then do requests on device.

However it's also possible to break this rule in a guest. Handle
unexpected request here to avoid NULL pointer dereference.

Fixes: e7a775fd ('cryptodev: Account statistics')
Cc: Gonglei 
Cc: Mauro Matteo Cascella 
Cc: Xiao Lei 
Cc: Yongkang Jia 
Reported-by: Yiming Tao 
Signed-off-by: zhenwei pi 
Message-Id: <20230803024314.29962-3-pizhen...@bytedance.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit 15b11a1da6a4b7c6b8bb37883f52b544dee2b8fd)
Signed-off-by: Michael Tokarev 

diff --git a/backends/cryptodev.c b/backends/cryptodev.c
index 94ca393cee..d3fe92d8c0 100644
--- a/backends/cryptodev.c
+++ b/backends/cryptodev.c
@@ -191,6 +191,11 @@ static int cryptodev_backend_account(CryptoDevBackend 
*backend,
 if (algtype == QCRYPTODEV_BACKEND_ALG_ASYM) {
 CryptoDevBackendAsymOpInfo *asym_op_info = op_info->u.asym_op_info;
 len = asym_op_info->src_len;
+
+if (unlikely(!backend->asym_stat)) {
+error_report("cryptodev: Unexpected asym operation");
+return -VIRTIO_CRYPTO_NOTSUPP;
+}
 switch (op_info->op_code) {
 case VIRTIO_CRYPTO_AKCIPHER_ENCRYPT:
 CryptodevAsymStatIncEncrypt(backend, len);
@@ -210,6 +215,11 @@ static int cryptodev_backend_account(CryptoDevBackend 
*backend,
 } else if (algtype == QCRYPTODEV_BACKEND_ALG_SYM) {
 CryptoDevBackendSymOpInfo *sym_op_info = op_info->u.sym_op_info;
 len = sym_op_info->src_len;
+
+if (unlikely(!backend->sym_stat)) {
+error_report("cryptodev: Unexpected sym operation");
+return -VIRTIO_CRYPTO_NOTSUPP;
+}
 switch (op_info->op_code) {
 case VIRTIO_CRYPTO_CIPHER_ENCRYPT:
 CryptodevSymStatIncEncrypt(backend, len);
-- 
2.39.2




[Stable-8.0.4 58/63] hw/i386/intel_iommu: Fix trivial endianness problems

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

After reading the guest memory with dma_memory_read(), we have
to make sure that we byteswap the little endian data to the host's
byte order.

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-2-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Peter Xu 
(cherry picked from commit cc2a08480e19007c05be8fe5b6893e20448954dc)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index a62896759c..255a881ad0 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -755,6 +755,8 @@ static int vtd_get_pdire_from_pdir_table(dma_addr_t 
pasid_dir_base,
 return -VTD_FR_PASID_TABLE_INV;
 }
 
+pdire->val = le64_to_cpu(pdire->val);
+
 return 0;
 }
 
@@ -779,6 +781,9 @@ static int vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState 
*s,
 pe, entry_size, MEMTXATTRS_UNSPECIFIED)) {
 return -VTD_FR_PASID_TABLE_INV;
 }
+for (size_t i = 0; i < ARRAY_SIZE(pe->val); i++) {
+pe->val[i] = le64_to_cpu(pe->val[i]);
+}
 
 /* Do translation type check */
 if (!vtd_pe_type_check(x86_iommu, pe)) {
-- 
2.39.2




[Stable-8.0.4 40/63] target/arm: Avoid writing to constant TCGv in trans_CSEL()

2023-08-04 Thread Michael Tokarev
From: Peter Maydell 

In commit 0b188ea05acb5 we changed the implementation of
trans_CSEL() to use tcg_constant_i32(). However, this change
was incorrect, because the implementation of the function
sets up the TCGv_i32 rn and rm to be either zero or else
a TCG temp created in load_reg(), and these TCG temps are
then in both cases written to by the emitted TCG ops.
The result is that we hit a TCG assertion:

qemu-system-arm: ../../tcg/tcg.c:4455: tcg_reg_alloc_mov: Assertion 
`!temp_readonly(ots)' failed.

(or on a non-debug build, just produce a garbage result)

Adjust the code so that rn and rm are always writeable
temporaries whether the instruction is using the special
case "0" or a normal register as input.

Cc: qemu-sta...@nongnu.org
Fixes: 0b188ea05acb5 ("target/arm: Use tcg_constant in trans_CSEL")
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230727103906.2641264-1-peter.mayd...@linaro.org
(cherry picked from commit 2b0d656ab6484cae7f174e194215a6d50343ecd2)
Signed-off-by: Michael Tokarev 

diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
index 7468476724..1e4d94e58a 100644
--- a/target/arm/tcg/translate.c
+++ b/target/arm/tcg/translate.c
@@ -8814,7 +8814,7 @@ static bool trans_IT(DisasContext *s, arg_IT *a)
 /* v8.1M CSEL/CSINC/CSNEG/CSINV */
 static bool trans_CSEL(DisasContext *s, arg_CSEL *a)
 {
-TCGv_i32 rn, rm, zero;
+TCGv_i32 rn, rm;
 DisasCompare c;
 
 if (!arm_dc_feature(s, ARM_FEATURE_V8_1M)) {
@@ -8832,16 +8832,17 @@ static bool trans_CSEL(DisasContext *s, arg_CSEL *a)
 }
 
 /* In this insn input reg fields of 0b mean "zero", not "PC" */
-zero = tcg_constant_i32(0);
+rn = tcg_temp_new_i32();
+rm = tcg_temp_new_i32();
 if (a->rn == 15) {
-rn = zero;
+tcg_gen_movi_i32(rn, 0);
 } else {
-rn = load_reg(s, a->rn);
+load_reg_var(s, rn, a->rn);
 }
 if (a->rm == 15) {
-rm = zero;
+tcg_gen_movi_i32(rm, 0);
 } else {
-rm = load_reg(s, a->rm);
+load_reg_var(s, rm, a->rm);
 }
 
 switch (a->op) {
@@ -8861,7 +8862,7 @@ static bool trans_CSEL(DisasContext *s, arg_CSEL *a)
 }
 
 arm_test_cc(, a->fcond);
-tcg_gen_movcond_i32(c.cond, rn, c.value, zero, rn, rm);
+tcg_gen_movcond_i32(c.cond, rn, c.value, tcg_constant_i32(0), rn, rm);
 
 store_reg(s, a->rd, rn);
 return true;
-- 
2.39.2




[Stable-8.0.4 46/63] xen-block: Avoid leaks on new error path

2023-08-04 Thread Michael Tokarev
From: Anthony PERARD 

Commit 189829399070 ("xen-block: Use specific blockdev driver")
introduced a new error path, without taking care of allocated
resources.

So only allocate the qdicts after the error check, and free both
`filename` and `driver` when we are about to return and thus taking
care of both success and error path.

Coverity only spotted the leak of qdicts (*_layer variables).

Reported-by: Peter Maydell 
Fixes: Coverity CID 1508722, 1398649
Fixes: 189829399070 ("xen-block: Use specific blockdev driver")
Signed-off-by: Anthony PERARD 
Reviewed-by: Paul Durrant 
Reviewed-by: Peter Maydell 
Message-Id: <20230704171819.42564-1-anthony.per...@citrix.com>
Signed-off-by: Anthony PERARD 
(cherry picked from commit aa36243514a777f76c8b8a19b1f8a71f27ec6c78)
Signed-off-by: Michael Tokarev 

diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index f5a744589d..6ccb8a4a32 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -763,14 +763,15 @@ static XenBlockDrive *xen_block_drive_create(const char 
*id,
 drive = g_new0(XenBlockDrive, 1);
 drive->id = g_strdup(id);
 
-file_layer = qdict_new();
-driver_layer = qdict_new();
-
 rc = stat(filename, );
 if (rc) {
 error_setg_errno(errp, errno, "Could not stat file '%s'", filename);
 goto done;
 }
+
+file_layer = qdict_new();
+driver_layer = qdict_new();
+
 if (S_ISBLK(st.st_mode)) {
 qdict_put_str(file_layer, "driver", "host_device");
 } else {
@@ -778,7 +779,6 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
 }
 
 qdict_put_str(file_layer, "filename", filename);
-g_free(filename);
 
 if (mode && *mode != 'w') {
 qdict_put_bool(file_layer, "read-only", true);
@@ -813,7 +813,6 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
 qdict_put_str(file_layer, "locking", "off");
 
 qdict_put_str(driver_layer, "driver", driver);
-g_free(driver);
 
 qdict_put(driver_layer, "file", file_layer);
 
@@ -824,6 +823,8 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
 qobject_unref(driver_layer);
 
 done:
+g_free(filename);
+g_free(driver);
 if (*errp) {
 xen_block_drive_destroy(drive, NULL);
 return NULL;
-- 
2.39.2




[Stable-8.0.4 41/63] util/interval-tree: Use qatomic_read for left/right while searching

2023-08-04 Thread Michael Tokarev
From: Richard Henderson 

Fixes a race condition (generally without optimization) in which
the subtree is re-read after the protecting if condition.

Cc: qemu-sta...@nongnu.org
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
(cherry picked from commit 055b86e0f0b4325117055d8d31c49011258f4af3)
Signed-off-by: Michael Tokarev 

diff --git a/util/interval-tree.c b/util/interval-tree.c
index 4c0baf108f..5a0ad21b2d 100644
--- a/util/interval-tree.c
+++ b/util/interval-tree.c
@@ -745,8 +745,9 @@ static IntervalTreeNode 
*interval_tree_subtree_search(IntervalTreeNode *node,
  * Loop invariant: start <= node->subtree_last
  * (Cond2 is satisfied by one of the subtree nodes)
  */
-if (node->rb.rb_left) {
-IntervalTreeNode *left = rb_to_itree(node->rb.rb_left);
+RBNode *tmp = qatomic_read(>rb.rb_left);
+if (tmp) {
+IntervalTreeNode *left = rb_to_itree(tmp);
 
 if (start <= left->subtree_last) {
 /*
@@ -765,8 +766,9 @@ static IntervalTreeNode 
*interval_tree_subtree_search(IntervalTreeNode *node,
 if (start <= node->last) { /* Cond2 */
 return node; /* node is leftmost match */
 }
-if (node->rb.rb_right) {
-node = rb_to_itree(node->rb.rb_right);
+tmp = qatomic_read(>rb.rb_right);
+if (tmp) {
+node = rb_to_itree(tmp);
 if (start <= node->subtree_last) {
 continue;
 }
@@ -814,8 +816,9 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot 
*root,
 IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode *node,
   uint64_t start, uint64_t last)
 {
-RBNode *rb = node->rb.rb_right, *prev;
+RBNode *rb, *prev;
 
+rb = qatomic_read(>rb.rb_right);
 while (true) {
 /*
  * Loop invariants:
@@ -840,7 +843,7 @@ IntervalTreeNode *interval_tree_iter_next(IntervalTreeNode 
*node,
 }
 prev = >rb;
 node = rb_to_itree(rb);
-rb = node->rb.rb_right;
+rb = qatomic_read(>rb.rb_right);
 } while (prev == rb);
 
 /* Check if the node intersects [start;last] */
-- 
2.39.2




[Stable-8.0.4 42/63] util/interval-tree: Use qatomic_set_mb in rb_link_node

2023-08-04 Thread Michael Tokarev
From: Richard Henderson 

Ensure that the stores to rb_left and rb_right are complete before
inserting the new node into the tree.  Otherwise a concurrent reader
could see garbage in the new leaf.

Cc: qemu-sta...@nongnu.org
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
(cherry picked from commit 4c8baa02d36379507afd17bdea87aabe0aa32ed3)
Signed-off-by: Michael Tokarev 
(Mjt: s/qatomic_set_mb/qatomic_mb_set/ for 8.0 - it was renamed later)

diff --git a/util/interval-tree.c b/util/interval-tree.c
index 5a0ad21b2d..2000cd2935 100644
--- a/util/interval-tree.c
+++ b/util/interval-tree.c
@@ -128,7 +128,11 @@ static inline void rb_link_node(RBNode *node, RBNode 
*parent, RBNode **rb_link)
 node->rb_parent_color = (uintptr_t)parent;
 node->rb_left = node->rb_right = NULL;
 
-qatomic_set(rb_link, node);
+/*
+ * Ensure that node is initialized before insertion,
+ * as viewed by a concurrent search.
+ */
+qatomic_mb_set(rb_link, node);
 }
 
 static RBNode *rb_next(RBNode *node)
-- 
2.39.2




[Stable-8.0.4 37/63] virtio-net: pass Device-TLB enable/disable events to vhost

2023-08-04 Thread Michael Tokarev
From: Viktor Prutyanov 

If vhost is enabled for virtio-net, Device-TLB enable/disable events
must be passed to vhost for proper IOMMU unmap flag selection.

Signed-off-by: Viktor Prutyanov 
Acked-by: Jason Wang 
Message-Id: <20230626091258.24453-3-vik...@daynix.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit cd9b8346884353ba9ae6560b44b7cccdf00a6633)
Signed-off-by: Michael Tokarev 

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 5c0a771170..3b66c97e3d 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3950,6 +3950,7 @@ static void virtio_net_class_init(ObjectClass *klass, 
void *data)
 vdc->vmsd = _virtio_net_device;
 vdc->primary_unplug_pending = primary_unplug_pending;
 vdc->get_vhost = virtio_net_get_vhost;
+vdc->toggle_device_iotlb = vhost_toggle_device_iotlb;
 }
 
 static const TypeInfo virtio_net_info = {
-- 
2.39.2




[Stable-8.0.4 39/63] target/arm: Special case M-profile in debug_helper.c code

2023-08-04 Thread Michael Tokarev
From: Peter Maydell 

A lot of the code called from helper_exception_bkpt_insn() is written
assuming A-profile, but we will also call this helper on M-profile
CPUs when they execute a BKPT insn.  This used to work by accident,
but recent changes mean that we will hit an assert when some of this
code calls down into lower level functions that end up calling
arm_security_space_below_el3(), arm_el_is_aa64(), and other functions
that now explicitly assert that the guest CPU is not M-profile.

Handle M-profile directly to avoid the assertions:
 * in arm_debug_target_el(), M-profile debug exceptions always
   go to EL1
 * in arm_debug_exception_fsr(), M-profile always uses the short
   format FSR (compare commit d7fe699be54b2, though in this case
   the code in arm_v7m_cpu_do_interrupt() does not need to
   look at the FSR value at all)

Cc: qemu-sta...@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1775
Signed-off-by: Peter Maydell 
Reviewed-by: Richard Henderson 
Message-id: 20230721143239.1753066-1-peter.mayd...@linaro.org
(cherry picked from commit 5d78893f39caf94c8587141e2219b57a7d63dd5c)
Signed-off-by: Michael Tokarev 

diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
index dfc8b2a1a5..0cbc8171d5 100644
--- a/target/arm/debug_helper.c
+++ b/target/arm/debug_helper.c
@@ -21,6 +21,10 @@ static int arm_debug_target_el(CPUARMState *env)
 bool secure = arm_is_secure(env);
 bool route_to_el2 = false;
 
+if (arm_feature(env, ARM_FEATURE_M)) {
+return 1;
+}
+
 if (arm_is_el2_enabled(env)) {
 route_to_el2 = env->cp15.hcr_el2 & HCR_TGE ||
env->cp15.mdcr_el2 & MDCR_TDE;
@@ -434,18 +438,20 @@ static uint32_t arm_debug_exception_fsr(CPUARMState *env)
 {
 ARMMMUFaultInfo fi = { .type = ARMFault_Debug };
 int target_el = arm_debug_target_el(env);
-bool using_lpae = false;
+bool using_lpae;
 
-if (target_el == 2 || arm_el_is_aa64(env, target_el)) {
+if (arm_feature(env, ARM_FEATURE_M)) {
+using_lpae = false;
+} else if (target_el == 2 || arm_el_is_aa64(env, target_el)) {
 using_lpae = true;
 } else if (arm_feature(env, ARM_FEATURE_PMSA) &&
arm_feature(env, ARM_FEATURE_V8)) {
 using_lpae = true;
+} else if (arm_feature(env, ARM_FEATURE_LPAE) &&
+   (env->cp15.tcr_el[target_el] & TTBCR_EAE)) {
+using_lpae = true;
 } else {
-if (arm_feature(env, ARM_FEATURE_LPAE) &&
-(env->cp15.tcr_el[target_el] & TTBCR_EAE)) {
-using_lpae = true;
-}
+using_lpae = false;
 }
 
 if (using_lpae) {
-- 
2.39.2




[Stable-8.0.4 47/63] io: remove io watch if TLS channel is closed during handshake

2023-08-04 Thread Michael Tokarev
From: Daniel P. Berrangé 

The TLS handshake make take some time to complete, during which time an
I/O watch might be registered with the main loop. If the owner of the
I/O channel invokes qio_channel_close() while the handshake is waiting
to continue the I/O watch must be removed. Failing to remove it will
later trigger the completion callback which the owner is not expecting
to receive. In the case of the VNC server, this results in a SEGV as
vnc_disconnect_start() tries to shutdown a client connection that is
already gone / NULL.

CVE-2023-3354
Reported-by: jiangyegen 
Signed-off-by: Daniel P. Berrangé 
(cherry picked from commit 10be627d2b5ec2d6b3dce045144aa739eef678b4)
Signed-off-by: Michael Tokarev 

diff --git a/include/io/channel-tls.h b/include/io/channel-tls.h
index 5672479e9e..26c67f17e2 100644
--- a/include/io/channel-tls.h
+++ b/include/io/channel-tls.h
@@ -48,6 +48,7 @@ struct QIOChannelTLS {
 QIOChannel *master;
 QCryptoTLSSession *session;
 QIOChannelShutdown shutdown;
+guint hs_ioc_tag;
 };
 
 /**
diff --git a/io/channel-tls.c b/io/channel-tls.c
index 9805dd0a3f..847d5297c3 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -198,12 +198,13 @@ static void qio_channel_tls_handshake_task(QIOChannelTLS 
*ioc,
 }
 
 trace_qio_channel_tls_handshake_pending(ioc, status);
-qio_channel_add_watch_full(ioc->master,
-   condition,
-   qio_channel_tls_handshake_io,
-   data,
-   NULL,
-   context);
+ioc->hs_ioc_tag =
+qio_channel_add_watch_full(ioc->master,
+   condition,
+   qio_channel_tls_handshake_io,
+   data,
+   NULL,
+   context);
 }
 }
 
@@ -218,6 +219,7 @@ static gboolean qio_channel_tls_handshake_io(QIOChannel 
*ioc,
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(
 qio_task_get_source(task));
 
+tioc->hs_ioc_tag = 0;
 g_free(data);
 qio_channel_tls_handshake_task(tioc, task, context);
 
@@ -378,6 +380,10 @@ static int qio_channel_tls_close(QIOChannel *ioc,
 {
 QIOChannelTLS *tioc = QIO_CHANNEL_TLS(ioc);
 
+if (tioc->hs_ioc_tag) {
+g_clear_handle_id(>hs_ioc_tag, g_source_remove);
+}
+
 return qio_channel_close(tioc->master, errp);
 }
 
-- 
2.39.2




[Stable-8.0.4 25/63] qemu-nbd: regression with arguments passing into nbd_client_thread()

2023-08-04 Thread Michael Tokarev
From: "Denis V. Lunev" 

Unfortunately
commit 03b67621445d601c9cdc7dfe25812e9f19b81488
(8.0:  feb0814b3b48e75b336ad72eb303f9d579c94083)
Author: Denis V. Lunev 
Date:   Mon Jul 17 16:55:40 2023 +0200
qemu-nbd: pass structure into nbd_client_thread instead of plain char*
has introduced a regression. struct NbdClientOpts resides on stack inside
'if' block. This specifically means that this stack space could be reused
once the execution will leave that block of the code.

This means that parameters passed into nbd_client_thread could be
overwritten at any moment.

The patch moves the data to the namespace of main() function effectively
preserving it for the whole process lifetime.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
CC: 
Reviewed-by: Eric Blake 
Message-ID: <20230727105828.324314-1-...@openvz.org>
Signed-off-by: Eric Blake 
(cherry picked from commit e5b815b0defcc3617f473ba70c3e675ef0ee69c2)
Signed-off-by: Michael Tokarev 
(Mjt: add reference to feb0814b3b48e75b336ad72eb303f9d579c94083 for 8.0 branch)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index e64f45f767..1039809e9c 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -584,6 +584,9 @@ int main(int argc, char **argv)
 const char *pid_file_name = NULL;
 const char *selinux_label = NULL;
 BlockExportOptions *export_opts;
+#if HAVE_NBD_DEVICE
+struct NbdClientOpts opts;
+#endif
 
 #ifdef CONFIG_POSIX
 os_setup_early_signal_handling();
@@ -1120,7 +1123,7 @@ int main(int argc, char **argv)
 if (device) {
 #if HAVE_NBD_DEVICE
 int ret;
-struct NbdClientOpts opts = {
+opts = (struct NbdClientOpts) {
 .device = device,
 .fork_process = fork_process,
 };
-- 
2.39.2




[Stable-7.2.5 05/36] qemu-nbd: regression with arguments passing into nbd_client_thread()

2023-08-04 Thread Michael Tokarev
From: "Denis V. Lunev" 

Unfortunately
commit 03b67621445d601c9cdc7dfe25812e9f19b81488
(7.2:  6e216d21b56a7545a05080a370b5ca7491fecfb3)
Author: Denis V. Lunev 
Date:   Mon Jul 17 16:55:40 2023 +0200
qemu-nbd: pass structure into nbd_client_thread instead of plain char*
has introduced a regression. struct NbdClientOpts resides on stack inside
'if' block. This specifically means that this stack space could be reused
once the execution will leave that block of the code.

This means that parameters passed into nbd_client_thread could be
overwritten at any moment.

The patch moves the data to the namespace of main() function effectively
preserving it for the whole process lifetime.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
CC: 
Reviewed-by: Eric Blake 
Message-ID: <20230727105828.324314-1-...@openvz.org>
Signed-off-by: Eric Blake 
(cherry picked from commit e5b815b0defcc3617f473ba70c3e675ef0ee69c2)
Signed-off-by: Michael Tokarev 
(Mjt: add reference to 6e216d21b56a7545a05080a370b5ca7491fecfb3 for 7.2 branch)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index bcdb74ff13..f71f5125d8 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -584,6 +584,9 @@ int main(int argc, char **argv)
 const char *pid_file_name = NULL;
 const char *selinux_label = NULL;
 BlockExportOptions *export_opts;
+#if HAVE_NBD_DEVICE
+struct NbdClientOpts opts;
+#endif
 
 #ifdef CONFIG_POSIX
 os_setup_early_signal_handling();
@@ -1122,7 +1125,7 @@ int main(int argc, char **argv)
 if (device) {
 #if HAVE_NBD_DEVICE
 int ret;
-struct NbdClientOpts opts = {
+opts = (struct NbdClientOpts) {
 .device = device,
 .fork_process = fork_process,
 };
-- 
2.39.2




[Stable-7.2.5 31/36] hw/i386/intel_iommu: Fix trivial endianness problems

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

After reading the guest memory with dma_memory_read(), we have
to make sure that we byteswap the little endian data to the host's
byte order.

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-2-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Peter Xu 
(cherry picked from commit cc2a08480e19007c05be8fe5b6893e20448954dc)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index d025ef2873..6dca977464 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -755,6 +755,8 @@ static int vtd_get_pdire_from_pdir_table(dma_addr_t 
pasid_dir_base,
 return -VTD_FR_PASID_TABLE_INV;
 }
 
+pdire->val = le64_to_cpu(pdire->val);
+
 return 0;
 }
 
@@ -779,6 +781,9 @@ static int vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState 
*s,
 pe, entry_size, MEMTXATTRS_UNSPECIFIED)) {
 return -VTD_FR_PASID_TABLE_INV;
 }
+for (size_t i = 0; i < ARRAY_SIZE(pe->val); i++) {
+pe->val[i] = le64_to_cpu(pe->val[i]);
+}
 
 /* Do translation type check */
 if (!vtd_pe_type_check(x86_iommu, pe)) {
-- 
2.39.2




[Stable-7.2.5 22/36] xen-block: Avoid leaks on new error path

2023-08-04 Thread Michael Tokarev
From: Anthony PERARD 

Commit 189829399070 ("xen-block: Use specific blockdev driver")
introduced a new error path, without taking care of allocated
resources.

So only allocate the qdicts after the error check, and free both
`filename` and `driver` when we are about to return and thus taking
care of both success and error path.

Coverity only spotted the leak of qdicts (*_layer variables).

Reported-by: Peter Maydell 
Fixes: Coverity CID 1508722, 1398649
Fixes: 189829399070 ("xen-block: Use specific blockdev driver")
Signed-off-by: Anthony PERARD 
Reviewed-by: Paul Durrant 
Reviewed-by: Peter Maydell 
Message-Id: <20230704171819.42564-1-anthony.per...@citrix.com>
Signed-off-by: Anthony PERARD 
(cherry picked from commit aa36243514a777f76c8b8a19b1f8a71f27ec6c78)
Signed-off-by: Michael Tokarev 

diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index 345b284d70..5e45a8b729 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -759,14 +759,15 @@ static XenBlockDrive *xen_block_drive_create(const char 
*id,
 drive = g_new0(XenBlockDrive, 1);
 drive->id = g_strdup(id);
 
-file_layer = qdict_new();
-driver_layer = qdict_new();
-
 rc = stat(filename, );
 if (rc) {
 error_setg_errno(errp, errno, "Could not stat file '%s'", filename);
 goto done;
 }
+
+file_layer = qdict_new();
+driver_layer = qdict_new();
+
 if (S_ISBLK(st.st_mode)) {
 qdict_put_str(file_layer, "driver", "host_device");
 } else {
@@ -774,7 +775,6 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
 }
 
 qdict_put_str(file_layer, "filename", filename);
-g_free(filename);
 
 if (mode && *mode != 'w') {
 qdict_put_bool(file_layer, "read-only", true);
@@ -809,7 +809,6 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
 qdict_put_str(file_layer, "locking", "off");
 
 qdict_put_str(driver_layer, "driver", driver);
-g_free(driver);
 
 qdict_put(driver_layer, "file", file_layer);
 
@@ -820,6 +819,8 @@ static XenBlockDrive *xen_block_drive_create(const char *id,
 qobject_unref(driver_layer);
 
 done:
+g_free(filename);
+g_free(driver);
 if (*errp) {
 xen_block_drive_destroy(drive, NULL);
 return NULL;
-- 
2.39.2




[Stable-7.2.5 28/36] virtio-crypto: verify src buffer length for sym request

2023-08-04 Thread Michael Tokarev
From: zhenwei pi 

For symmetric algorithms, the length of ciphertext must be as same
as the plaintext.
The missing verification of the src_len and the dst_len in
virtio_crypto_sym_op_helper() may lead buffer overflow/divulged.

This patch is originally written by Yiming Tao for QEMU-SECURITY,
resend it(a few changes of error message) in qemu-devel.

Fixes: CVE-2023-3180
Fixes: 04b9b37edda("virtio-crypto: add data queue processing handler")
Cc: Gonglei 
Cc: Mauro Matteo Cascella 
Cc: Yiming Tao 
Signed-off-by: zhenwei pi 
Message-Id: <20230803024314.29962-2-pizhen...@bytedance.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit 9d38a8434721a6479fe03fb5afb150ca793d3980)
Signed-off-by: Michael Tokarev 

diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index a6dbdd32da..406b4e5fd0 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -635,6 +635,11 @@ virtio_crypto_sym_op_helper(VirtIODevice *vdev,
 return NULL;
 }
 
+if (unlikely(src_len != dst_len)) {
+virtio_error(vdev, "sym request src len is different from dst len");
+return NULL;
+}
+
 max_len = (uint64_t)iv_len + aad_len + src_len + dst_len + hash_result_len;
 if (unlikely(max_len > vcrypto->conf.max_size)) {
 virtio_error(vdev, "virtio-crypto too big length");
-- 
2.39.2




[Stable-7.2.5 36/36] include/hw/i386/x86-iommu: Fix struct X86IOMMU_MSIMessage for big endian hosts

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

The first bitfield here is supposed to be used as a 64-bit equivalent
to the "uint64_t msi_addr" in the union. To make this work correctly
on big endian hosts, too, the __addr_hi field has to be part of the
bitfield, and the the bitfield members must be declared with "uint64_t"
instead of "uint32_t" - otherwise the values are placed in the wrong
bytes on big endian hosts.

Same applies to the 32-bit "msi_data" field: __resved1 must be part
of the bitfield, and the members must be declared with "uint32_t"
instead of "uint16_t".

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-7-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Xu 
(cherry picked from commit e1e56c07d1fa24aa37a7e89e6633768fc8ea8705)
Signed-off-by: Michael Tokarev 

diff --git a/include/hw/i386/x86-iommu.h b/include/hw/i386/x86-iommu.h
index 7637edb430..02dc2fe9ee 100644
--- a/include/hw/i386/x86-iommu.h
+++ b/include/hw/i386/x86-iommu.h
@@ -88,40 +88,42 @@ struct X86IOMMU_MSIMessage {
 union {
 struct {
 #if HOST_BIG_ENDIAN
-uint32_t __addr_head:12; /* 0xfee */
-uint32_t dest:8;
-uint32_t __reserved:8;
-uint32_t redir_hint:1;
-uint32_t dest_mode:1;
-uint32_t __not_used:2;
+uint64_t __addr_hi:32;
+uint64_t __addr_head:12; /* 0xfee */
+uint64_t dest:8;
+uint64_t __reserved:8;
+uint64_t redir_hint:1;
+uint64_t dest_mode:1;
+uint64_t __not_used:2;
 #else
-uint32_t __not_used:2;
-uint32_t dest_mode:1;
-uint32_t redir_hint:1;
-uint32_t __reserved:8;
-uint32_t dest:8;
-uint32_t __addr_head:12; /* 0xfee */
+uint64_t __not_used:2;
+uint64_t dest_mode:1;
+uint64_t redir_hint:1;
+uint64_t __reserved:8;
+uint64_t dest:8;
+uint64_t __addr_head:12; /* 0xfee */
+uint64_t __addr_hi:32;
 #endif
-uint32_t __addr_hi;
 } QEMU_PACKED;
 uint64_t msi_addr;
 };
 union {
 struct {
 #if HOST_BIG_ENDIAN
-uint16_t trigger_mode:1;
-uint16_t level:1;
-uint16_t __resved:3;
-uint16_t delivery_mode:3;
-uint16_t vector:8;
+uint32_t __resved1:16;
+uint32_t trigger_mode:1;
+uint32_t level:1;
+uint32_t __resved:3;
+uint32_t delivery_mode:3;
+uint32_t vector:8;
 #else
-uint16_t vector:8;
-uint16_t delivery_mode:3;
-uint16_t __resved:3;
-uint16_t level:1;
-uint16_t trigger_mode:1;
+uint32_t vector:8;
+uint32_t delivery_mode:3;
+uint32_t __resved:3;
+uint32_t level:1;
+uint32_t trigger_mode:1;
+uint32_t __resved1:16;
 #endif
-uint16_t __resved1;
 } QEMU_PACKED;
 uint32_t msi_data;
 };
-- 
2.39.2




Re: [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode CPUID[0x8000001D].EAX[bits 25:14]

2023-08-04 Thread Moger, Babu
Hi Zhao,

On 8/4/23 04:56, Zhao Liu wrote:
> Hi Babu,
> 
> On Thu, Aug 03, 2023 at 03:44:13PM -0500, Moger, Babu wrote:
>> Date: Thu, 3 Aug 2023 15:44:13 -0500
>> From: "Moger, Babu" 
>> Subject: Re: [PATCH v3 16/17] i386: Use CPUCacheInfo.share_level to encode
>>  CPUID[0x801D].EAX[bits 25:14]
>>
>> Hi Zhao,
>>   Please copy the thread to k...@vger.kernel.org also.  It makes it easier
>> to browse.
>>
> 
> OK. I'm not sure how to cc, should I forward all mail to KVM for the
> current version (v3), or should I cc the kvm mail list for the next
> version (v4)?

Yes. From v4.
Thanks
Babu
> 
>>
>> On 8/1/23 05:35, Zhao Liu wrote:
>>> From: Zhao Liu 
>>>
>>> CPUID[0x801D].EAX[bits 25:14] is used to represent the cache
>>> topology for amd CPUs.
>> Please change this to.
>>
>>
>> CPUID[0x801D].EAX[bits 25:14] NumSharingCache: number of logical
>> processors sharing cache. The number of
>> logical processors sharing this cache is NumSharingCache + 1.
> 
> OK.
> 
> Thanks,
> Zhao
> 
>>
>>>
>>> After cache models have topology information, we can use
>>> CPUCacheInfo.share_level to decide which topology level to be encoded
>>> into CPUID[0x801D].EAX[bits 25:14].
>>>
>>> Signed-off-by: Zhao Liu 
>>> ---
>>> Changes since v1:
>>>  * Use cache->share_level as the parameter in
>>>max_processor_ids_for_cache().
>>> ---
>>>  target/i386/cpu.c | 10 +-
>>>  1 file changed, 1 insertion(+), 9 deletions(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index f67b6be10b8d..6eee0274ade4 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -361,20 +361,12 @@ static void encode_cache_cpuid801d(CPUCacheInfo 
>>> *cache,
>>> uint32_t *eax, uint32_t *ebx,
>>> uint32_t *ecx, uint32_t *edx)
>>>  {
>>> -uint32_t num_apic_ids;
>>>  assert(cache->size == cache->line_size * cache->associativity *
>>>cache->partitions * cache->sets);
>>>  
>>>  *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
>>> (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
>>> -
>>> -/* L3 is shared among multiple cores */
>>> -if (cache->level == 3) {
>>> -num_apic_ids = 1 << apicid_die_offset(topo_info);
>>> -} else {
>>> -num_apic_ids = 1 << apicid_core_offset(topo_info);
>>> -}
>>> -*eax |= (num_apic_ids - 1) << 14;
>>> +*eax |= max_processor_ids_for_cache(topo_info, cache->share_level) << 
>>> 14;
>>>  
>>>  assert(cache->line_size > 0);
>>>  assert(cache->partitions > 0);
>>
>> -- 
>> Thanks
>> Babu Moger

-- 
Thanks
Babu Moger



[Stable-7.2.5 35/36] hw/i386/x86-iommu: Fix endianness issue in x86_iommu_irq_to_msi_message()

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

The values in "msg" are assembled in host endian byte order (the other
field are also not swapped), so we must not swap the __addr_head here.

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-6-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Peter Xu 
(cherry picked from commit 37cf5cecb039a063c0abe3b51ae30f969e73aa84)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/x86-iommu.c b/hw/i386/x86-iommu.c
index 01d11325a6..726e9e1d16 100644
--- a/hw/i386/x86-iommu.c
+++ b/hw/i386/x86-iommu.c
@@ -63,7 +63,7 @@ void x86_iommu_irq_to_msi_message(X86IOMMUIrq *irq, 
MSIMessage *msg_out)
 msg.redir_hint = irq->redir_hint;
 msg.dest = irq->dest;
 msg.__addr_hi = irq->dest & 0xff00;
-msg.__addr_head = cpu_to_le32(0xfee);
+msg.__addr_head = 0xfee;
 /* Keep this from original MSI address bits */
 msg.__not_used = irq->msi_addr_last_bits;
 
-- 
2.39.2




[Stable-7.2.5 32/36] hw/i386/intel_iommu: Fix endianness problems related to VTD_IR_TableEntry

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

The code already tries to do some endianness handling here, but
currently fails badly:
- While it already swaps the data when logging errors / tracing, it fails
  to byteswap the value before e.g. accessing entry->irte.present
- entry->irte.source_id is swapped with le32_to_cpu(), though this is
  a 16-bit value
- The whole union is apparently supposed to be swapped via the 64-bit
  data[2] array, but the struct is a mixture between 32 bit values
  (the first 8 bytes) and 64 bit values (the second 8 bytes), so this
  cannot work as expected.

Fix it by converting the struct to two proper 64-bit bitfields, and
by swapping the values only once for everybody right after reading
the data from memory.

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-3-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Xu 
(cherry picked from commit 642ba89672279fbdd14016a90da239c85e845d18)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 6dca977464..e8d25e2f19 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3323,14 +3323,15 @@ static int vtd_irte_get(IntelIOMMUState *iommu, 
uint16_t index,
 return -VTD_FR_IR_ROOT_INVAL;
 }
 
-trace_vtd_ir_irte_get(index, le64_to_cpu(entry->data[1]),
-  le64_to_cpu(entry->data[0]));
+entry->data[0] = le64_to_cpu(entry->data[0]);
+entry->data[1] = le64_to_cpu(entry->data[1]);
+
+trace_vtd_ir_irte_get(index, entry->data[1], entry->data[0]);
 
 if (!entry->irte.present) {
 error_report_once("%s: detected non-present IRTE "
   "(index=%u, high=0x%" PRIx64 ", low=0x%" PRIx64 ")",
-  __func__, index, le64_to_cpu(entry->data[1]),
-  le64_to_cpu(entry->data[0]));
+  __func__, index, entry->data[1], entry->data[0]);
 return -VTD_FR_IR_ENTRY_P;
 }
 
@@ -3338,14 +3339,13 @@ static int vtd_irte_get(IntelIOMMUState *iommu, 
uint16_t index,
 entry->irte.__reserved_2) {
 error_report_once("%s: detected non-zero reserved IRTE "
   "(index=%u, high=0x%" PRIx64 ", low=0x%" PRIx64 ")",
-  __func__, index, le64_to_cpu(entry->data[1]),
-  le64_to_cpu(entry->data[0]));
+  __func__, index, entry->data[1], entry->data[0]);
 return -VTD_FR_IR_IRTE_RSVD;
 }
 
 if (sid != X86_IOMMU_SID_INVALID) {
 /* Validate IRTE SID */
-source_id = le32_to_cpu(entry->irte.source_id);
+source_id = entry->irte.source_id;
 switch (entry->irte.sid_vtype) {
 case VTD_SVT_NONE:
 break;
@@ -3399,7 +3399,7 @@ static int vtd_remap_irq_get(IntelIOMMUState *iommu, 
uint16_t index,
 irq->trigger_mode = irte.irte.trigger_mode;
 irq->vector = irte.irte.vector;
 irq->delivery_mode = irte.irte.delivery_mode;
-irq->dest = le32_to_cpu(irte.irte.dest_id);
+irq->dest = irte.irte.dest_id;
 if (!iommu->intr_eime) {
 #define  VTD_IR_APIC_DEST_MASK (0xff00ULL)
 #define  VTD_IR_APIC_DEST_SHIFT(8)
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 46d973e629..7660dda768 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -142,37 +142,39 @@ enum {
 union VTD_IR_TableEntry {
 struct {
 #if HOST_BIG_ENDIAN
-uint32_t __reserved_1:8; /* Reserved 1 */
-uint32_t vector:8;   /* Interrupt Vector */
-uint32_t irte_mode:1;/* IRTE Mode */
-uint32_t __reserved_0:3; /* Reserved 0 */
-uint32_t __avail:4;  /* Available spaces for software */
-uint32_t delivery_mode:3;/* Delivery Mode */
-uint32_t trigger_mode:1; /* Trigger Mode */
-uint32_t redir_hint:1;   /* Redirection Hint */
-uint32_t dest_mode:1;/* Destination Mode */
-uint32_t fault_disable:1;/* Fault Processing Disable */
-uint32_t present:1;  /* Whether entry present/available */
+uint64_t dest_id:32; /* Destination ID */
+uint64_t __reserved_1:8; /* Reserved 1 */
+uint64_t vector:8;   /* Interrupt Vector */
+uint64_t irte_mode:1;/* IRTE Mode */
+uint64_t __reserved_0:3; /* Reserved 0 */
+uint64_t __avail:4;  /* Available spaces for software */
+uint64_t delivery_mode:3;/* Delivery Mode */
+uint64_t trigger_mode:1; /* Trigger Mode */
+uint64_t redir_hint:1;   /* Redirection Hint */
+uint64_t dest_mode:1;/* Destination Mode */
+uint64_t fault_disable:1;/* Fault Processing Disable */
+uint64_t present:1;  /* Whether entry present/available */
 #else
-uint32_t present:1;  /* Whether entry 

[Stable-7.2.5 13/36] virtio-pci: add handling of PCI ATS and Device-TLB enable/disable

2023-08-04 Thread Michael Tokarev
From: Viktor Prutyanov 

According to PCIe Address Translation Services specification 5.1.3.,
ATS Control Register has Enable bit to enable/disable ATS. Guest may
enable/disable PCI ATS and, accordingly, Device-TLB for the VirtIO PCI
device. So, raise/lower a flag and call a trigger function to pass this
event to a device implementation.

Signed-off-by: Viktor Prutyanov 
Message-Id: <20230512135122.70403-2-vik...@daynix.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
(cherry picked from commit 206e91d143301414df2deb48a411e402414ba6db)
Signed-off-by: Michael Tokarev 
(Mjt: include/hw/virtio/virtio.h: skip extra struct field added in 8.0)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index a1c9dfa7bb..67e771c373 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -631,6 +631,38 @@ virtio_address_space_read(VirtIOPCIProxy *proxy, hwaddr 
addr,
 }
 }
 
+static void virtio_pci_ats_ctrl_trigger(PCIDevice *pci_dev, bool enable)
+{
+VirtIOPCIProxy *proxy = VIRTIO_PCI(pci_dev);
+VirtIODevice *vdev = virtio_bus_get_device(>bus);
+VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
+
+vdev->device_iotlb_enabled = enable;
+
+if (k->toggle_device_iotlb) {
+k->toggle_device_iotlb(vdev);
+}
+}
+
+static void pcie_ats_config_write(PCIDevice *dev, uint32_t address,
+  uint32_t val, int len)
+{
+uint32_t off;
+uint16_t ats_cap = dev->exp.ats_cap;
+
+if (!ats_cap || address < ats_cap) {
+return;
+}
+off = address - ats_cap;
+if (off >= PCI_EXT_CAP_ATS_SIZEOF) {
+return;
+}
+
+if (range_covers_byte(off, len, PCI_ATS_CTRL + 1)) {
+virtio_pci_ats_ctrl_trigger(dev, !!(val & PCI_ATS_CTRL_ENABLE));
+}
+}
+
 static void virtio_write_config(PCIDevice *pci_dev, uint32_t address,
 uint32_t val, int len)
 {
@@ -644,6 +676,10 @@ static void virtio_write_config(PCIDevice *pci_dev, 
uint32_t address,
 pcie_cap_flr_write_config(pci_dev, address, val, len);
 }
 
+if (proxy->flags & VIRTIO_PCI_FLAG_ATS) {
+pcie_ats_config_write(pci_dev, address, val, len);
+}
+
 if (range_covers_byte(address, len, PCI_COMMAND)) {
 if (!(pci_dev->config[PCI_COMMAND] & PCI_COMMAND_MASTER)) {
 virtio_set_disabled(vdev, true);
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index acfd4df125..96a56430a6 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -135,6 +135,7 @@ struct VirtIODevice
 AddressSpace *dma_as;
 QLIST_HEAD(, VirtQueue) *vector_queues;
 QTAILQ_ENTRY(VirtIODevice) next;
+bool device_iotlb_enabled;
 };
 
 struct VirtioDeviceClass {
@@ -192,6 +193,7 @@ struct VirtioDeviceClass {
 const VMStateDescription *vmsd;
 bool (*primary_unplug_pending)(void *opaque);
 struct vhost_dev *(*get_vhost)(VirtIODevice *vdev);
+void (*toggle_device_iotlb)(VirtIODevice *vdev);
 };
 
 void virtio_instance_init_common(Object *proxy_obj, void *data,
-- 
2.39.2




[Stable-7.2.5 00/36] Patch Round-up for stable 7.2.5, freeze on 2023-08-05

2023-08-04 Thread Michael Tokarev
The following patches are queued for QEMU stable v7.2.5:

  https://gitlab.com/qemu-project/qemu/-/commits/staging-7.2

Patch freeze is 2023-08-05, and the release is planned for 2023-08-07:

  https://wiki.qemu.org/Planning/7.2

Please respond here or CC qemu-sta...@nongnu.org on any additional patches
you think should (or shouldn't) be included in the release.

The changes which are staging for inclusion, with the original commit hash
from master branch, are given below the bottom line.

Thanks!

/mjt

--
01* 230dfd9257e9 Olaf Hering:
   hw/ide/piix: properly initialize the BMIBA register
02* d921fea338c1 Mauro Matteo Cascella:
   ui/vnc-clipboard: fix infinite loop in inflate_buffer (CVE-2023-3255)
03* 03b67621445d Denis V. Lunev:
   qemu-nbd: pass structure into nbd_client_thread instead of plain char*
04* 5c56dd27a2c9 Denis V. Lunev:
   qemu-nbd: fix regression with qemu-nbd --fork run over ssh
05 e5b815b0defc Denis V. Lunev:
   qemu-nbd: regression with arguments passing into nbd_client_thread()
06* 761b0aa9381e Ilya Leoshkevich:
   target/s390x: Make CKSM raise an exception if R2 is odd
07* 4b6e4c0b8223 Ilya Leoshkevich:
   target/s390x: Fix CLM with M3=0
08* 53684e344a27 Ilya Leoshkevich:
   target/s390x: Fix CONVERT TO LOGICAL/FIXED with out-of-range inputs
09* a2025557ed4d Ilya Leoshkevich:
   target/s390x: Fix ICM with M3=0
10* 9c028c057adc Ilya Leoshkevich:
   target/s390x: Make MC raise specification exception when class >= 16
11* ff537b0370ab Ilya Leoshkevich:
   target/s390x: Fix assertion failure in VFMIN/VFMAX with type 13
12* c34ad459926f Thomas Huth:
   target/loongarch: Fix the CSRRD CPUID instruction on big endian hosts
13 206e91d14330 Viktor Prutyanov:
   virtio-pci: add handling of PCI ATS and Device-TLB enable/disable
14* ee071f67f7a1 Viktor Prutyanov:
   vhost: register and change IOMMU flag depending on Device-TLB state
15* cd9b83468843 Viktor Prutyanov:
   virtio-net: pass Device-TLB enable/disable events to vhost
16 c6445544d4ce Peter Maydell:
   hw/arm/smmu: Handle big-endian hosts correctly
17 2b0d656ab648 Peter Maydell:
   target/arm: Avoid writing to constant TCGv in trans_CSEL()
18 2e718e665706 Richard Henderson:
   target/ppc: Disable goto_tb with architectural singlestep
19 38dd78c41eaf Helge Deller:
   linux-user/armeb: Fix __kernel_cmpxchg() for armeb
20 07ce178a2b07 Konstantin Kostiuk:
   qga/win32: Use rundll for VSS installation
21 f4f71363fcdb Anthony PERARD:
   thread-pool: signal "request_cond" while locked
22 aa36243514a7 Anthony PERARD:
   xen-block: Avoid leaks on new error path
23 10be627d2b5e Daniel P. Berrangé:
   io: remove io watch if TLS channel is closed during handshake
24 c11d5bdae79a Keith Packard:
   target/nios2: Pass semihosting arg to exit
25 71e2dd6aa1bd Keith Packard:
   target/nios2: Fix semihost lseek offset computation
26 8caaae7319a5 Peter Maydell:
   target/m68k: Fix semihost lseek offset computation
27 cf2f89edf36a Eric Auger:
   hw/virtio-iommu: Fix potential OOB access in virtio_iommu_handle_command()
28 9d38a8434721 zhenwei pi:
   virtio-crypto: verify src buffer length for sym request
29 f8c0fd9804f4 Helge Deller:
   target/hppa: Move iaoq registers and thus reduce generated code size
30 348e354417b6 Yuri Benditovich:
   pci: do not respond config requests after PCI device eject
31 cc2a08480e19 Thomas Huth:
   hw/i386/intel_iommu: Fix trivial endianness problems
32 642ba8967227 Thomas Huth:
   hw/i386/intel_iommu: Fix endianness problems related to VTD_IR_TableEntry
33 4572b22cf9ba Thomas Huth:
   hw/i386/intel_iommu: Fix struct VTDInvDescIEC on big endian hosts
34 fcd802742330 Thomas Huth:
   hw/i386/intel_iommu: Fix index calculation in vtd_interrupt_remap_msi()
35 37cf5cecb039 Thomas Huth:
   hw/i386/x86-iommu: Fix endianness issue in x86_iommu_irq_to_msi_message()
36 e1e56c07d1fa Thomas Huth:
   include/hw/i386/x86-iommu: Fix struct X86IOMMU_MSIMessage for big endian 
   hosts

(commit(s) marked with * were in previous series and are not resent)



[Stable-7.2.5 34/36] hw/i386/intel_iommu: Fix index calculation in vtd_interrupt_remap_msi()

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

The values in "addr" are populated locally in this function in host
endian byte order, so we must not swap the index_l field here.

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-5-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Peter Xu 
(cherry picked from commit fcd8027423300b201b37842b88393dc5c6c8ee9e)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index e8d25e2f19..6640b669e2 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3454,7 +3454,7 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
 goto out;
 }
 
-index = addr.addr.index_h << 15 | le16_to_cpu(addr.addr.index_l);
+index = addr.addr.index_h << 15 | addr.addr.index_l;
 
 #define  VTD_IR_MSI_DATA_SUBHANDLE   (0x)
 #define  VTD_IR_MSI_DATA_RESERVED(0x)
-- 
2.39.2




[Stable-7.2.5 33/36] hw/i386/intel_iommu: Fix struct VTDInvDescIEC on big endian hosts

2023-08-04 Thread Michael Tokarev
From: Thomas Huth 

On big endian hosts, we need to reverse the bitfield order in the
struct VTDInvDescIEC, just like it is already done for the other
bitfields in the various structs of the intel-iommu device.

Signed-off-by: Thomas Huth 
Message-Id: <20230802135723.178083-4-th...@redhat.com>
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Xu 
(cherry picked from commit 4572b22cf9ba432fa3955686853c706a1821bbc7)
Signed-off-by: Michael Tokarev 

diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index f090e61e11..e4d43ce48c 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -321,12 +321,21 @@ typedef enum VTDFaultReason {
 
 /* Interrupt Entry Cache Invalidation Descriptor: VT-d 6.5.2.7. */
 struct VTDInvDescIEC {
+#if HOST_BIG_ENDIAN
+uint64_t reserved_2:16;
+uint64_t index:16;  /* Start index to invalidate */
+uint64_t index_mask:5;  /* 2^N for continuous int invalidation */
+uint64_t resved_1:22;
+uint64_t granularity:1; /* If set, it's global IR invalidation */
+uint64_t type:4;/* Should always be 0x4 */
+#else
 uint32_t type:4;/* Should always be 0x4 */
 uint32_t granularity:1; /* If set, it's global IR invalidation */
 uint32_t resved_1:22;
 uint32_t index_mask:5;  /* 2^N for continuous int invalidation */
 uint32_t index:16;  /* Start index to invalidate */
 uint32_t reserved_2:16;
+#endif
 };
 typedef struct VTDInvDescIEC VTDInvDescIEC;
 
-- 
2.39.2




  1   2   3   >