Re: [PATCH v2 20/35] docs: add a new section to outline emulation support

2023-01-24 Thread Thomas Huth

On 24/01/2023 19.01, Alex Bennée wrote:

This affects both system and user mode emulation so we should probably
list it up front.

Acked-by: Richard Henderson 
Signed-off-by: Alex Bennée 

---
v2
   - HPs -> HP's
   - MIPs-like -> MIPS-like
---
  docs/about/emulation.rst  | 103 ++
  docs/about/index.rst  |   1 +
  docs/devel/tcg-plugins.rst|   2 +
  docs/system/arm/emulation.rst |   2 +
  4 files changed, 108 insertions(+)
  create mode 100644 docs/about/emulation.rst

diff --git a/docs/about/emulation.rst b/docs/about/emulation.rst
new file mode 100644
index 00..bdc0630b35
--- /dev/null
+++ b/docs/about/emulation.rst
@@ -0,0 +1,103 @@
+Emulation
+=
+
+QEMU's Tiny Code Generator (TCG) gives it the ability to emulate a


I'd maybe rather say "provides" instead of "gives it".


+number of CPU architectures on any supported platform. Both


I'd maybe add a "host" between "supported" and "platform.


+:ref:`System Emulation` and :ref:`User Mode Emulation` are supported
+depending on the guest architecture.
+
+.. list-table:: Supported Guest Architectures for Emulation
+  :widths: 30 10 10 50
+  :header-rows: 1
+
+  * - Architecture (qemu name)
+- System
+- User-mode


Maybe just use "User" instead of "User-mode" to make the column smaller?


+- Notes
+  * - Alpha
+- Yes
+- Yes
+- Legacy 64 bit RISC ISA developed by DEC
+  * - Arm (arm, aarch64)
+- Yes
+- Yes
+- Wide range of features, see :ref:`Arm Emulation` for details
+  * - AVR
+- Yes
+- No
+- 8 bit micro controller, often used in maker projects
+  * - Cris
+- Yes
+- Yes
+- Embedded RISC chip developed by AXIS
+  * - Hexagon
+- No
+- Yes
+- Family of DSPs by Qualcomm
+  * - PA-RISC (hppa)
+- Yes
+- Yes
+- A legacy RISC system used in HP's old minicomputers
+  * - x86 (i386, x86_64)
+- Yes
+- Yes
+- The ubiquitous desktop PC CPU architecture, 32 and 64 bit.
+  * - Loongarch
+- Yes
+- Yes
+- A MIPS-like 64bit RISC architecture developed in China
+  * - m68k
+- Yes


Would it be possible to link the "Yes" entries in the "System" column to 
corresponding target-*.rst files? E.g. docs/system/target-m68k.rst for the 
m68k entry?



+- Yes
+- Motorola 68000 variants and ColdFire
+  * - Microblaze
+- Yes
+- Yes
+- RISC based soft-core by Xilinx
+  * - MIPS (mips, mipsel, mips64, mips64el)


The table renders very badly for me, the last column is cut off and you need 
to scroll to see its contents. This seems mainly to happen since this MIPS 
entry is very long. Could the information in the parentheses maybe be 
shortened to "(mips*)" or be dropped completely?


 Thomas




Re: [PATCH v2 22/35] docs: add an introduction to the system docs

2023-01-24 Thread Thomas Huth

On 24/01/2023 19.01, Alex Bennée wrote:

Drop the frankly misleading quickstart section for a more rounded
introduction section. This new section gives an overview of the
accelerators as well as a high level introduction to some of the key
features of the emulator. We also expand on a general form for a QEMU
command line with a hopefully not too scary worked example of what
this looks like.

Acked-by: Richard Henderson 
Signed-off-by: Alex Bennée 
Reviewed-by: Kashyap Chamarthy 

...

+That said, the general form of a QEMU command line can be expressed
+as:
+
+.. parsed-literal::
+
+  $ |qemu_system| [machine opts] \\
+  [cpu opts] \\
+  [accelerator opts] \\
+  [device opts] \\
+  [backend opts] \\
+  [interface opts] \\
+  [boot opts]


FYI: My "git am" complains about a trailing white space after "[boot opts]" 
here. Please remove it.


 Thomas




Re: [PATCH v2 09/35] gitlab: add lsan suppression file to workaround tcmalloc issues

2023-01-24 Thread Thomas Huth

On 24/01/2023 19.01, Alex Bennée wrote:

The up-coming upgrade to Fedora 37 will bring in libtcmalloc as a
dependency of libglusterfs which confuses our fuzz run. Rather than
disable the build lets use LSAN's suppression mechanism to prevent the
job from failing.

Signed-off-by: Alex Bennée 
Cc: Daniel P. Berrangé 
---
  .gitlab-ci.d/buildtest.yml | 1 +
  scripts/oss-fuzz/lsan_suppressions.txt | 2 ++
  2 files changed, 3 insertions(+)
  create mode 100644 scripts/oss-fuzz/lsan_suppressions.txt

diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index f09a898c3e..9a6ba1fe3b 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -511,6 +511,7 @@ build-oss-fuzz:
  IMAGE: fedora
script:
  - mkdir build-oss-fuzz
+- export LSAN_OPTIONS=suppressions=scripts/oss-fuzz/lsan_suppressions.txt
  - CC="clang" CXX="clang++" CFLAGS="-fsanitize=address"
./scripts/oss-fuzz/build.sh
  - export ASAN_OPTIONS="fast_unwind_on_malloc=0"
diff --git a/scripts/oss-fuzz/lsan_suppressions.txt 
b/scripts/oss-fuzz/lsan_suppressions.txt
new file mode 100644
index 00..02ec0a6ed5
--- /dev/null
+++ b/scripts/oss-fuzz/lsan_suppressions.txt
@@ -0,0 +1,2 @@
+# The tcmalloc on Fedora37 confuses things
+leak:/lib64/libtcmalloc_minimal.so.4


Reviewed-by: Thomas Huth 




Re: [PATCH] target/arm: Propagate errno when writing list

2023-01-24 Thread Akihiko Odaki

On 2023/01/25 1:12, Peter Maydell wrote:

On Thu, 1 Dec 2022 at 10:33, Akihiko Odaki  wrote:


Before this change, write_kvmstate_to_list() and
write_list_to_kvmstate() tolerated even if it failed to access some
register, and returned a bool indicating whether one of the register
accesses failed. However, it does not make sen not to fail early as the
the callers check the returned value and fail early anyway.

So let write_kvmstate_to_list() and write_list_to_kvmstate() fail early
too. This will allow to propagate errno to the callers and log it if
appropriate.


(Sorry this one didn't get reviewed earlier.)

I agree that all the callers of these functions check for
failure, so there's no major benefit from doing the
don't-fail-early logic. But is there a reason why we should
actively make this change?

In particular, these functions form part of a family with the
similar write_cpustate_to_list() and write_list_to_cpustate(),
and it's inconsistent to have the kvmstate ones return
negative-errno while the cpustate ones still return bool.
For the cpustate ones we *do* rely in some places on
the "don't fail early" behaviour. The kvmstate ones do the
same thing I think mostly for consistency.

So unless there's a specific reason why changing these
functions improves behaviour as seen by users, I think
I favour retaining the consistency.

thanks
-- PMM


I withdraw this patch. The only reason is that it allows to log errno 
when reporting the error, and the benefit is negligible when compared to 
the consistency.


Regards,
Akihiko Odaki



[PATCH v4 08/13] Hexagon (tests/tcg/hexagon) Remove __builtin from scatter_gather

2023-01-24 Thread Taylor Simpson
Replace __builtin_* with inline assembly
The __builtin's are subject to change with different compiler
releases, so might break
Mark arrays as aligned when accessed as HVX vectors
Clean up comments

Signed-off-by: Taylor Simpson 
---
 tests/tcg/hexagon/scatter_gather.c | 513 +++--
 1 file changed, 271 insertions(+), 242 deletions(-)

diff --git a/tests/tcg/hexagon/scatter_gather.c 
b/tests/tcg/hexagon/scatter_gather.c
index b93eb18133..bf8b5e0317 100644
--- a/tests/tcg/hexagon/scatter_gather.c
+++ b/tests/tcg/hexagon/scatter_gather.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -40,47 +40,6 @@ typedef long HVX_VectorPair   
__attribute__((__vector_size__(256)))
 typedef long HVX_VectorPred   __attribute__((__vector_size__(128)))
   __attribute__((aligned(128)));
 
-#define VSCATTER_16(BASE, RGN, OFF, VALS) \
-__builtin_HEXAGON_V6_vscattermh_128B((int)BASE, RGN, OFF, VALS)
-#define VSCATTER_16_MASKED(MASK, BASE, RGN, OFF, VALS) \
-__builtin_HEXAGON_V6_vscattermhq_128B(MASK, (int)BASE, RGN, OFF, VALS)
-#define VSCATTER_32(BASE, RGN, OFF, VALS) \
-__builtin_HEXAGON_V6_vscattermw_128B((int)BASE, RGN, OFF, VALS)
-#define VSCATTER_32_MASKED(MASK, BASE, RGN, OFF, VALS) \
-__builtin_HEXAGON_V6_vscattermwq_128B(MASK, (int)BASE, RGN, OFF, VALS)
-#define VSCATTER_16_32(BASE, RGN, OFF, VALS) \
-__builtin_HEXAGON_V6_vscattermhw_128B((int)BASE, RGN, OFF, VALS)
-#define VSCATTER_16_32_MASKED(MASK, BASE, RGN, OFF, VALS) \
-__builtin_HEXAGON_V6_vscattermhwq_128B(MASK, (int)BASE, RGN, OFF, VALS)
-#define VSCATTER_16_ACC(BASE, RGN, OFF, VALS) \
-__builtin_HEXAGON_V6_vscattermh_add_128B((int)BASE, RGN, OFF, VALS)
-#define VSCATTER_32_ACC(BASE, RGN, OFF, VALS) \
-__builtin_HEXAGON_V6_vscattermw_add_128B((int)BASE, RGN, OFF, VALS)
-#define VSCATTER_16_32_ACC(BASE, RGN, OFF, VALS) \
-__builtin_HEXAGON_V6_vscattermhw_add_128B((int)BASE, RGN, OFF, VALS)
-
-#define VGATHER_16(DSTADDR, BASE, RGN, OFF) \
-__builtin_HEXAGON_V6_vgathermh_128B(DSTADDR, (int)BASE, RGN, OFF)
-#define VGATHER_16_MASKED(DSTADDR, MASK, BASE, RGN, OFF) \
-__builtin_HEXAGON_V6_vgathermhq_128B(DSTADDR, MASK, (int)BASE, RGN, OFF)
-#define VGATHER_32(DSTADDR, BASE, RGN, OFF) \
-__builtin_HEXAGON_V6_vgathermw_128B(DSTADDR, (int)BASE, RGN, OFF)
-#define VGATHER_32_MASKED(DSTADDR, MASK, BASE, RGN, OFF) \
-__builtin_HEXAGON_V6_vgathermwq_128B(DSTADDR, MASK, (int)BASE, RGN, OFF)
-#define VGATHER_16_32(DSTADDR, BASE, RGN, OFF) \
-__builtin_HEXAGON_V6_vgathermhw_128B(DSTADDR, (int)BASE, RGN, OFF)
-#define VGATHER_16_32_MASKED(DSTADDR, MASK, BASE, RGN, OFF) \
-__builtin_HEXAGON_V6_vgathermhwq_128B(DSTADDR, MASK, (int)BASE, RGN, OFF)
-
-#define VSHUFF_H(V) \
-__builtin_HEXAGON_V6_vshuffh_128B(V)
-#define VSPLAT_H(X) \
-__builtin_HEXAGON_V6_lvsplath_128B(X)
-#define VAND_VAL(PRED, VAL) \
-__builtin_HEXAGON_V6_vandvrt_128B(PRED, VAL)
-#define VDEAL_H(V) \
-__builtin_HEXAGON_V6_vdealh_128B(V)
-
 int err;
 
 /* define the number of rows/cols in a square matrix */
@@ -108,22 +67,22 @@ unsigned short vscatter16_32_ref[SCATTER_BUFFER_SIZE];
 unsigned short vgather16_32_ref[MATRIX_SIZE];
 
 /* declare the arrays of offsets */
-unsigned short half_offsets[MATRIX_SIZE];
-unsigned int   word_offsets[MATRIX_SIZE];
+unsigned short half_offsets[MATRIX_SIZE] __attribute__((aligned(128)));
+unsigned int   word_offsets[MATRIX_SIZE] __attribute__((aligned(128)));
 
 /* declare the arrays of values */
-unsigned short half_values[MATRIX_SIZE];
-unsigned short half_values_acc[MATRIX_SIZE];
-unsigned short half_values_masked[MATRIX_SIZE];
-unsigned int   word_values[MATRIX_SIZE];
-unsigned int   word_values_acc[MATRIX_SIZE];
-unsigned int   word_values_masked[MATRIX_SIZE];
+unsigned short half_values[MATRIX_SIZE] __attribute__((aligned(128)));
+unsigned short half_values_acc[MATRIX_SIZE] __attribute__((aligned(128)));
+unsigned short half_values_masked[MATRIX_SIZE] __attribute__((aligned(128)));
+unsigned int   word_values[MATRIX_SIZE] __attribute__((aligned(128)));
+unsigned int   word_values_acc[MATRIX_SIZE] __attribute__((aligned(128)));
+unsigned int   word_values_masked[MATRIX_SIZE] __attribute__((aligned(128)));
 
 /* declare the arrays of predicates */
-unsigned short half_predicates[MATRIX_SIZE];
-unsigned int   word_predicates[MATRIX_SIZE];
+unsigned short half_predicates[MATRIX_SIZE] __attribute__((aligned(128)));
+unsigned int   word_predicates[MATRIX_SIZE] __attribute__((aligned(128)));
 
-/* make this big enough for all the intrinsics */
+/* make this big enough for all the operations */
 const size_t region_len = sizeof(vtcm);
 
 /* optionally add sync 

[PATCH v4 12/13] Hexagon (target/hexagon) Reduce manipulation of slot_cancelled

2023-01-24 Thread Taylor Simpson
We only need to track slot for predicated stores and predicated HVX
instructions.

Add arguments to the probe helper functions to indicate if the slot
is predicated.

Signed-off-by: Taylor Simpson 
---
 target/hexagon/macros.h |  2 +-
 target/hexagon/op_helper.h  |  1 -
 target/hexagon/idef-parser/parser-helpers.c |  1 -
 target/hexagon/op_helper.c  | 23 +--
 target/hexagon/translate.c  | 25 ++---
 target/hexagon/idef-parser/idef-parser.lex  |  4 ++--
 target/hexagon/idef-parser/idef-parser.y|  7 +++---
 7 files changed, 40 insertions(+), 23 deletions(-)

diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 8f1f82f8da..e3b2ea2d42 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -205,7 +205,7 @@ static inline void gen_cancel(uint32_t slot)
 
 #define CANCEL gen_cancel(slot);
 #else
-#define CANCEL cancel_slot(env, slot)
+#define CANCEL do { } while (0)
 #endif
 
 #define LOAD_CANCEL(EA) do { CANCEL; } while (0)
diff --git a/target/hexagon/op_helper.h b/target/hexagon/op_helper.h
index 02347edee8..906420e5a0 100644
--- a/target/hexagon/op_helper.h
+++ b/target/hexagon/op_helper.h
@@ -19,7 +19,6 @@
 #define HEXAGON_OP_HELPER_H
 
 /* Misc functions */
-void cancel_slot(CPUHexagonState *env, uint32_t slot);
 void write_new_pc(CPUHexagonState *env, bool pkt_has_multi_cof, target_ulong 
addr);
 
 uint8_t mem_load1(CPUHexagonState *env, uint32_t slot, target_ulong vaddr);
diff --git a/target/hexagon/idef-parser/parser-helpers.c 
b/target/hexagon/idef-parser/parser-helpers.c
index eb652d6a7a..c44d3a238f 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -1901,7 +1901,6 @@ void gen_cancel(Context *c, YYLTYPE *locp)
 
 void gen_load_cancel(Context *c, YYLTYPE *locp)
 {
-gen_cancel(c, locp);
 OUT(c, locp, "if (insn->slot == 0 && pkt->pkt_has_store_s1) {\n");
 OUT(c, locp, "ctx->s1_store_processed = false;\n");
 OUT(c, locp, "process_store(ctx, 1);\n");
diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index 8ddad35de7..15619a642c 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -415,9 +415,10 @@ int32_t HELPER(vacsh_pred)(CPUHexagonState *env,
 return PeV;
 }
 
-static void probe_store(CPUHexagonState *env, int slot, int mmu_idx)
+static void probe_store(CPUHexagonState *env, int slot, int mmu_idx,
+bool is_predicated)
 {
-if (!(env->slot_cancelled & (1 << slot))) {
+if (!is_predicated || !(env->slot_cancelled & (1 << slot))) {
 size1u_t width = env->mem_log_stores[slot].width;
 target_ulong va = env->mem_log_stores[slot].va;
 uintptr_t ra = GETPC();
@@ -437,9 +438,11 @@ void HELPER(probe_noshuf_load)(CPUHexagonState *env, 
target_ulong va,
 }
 
 /* Called during packet commit when there are two scalar stores */
-void HELPER(probe_pkt_scalar_store_s0)(CPUHexagonState *env, int mmu_idx)
+void HELPER(probe_pkt_scalar_store_s0)(CPUHexagonState *env, int args)
 {
-probe_store(env, 0, mmu_idx);
+int mmu_idx = args & 0x3;
+bool is_predicated = (args >> 2) & 1;
+probe_store(env, 0, mmu_idx, is_predicated);
 }
 
 void HELPER(probe_hvx_stores)(CPUHexagonState *env, int mmu_idx)
@@ -489,12 +492,14 @@ void HELPER(probe_pkt_scalar_hvx_stores)(CPUHexagonState 
*env, int mask,
 bool has_st0= (mask >> 0) & 1;
 bool has_st1= (mask >> 1) & 1;
 bool has_hvx_stores = (mask >> 2) & 1;
+bool s0_is_pred = (mask >> 3) & 1;
+bool s1_is_pred = (mask >> 4) & 1;
 
 if (has_st0) {
-probe_store(env, 0, mmu_idx);
+probe_store(env, 0, mmu_idx, s0_is_pred);
 }
 if (has_st1) {
-probe_store(env, 1, mmu_idx);
+probe_store(env, 1, mmu_idx, s1_is_pred);
 }
 if (has_hvx_stores) {
 HELPER(probe_hvx_stores)(env, mmu_idx);
@@ -1399,12 +1404,6 @@ void HELPER(vwhist128qm)(CPUHexagonState *env, int32_t 
uiV)
 }
 }
 
-void cancel_slot(CPUHexagonState *env, uint32_t slot)
-{
-HEX_DEBUG_LOG("Slot %d cancelled\n", slot);
-env->slot_cancelled |= (1 << slot);
-}
-
 /* These macros can be referenced in the generated helper functions */
 #define warn(...) /* Nothing */
 #define fatal(...) g_assert_not_reached();
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 30a06bd442..8ac1f5cabc 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -247,7 +247,16 @@ static bool check_for_attrib(Packet *pkt, int attrib)
 
 static bool need_slot_cancelled(Packet *pkt)
 {
-return check_for_attrib(pkt, A_CONDEXEC);
+/* We only need slot_cancelled for conditional store and HVX instructions 
*/
+for (int i = 0; i < pkt->num_insns; i++) {
+uint16_t opcode = pkt->insn[i].opcode;
+if (GET_ATTRIB(opcode, A_CONDEXEC) &&
+(GET_ATTRIB(opcode, A_STORE) ||
+   

[PATCH v4 09/13] Hexagon (tests/tcg/hexagon) Enable HVX tests

2023-01-24 Thread Taylor Simpson
Made possible by new toolchain container

Signed-off-by: Taylor Simpson 
---
 tests/tcg/hexagon/Makefile.target | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/tests/tcg/hexagon/Makefile.target 
b/tests/tcg/hexagon/Makefile.target
index 18e6a5969e..f753b39d91 100644
--- a/tests/tcg/hexagon/Makefile.target
+++ b/tests/tcg/hexagon/Makefile.target
@@ -1,5 +1,5 @@
 ##
-##  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+##  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
 ##
 ##  This program is free software; you can redistribute it and/or modify
 ##  it under the terms of the GNU General Public License as published by
@@ -45,6 +45,10 @@ HEX_TESTS += fpstuff
 HEX_TESTS += overflow
 HEX_TESTS += signal_context
 HEX_TESTS += reg_mut
+HEX_TESTS += vector_add_int
+HEX_TESTS += scatter_gather
+HEX_TESTS += hvx_misc
+HEX_TESTS += hvx_histogram
 
 HEX_TESTS += test_abs
 HEX_TESTS += test_bitcnt
@@ -78,3 +82,10 @@ TESTS += $(HEX_TESTS)
 usr: usr.c
$(CC) $(CFLAGS) -mv67t -O2 -Wno-inline-asm -Wno-expansion-to-defined $< 
-o $@ $(LDFLAGS)
 
+scatter_gather: CFLAGS += -mhvx
+vector_add_int: CFLAGS += -mhvx -fvectorize
+hvx_misc: CFLAGS += -mhvx
+hvx_histogram: CFLAGS += -mhvx -Wno-gnu-folding-constant
+
+hvx_histogram: hvx_histogram.c hvx_histogram_row.S
+   $(CC) $(CFLAGS) $(CROSS_CC_GUEST_CFLAGS) $^ -o $@
-- 
2.17.1



[PATCH v4 02/13] Hexagon (target/hexagon) Add overrides for callr

2023-01-24 Thread Taylor Simpson
Add overrides for
J2_callr
J2_callrt
J2_callrf

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h |  6 ++
 target/hexagon/macros.h  | 12 +---
 target/hexagon/genptr.c  | 20 
 3 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index d644e59a63..9e8f3373ad 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -614,11 +614,17 @@
 
 #define fGEN_TCG_J2_call(SHORTCODE) \
 gen_call(ctx, riV)
+#define fGEN_TCG_J2_callr(SHORTCODE) \
+gen_callr(ctx, RsV)
 
 #define fGEN_TCG_J2_callt(SHORTCODE) \
 gen_cond_call(ctx, PuV, TCG_COND_EQ, riV)
 #define fGEN_TCG_J2_callf(SHORTCODE) \
 gen_cond_call(ctx, PuV, TCG_COND_NE, riV)
+#define fGEN_TCG_J2_callrt(SHORTCODE) \
+gen_cond_callr(ctx, TCG_COND_EQ, PuV, RsV)
+#define fGEN_TCG_J2_callrf(SHORTCODE) \
+gen_cond_callr(ctx, TCG_COND_NE, PuV, RsV)
 
 #define fGEN_TCG_J2_endloop0(SHORTCODE) \
 gen_endloop0(ctx)
diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index cd64bb8eec..8f1f82f8da 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -421,16 +421,6 @@ static inline TCGv gen_read_ireg(TCGv result, TCGv val, 
int shift)
 #define fBRANCH(LOC, TYPE)  fWRITE_NPC(LOC)
 #define fJUMPR(REGNO, TARGET, TYPE) fBRANCH(TARGET, COF_TYPE_JUMPR)
 #define fHINTJR(TARGET) { /* Not modelled in qemu */}
-#define fCALL(A) \
-do { \
-fWRITE_LR(fREAD_NPC()); \
-fBRANCH(A, COF_TYPE_CALL); \
-} while (0)
-#define fCALLR(A) \
-do { \
-fWRITE_LR(fREAD_NPC()); \
-fBRANCH(A, COF_TYPE_CALLR); \
-} while (0)
 #define fWRITE_LOOP_REGS0(START, COUNT) \
 do { \
 WRITE_RREG(HEX_REG_LC0, COUNT);  \
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 23fb808e37..360bcd0a19 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -710,6 +710,14 @@ static void gen_call(DisasContext *ctx, int pc_off)
 gen_write_new_pc_pcrel(ctx, pc_off, TCG_COND_ALWAYS, NULL);
 }
 
+static void gen_callr(DisasContext *ctx, TCGv new_pc)
+{
+TCGv next_PC =
+tcg_constant_tl(ctx->pkt->pc + ctx->pkt->encod_pkt_size_in_bytes);
+gen_log_reg_write(HEX_REG_LR, next_PC);
+gen_write_new_pc_addr(ctx, new_pc, TCG_COND_ALWAYS, NULL);
+}
+
 static void gen_cond_call(DisasContext *ctx, TCGv pred,
   TCGCond cond, int pc_off)
 {
@@ -726,6 +734,18 @@ static void gen_cond_call(DisasContext *ctx, TCGv pred,
 gen_set_label(skip);
 }
 
+static void gen_cond_callr(DisasContext *ctx,
+   TCGCond cond, TCGv pred, TCGv new_pc)
+{
+TCGv lsb = tcg_temp_new();
+TCGLabel *skip = gen_new_label();
+tcg_gen_andi_tl(lsb, pred, 1);
+tcg_gen_brcondi_tl(cond, lsb, 0, skip);
+tcg_temp_free(lsb);
+gen_callr(ctx, new_pc);
+gen_set_label(skip);
+}
+
 static void gen_endloop0(DisasContext *ctx)
 {
 TCGv lpcfg = tcg_temp_local_new();
-- 
2.17.1



[PATCH v4 06/13] Hexagon (target/hexagon) Analyze packet for HVX

2023-01-24 Thread Taylor Simpson
Extend the analyze_ functions for HVX vector and predicate writes
Remove calls to ctx_log_vreg_write[_pair] from gen_tcg_funcs.py
During gen_start_packet, reload the predicated HVX registers into
fugure_VRegs and tmp_VRegs

Signed-off-by: Taylor Simpson 
---
 target/hexagon/translate.h  | 14 --
 target/hexagon/translate.c  | 30 +
 target/hexagon/gen_analyze_funcs.py | 17 +---
 target/hexagon/gen_tcg_funcs.py | 18 -
 4 files changed, 52 insertions(+), 27 deletions(-)

diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index d45d3a4bb0..e997f74278 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -54,6 +54,8 @@ typedef struct DisasContext {
 DECLARE_BITMAP(vregs_updated_tmp, NUM_VREGS);
 DECLARE_BITMAP(vregs_updated, NUM_VREGS);
 DECLARE_BITMAP(vregs_select, NUM_VREGS);
+DECLARE_BITMAP(predicated_future_vregs, NUM_VREGS);
+DECLARE_BITMAP(predicated_tmp_vregs, NUM_VREGS);
 int qreg_log[NUM_QREGS];
 bool qreg_is_predicated[NUM_QREGS];
 int qreg_log_idx;
@@ -98,12 +100,6 @@ static inline void ctx_log_reg_write_pair(DisasContext 
*ctx, int rnum,
 ctx_log_reg_write(ctx, rnum + 1, is_predicated);
 }
 
-static inline bool is_vreg_preloaded(DisasContext *ctx, int num)
-{
-return test_bit(num, ctx->vregs_updated) ||
-   test_bit(num, ctx->vregs_updated_tmp);
-}
-
 intptr_t ctx_future_vreg_off(DisasContext *ctx, int regnum,
  int num, bool alloc_ok);
 intptr_t ctx_tmp_vreg_off(DisasContext *ctx, int regnum,
@@ -119,12 +115,18 @@ static inline void ctx_log_vreg_write(DisasContext *ctx,
 ctx->vreg_log_idx++;
 
 set_bit(rnum, ctx->vregs_updated);
+if (is_predicated) {
+set_bit(rnum, ctx->predicated_future_vregs);
+}
 }
 if (type == EXT_NEW) {
 set_bit(rnum, ctx->vregs_select);
 }
 if (type == EXT_TMP) {
 set_bit(rnum, ctx->vregs_updated_tmp);
+if (is_predicated) {
+set_bit(rnum, ctx->predicated_tmp_vregs);
+}
 }
 }
 
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 8f3436d69a..30a06bd442 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -362,6 +362,8 @@ static void gen_start_packet(DisasContext *ctx)
 bitmap_zero(ctx->vregs_updated_tmp, NUM_VREGS);
 bitmap_zero(ctx->vregs_updated, NUM_VREGS);
 bitmap_zero(ctx->vregs_select, NUM_VREGS);
+bitmap_zero(ctx->predicated_future_vregs, NUM_VREGS);
+bitmap_zero(ctx->predicated_tmp_vregs, NUM_VREGS);
 ctx->qreg_log_idx = 0;
 for (i = 0; i < STORES_MAX; i++) {
 ctx->store_width[i] = 0;
@@ -409,6 +411,34 @@ static void gen_start_packet(DisasContext *ctx)
 }
 }
 
+/* Preload the predicated HVX registers into future_VRegs and tmp_VRegs */
+if (!bitmap_empty(ctx->predicated_future_vregs, NUM_VREGS)) {
+int i = find_first_bit(ctx->predicated_future_vregs, NUM_VREGS);
+while (i < NUM_VREGS) {
+const intptr_t VdV_off =
+ctx_future_vreg_off(ctx, i, 1, true);
+intptr_t src_off = offsetof(CPUHexagonState, VRegs[i]);
+tcg_gen_gvec_mov(MO_64, VdV_off,
+ src_off,
+ sizeof(MMVector),
+ sizeof(MMVector));
+i = find_next_bit(ctx->predicated_future_vregs, NUM_VREGS, i + 1);
+}
+}
+if (!bitmap_empty(ctx->predicated_tmp_vregs, NUM_VREGS)) {
+int i = find_first_bit(ctx->predicated_tmp_vregs, NUM_VREGS);
+while (i < NUM_VREGS) {
+const intptr_t VdV_off =
+ctx_tmp_vreg_off(ctx, i, 1, true);
+intptr_t src_off = offsetof(CPUHexagonState, VRegs[i]);
+tcg_gen_gvec_mov(MO_64, VdV_off,
+ src_off,
+ sizeof(MMVector),
+ sizeof(MMVector));
+i = find_next_bit(ctx->predicated_tmp_vregs, NUM_VREGS, i + 1);
+}
+}
+
 if (pkt->pkt_has_hvx) {
 tcg_gen_movi_tl(hex_VRegs_updated, 0);
 tcg_gen_movi_tl(hex_QRegs_updated, 0);
diff --git a/target/hexagon/gen_analyze_funcs.py 
b/target/hexagon/gen_analyze_funcs.py
index 67a8e5e5e2..7b05b165a1 100755
--- a/target/hexagon/gen_analyze_funcs.py
+++ b/target/hexagon/gen_analyze_funcs.py
@@ -83,9 +83,16 @@ def analyze_opn_old(f, tag, regtype, regid, regno):
 else:
 print("Bad register parse: ", regtype, regid)
 elif (regtype == "V"):
+newv = "EXT_DFL"
+if (hex_common.is_new_result(tag)):
+newv = "EXT_NEW"
+elif (hex_common.is_tmp_result(tag)):
+newv = "EXT_TMP"
 if (regid in {"dd", "xx"}):
-f.write("//const int %s = insn->regno[%d];\n" %\
+f.write("const int %s = insn->regno[%d];\n" 

[PATCH v4 04/13] Hexagon (target/hexagon) Add overrides for dealloc-return instructions

2023-01-24 Thread Taylor Simpson
These instructions perform a deallocframe+return (jumpr r31)

Add overrides for
L4_return
SL2_return
L4_return_t
L4_return_f
L4_return_tnew_pt
L4_return_fnew_pt
L4_return_tnew_pnt
L4_return_fnew_pnt
SL2_return_t
SL2_return_f
SL2_return_tnew
SL2_return_fnew

This patch eliminates the last helper that uses write_new_pc, so we
remove it from op_helper.c

This patch also eliminates the last helper for load instructions, so we
remove the pkt_has_store_s1 runtime field as well as the mem_load[1248]
functions.

Signed-off-by: Taylor Simpson 
---
 target/hexagon/cpu.h   |  3 +-
 target/hexagon/gen_tcg.h   | 54 
 target/hexagon/genptr.c| 86 ++
 target/hexagon/op_helper.c | 69 --
 target/hexagon/translate.c |  6 +--
 5 files changed, 142 insertions(+), 76 deletions(-)

diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 34c0ae0a67..8df5b5a236 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -98,7 +98,6 @@ typedef struct CPUArchState {
 target_ulong pred_written;
 
 MemLog mem_log_stores[STORES_MAX];
-target_ulong pkt_has_store_s1;
 target_ulong dczero_addr;
 
 float_status fp_status;
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 6267f51ccc..8282ff3fc5 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -508,6 +508,60 @@
 #define fGEN_TCG_S2_storerinew_pcr(SHORTCODE) \
 fGEN_TCG_STORE_pcr(2, fSTORE(1, 4, EA, NtN))
 
+/*
+ * dealloc_return
+ * Assembler mapped to
+ * r31:30 = dealloc_return(r30):raw
+ */
+#define fGEN_TCG_L4_return(SHORTCODE) \
+gen_return(ctx, RddV, RsV)
+
+/*
+ * sub-instruction version (no RddV, so handle it manually)
+ */
+#define fGEN_TCG_SL2_return(SHORTCODE) \
+do { \
+TCGv_i64 RddV = tcg_temp_new_i64(); \
+gen_return(ctx, RddV, hex_gpr[HEX_REG_FP]); \
+gen_log_reg_write_pair(HEX_REG_FP, RddV); \
+tcg_temp_free_i64(RddV); \
+} while (0)
+
+/*
+ * Conditional returns follow this naming convention
+ * _t predicate true
+ * _f predicate false
+ * _tnew_pt   predicate.new true predict taken
+ * _fnew_pt   predicate.new false predict taken
+ * _tnew_pnt  predicate.new true predict not taken
+ * _fnew_pnt  predicate.new false predict not taken
+ * Predictions are not modelled in QEMU
+ *
+ * Example:
+ * if (p1) r31:30 = dealloc_return(r30):raw
+ */
+#define fGEN_TCG_L4_return_t(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvV, TCG_COND_EQ);
+#define fGEN_TCG_L4_return_f(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvV, TCG_COND_NE)
+#define fGEN_TCG_L4_return_tnew_pt(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_EQ)
+#define fGEN_TCG_L4_return_fnew_pt(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_NE)
+#define fGEN_TCG_L4_return_tnew_pnt(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_EQ)
+#define fGEN_TCG_L4_return_fnew_pnt(SHORTCODE) \
+gen_cond_return(ctx, RddV, RsV, PvN, TCG_COND_NE)
+
+#define fGEN_TCG_SL2_return_t(SHORTCODE) \
+gen_cond_return_subinsn(ctx, TCG_COND_EQ, hex_pred[0])
+#define fGEN_TCG_SL2_return_f(SHORTCODE) \
+gen_cond_return_subinsn(ctx, TCG_COND_NE, hex_pred[0])
+#define fGEN_TCG_SL2_return_tnew(SHORTCODE) \
+gen_cond_return_subinsn(ctx, TCG_COND_EQ, hex_new_pred_value[0])
+#define fGEN_TCG_SL2_return_fnew(SHORTCODE) \
+gen_cond_return_subinsn(ctx, TCG_COND_NE, hex_new_pred_value[0])
+
 /*
  * Mathematical operations with more than one definition require
  * special handling
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index e17ac93a59..efd36f760f 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -746,6 +746,92 @@ static void gen_cond_callr(DisasContext *ctx,
 gen_set_label(skip);
 }
 
+/* frame ^= (int64_t)FRAMEKEY << 32 */
+static void gen_frame_unscramble(TCGv_i64 frame)
+{
+TCGv_i64 framekey = tcg_temp_new_i64();
+tcg_gen_extu_i32_i64(framekey, hex_gpr[HEX_REG_FRAMEKEY]);
+tcg_gen_shli_i64(framekey, framekey, 32);
+tcg_gen_xor_i64(frame, frame, framekey);
+tcg_temp_free_i64(framekey);
+}
+
+static void gen_load_frame(DisasContext *ctx, TCGv_i64 frame, TCGv EA)
+{
+Insn *insn = ctx->insn;  /* Needed for CHECK_NOSHUF */
+CHECK_NOSHUF(EA, 8);
+tcg_gen_qemu_ld64(frame, EA, ctx->mem_idx);
+}
+
+static void gen_return_base(DisasContext *ctx, TCGv_i64 dst, TCGv src,
+TCGv r29)
+{
+/*
+ * 

[PATCH v4 05/13] Hexagon (target/hexagon) Analyze packet before generating TCG

2023-01-24 Thread Taylor Simpson
We create a new generator that creates an analyze_ function for
each instruction.  Currently, these functions record the writes to
R, P, and C registers by calling ctx_log_reg_write[_pair] or
ctx_log_pred_write.

During gen_start_packet, we invoke the analyze_ function for
each instruction in the packet, and we mark the implicit register
and predicate writes.

Doing the analysis up front has several advantages
- We remove calls to ctx_log_* from gen_tcg_funcs.py and genptr.c
- After the analysis is performed, we can initialize hex_new_value
  for each of the predicated assignments rather than during TCG
  generation for the instructions
- This is a stepping stone for future work where the analysis will
  include the set of registers that are read.  In cases where
  the packet doesn't have an overlap between the registers that are
  written and registers that are read, we can avoid the intermediate
  step of writing to hex_new_value.  Note that other checks will also
  be needed (e.g., no instructions can raise an exception).

Signed-off-by: Taylor Simpson 
---
 target/hexagon/translate.h  |  46 ++--
 target/hexagon/genptr.c |   5 +-
 target/hexagon/idef-parser/parser-helpers.c |   7 +-
 target/hexagon/translate.c  | 160 --
 target/hexagon/README   |  10 +-
 target/hexagon/gen_analyze_funcs.py | 225 
 target/hexagon/gen_tcg_funcs.py |  23 +-
 target/hexagon/meson.build  |  11 +-
 8 files changed, 370 insertions(+), 117 deletions(-)
 create mode 100755 target/hexagon/gen_analyze_funcs.py

diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index d971f4f095..d45d3a4bb0 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -38,6 +38,7 @@ typedef struct DisasContext {
 int reg_log[REG_WRITES_MAX];
 int reg_log_idx;
 DECLARE_BITMAP(regs_written, TOTAL_PER_THREAD_REGS);
+DECLARE_BITMAP(predicated_regs, TOTAL_PER_THREAD_REGS);
 int preg_log[PRED_WRITES_MAX];
 int preg_log_idx;
 DECLARE_BITMAP(pregs_written, NUM_PREGS);
@@ -62,32 +63,39 @@ typedef struct DisasContext {
 bool is_tight_loop;
 } DisasContext;
 
-static inline void ctx_log_reg_write(DisasContext *ctx, int rnum)
+static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
 {
-if (test_bit(rnum, ctx->regs_written)) {
-HEX_DEBUG_LOG("WARNING: Multiple writes to r%d\n", rnum);
+if (!test_bit(pnum, ctx->pregs_written)) {
+ctx->preg_log[ctx->preg_log_idx] = pnum;
+ctx->preg_log_idx++;
+set_bit(pnum, ctx->pregs_written);
 }
-ctx->reg_log[ctx->reg_log_idx] = rnum;
-ctx->reg_log_idx++;
-set_bit(rnum, ctx->regs_written);
-}
-
-static inline void ctx_log_reg_write_pair(DisasContext *ctx, int rnum)
-{
-ctx_log_reg_write(ctx, rnum);
-ctx_log_reg_write(ctx, rnum + 1);
 }
 
-static inline void ctx_log_pred_write(DisasContext *ctx, int pnum)
+static inline void ctx_log_reg_write(DisasContext *ctx, int rnum,
+ bool is_predicated)
 {
-ctx->preg_log[ctx->preg_log_idx] = pnum;
-ctx->preg_log_idx++;
-set_bit(pnum, ctx->pregs_written);
+if (rnum == HEX_REG_P3_0_ALIASED) {
+for (int i = 0; i < NUM_PREGS; i++) {
+ctx_log_pred_write(ctx, i);
+}
+} else {
+if (!test_bit(rnum, ctx->regs_written)) {
+ctx->reg_log[ctx->reg_log_idx] = rnum;
+ctx->reg_log_idx++;
+set_bit(rnum, ctx->regs_written);
+}
+if (is_predicated) {
+set_bit(rnum, ctx->predicated_regs);
+}
+}
 }
 
-static inline bool is_preloaded(DisasContext *ctx, int num)
+static inline void ctx_log_reg_write_pair(DisasContext *ctx, int rnum,
+  bool is_predicated)
 {
-return test_bit(num, ctx->regs_written);
+ctx_log_reg_write(ctx, rnum, is_predicated);
+ctx_log_reg_write(ctx, rnum + 1, is_predicated);
 }
 
 static inline bool is_vreg_preloaded(DisasContext *ctx, int num)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index efd36f760f..67ec3ac551 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -189,6 +189,7 @@ void gen_log_pred_write(DisasContext *ctx, int pnum, TCGv 
val)
hex_new_pred_value[pnum], base_val);
 }
 tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << pnum);
+set_bit(pnum, ctx->pregs_written);
 
 tcg_temp_free(base_val);
 }
@@ -271,7 +272,6 @@ static void gen_write_p3_0(DisasContext *ctx, TCGv 
control_reg)

[PATCH v4 01/13] Hexagon (target/hexagon) Add overrides for jumpr31 instructions

2023-01-24 Thread Taylor Simpson
Add overrides for
SL2_jumpr31Unconditional
SL2_jumpr31_t  Predicated true (old value)
SL2_jumpr31_f  Predicated false (old value)
SL2_jumpr31_tnew   Predicated true (new value)
SL2_jumpr31_fnew   Predicated false (new value)

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h | 15 ++-
 target/hexagon/genptr.c  | 10 +-
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 19697b42a5..d644e59a63 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -1015,6 +1015,19 @@
 #define fGEN_TCG_S2_asl_r_r_sat(SHORTCODE) \
 gen_asl_r_r_sat(RdV, RsV, RtV)
 
+#define fGEN_TCG_SL2_jumpr31(SHORTCODE) \
+gen_jumpr(ctx, hex_gpr[HEX_REG_LR])
+
+#define fGEN_TCG_SL2_jumpr31_t(SHORTCODE) \
+gen_cond_jumpr31(ctx, TCG_COND_EQ, hex_pred[0])
+#define fGEN_TCG_SL2_jumpr31_f(SHORTCODE) \
+gen_cond_jumpr31(ctx, TCG_COND_NE, hex_pred[0])
+
+#define fGEN_TCG_SL2_jumpr31_tnew(SHORTCODE) \
+gen_cond_jumpr31(ctx, TCG_COND_EQ, hex_new_pred_value[0])
+#define fGEN_TCG_SL2_jumpr31_fnew(SHORTCODE) \
+gen_cond_jumpr31(ctx, TCG_COND_NE, hex_new_pred_value[0])
+
 /* Floating point */
 #define fGEN_TCG_F2_conv_sf2df(SHORTCODE) \
 gen_helper_conv_sf2df(RddV, cpu_env, RsV)
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 90db99024f..23fb808e37 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -593,6 +593,14 @@ static void gen_cond_jumpr(DisasContext *ctx, TCGv dst_pc,
 gen_write_new_pc_addr(ctx, dst_pc, cond, pred);
 }
 
+static void gen_cond_jumpr31(DisasContext *ctx, TCGCond cond, TCGv pred)
+{
+TCGv LSB = tcg_temp_new();
+tcg_gen_andi_tl(LSB, pred, 1);
+gen_cond_jumpr(ctx, hex_gpr[HEX_REG_LR], cond, LSB);
+tcg_temp_free(LSB);
+}
+
 static void gen_cond_jump(DisasContext *ctx, TCGCond cond, TCGv pred,
   int pc_off)
 {
-- 
2.17.1



[PATCH v4 00/13] Hexagon: COF overrides, new generator, test/bug update

2023-01-24 Thread Taylor Simpson
The idef-parser skips the change-of-flow (COF) instructions, so add
overrides

 Changes in v2 
Add a new generator for analyze_ instructions.  Pouplate the
DisasContext ahead of generating code.

 Changes in v3 
Cleanup of analysis code
Added test updates enabled by new toolchain container

 Changes in v4 
Additional patch for bug fix
Remove pkt_has_store_s1 from runtime state with dealloc-return patch
New patches to utilize new analyzer to improve predicated instructions


Taylor Simpson (13):
  Hexagon (target/hexagon) Add overrides for jumpr31 instructions
  Hexagon (target/hexagon) Add overrides for callr
  Hexagon (target/hexagon) Add overrides for endloop1/endloop01
  Hexagon (target/hexagon) Add overrides for dealloc-return instructions
  Hexagon (target/hexagon) Analyze packet before generating TCG
  Hexagon (target/hexagon) Analyze packet for HVX
  Hexagon (tests/tcg/hexagon) Update preg_alias.c
  Hexagon (tests/tcg/hexagon) Remove __builtin from scatter_gather
  Hexagon (tests/tcg/hexagon) Enable HVX tests
  Hexagon (target/hexagon) Change subtract from zero to change sign
  Hexagon (target/hexagon) Remove gen_log_predicated_reg_write[_pair]
  Hexagon (target/hexagon) Reduce manipulation of slot_cancelled
  Hexagon (target/hexagon) Improve code gen for predicated HVX
instructions

 target/hexagon/cpu.h|   6 +-
 target/hexagon/gen_tcg.h|  78 ++-
 target/hexagon/gen_tcg_hvx.h|  17 +-
 target/hexagon/macros.h |  14 +-
 target/hexagon/op_helper.h  |   1 -
 target/hexagon/translate.h  |  76 +--
 target/hexagon/genptr.c | 307 +++-
 target/hexagon/idef-parser/parser-helpers.c |  12 +-
 target/hexagon/op_helper.c  |  96 +---
 target/hexagon/translate.c  | 271 ++-
 tests/tcg/hexagon/fpstuff.c |  31 +-
 tests/tcg/hexagon/preg_alias.c  |  10 +-
 tests/tcg/hexagon/scatter_gather.c  | 513 +++-
 target/hexagon/README   |  38 +-
 target/hexagon/gen_analyze_funcs.py | 235 +
 target/hexagon/gen_tcg_funcs.py | 128 ++---
 target/hexagon/idef-parser/idef-parser.lex  |   4 +-
 target/hexagon/idef-parser/idef-parser.y|   7 +-
 target/hexagon/meson.build  |  11 +-
 tests/tcg/hexagon/Makefile.target   |  13 +-
 20 files changed, 1098 insertions(+), 770 deletions(-)
 create mode 100755 target/hexagon/gen_analyze_funcs.py

-- 
2.17.1



[PATCH v4 03/13] Hexagon (target/hexagon) Add overrides for endloop1/endloop01

2023-01-24 Thread Taylor Simpson
Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h |  4 ++
 target/hexagon/genptr.c  | 79 
 2 files changed, 83 insertions(+)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 9e8f3373ad..6267f51ccc 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -628,6 +628,10 @@
 
 #define fGEN_TCG_J2_endloop0(SHORTCODE) \
 gen_endloop0(ctx)
+#define fGEN_TCG_J2_endloop1(SHORTCODE) \
+gen_endloop1(ctx)
+#define fGEN_TCG_J2_endloop01(SHORTCODE) \
+gen_endloop01(ctx)
 
 /*
  * Compound compare and jump instructions
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 360bcd0a19..e17ac93a59 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -803,6 +803,85 @@ static void gen_endloop0(DisasContext *ctx)
 tcg_temp_free(lpcfg);
 }
 
+static void gen_endloop1(DisasContext *ctx)
+{
+/*
+ *if (hex_gpr[HEX_REG_LC1] > 1) {
+ *PC = hex_gpr[HEX_REG_SA1];
+ *hex_new_value[HEX_REG_LC1] = hex_gpr[HEX_REG_LC1] - 1;
+ *}
+ */
+TCGLabel *label = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_LEU, hex_gpr[HEX_REG_LC1], 1, label);
+{
+gen_jumpr(ctx, hex_gpr[HEX_REG_SA1]);
+tcg_gen_subi_tl(hex_new_value[HEX_REG_LC1], hex_gpr[HEX_REG_LC1], 1);
+}
+gen_set_label(label);
+}
+
+static void gen_endloop01(DisasContext *ctx)
+{
+TCGv lpcfg = tcg_temp_local_new();
+
+GET_USR_FIELD(USR_LPCFG, lpcfg);
+
+/*
+ *if (lpcfg == 1) {
+ *hex_new_pred_value[3] = 0xff;
+ *hex_pred_written |= 1 << 3;
+ *}
+ */
+TCGLabel *label1 = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_NE, lpcfg, 1, label1);
+{
+tcg_gen_movi_tl(hex_new_pred_value[3], 0xff);
+tcg_gen_ori_tl(hex_pred_written, hex_pred_written, 1 << 3);
+}
+gen_set_label(label1);
+
+/*
+ *if (lpcfg) {
+ *SET_USR_FIELD(USR_LPCFG, lpcfg - 1);
+ *}
+ */
+TCGLabel *label2 = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_EQ, lpcfg, 0, label2);
+{
+tcg_gen_subi_tl(lpcfg, lpcfg, 1);
+SET_USR_FIELD(USR_LPCFG, lpcfg);
+}
+gen_set_label(label2);
+
+/*
+ *if (hex_gpr[HEX_REG_LC0] > 1) {
+ *PC = hex_gpr[HEX_REG_SA0];
+ *hex_new_value[HEX_REG_LC0] = hex_gpr[HEX_REG_LC0] - 1;
+ *} else {
+ *if (hex_gpr[HEX_REG_LC1] > 1) {
+ *hex_next_pc = hex_gpr[HEX_REG_SA1];
+ *hex_new_value[HEX_REG_LC1] = hex_gpr[HEX_REG_LC1] - 1;
+ *}
+ *}
+ */
+TCGLabel *label3 = gen_new_label();
+TCGLabel *done = gen_new_label();
+tcg_gen_brcondi_tl(TCG_COND_LEU, hex_gpr[HEX_REG_LC0], 1, label3);
+{
+gen_jumpr(ctx, hex_gpr[HEX_REG_SA0]);
+tcg_gen_subi_tl(hex_new_value[HEX_REG_LC0], hex_gpr[HEX_REG_LC0], 1);
+tcg_gen_br(done);
+}
+gen_set_label(label3);
+tcg_gen_brcondi_tl(TCG_COND_LEU, hex_gpr[HEX_REG_LC1], 1, done);
+{
+gen_jumpr(ctx, hex_gpr[HEX_REG_SA1]);
+tcg_gen_subi_tl(hex_new_value[HEX_REG_LC1], hex_gpr[HEX_REG_LC1], 1);
+}
+gen_set_label(done);
+tcg_temp_free(lpcfg);
+}
+
 static void gen_cmp_jumpnv(DisasContext *ctx,
TCGCond cond, TCGv val, TCGv src, int pc_off)
 {
-- 
2.17.1



[PATCH v4 13/13] Hexagon (target/hexagon) Improve code gen for predicated HVX instructions

2023-01-24 Thread Taylor Simpson
The following improvements are made for predicated HVX instructions
During gen_commit_hvx, unconditionally move the "new" value into
the dest
Don't set slot_cancelled
Remove runtime bookkeeping of which registers were updated
Reduce the cases where gen_log_vreg_write[_pair] is called
It's only needed for special operands VxxV and VyV
Remove gen_log_qreg_write

Signed-off-by: Taylor Simpson 
---
 target/hexagon/cpu.h|  3 --
 target/hexagon/gen_tcg_hvx.h| 17 +---
 target/hexagon/translate.h  | 16 +++-
 target/hexagon/genptr.c | 51 ++--
 target/hexagon/translate.c  | 60 +++--
 target/hexagon/README   | 28 --
 target/hexagon/gen_analyze_funcs.py |  3 +-
 target/hexagon/gen_tcg_funcs.py | 32 ---
 8 files changed, 33 insertions(+), 177 deletions(-)

diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 8df5b5a236..43206f8bce 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -110,11 +110,8 @@ typedef struct CPUArchState {
 MMVector future_VRegs[VECTOR_TEMPS_MAX] QEMU_ALIGNED(16);
 MMVector tmp_VRegs[VECTOR_TEMPS_MAX] QEMU_ALIGNED(16);
 
-VRegMask VRegs_updated;
-
 MMQReg QRegs[NUM_QREGS] QEMU_ALIGNED(16);
 MMQReg future_QRegs[NUM_QREGS] QEMU_ALIGNED(16);
-QRegMask QRegs_updated;
 
 /* Temporaries used within instructions */
 MMVectorPair VuuV QEMU_ALIGNED(16);
diff --git a/target/hexagon/gen_tcg_hvx.h b/target/hexagon/gen_tcg_hvx.h
index 083f4d92c6..3154c65ce1 100644
--- a/target/hexagon/gen_tcg_hvx.h
+++ b/target/hexagon/gen_tcg_hvx.h
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -133,17 +133,12 @@ static inline void assert_vhist_tmp(DisasContext *ctx)
 do { \
 TCGv lsb = tcg_temp_new(); \
 TCGLabel *false_label = gen_new_label(); \
-TCGLabel *end_label = gen_new_label(); \
 tcg_gen_andi_tl(lsb, PsV, 1); \
 tcg_gen_brcondi_tl(TCG_COND_NE, lsb, PRED, false_label); \
 tcg_temp_free(lsb); \
 tcg_gen_gvec_mov(MO_64, VdV_off, VuV_off, \
  sizeof(MMVector), sizeof(MMVector)); \
-tcg_gen_br(end_label); \
 gen_set_label(false_label); \
-tcg_gen_ori_tl(hex_slot_cancelled, hex_slot_cancelled, \
-   1 << insn->slot); \
-gen_set_label(end_label); \
 } while (0)
 
 
@@ -560,18 +555,13 @@ static inline void assert_vhist_tmp(DisasContext *ctx)
 do { \
 TCGv LSB = tcg_temp_new(); \
 TCGLabel *false_label = gen_new_label(); \
-TCGLabel *end_label = gen_new_label(); \
 GET_EA; \
 PRED; \
 tcg_gen_brcondi_tl(TCG_COND_EQ, LSB, 0, false_label); \
 tcg_temp_free(LSB); \
 gen_vreg_load(ctx, DSTOFF, EA, true); \
 INC; \
-tcg_gen_br(end_label); \
 gen_set_label(false_label); \
-tcg_gen_ori_tl(hex_slot_cancelled, hex_slot_cancelled, \
-   1 << insn->slot); \
-gen_set_label(end_label); \
 } while (0)
 
 #define fGEN_TCG_PRED_VEC_LOAD_pred_pi \
@@ -731,18 +721,13 @@ static inline void assert_vhist_tmp(DisasContext *ctx)
 do { \
 TCGv LSB = tcg_temp_new(); \
 TCGLabel *false_label = gen_new_label(); \
-TCGLabel *end_label = gen_new_label(); \
 GET_EA; \
 PRED; \
 tcg_gen_brcondi_tl(TCG_COND_EQ, LSB, 0, false_label); \
 tcg_temp_free(LSB); \
 gen_vreg_store(ctx, EA, SRCOFF, insn->slot, ALIGN); \
 INC; \
-tcg_gen_br(end_label); \
 gen_set_label(false_label); \
-tcg_gen_ori_tl(hex_slot_cancelled, hex_slot_cancelled, \
-   1 << insn->slot); \
-gen_set_label(end_label); \
 } while (0)
 
 #define fGEN_TCG_PRED_VEC_STORE_pred_pi(ALIGN) \
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index e997f74278..89761273be 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -49,7 +49,6 @@ typedef struct DisasContext {
 int tmp_vregs_idx;
 int tmp_vregs_num[VECTOR_TEMPS_MAX];
 int vreg_log[NUM_VREGS];
-bool vreg_is_predicated[NUM_VREGS];
 int vreg_log_idx;
 DECLARE_BITMAP(vregs_updated_tmp, NUM_VREGS);
 DECLARE_BITMAP(vregs_updated, NUM_VREGS);
@@ -57,7 +56,6 @@ typedef struct DisasContext {
 DECLARE_BITMAP(predicated_future_vregs, NUM_VREGS);
 DECLARE_BITMAP(predicated_tmp_vregs, NUM_VREGS);
 int qreg_log[NUM_QREGS];
-bool qreg_is_predicated[NUM_QREGS];
 int qreg_log_idx;
 bool pre_commit;
 TCGCond branch_cond;
@@ -110,11 +108,12 @@ static 

[PATCH v4 11/13] Hexagon (target/hexagon) Remove gen_log_predicated_reg_write[_pair]

2023-01-24 Thread Taylor Simpson
We assign the instruction destination register to hex_new_value[num]
instead of a TCG temp that gets copied back to hex_new_value[num].

Since we preload hex_new_value for predicated instructions, we don't
need the check for slot_cancelled.  So, we call gen_log_reg_write instead.

Here is a simple example of the differences in the TCG code generated:

IN:
0x00400094:  0xf900c102 {   if (P0) R2 = and(R0,R1) }

BEFORE
  00400094
 mov_i32 pkt_has_store_s1,$0x0
 mov_i32 slot_cancelled,$0x0
 mov_i32 new_r2,r2
 mov_i32 loc2,$0x0
 and_i32 tmp0,p0,$0x1
 brcond_i32 tmp0,$0x0,eq,$L1
 and_i32 tmp0,r0,r1
 mov_i32 loc2,tmp0
 br $L2
 set_label $L1
 or_i32 slot_cancelled,slot_cancelled,$0x8
 set_label $L2
 and_i32 tmp0,slot_cancelled,$0x8
 movcond_i32 new_r2,tmp0,$0x0,loc2,new_r2,eq
 mov_i32 r2,new_r2

AFTER
  00400094
 mov_i32 slot_cancelled,$0x0
 mov_i32 new_r2,r2
 and_i32 tmp0,p0,$0x1
 brcond_i32 tmp0,$0x0,eq,$L1
 and_i32 tmp0,r0,r1
 mov_i32 new_r2,tmp0
 br $L2
 set_label $L1
 or_i32 slot_cancelled,slot_cancelled,$0x8
 set_label $L2
 mov_i32 r2,new_r2

We'll remove the unnecessary manipulation of slot_cancelled in a
subsequent patch.

Signed-off-by: Taylor Simpson 
---
 target/hexagon/gen_tcg.h|   3 +-
 target/hexagon/genptr.c | 120 
 target/hexagon/idef-parser/parser-helpers.c |   4 +-
 target/hexagon/gen_tcg_funcs.py |  55 +
 4 files changed, 52 insertions(+), 130 deletions(-)

diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 8282ff3fc5..63df79e006 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -521,10 +521,9 @@
  */
 #define fGEN_TCG_SL2_return(SHORTCODE) \
 do { \
-TCGv_i64 RddV = tcg_temp_new_i64(); \
+TCGv_i64 RddV = get_result_gpr_pair(ctx, HEX_REG_FP); \
 gen_return(ctx, RddV, hex_gpr[HEX_REG_FP]); \
 gen_log_reg_write_pair(HEX_REG_FP, RddV); \
-tcg_temp_free_i64(RddV); \
 } while (0)
 
 /*
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 67ec3ac551..f937a17b24 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -70,28 +70,17 @@ static inline void gen_masked_reg_write(TCGv new_val, TCGv 
cur_val,
 }
 }
 
-static inline void gen_log_predicated_reg_write(int rnum, TCGv val,
-uint32_t slot)
+static TCGv get_result_gpr(DisasContext *ctx, int rnum)
 {
-TCGv zero = tcg_constant_tl(0);
-TCGv slot_mask = tcg_temp_new();
-
-tcg_gen_andi_tl(slot_mask, hex_slot_cancelled, 1 << slot);
-tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum], slot_mask, zero,
-   val, hex_new_value[rnum]);
-if (HEX_DEBUG) {
-/*
- * Do this so HELPER(debug_commit_end) will know
- *
- * Note that slot_mask indicates the value is not written
- * (i.e., slot was cancelled), so we create a true/false value before
- * or'ing with hex_reg_written[rnum].
- */
-tcg_gen_setcond_tl(TCG_COND_EQ, slot_mask, slot_mask, zero);
-tcg_gen_or_tl(hex_reg_written[rnum], hex_reg_written[rnum], slot_mask);
-}
+return hex_new_value[rnum];
+}
 
-tcg_temp_free(slot_mask);
+static TCGv_i64 get_result_gpr_pair(DisasContext *ctx, int rnum)
+{
+TCGv_i64 result = tcg_temp_local_new_i64();
+tcg_gen_concat_i32_i64(result, hex_new_value[rnum],
+   hex_new_value[rnum + 1]);
+return result;
 }
 
 void gen_log_reg_write(int rnum, TCGv val)
@@ -106,42 +95,6 @@ void gen_log_reg_write(int rnum, TCGv val)
 }
 }
 
-static void gen_log_predicated_reg_write_pair(int rnum, TCGv_i64 val,
-  uint32_t slot)
-{
-TCGv val32 = tcg_temp_new();
-TCGv zero = tcg_constant_tl(0);
-TCGv slot_mask = tcg_temp_new();
-
-tcg_gen_andi_tl(slot_mask, hex_slot_cancelled, 1 << slot);
-/* Low word */
-tcg_gen_extrl_i64_i32(val32, val);
-tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum],
-   slot_mask, zero,
-   val32, hex_new_value[rnum]);
-/* High word */
-tcg_gen_extrh_i64_i32(val32, val);
-tcg_gen_movcond_tl(TCG_COND_EQ, hex_new_value[rnum + 1],
-   slot_mask, zero,
-   val32, hex_new_value[rnum + 1]);
-if (HEX_DEBUG) {
-/*
- * Do this so HELPER(debug_commit_end) will know
- *
- * Note that slot_mask indicates the value is not written
- * (i.e., slot was cancelled), so we create a true/false value before
- * or'ing with hex_reg_written[rnum].
- */
-tcg_gen_setcond_tl(TCG_COND_EQ, slot_mask, slot_mask, zero);
-tcg_gen_or_tl(hex_reg_written[rnum], hex_reg_written[rnum], slot_mask);
-tcg_gen_or_tl(hex_reg_written[rnum + 1], hex_reg_written[rnum + 1],
-  slot_mask);
-}
-
-

[PATCH v4 10/13] Hexagon (target/hexagon) Change subtract from zero to change sign

2023-01-24 Thread Taylor Simpson
The F2_sffms instruction [r0 -= sfmpy(r1, r2)] doesn't properly
handle -0.  Previously we would negate the input operand by subtracting
from zero.  Instead, we negate by changing the sign bit.

Test case added to tests/tcg/hexagon/fpstuff.c

Signed-off-by: Taylor Simpson 
---
 target/hexagon/op_helper.c  |  4 ++--
 tests/tcg/hexagon/fpstuff.c | 31 ++-
 2 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/target/hexagon/op_helper.c b/target/hexagon/op_helper.c
index cb43519edf..8ddad35de7 100644
--- a/target/hexagon/op_helper.c
+++ b/target/hexagon/op_helper.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -1124,7 +1124,7 @@ float32 HELPER(sffms)(CPUHexagonState *env, float32 RxV,
 {
 float32 neg_RsV;
 arch_fpop_start(env);
-neg_RsV = float32_sub(float32_zero, RsV, >fp_status);
+neg_RsV = float32_set_sign(RsV, float32_is_neg(RsV) ? 0 : 1);
 RxV = internal_fmafx(neg_RsV, RtV, RxV, 0, >fp_status);
 arch_fpop_end(env);
 return RxV;
diff --git a/tests/tcg/hexagon/fpstuff.c b/tests/tcg/hexagon/fpstuff.c
index 56bf562a40..90ce9a6ef3 100644
--- a/tests/tcg/hexagon/fpstuff.c
+++ b/tests/tcg/hexagon/fpstuff.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2020-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2020-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -40,6 +40,7 @@ const int SF_HEX_NAN =0x;
 const int SF_small_neg =  0xab98fba8;
 const int SF_denorm = 0x0001;
 const int SF_random = 0x346001d6;
+const int SF_neg_zero =   0x8000;
 
 const long long DF_QNaN = 0x7ff8ULL;
 const long long DF_SNaN = 0x7ff7ULL;
@@ -536,6 +537,33 @@ static void check_sffixupd(void)
 check32(result, 0x146001d6);
 }
 
+static void check_sffms(void)
+{
+int result;
+
+/* Check that sffms properly deals with -0 */
+result = SF_neg_zero;
+asm ("%0 -= sfmpy(%1 , %2)\n\t"
+: "+r"(result)
+: "r"(SF_ZERO), "r"(SF_ZERO)
+: "r12", "r8");
+check32(result, SF_neg_zero);
+
+result = SF_ZERO;
+asm ("%0 -= sfmpy(%1 , %2)\n\t"
+: "+r"(result)
+: "r"(SF_neg_zero), "r"(SF_ZERO)
+: "r12", "r8");
+check32(result, SF_ZERO);
+
+result = SF_ZERO;
+asm ("%0 -= sfmpy(%1 , %2)\n\t"
+: "+r"(result)
+: "r"(SF_ZERO), "r"(SF_neg_zero)
+: "r12", "r8");
+check32(result, SF_ZERO);
+}
+
 static void check_float2int_convs()
 {
 int res32;
@@ -688,6 +716,7 @@ int main()
 check_invsqrta();
 check_sffixupn();
 check_sffixupd();
+check_sffms();
 check_float2int_convs();
 
 puts(err ? "FAIL" : "PASS");
-- 
2.17.1



[PATCH v4 07/13] Hexagon (tests/tcg/hexagon) Update preg_alias.c

2023-01-24 Thread Taylor Simpson
Add control registers (c4, c5) to clobbers list
Made possible by new toolchain container

Signed-off-by: Taylor Simpson 
---
 tests/tcg/hexagon/preg_alias.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tests/tcg/hexagon/preg_alias.c b/tests/tcg/hexagon/preg_alias.c
index b44a8112b4..8798fbcaf3 100644
--- a/tests/tcg/hexagon/preg_alias.c
+++ b/tests/tcg/hexagon/preg_alias.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2023 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -65,7 +65,7 @@ static inline void creg_alias(int cval, PRegs *pregs)
   : "=r"(pregs->pregs.p0), "=r"(pregs->pregs.p1),
 "=r"(pregs->pregs.p2), "=r"(pregs->pregs.p3)
   : "r"(cval)
-  : "p0", "p1", "p2", "p3");
+  : "c4", "p0", "p1", "p2", "p3");
 }
 
 int err;
@@ -92,7 +92,7 @@ static inline void creg_alias_pair(unsigned int cval, PRegs 
*pregs)
: "=r"(pregs->pregs.p0), "=r"(pregs->pregs.p1),
  "=r"(pregs->pregs.p2), "=r"(pregs->pregs.p3), "=r"(c5)
: "r"(cval_pair)
-   : "p0", "p1", "p2", "p3");
+   : "c4", "c5", "p0", "p1", "p2", "p3");
 
   check(c5, 0xdeadbeef);
 }
@@ -117,7 +117,7 @@ static void test_packet(void)
  "}\n\t"
  : "+r"(result)
  : "r"(0x), "r"(0xff00), "r"(0x837ed653)
- : "p0", "p1", "p2", "p3");
+ : "c4", "p0", "p1", "p2", "p3");
 check(result, old_val);
 
 /* Test a predicated store */
@@ -129,7 +129,7 @@ static void test_packet(void)
  "}\n\t"
  :
  : "r"(0), "r"(0x), "r"()
- : "p0", "p1", "p2", "p3", "memory");
+ : "c4", "p0", "p1", "p2", "p3", "memory");
 check(result, 0x0);
 }
 
-- 
2.17.1



[PULL 4/7] python/qmp: increase read buffer size

2023-01-24 Thread John Snow
From: Maksim Davydov 

Current 256KB is not enough for some real cases. As a possible solution
limit can be chosen to be the same as libvirt (10MB)

Signed-off-by: Maksim Davydov 
Reviewed-by: John Snow 
Message-id: 20230112152805.33109-3-davydov-...@yandex-team.ru
Signed-off-by: John Snow 
---
 python/qemu/qmp/qmp_client.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/python/qemu/qmp/qmp_client.py b/python/qemu/qmp/qmp_client.py
index 5dcda04a756..b5772e7f32b 100644
--- a/python/qemu/qmp/qmp_client.py
+++ b/python/qemu/qmp/qmp_client.py
@@ -197,8 +197,8 @@ async def run(self, address='/tmp/qemu.socket'):
 #: Logger object used for debugging messages.
 logger = logging.getLogger(__name__)
 
-# Read buffer limit; large enough to accept query-qmp-schema
-_limit = (256 * 1024)
+# Read buffer limit; 10MB like libvirt default
+_limit = (10 * 1024 * 1024)
 
 # Type alias for pending execute() result items
 _PendingT = Union[Message, ExecInterruptedError]
-- 
2.39.0




[PULL 7/7] python/qemu/machine: use socketpair() for QMP by default

2023-01-24 Thread John Snow
From: Marc-André Lureau 

When no monitor address is given, establish the QMP communication through
a socketpair() (API is also supported on Windows since Python 3.5)

Signed-off-by: Marc-André Lureau 
Reviewed-by: Daniel P. Berrangé 
Message-id: 20230111080101.969151-4-marcandre.lur...@redhat.com
[Resolved conflicts, fixed typing error. --js]
Signed-off-by: John Snow 
---
 python/qemu/machine/machine.py | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index a71d87ead40..e57c2544842 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -158,17 +158,13 @@ def __init__(self,
 self._qmp_timer = qmp_timer
 
 self._name = name or f"{id(self):x}"
+self._sock_pair: Optional[Tuple[socket.socket, socket.socket]] = None
 self._temp_dir: Optional[str] = None
 self._base_temp_dir = base_temp_dir
 self._sock_dir = sock_dir
 self._log_dir = log_dir
 
-if monitor_address is not None:
-self._monitor_address = monitor_address
-else:
-self._monitor_address = os.path.join(
-self.sock_dir, f"{self._name}.qmp"
-)
+self._monitor_address = monitor_address
 
 self._console_log_path = console_log
 if self._console_log_path:
@@ -303,7 +299,11 @@ def _base_args(self) -> List[str]:
 args = ['-display', 'none', '-vga', 'none']
 
 if self._qmp_set:
-if isinstance(self._monitor_address, tuple):
+if self._sock_pair:
+fd = self._sock_pair[0].fileno()
+os.set_inheritable(fd, True)
+moncdev = f"socket,id=mon,fd={fd}"
+elif isinstance(self._monitor_address, tuple):
 moncdev = "socket,id=mon,host={},port={}".format(
 *self._monitor_address
 )
@@ -337,10 +337,17 @@ def _pre_launch(self) -> None:
 self._remove_files.append(self._console_address)
 
 if self._qmp_set:
+monitor_address = None
+sock = None
+if self._monitor_address is None:
+self._sock_pair = socket.socketpair()
+sock = self._sock_pair[1]
 if isinstance(self._monitor_address, str):
 self._remove_files.append(self._monitor_address)
+monitor_address = self._monitor_address
 self._qmp_connection = QEMUMonitorProtocol(
-self._monitor_address,
+address=monitor_address,
+sock=sock,
 server=True,
 nickname=self._name
 )
@@ -360,6 +367,8 @@ def _pre_launch(self) -> None:
 ))
 
 def _post_launch(self) -> None:
+if self._sock_pair:
+self._sock_pair[0].close()
 if self._qmp_connection:
 self._qmp.accept(self._qmp_timer)
 
-- 
2.39.0




[PULL 2/7] python: QEMUMachine: enable qmp accept timeout by default

2023-01-24 Thread John Snow
From: Vladimir Sementsov-Ogievskiy 

I've spent much time trying to debug hanging pipeline in gitlab. I
started from and idea that I have problem in code in my series (which
has some timeouts). Finally I found that the problem is that I've used
QEMUMachine class directly to avoid qtest, and didn't add necessary
arguments. Qemu fails and we wait for qmp accept endlessly. In gitlab
it's just stopped by timeout (one hour) with no sign of what's going
wrong.

With timeout enabled, gitlab don't wait for an hour and prints all
needed information.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: John Snow 
Message-Id: <20220624195252.175249-1-vsement...@yandex-team.ru>
[Fixed typing. --js]
Signed-off-by: John Snow 
---
 python/qemu/machine/machine.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 748a0d807c9..c759db03e43 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -131,7 +131,7 @@ def __init__(self,
  drain_console: bool = False,
  console_log: Optional[str] = None,
  log_dir: Optional[str] = None,
- qmp_timer: Optional[float] = None):
+ qmp_timer: Optional[float] = 30):
 '''
 Initialize a QEMUMachine
 
-- 
2.39.0




[PULL 6/7] python/qmp/legacy: make QEMUMonitorProtocol accept a socket

2023-01-24 Thread John Snow
From: Marc-André Lureau 

Teach QEMUMonitorProtocol to accept an exisiting socket.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Daniel P. Berrangé 
Message-id: 20230111080101.969151-3-marcandre.lur...@redhat.com
Signed-off-by: John Snow 
---
 python/qemu/qmp/legacy.py | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/python/qemu/qmp/legacy.py b/python/qemu/qmp/legacy.py
index 1951754455a..8b09ee7dbb5 100644
--- a/python/qemu/qmp/legacy.py
+++ b/python/qemu/qmp/legacy.py
@@ -22,6 +22,7 @@
 #
 
 import asyncio
+import socket
 from types import TracebackType
 from typing import (
 Any,
@@ -69,22 +70,32 @@ class QEMUMonitorProtocol:
 
 :param address:  QEMU address, can be either a unix socket path (string)
  or a tuple in the form ( address, port ) for a TCP
- connection
+ connection or None
+:param sock: a socket or None
 :param server:   Act as the socket server. (See 'accept')
 :param nickname: Optional nickname used for logging.
 """
 
-def __init__(self, address: SocketAddrT,
+def __init__(self,
+ address: Optional[SocketAddrT] = None,
+ sock: Optional[socket.socket] = None,
  server: bool = False,
  nickname: Optional[str] = None):
 
+assert address or sock
 self._qmp = QMPClient(nickname)
 self._aloop = asyncio.get_event_loop()
 self._address = address
+self._sock = sock
 self._timeout: Optional[float] = None
 
 if server:
-self._sync(self._qmp.start_server(self._address))
+if sock:
+assert self._sock is not None
+self._sync(self._qmp.open_with_socket(self._sock))
+else:
+assert self._address is not None
+self._sync(self._qmp.start_server(self._address))
 
 _T = TypeVar('_T')
 
@@ -139,6 +150,7 @@ def connect(self, negotiate: bool = True) -> 
Optional[QMPMessage]:
 :return: QMP greeting dict, or None if negotiate is false
 :raise ConnectError: on connection errors
 """
+assert self._address is not None
 self._qmp.await_greeting = negotiate
 self._qmp.negotiate = negotiate
 
-- 
2.39.0




[PULL 5/7] python/qmp/protocol: add open_with_socket()

2023-01-24 Thread John Snow
From: Marc-André Lureau 

Instead of listening for incoming connections with a SocketAddr, add a
new method open_with_socket() that accepts an existing socket.

Signed-off-by: Marc-André Lureau 
Reviewed-by: Daniel P. Berrangé 
Message-id: 20230111080101.969151-2-marcandre.lur...@redhat.com
Signed-off-by: John Snow 
---
 python/qemu/qmp/protocol.py | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/python/qemu/qmp/protocol.py b/python/qemu/qmp/protocol.py
index 15909b7dbad..6d3d739daa7 100644
--- a/python/qemu/qmp/protocol.py
+++ b/python/qemu/qmp/protocol.py
@@ -18,6 +18,7 @@
 from enum import Enum
 from functools import wraps
 import logging
+import socket
 from ssl import SSLContext
 from typing import (
 Any,
@@ -296,6 +297,19 @@ async def start_server_and_accept(
 await self.accept()
 assert self.runstate == Runstate.RUNNING
 
+@upper_half
+@require(Runstate.IDLE)
+async def open_with_socket(self, sock: socket.socket) -> None:
+"""
+Start connection with given socket.
+
+:param sock: A socket.
+
+:raise StateError: When the `Runstate` is not `IDLE`.
+"""
+self._reader, self._writer = await asyncio.open_connection(sock=sock)
+self._set_state(Runstate.CONNECTING)
+
 @upper_half
 @require(Runstate.IDLE)
 async def start_server(self, address: SocketAddrT,
@@ -343,11 +357,12 @@ async def accept(self) -> None:
 protocol-level failure occurs while establishing a new
 session, the wrapped error may also be an `QMPError`.
 """
-if self._accepted is None:
-raise QMPError("Cannot call accept() before start_server().")
-await self._session_guard(
-self._do_accept(),
-'Failed to establish connection')
+if not self._reader:
+if self._accepted is None:
+raise QMPError("Cannot call accept() before start_server().")
+await self._session_guard(
+self._do_accept(),
+'Failed to establish connection')
 await self._session_guard(
 self._establish_session(),
 'Failed to establish session')
-- 
2.39.0




[PULL 3/7] python/machine: Fix AF_UNIX path too long on macOS

2023-01-24 Thread John Snow
From: Peter Delevoryas 

On macOS, private $TMPDIR's are the default. These $TMPDIR's are
generated from a user's unix UID and UUID [1], which can create a
relatively long path:

/var/folders/d7/rz20f6hd709c1ty8f6_6y_z4gn/T/

QEMU's avocado tests create a temporary directory prefixed by
"avo_qemu_sock_", and create QMP sockets within _that_ as well.
The QMP socket is unnecessarily long, because a temporary directory
is created for every QEMUMachine object.

/avo_qemu_sock_uh3w_dgc/qemu-37331-10bacf110-monitor.sock

The path limit for unix sockets on macOS is 104: [2]

/*
 * [XSI] Definitions for UNIX IPC domain.
 */
struct  sockaddr_un {
unsigned char   sun_len;/* sockaddr len including null */
sa_family_t sun_family; /* [XSI] AF_UNIX */
charsun_path[104];  /* [XSI] path name (gag) */
};

This results in avocado tests failing on macOS because the QMP unix
socket can't be created, because the path is too long:

ERROR| Failed to establish connection: OSError: AF_UNIX path too long

This change resolves by reducing the size of the socket directory prefix
and the suffix on the QMP and console socket names.

The result is paths like this:

pdel@pdel-mbp:/var/folders/d7/rz20f6hd709c1ty8f6_6y_z4gn/T
$ tree qemu*
qemu_df4evjeq
qemu_jbxel3gy
qemu_ml9s_gg7
qemu_oc7h7f3u
qemu_oqb1yf97
├── 10a004050.con
└── 10a004050.qmp

[1] 
https://apple.stackexchange.com/questions/353832/why-is-mac-osx-temp-directory-in-weird-path
[2] /Library/Developer/CommandLineTools/SDKs/MacOSX12.3.sdk/usr/include/sys/un.h

Signed-off-by: Peter Delevoryas 
Reviewed-by: Daniel P. Berrangé 
Reviewed-by: Philippe Mathieu-Daudé 
Message-id: 20230110082930.42129-2-pe...@pjd.dev
Signed-off-by: John Snow 
---
 python/qemu/machine/machine.py | 6 +++---
 tests/avocado/avocado_qemu/__init__.py | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index c759db03e43..a71d87ead40 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -157,7 +157,7 @@ def __init__(self,
 self._wrapper = wrapper
 self._qmp_timer = qmp_timer
 
-self._name = name or f"qemu-{os.getpid()}-{id(self):02x}"
+self._name = name or f"{id(self):x}"
 self._temp_dir: Optional[str] = None
 self._base_temp_dir = base_temp_dir
 self._sock_dir = sock_dir
@@ -167,7 +167,7 @@ def __init__(self,
 self._monitor_address = monitor_address
 else:
 self._monitor_address = os.path.join(
-self.sock_dir, f"{self._name}-monitor.sock"
+self.sock_dir, f"{self._name}.qmp"
 )
 
 self._console_log_path = console_log
@@ -192,7 +192,7 @@ def __init__(self,
 self._console_set = False
 self._console_device_type: Optional[str] = None
 self._console_address = os.path.join(
-self.sock_dir, f"{self._name}-console.sock"
+self.sock_dir, f"{self._name}.con"
 )
 self._console_socket: Optional[socket.socket] = None
 self._remove_files: List[str] = []
diff --git a/tests/avocado/avocado_qemu/__init__.py 
b/tests/avocado/avocado_qemu/__init__.py
index 910f3ba1eab..25a546842fa 100644
--- a/tests/avocado/avocado_qemu/__init__.py
+++ b/tests/avocado/avocado_qemu/__init__.py
@@ -306,7 +306,7 @@ def require_netdev(self, netdevname):
 self.cancel('no support for user networking')
 
 def _new_vm(self, name, *args):
-self._sd = tempfile.TemporaryDirectory(prefix="avo_qemu_sock_")
+self._sd = tempfile.TemporaryDirectory(prefix="qemu_")
 vm = QEMUMachine(self.qemu_bin, base_temp_dir=self.workdir,
  sock_dir=self._sd.name, log_dir=self.logdir)
 self.log.debug('QEMUMachine "%s" created', name)
-- 
2.39.0




[PULL 1/7] Fix some typos

2023-01-24 Thread John Snow
From: Dongdong Zhang 

Fix some typos in 'python' directory.

Signed-off-by: Dongdong Zhang 
Reviewed-by: Philippe Mathieu-Daudé 
Message-id: 20221130015358.6998-2-zhangdongd...@eswincomputing.com
[Fixed additional typo spotted by Max Filippov. --js]
Reviewed-by: John Snow 
Signed-off-by: John Snow 
---
 python/qemu/machine/console_socket.py | 2 +-
 python/qemu/machine/qtest.py  | 2 +-
 python/qemu/qmp/protocol.py   | 2 +-
 python/qemu/qmp/qmp_tui.py| 6 +++---
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/python/qemu/machine/console_socket.py 
b/python/qemu/machine/console_socket.py
index 8c4ff598ad7..4e28ba9bb23 100644
--- a/python/qemu/machine/console_socket.py
+++ b/python/qemu/machine/console_socket.py
@@ -68,7 +68,7 @@ def _thread_start(self) -> threading.Thread:
 """Kick off a thread to drain the socket."""
 # Configure socket to not block and timeout.
 # This allows our drain thread to not block
-# on recieve and exit smoothly.
+# on receive and exit smoothly.
 socket.socket.setblocking(self, False)
 socket.socket.settimeout(self, 1)
 drain_thread = threading.Thread(target=self._drain_fn)
diff --git a/python/qemu/machine/qtest.py b/python/qemu/machine/qtest.py
index 1a1fc6c9b08..1c46138bd0c 100644
--- a/python/qemu/machine/qtest.py
+++ b/python/qemu/machine/qtest.py
@@ -42,7 +42,7 @@ class QEMUQtestProtocol:
 :raise socket.error: on socket connection errors
 
 .. note::
-   No conection is estabalished by __init__(), this is done
+   No connection is established by __init__(), this is done
by the connect() or accept() methods.
 """
 def __init__(self, address: SocketAddrT,
diff --git a/python/qemu/qmp/protocol.py b/python/qemu/qmp/protocol.py
index 6ea86650ad2..15909b7dbad 100644
--- a/python/qemu/qmp/protocol.py
+++ b/python/qemu/qmp/protocol.py
@@ -812,7 +812,7 @@ async def _bh_flush_writer(self) -> None:
 
 @bottom_half
 async def _bh_close_stream(self, error_pathway: bool = False) -> None:
-# NB: Closing the writer also implcitly closes the reader.
+# NB: Closing the writer also implicitly closes the reader.
 if not self._writer:
 return
 
diff --git a/python/qemu/qmp/qmp_tui.py b/python/qemu/qmp/qmp_tui.py
index ce239d8979b..83691447231 100644
--- a/python/qemu/qmp/qmp_tui.py
+++ b/python/qemu/qmp/qmp_tui.py
@@ -71,7 +71,7 @@ def format_json(msg: str) -> str:
 due to an decoding error then a simple string manipulation is done to
 achieve a single line JSON string.
 
-Converting into single line is more asthetically pleasing when looking
+Converting into single line is more aesthetically pleasing when looking
 along with error messages.
 
 Eg:
@@ -91,7 +91,7 @@ def format_json(msg: str) -> str:
 
 [1, true, 3]: QMP message is not a JSON object.
 
-The single line mode is more asthetically pleasing.
+The single line mode is more aesthetically pleasing.
 
 :param msg:
 The message to formatted into single line.
@@ -498,7 +498,7 @@ def __init__(self, parent: App) -> None:
 class HistoryBox(urwid.ListBox):
 """
 This widget is modelled using the ListBox widget, contains the list of
-all messages both QMP messages and log messsages to be shown in the TUI.
+all messages both QMP messages and log messages to be shown in the TUI.
 
 The messages are urwid.Text widgets. On every append of a message, the
 focus is shifted to the last appended message.
-- 
2.39.0




[PULL 0/7] Python patches

2023-01-24 Thread John Snow
The following changes since commit 13356edb87506c148b163b8c7eb0695647d00c2a:

  Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into 
staging (2023-01-24 09:45:33 +)

are available in the Git repository at:

  https://gitlab.com/jsnow/qemu.git tags/python-pull-request

for you to fetch changes up to bd4c0ef409140bd1be393407c04005ac077d4574:

  python/qemu/machine: use socketpair() for QMP by default (2023-01-24 13:37:13 
-0500)


Python

Bits and pieces, kibbles'n'bits



Dongdong Zhang (1):
  Fix some typos

Maksim Davydov (1):
  python/qmp: increase read buffer size

Marc-André Lureau (3):
  python/qmp/protocol: add open_with_socket()
  python/qmp/legacy: make QEMUMonitorProtocol accept a socket
  python/qemu/machine: use socketpair() for QMP by default

Peter Delevoryas (1):
  python/machine: Fix AF_UNIX path too long on macOS

Vladimir Sementsov-Ogievskiy (1):
  python: QEMUMachine: enable qmp accept timeout by default

 python/qemu/machine/console_socket.py  |  2 +-
 python/qemu/machine/machine.py | 31 +-
 python/qemu/machine/qtest.py   |  2 +-
 python/qemu/qmp/legacy.py  | 18 ---
 python/qemu/qmp/protocol.py| 27 +-
 python/qemu/qmp/qmp_client.py  |  4 ++--
 python/qemu/qmp/qmp_tui.py |  6 ++---
 tests/avocado/avocado_qemu/__init__.py |  2 +-
 8 files changed, 64 insertions(+), 28 deletions(-)

-- 
2.39.0





Re: [PATCH 5.10 00/98] 5.10.165-rc2 review

2023-01-24 Thread Naresh Kamboju
+ qemu-devel

On Tue, 24 Jan 2023 at 15:22, Naresh Kamboju  wrote:
>
> On Mon, 23 Jan 2023 at 15:22, Greg Kroah-Hartman
>  wrote:
> >
> > This is the start of the stable review cycle for the 5.10.165 release.
> > There are 98 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Wed, 25 Jan 2023 09:48:53 +.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > 
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.165-rc2.gz
> > or in the git tree and branch at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> > linux-5.10.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
>
>
> Results from Linaro’s test farm.
> Regressions found on arm64 for both 5.15.90-rc2 and 5.10.165-rc2.
>
> * qemu-arm64-mte, kselftest-arm64
>   - arm64_check_buffer_fill
>   - arm64_check_child_memory
>   - arm64_check_ksm_options
>   - arm64_check_mmap_options
>   - arm64_check_tags_inclusion
>
> Reported-by: Linux Kernel Functional Testing 
>
> We are in a process to bisect this problem and there are updates coming
> from kselftest rootfs.

Here is a interesting findings,

The bisect process was not successful.

I have investigated the infrastructure changes and found qemu updated to 7.2
 1. qemu version 7.2 caused this regressions
 or we can put in this way,
 2. qemu version 7.2 has better coverage to find better regressions.

Old version of qemu 7.1 selftests/arm64 test passed.

lava-dispatcher, installed at version: 2022.11.1
qemu-system-arm, installed at version: 1:7.1+dfsg-2~bpo11+3, host
architecture: arm64


New version of qemu 7.2 selftests/arm64 test failed.

* qemu-arm64-mte, kselftest-arm64 failed tests,
  - arm64_check_buffer_fill
  - arm64_check_child_memory
  - arm64_check_ksm_options
  - arm64_check_mmap_options
  - arm64_check_tags_inclusion

lava-dispatcher, installed at version: 2023.01
qemu-system-arm, installed at version: 1:7.2+dfsg-1~bpo11+2, host
architecture: arm64

With reference to my previous emails,
This is not a kernel regression on stable-rc 6.1, 5.15, and 5.10.

-- 
Looking into 7.2 release blog post i have found following highlights,

QEMU version 7.2.0 released
14 DEC 2022
We’d like to announce the availability of the QEMU 7.2.0 release. This
release contains 1800+ commits from 205 authors.

You can grab the tarball from our download page. The full list of
changes are available in the Wiki.

Highlights include:

* ARM: emulation support for the following CPU features: Enhanced
Translation Synchronization, PMU Extensions v3.5, Guest Translation
Granule size, Hardware management of access flag/dirty bit state, and
Preventing EL0 access to halves of address maps
* ARM: emulation support for Cortex-A35 CPUs
* LoongArch: support for fw_cfg DMA functionality, memory hotplug, and
TPM device emulation
* OpenRISC: support for multi-threaded TCG, stability improvements,
and new ‘virt’ machine type for CI/device testing.
* RISC-V: ‘virt’ machine support for booting S-mode firmware from
pflash, and general device tree improvements
* s390x: support for Message-Security-Assist Extension 5 (RNG via PRNO
instruction), SHA-512 via KIMD/KLMD instructions, and enhanced zPCI
interpretation support for KVM guests
* x86: TCG performance improvements, including SSE
* x86: TCG support for AVX, AVX2, F16C, FMA3, and VAES instructions
* x86: KVM support for “notify vmexit” mechanism to prevent processor
bugs from hanging whole system
* LUKS block device headers are validated more strictly, creating LUKS
images is supported on macOS
* Memory backends now support NUMA-awareness when preallocating memory
and lots more…
 - https://www.qemu.org/blog/




>
> Test logs,
> # selftests: arm64: check_buffer_fill
> # 1..20
> # ok 1 Check buffer correctness by byte with sync err mode and mmap memory
> # ok 2 Check buffer correctness by byte with async err mode and mmap memory
> # ok 3 Check buffer correctness by byte with sync err mode and
> mmap/mprotect memory
> # ok 4 Check buffer correctness by byte with async err mode and
> mmap/mprotect memory
> # not ok 5 Check buffer write underflow by byte with sync mode and mmap memory
> # not ok 6 Check buffer write underflow by byte with async mode and mmap 
> memory
> # ok 7 Check buffer write underflow by byte with tag check fault
> ignore and mmap memory
> # ok 8 Check buffer write underflow by byte with sync mode and mmap memory
> # ok 9 Check buffer write underflow by byte with async mode and mmap memory
> # ok 10 Check buffer write underflow by byte with tag check fault
> ignore and mmap memory
> # not ok 11 Check buffer write overflow by byte with sync mode and mmap memory
> # not ok 12 Check buffer write overflow by byte with async mode and mmap 
> memory
> # ok 13 Check buffer write 

Re: [PATCH 1/1] modules: load modules from /var/run/qemu/ directory firstly

2023-01-24 Thread Philippe Mathieu-Daudé

On 24/1/23 19:39, Siddhi Katage wrote:

From: Siddhi Katage 

An old running QEMU will try to load modules with new build-id first, this
will fail as expected, then QEMU will fallback to load the old modules that


You corrected the comma/space typo :)


matches its build-id from /var/run/qemu/ directory.
Make /var/run/qemu/ directory as first search path to load modules.

Fixes: bd83c861c0 ("modules: load modules from versioned /var/run dir")
Signed-off-by: Siddhi Katage 
---
  util/module.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM

2023-01-24 Thread Sean Christopherson
On Tue, Jan 24, 2023, Liam Merwick wrote:
> On 14/01/2023 00:37, Sean Christopherson wrote:
> > On Fri, Dec 02, 2022, Chao Peng wrote:
> > > This patch series implements KVM guest private memory for confidential
> > > computing scenarios like Intel TDX[1]. If a TDX host accesses
> > > TDX-protected guest memory, machine check can happen which can further
> > > crash the running host system, this is terrible for multi-tenant
> > > configurations. The host accesses include those from KVM userspace like
> > > QEMU. This series addresses KVM userspace induced crash by introducing
> > > new mm and KVM interfaces so KVM userspace can still manage guest memory
> > > via a fd-based approach, but it can never access the guest memory
> > > content.
> > > 
> > > The patch series touches both core mm and KVM code. I appreciate
> > > Andrew/Hugh and Paolo/Sean can review and pick these patches. Any other
> > > reviews are always welcome.
> > >- 01: mm change, target for mm tree
> > >- 02-09: KVM change, target for KVM tree
> > 
> > A version with all of my feedback, plus reworked versions of Vishal's 
> > selftest,
> > is available here:
> > 
> >g...@github.com:sean-jc/linux.git x86/upm_base_support
> > 
> > It compiles and passes the selftest, but it's otherwise barely tested.  
> > There are
> > a few todos (2 I think?) and many of the commits need changelogs, i.e. it's 
> > still
> > a WIP.
> > 
> 
> When running LTP (https://github.com/linux-test-project/ltp) on the v10
> bits (and also with Sean's branch above) I encounter the following NULL
> pointer dereference with testcases/kernel/syscalls/madvise/madvise01
> (100% reproducible).
> 
> It appears that in restrictedmem_error_page() inode->i_mapping->private_data
> is NULL
> in the list_for_each_entry_safe(inode, next, >s_inodes, i_sb_list)
> but I don't know why.

Kirill, can you take a look?  Or pass the buck to someone who can? :-)



Re: [PATCH RFC 02/21] util: Include osdep.h first in util/mmap-alloc.c

2023-01-24 Thread Philippe Mathieu-Daudé

On 17/1/23 23:08, Peter Xu wrote:

Without it, we never have CONFIG_LINUX defined even if on linux, so
linux/mman.h is never really included.

Signed-off-by: Peter Xu 
---
  util/mmap-alloc.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH] hw/arm: Use TYPE_ARM_SMMUV3

2023-01-24 Thread Philippe Mathieu-Daudé

On 25/1/23 00:20, Richard Henderson wrote:

Use the macro instead of two explicit string literals.

Signed-off-by: Richard Henderson 
---
  hw/arm/sbsa-ref.c | 3 ++-
  hw/arm/virt.c | 2 +-
  2 files changed, 3 insertions(+), 2 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v4 13/36] tcg: Add temp allocation for TCGv_i128

2023-01-24 Thread Philippe Mathieu-Daudé

On 8/1/23 03:36, Richard Henderson wrote:

This enables allocation of i128.  The type is not yet
usable, as we have not yet added data movement ops.

Signed-off-by: Richard Henderson 
---
  include/tcg/tcg.h | 32 +
  tcg/tcg.c | 60 +--
  2 files changed, 74 insertions(+), 18 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v4 08/36] include/qemu/int128: Use Int128 structure for TCI

2023-01-24 Thread Philippe Mathieu-Daudé

On 8/1/23 03:36, Richard Henderson wrote:

We are about to allow passing Int128 to/from tcg helper functions,
but libffi doesn't support __int128_t, so use the structure.

In order for atomic128.h to continue working, we must provide
a mechanism to frob between real __int128_t and the structure.
Provide a new union, Int128Alias, for this.  We cannot modify
Int128 itself, as any changed alignment would also break libffi.

Signed-off-by: Richard Henderson 
---
  include/qemu/atomic128.h | 29 +--
  include/qemu/int128.h| 25 +---
  util/int128.c| 42 
  3 files changed, 87 insertions(+), 9 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 





Re: [PATCH v4 01/36] tcg: Define TCG_TYPE_I128 and related helper macros

2023-01-24 Thread Philippe Mathieu-Daudé

On 8/1/23 03:36, Richard Henderson wrote:

Begin staging in support for TCGv_i128 with Int128.
Define the type enumerator, the typedef, and the
helper-head.h macros.

This cannot yet be used, because you can't allocate
temporaries of this new type.

Signed-off-by: Richard Henderson 
---
  include/exec/helper-head.h |  7 +++
  include/tcg/tcg.h  | 17 ++---
  2 files changed, 17 insertions(+), 7 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




[PATCH] hw/arm: Use TYPE_ARM_SMMUV3

2023-01-24 Thread Richard Henderson
Use the macro instead of two explicit string literals.

Signed-off-by: Richard Henderson 
---
 hw/arm/sbsa-ref.c | 3 ++-
 hw/arm/virt.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index 4bb444684f..8378441dbb 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -29,6 +29,7 @@
 #include "exec/hwaddr.h"
 #include "kvm_arm.h"
 #include "hw/arm/boot.h"
+#include "hw/arm/smmuv3.h"
 #include "hw/block/flash.h"
 #include "hw/boards.h"
 #include "hw/ide/internal.h"
@@ -574,7 +575,7 @@ static void create_smmu(const SBSAMachineState *sms, PCIBus 
*bus)
 DeviceState *dev;
 int i;
 
-dev = qdev_new("arm-smmuv3");
+dev = qdev_new(TYPE_ARM_SMMUV3);
 
 object_property_set_link(OBJECT(dev), "primary-bus", OBJECT(bus),
  _abort);
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 5f1fddd210..d103de8c2e 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1344,7 +1344,7 @@ static void create_smmu(const VirtMachineState *vms,
 return;
 }
 
-dev = qdev_new("arm-smmuv3");
+dev = qdev_new(TYPE_ARM_SMMUV3);
 
 object_property_set_link(OBJECT(dev), "primary-bus", OBJECT(bus),
  _abort);
-- 
2.34.1




Re: [PATCH 01/32] monitor: Drop unnecessary includes

2023-01-24 Thread Stefan Berger




On 1/24/23 07:19, Markus Armbruster wrote:

Signed-off-by: Markus Armbruster 


Reviewed-by: Stefan Berger 




Re: [PATCH 20/32] tpm: Move HMP commands from monitor/ to softmmu/

2023-01-24 Thread Stefan Berger




On 1/24/23 07:19, Markus Armbruster wrote:

This moves these commands from MAINTAINERS section "Human
Monitor (HMP)" to "TPM".

Signed-off-by: Markus Armbruster 


Reviewed-by: Stefan Berger 


---
  MAINTAINERS|  2 +-
  monitor/hmp-cmds.c | 54 ---
  softmmu/tpm-hmp-cmds.c | 65 ++
  softmmu/meson.build|  1 +
  4 files changed, 67 insertions(+), 55 deletions(-)
  create mode 100644 softmmu/tpm-hmp-cmds.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 3bd4d101d3..dab4def753 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3067,7 +3067,7 @@ T: git https://github.com/stefanha/qemu.git tracing
  TPM
  M: Stefan Berger 
  S: Maintained
-F: softmmu/tpm.c
+F: softmmu/tpm*
  F: hw/tpm/*
  F: include/hw/acpi/tpm.h
  F: include/sysemu/tpm*
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 6b1d5358f7..81f63fa8ec 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -22,7 +22,6 @@
  #include "qapi/qapi-commands-misc.h"
  #include "qapi/qapi-commands-run-state.h"
  #include "qapi/qapi-commands-stats.h"
-#include "qapi/qapi-commands-tpm.h"
  #include "qapi/qmp/qdict.h"
  #include "qapi/qmp/qerror.h"
  #include "qemu/cutils.h"
@@ -126,59 +125,6 @@ void hmp_info_pic(Monitor *mon, const QDict *qdict)
 hmp_info_pic_foreach, mon);
  }

-void hmp_info_tpm(Monitor *mon, const QDict *qdict)
-{
-#ifdef CONFIG_TPM
-TPMInfoList *info_list, *info;
-Error *err = NULL;
-unsigned int c = 0;
-TPMPassthroughOptions *tpo;
-TPMEmulatorOptions *teo;
-
-info_list = qmp_query_tpm();
-if (err) {
-monitor_printf(mon, "TPM device not supported\n");
-error_free(err);
-return;
-}
-
-if (info_list) {
-monitor_printf(mon, "TPM device:\n");
-}
-
-for (info = info_list; info; info = info->next) {
-TPMInfo *ti = info->value;
-monitor_printf(mon, " tpm%d: model=%s\n",
-   c, TpmModel_str(ti->model));
-
-monitor_printf(mon, "  \\ %s: type=%s",
-   ti->id, TpmType_str(ti->options->type));
-
-switch (ti->options->type) {
-case TPM_TYPE_PASSTHROUGH:
-tpo = ti->options->u.passthrough.data;
-monitor_printf(mon, "%s%s%s%s",
-   tpo->path ? ",path=" : "",
-   tpo->path ?: "",
-   tpo->cancel_path ? ",cancel-path=" : "",
-   tpo->cancel_path ?: "");
-break;
-case TPM_TYPE_EMULATOR:
-teo = ti->options->u.emulator.data;
-monitor_printf(mon, ",chardev=%s", teo->chardev);
-break;
-case TPM_TYPE__MAX:
-break;
-}
-monitor_printf(mon, "\n");
-c++;
-}
-qapi_free_TPMInfoList(info_list);
-#else
-monitor_printf(mon, "TPM device not supported\n");
-#endif /* CONFIG_TPM */
-}
-
  void hmp_quit(Monitor *mon, const QDict *qdict)
  {
  monitor_suspend(mon);
diff --git a/softmmu/tpm-hmp-cmds.c b/softmmu/tpm-hmp-cmds.c
new file mode 100644
index 00..9ed6ad6c4d
--- /dev/null
+++ b/softmmu/tpm-hmp-cmds.c
@@ -0,0 +1,65 @@
+/*
+ * HMP commands related to TPM
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/qapi-commands-tpm.h"
+#include "monitor/monitor.h"
+#include "monitor/hmp.h"
+#include "qapi/error.h"
+
+void hmp_info_tpm(Monitor *mon, const QDict *qdict)
+{
+#ifdef CONFIG_TPM
+TPMInfoList *info_list, *info;
+Error *err = NULL;
+unsigned int c = 0;
+TPMPassthroughOptions *tpo;
+TPMEmulatorOptions *teo;
+
+info_list = qmp_query_tpm();
+if (err) {
+monitor_printf(mon, "TPM device not supported\n");
+error_free(err);
+return;
+}
+
+if (info_list) {
+monitor_printf(mon, "TPM device:\n");
+}
+
+for (info = info_list; info; info = info->next) {
+TPMInfo *ti = info->value;
+monitor_printf(mon, " tpm%d: model=%s\n",
+   c, TpmModel_str(ti->model));
+
+monitor_printf(mon, "  \\ %s: type=%s",
+   ti->id, TpmType_str(ti->options->type));
+
+switch (ti->options->type) {
+case TPM_TYPE_PASSTHROUGH:
+tpo = ti->options->u.passthrough.data;
+monitor_printf(mon, "%s%s%s%s",
+   tpo->path ? ",path=" : "",
+   tpo->path ?: "",
+   tpo->cancel_path ? ",cancel-path=" : "",
+   tpo->cancel_path ?: "");
+break;
+case TPM_TYPE_EMULATOR:
+teo = ti->options->u.emulator.data;
+monitor_printf(mon, ",chardev=%s", teo->chardev);
+break;
+case TPM_TYPE__MAX:
+break;
+}
+monitor_printf(mon, "\n");
+ 

Re: [PATCH RFC 12/21] migration: Introduce page size for-migration-only

2023-01-24 Thread Peter Xu
On Tue, Jan 24, 2023 at 04:36:20PM -0500, Peter Xu wrote:
> On Tue, Jan 24, 2023 at 01:20:37PM +, Dr. David Alan Gilbert wrote:
> > > @@ -3970,7 +3984,8 @@ int ram_load_postcopy(QEMUFile *f, int channel)
> > >  break;
> > >  }
> > >  tmp_page->target_pages++;
> > > -matches_target_page_size = block->page_size == 
> > > TARGET_PAGE_SIZE;
> > > +matches_target_page_size =
> > > +migration_ram_pagesize(block) == TARGET_PAGE_SIZE;
> > >  /*
> > >   * Postcopy requires that we place whole host pages 
> > > atomically;
> > >   * these may be huge pages for RAMBlocks that are backed by
> > 
> > Hmm do you really want this change?
> 
> Yes that's intended.  I want to reuse the same logic here when receiving
> small pages from huge pages, just like when we're receiving small pages on
> non-hugetlb mappings.
> 
> matches_target_page_size majorly affects two things:
> 
>   1) For a small zero page, whether we want to pre-set the page_buffer, or
>  simply use postcopy_place_page_zero():
>   
> case RAM_SAVE_FLAG_ZERO:
> ch = qemu_get_byte(f);
> /*
>  * Can skip to set page_buffer when
>  * this is a zero page and (block->page_size == TARGET_PAGE_SIZE).
>  */
> if (ch || !matches_target_page_size) {
> memset(page_buffer, ch, TARGET_PAGE_SIZE);
> }
> 
>   2) For normal page, whether we need to use a page buffer or we can
>  directly reuse the page buffer in QEMUFile:
> 
> if (!matches_target_page_size) {
> /* For huge pages, we always use temporary buffer */
> qemu_get_buffer(f, page_buffer, TARGET_PAGE_SIZE);
> } else {
> /*
>  * For small pages that matches target page size, we
>  * avoid the qemu_file copy.  Instead we directly use
>  * the buffer of QEMUFile to place the page.  Note: we
>  * cannot do any QEMUFile operation before using that
>  * buffer to make sure the buffer is valid when
>  * placing the page.
>  */
> qemu_get_buffer_in_place(f, (uint8_t **)_source,
>  TARGET_PAGE_SIZE);
> }
> 
> Here:
> 
> I want 1) to reuse postcopy_place_page_zero().  For the doublemap case,
> it'll reuse postcopy_tmp_zero_page() (because qemu_ram_is_uf_zeroable()
> will return false for such a ramblock).
> 
> I want 2) to reuse qemu_get_buffer_in_place(), so we avoid a copy process
> for the small page which is faster (even if it's hugetlb backed, now we can
> reuse the qemufile buffer safely).

Since at it, one more thing worth mentioning is I didn't actually know
whether the original code is always correct when target and host small
psizes don't match..  This is the original line:

  matches_target_page_size = block->page_size == TARGET_PAGE_SIZE;

The problem is we're comparing block page size against target page size,
however block page size should be in host page size granule:

  RAMBlock *qemu_ram_alloc_internal()
  {
new_block->page_size = qemu_real_host_page_size();

IOW, I am not sure whether postcopy will run at all in that case.  For
example, when we run an Alpha emulator upon x86_64, we can have target
psize 8K while host psize 4K.

The migration protocol should be TARGET_PAGE_SIZE based.  It means, for
postcopy when receiving a single page for Alpha VM being migrated, maybe we
should call UFFDIO_COPY (or UFFDIO_CONTINUE; doesn't matter here) twice
because one guest page contains two host pages.

I'm not sure whether I get all these right.. if so, we have two options:

  a) Forbid postcopy as a whole when detecting qemu_real_host_page_size()
 != TARGET_PAGE_SIZE.

  b) Implement postcopy for that case

I'd go with a) even if it's an issue because it means no one is migrating
that thing in postcopy way in the past N years, so it justifies that maybe
b) doesn't worth it.

-- 
Peter Xu




Re: [PATCH v4 00/36] tcg: Support for Int128 with helpers

2023-01-24 Thread Richard Henderson

On 1/7/23 16:36, Richard Henderson wrote:

Patches requiring review:
   01-tcg-Define-TCG_TYPE_I128-and-related-helper-macro.patch
   02-tcg-Handle-dh_typecode_i128-with-TCG_CALL_-RET-AR.patch
   03-tcg-Allocate-objects-contiguously-in-temp_allocat.patch
   05-tcg-Add-TCG_CALL_-RET-ARG-_BY_REF.patch
   07-tcg-Add-TCG_CALL_RET_BY_VEC.patch
   08-include-qemu-int128-Use-Int128-structure-for-TCI.patch
   09-tcg-i386-Add-TCG_TARGET_CALL_-RET-ARG-_I128.patch
   10-tcg-tci-Fix-big-endian-return-register-ordering.patch
   11-tcg-tci-Add-TCG_TARGET_CALL_-RET-ARG-_I128.patch
   13-tcg-Add-temp-allocation-for-TCGv_i128.patch
   14-tcg-Add-basic-data-movement-for-TCGv_i128.patch
   15-tcg-Add-guest-load-store-primitives-for-TCGv_i128.patch
   16-tcg-Add-tcg_gen_-non-atomic_cmpxchg_i128.patch
   17-tcg-Split-out-tcg_gen_nonatomic_cmpxchg_i-32-64.patch
   24-target-s390x-Use-a-single-return-for-helper_divs3.patch
   31-target-s390x-Use-Int128-for-passing-float128.patch
   32-target-s390x-Use-tcg_gen_atomic_cmpxchg_i128-for-.patch
   33-target-s390x-Implement-CC_OP_NZ-in-gen_op_calc_cc.patch
   34-target-i386-Split-out-gen_cmpxchg8b-gen_cmpxchg16.patch
   35-target-i386-Inline-cmpxchg8b.patch
   36-target-i386-Inline-cmpxchg16b.patch


Ping.  Only 2, 3, 10, 14 reviewed in the past 2 weeks.
There is a very minor patch conflict now in patch 4, nothing worth re-posting 
over.


r~



Re: [PATCH v4 00/36] tcg: Support for Int128 with helpers

2023-01-24 Thread Richard Henderson

On 1/10/23 13:12, Mark Cave-Ayland wrote:
Now that the TCG documentation is more visible, would it be possible to add a patch to 
update the relevant parts of docs/devel/tcg-ops.rst to reflect the new Int128 support?


For avoidance of doubt, this document covers the intermediate representation and some 
backend specifics.  There are no changes to either of these at this time.  The TCGv_i128 
type is lowered to TCG_TYPE_REG (either I32 or I64 per host) during translation of guest 
instructions to intermediate opcodes.


Not to say another document shouldn't be written covering the translation 
interface...


r~



Re: [PATCH RFC 12/21] migration: Introduce page size for-migration-only

2023-01-24 Thread Peter Xu
On Tue, Jan 24, 2023 at 01:20:37PM +, Dr. David Alan Gilbert wrote:
> > @@ -3970,7 +3984,8 @@ int ram_load_postcopy(QEMUFile *f, int channel)
> >  break;
> >  }
> >  tmp_page->target_pages++;
> > -matches_target_page_size = block->page_size == 
> > TARGET_PAGE_SIZE;
> > +matches_target_page_size =
> > +migration_ram_pagesize(block) == TARGET_PAGE_SIZE;
> >  /*
> >   * Postcopy requires that we place whole host pages atomically;
> >   * these may be huge pages for RAMBlocks that are backed by
> 
> Hmm do you really want this change?

Yes that's intended.  I want to reuse the same logic here when receiving
small pages from huge pages, just like when we're receiving small pages on
non-hugetlb mappings.

matches_target_page_size majorly affects two things:

  1) For a small zero page, whether we want to pre-set the page_buffer, or
 simply use postcopy_place_page_zero():
  
case RAM_SAVE_FLAG_ZERO:
ch = qemu_get_byte(f);
/*
 * Can skip to set page_buffer when
 * this is a zero page and (block->page_size == TARGET_PAGE_SIZE).
 */
if (ch || !matches_target_page_size) {
memset(page_buffer, ch, TARGET_PAGE_SIZE);
}

  2) For normal page, whether we need to use a page buffer or we can
 directly reuse the page buffer in QEMUFile:

if (!matches_target_page_size) {
/* For huge pages, we always use temporary buffer */
qemu_get_buffer(f, page_buffer, TARGET_PAGE_SIZE);
} else {
/*
 * For small pages that matches target page size, we
 * avoid the qemu_file copy.  Instead we directly use
 * the buffer of QEMUFile to place the page.  Note: we
 * cannot do any QEMUFile operation before using that
 * buffer to make sure the buffer is valid when
 * placing the page.
 */
qemu_get_buffer_in_place(f, (uint8_t **)_source,
 TARGET_PAGE_SIZE);
}

Here:

I want 1) to reuse postcopy_place_page_zero().  For the doublemap case,
it'll reuse postcopy_tmp_zero_page() (because qemu_ram_is_uf_zeroable()
will return false for such a ramblock).

I want 2) to reuse qemu_get_buffer_in_place(), so we avoid a copy process
for the small page which is faster (even if it's hugetlb backed, now we can
reuse the qemufile buffer safely).

Thanks,

-- 
Peter Xu




Re: [PATCH v3 12/14] RISC-V: Add initial support for T-Head C906

2023-01-24 Thread Richard Henderson

On 1/24/23 09:59, Christoph Muellner wrote:

+++ b/target/riscv/cpu.h
@@ -27,6 +27,7 @@
  #include "qom/object.h"
  #include "qemu/int128.h"
  #include "cpu_bits.h"
+#include "cpu_vendorid.h"


I don't see that this ID is required for all users of riscv/cpu.h.
This include should be limited to cpu.c.


r~



[PATCH v4 0/3] hw/riscv: misc cleanups

2023-01-24 Thread Daniel Henrique Barboza
Hi,

These are the last 3 patches from the series

"[PATCH v3 0/7] riscv: fdt related cleanups"

That can be sent in separate from the fdt work. Patches are all acked.

Changes from v3:
- patches 1,2,3:
  - former patches 5, 6 and 7 from "[PATCH v3 0/7] riscv: fdt related cleanups"
- v3 link: https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg04464.html

Daniel Henrique Barboza (3):
  hw/riscv/virt.c: calculate socket count once in create_fdt_imsic()
  hw/riscv/virt.c: rename MachineState 'mc' pointers to 'ms'
  hw/riscv/spike.c: rename MachineState 'mc' pointers to' ms'

 hw/riscv/spike.c |  18 +-
 hw/riscv/virt.c  | 462 ---
 2 files changed, 242 insertions(+), 238 deletions(-)

-- 
2.39.1




[PATCH v4 3/3] hw/riscv/spike.c: rename MachineState 'mc' pointers to' ms'

2023-01-24 Thread Daniel Henrique Barboza
Follow the QEMU convention of naming MachineState pointers as 'ms' by
renaming the instances where we're calling it 'mc'.

Suggested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Alistair Francis 
Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/spike.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index 483581e05f..4cc877bea9 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -56,7 +56,7 @@ static void create_fdt(SpikeState *s, const MemMapEntry 
*memmap,
 uint64_t addr, size;
 unsigned long clint_addr;
 int cpu, socket;
-MachineState *mc = MACHINE(s);
+MachineState *ms = MACHINE(s);
 uint32_t *clint_cells;
 uint32_t cpu_phandle, intc_phandle, phandle = 1;
 char *name, *mem_name, *clint_name, *clust_name;
@@ -65,7 +65,7 @@ static void create_fdt(SpikeState *s, const MemMapEntry 
*memmap,
 "sifive,clint0", "riscv,clint0"
 };
 
-fdt = mc->fdt = create_device_tree(_size);
+fdt = ms->fdt = create_device_tree(_size);
 if (!fdt) {
 error_report("create_device_tree() failed");
 exit(1);
@@ -96,7 +96,7 @@ static void create_fdt(SpikeState *s, const MemMapEntry 
*memmap,
 qemu_fdt_setprop_cell(fdt, "/cpus", "#address-cells", 0x1);
 qemu_fdt_add_subnode(fdt, "/cpus/cpu-map");
 
-for (socket = (riscv_socket_count(mc) - 1); socket >= 0; socket--) {
+for (socket = (riscv_socket_count(ms) - 1); socket >= 0; socket--) {
 clust_name = g_strdup_printf("/cpus/cpu-map/cluster%d", socket);
 qemu_fdt_add_subnode(fdt, clust_name);
 
@@ -121,7 +121,7 @@ static void create_fdt(SpikeState *s, const MemMapEntry 
*memmap,
 qemu_fdt_setprop_cell(fdt, cpu_name, "reg",
 s->soc[socket].hartid_base + cpu);
 qemu_fdt_setprop_string(fdt, cpu_name, "device_type", "cpu");
-riscv_socket_fdt_write_id(mc, cpu_name, socket);
+riscv_socket_fdt_write_id(ms, cpu_name, socket);
 qemu_fdt_setprop_cell(fdt, cpu_name, "phandle", cpu_phandle);
 
 intc_name = g_strdup_printf("%s/interrupt-controller", cpu_name);
@@ -147,14 +147,14 @@ static void create_fdt(SpikeState *s, const MemMapEntry 
*memmap,
 g_free(cpu_name);
 }
 
-addr = memmap[SPIKE_DRAM].base + riscv_socket_mem_offset(mc, socket);
-size = riscv_socket_mem_size(mc, socket);
+addr = memmap[SPIKE_DRAM].base + riscv_socket_mem_offset(ms, socket);
+size = riscv_socket_mem_size(ms, socket);
 mem_name = g_strdup_printf("/memory@%lx", (long)addr);
 qemu_fdt_add_subnode(fdt, mem_name);
 qemu_fdt_setprop_cells(fdt, mem_name, "reg",
 addr >> 32, addr, size >> 32, size);
 qemu_fdt_setprop_string(fdt, mem_name, "device_type", "memory");
-riscv_socket_fdt_write_id(mc, mem_name, socket);
+riscv_socket_fdt_write_id(ms, mem_name, socket);
 g_free(mem_name);
 
 clint_addr = memmap[SPIKE_CLINT].base +
@@ -167,14 +167,14 @@ static void create_fdt(SpikeState *s, const MemMapEntry 
*memmap,
 0x0, clint_addr, 0x0, memmap[SPIKE_CLINT].size);
 qemu_fdt_setprop(fdt, clint_name, "interrupts-extended",
 clint_cells, s->soc[socket].num_harts * sizeof(uint32_t) * 4);
-riscv_socket_fdt_write_id(mc, clint_name, socket);
+riscv_socket_fdt_write_id(ms, clint_name, socket);
 
 g_free(clint_name);
 g_free(clint_cells);
 g_free(clust_name);
 }
 
-riscv_socket_fdt_write_distance_matrix(mc);
+riscv_socket_fdt_write_distance_matrix(ms);
 
 qemu_fdt_add_subnode(fdt, "/chosen");
 qemu_fdt_setprop_string(fdt, "/chosen", "stdout-path", "/htif");
-- 
2.39.1




[PATCH v4 1/3] hw/riscv/virt.c: calculate socket count once in create_fdt_imsic()

2023-01-24 Thread Daniel Henrique Barboza
riscv_socket_count() returns either ms->numa_state->num_nodes or 1
depending on NUMA support. In any case the value can be retrieved only
once and used in the rest of the function.

This will also alleviate the rename we're going to do next by reducing
the instances of MachineState 'mc' inside hw/riscv/virt.c.

Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Alistair Francis 
Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 34 +++---
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 48326406fd..f0fdb295e0 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -505,13 +505,14 @@ static void create_fdt_imsic(RISCVVirtState *s, const 
MemMapEntry *memmap,
 int cpu, socket;
 char *imsic_name;
 MachineState *mc = MACHINE(s);
+int socket_count = riscv_socket_count(mc);
 uint32_t imsic_max_hart_per_socket, imsic_guest_bits;
 uint32_t *imsic_cells, *imsic_regs, imsic_addr, imsic_size;
 
 *msi_m_phandle = (*phandle)++;
 *msi_s_phandle = (*phandle)++;
 imsic_cells = g_new0(uint32_t, mc->smp.cpus * 2);
-imsic_regs = g_new0(uint32_t, riscv_socket_count(mc) * 4);
+imsic_regs = g_new0(uint32_t, socket_count * 4);
 
 /* M-level IMSIC node */
 for (cpu = 0; cpu < mc->smp.cpus; cpu++) {
@@ -519,7 +520,7 @@ static void create_fdt_imsic(RISCVVirtState *s, const 
MemMapEntry *memmap,
 imsic_cells[cpu * 2 + 1] = cpu_to_be32(IRQ_M_EXT);
 }
 imsic_max_hart_per_socket = 0;
-for (socket = 0; socket < riscv_socket_count(mc); socket++) {
+for (socket = 0; socket < socket_count; socket++) {
 imsic_addr = memmap[VIRT_IMSIC_M].base +
  socket * VIRT_IMSIC_GROUP_MAX_SIZE;
 imsic_size = IMSIC_HART_SIZE(0) * s->soc[socket].num_harts;
@@ -545,14 +546,14 @@ static void create_fdt_imsic(RISCVVirtState *s, const 
MemMapEntry *memmap,
 qemu_fdt_setprop(mc->fdt, imsic_name, "interrupts-extended",
 imsic_cells, mc->smp.cpus * sizeof(uint32_t) * 2);
 qemu_fdt_setprop(mc->fdt, imsic_name, "reg", imsic_regs,
-riscv_socket_count(mc) * sizeof(uint32_t) * 4);
+socket_count * sizeof(uint32_t) * 4);
 qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,num-ids",
 VIRT_IRQCHIP_NUM_MSIS);
-if (riscv_socket_count(mc) > 1) {
+if (socket_count > 1) {
 qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,hart-index-bits",
 imsic_num_bits(imsic_max_hart_per_socket));
 qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-bits",
-imsic_num_bits(riscv_socket_count(mc)));
+imsic_num_bits(socket_count));
 qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-shift",
 IMSIC_MMIO_GROUP_MIN_SHIFT);
 }
@@ -567,7 +568,7 @@ static void create_fdt_imsic(RISCVVirtState *s, const 
MemMapEntry *memmap,
 }
 imsic_guest_bits = imsic_num_bits(s->aia_guests + 1);
 imsic_max_hart_per_socket = 0;
-for (socket = 0; socket < riscv_socket_count(mc); socket++) {
+for (socket = 0; socket < socket_count; socket++) {
 imsic_addr = memmap[VIRT_IMSIC_S].base +
  socket * VIRT_IMSIC_GROUP_MAX_SIZE;
 imsic_size = IMSIC_HART_SIZE(imsic_guest_bits) *
@@ -594,18 +595,18 @@ static void create_fdt_imsic(RISCVVirtState *s, const 
MemMapEntry *memmap,
 qemu_fdt_setprop(mc->fdt, imsic_name, "interrupts-extended",
 imsic_cells, mc->smp.cpus * sizeof(uint32_t) * 2);
 qemu_fdt_setprop(mc->fdt, imsic_name, "reg", imsic_regs,
-riscv_socket_count(mc) * sizeof(uint32_t) * 4);
+socket_count * sizeof(uint32_t) * 4);
 qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,num-ids",
 VIRT_IRQCHIP_NUM_MSIS);
 if (imsic_guest_bits) {
 qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,guest-index-bits",
 imsic_guest_bits);
 }
-if (riscv_socket_count(mc) > 1) {
+if (socket_count > 1) {
 qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,hart-index-bits",
 imsic_num_bits(imsic_max_hart_per_socket));
 qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-bits",
-imsic_num_bits(riscv_socket_count(mc)));
+imsic_num_bits(socket_count));
 qemu_fdt_setprop_cell(mc->fdt, imsic_name, "riscv,group-index-shift",
 IMSIC_MMIO_GROUP_MIN_SHIFT);
 }
@@ -733,6 +734,7 @@ static void create_fdt_sockets(RISCVVirtState *s, const 
MemMapEntry *memmap,
 MachineState *mc = MACHINE(s);
 uint32_t msi_m_phandle = 0, msi_s_phandle = 0;
 uint32_t *intc_phandles, xplic_phandles[MAX_NODES];
+int socket_count = riscv_socket_count(mc);
 
 qemu_fdt_add_subnode(mc->fdt, "/cpus");
 qemu_fdt_setprop_cell(mc->fdt, "/cpus", "timebase-frequency",
@@ -744,7 +746,7 @@ static void create_fdt_sockets(RISCVVirtState *s, const 
MemMapEntry *memmap,
 intc_phandles = 

[PATCH v4 2/3] hw/riscv/virt.c: rename MachineState 'mc' pointers to 'ms'

2023-01-24 Thread Daniel Henrique Barboza
We have a convention in other QEMU boards/archs to name MachineState
pointers as either 'machine' or 'ms'. MachineClass pointers are usually
called 'mc'.

The 'virt' RISC-V machine has a lot of instances where MachineState
pointers are named 'mc'. There is nothing wrong with that, but we gain
more compatibility with the rest of the QEMU code base, and easier
reviews, if we follow QEMU conventions.

Rename all 'mc' MachineState pointers to 'ms'. This is a very tedious
and mechanical patch that was produced by doing the following:

- find/replace all 'MachineState *mc' to 'MachineState *ms';
- find/replace all 'mc->fdt' to 'ms->fdt';
- find/replace all 'mc->smp.cpus' to 'ms->smp.cpus';
- replace any remaining occurrences of 'mc' that the compiler complained
about.

Suggested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Alistair Francis 
Signed-off-by: Daniel Henrique Barboza 
---
 hw/riscv/virt.c | 434 
 1 file changed, 217 insertions(+), 217 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index f0fdb295e0..02edfcf71f 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -227,7 +227,7 @@ static void create_fdt_socket_cpus(RISCVVirtState *s, int 
socket,
 {
 int cpu;
 uint32_t cpu_phandle;
-MachineState *mc = MACHINE(s);
+MachineState *ms = MACHINE(s);
 char *name, *cpu_name, *core_name, *intc_name;
 bool is_32_bit = riscv_is_32bit(>soc[0]);
 
@@ -236,40 +236,40 @@ static void create_fdt_socket_cpus(RISCVVirtState *s, int 
socket,
 
 cpu_name = g_strdup_printf("/cpus/cpu@%d",
 s->soc[socket].hartid_base + cpu);
-qemu_fdt_add_subnode(mc->fdt, cpu_name);
+qemu_fdt_add_subnode(ms->fdt, cpu_name);
 if (riscv_feature(>soc[socket].harts[cpu].env,
   RISCV_FEATURE_MMU)) {
-qemu_fdt_setprop_string(mc->fdt, cpu_name, "mmu-type",
+qemu_fdt_setprop_string(ms->fdt, cpu_name, "mmu-type",
 (is_32_bit) ? "riscv,sv32" : "riscv,sv48");
 } else {
-qemu_fdt_setprop_string(mc->fdt, cpu_name, "mmu-type",
+qemu_fdt_setprop_string(ms->fdt, cpu_name, "mmu-type",
 "riscv,none");
 }
 name = riscv_isa_string(>soc[socket].harts[cpu]);
-qemu_fdt_setprop_string(mc->fdt, cpu_name, "riscv,isa", name);
+qemu_fdt_setprop_string(ms->fdt, cpu_name, "riscv,isa", name);
 g_free(name);
-qemu_fdt_setprop_string(mc->fdt, cpu_name, "compatible", "riscv");
-qemu_fdt_setprop_string(mc->fdt, cpu_name, "status", "okay");
-qemu_fdt_setprop_cell(mc->fdt, cpu_name, "reg",
+qemu_fdt_setprop_string(ms->fdt, cpu_name, "compatible", "riscv");
+qemu_fdt_setprop_string(ms->fdt, cpu_name, "status", "okay");
+qemu_fdt_setprop_cell(ms->fdt, cpu_name, "reg",
 s->soc[socket].hartid_base + cpu);
-qemu_fdt_setprop_string(mc->fdt, cpu_name, "device_type", "cpu");
-riscv_socket_fdt_write_id(mc, cpu_name, socket);
-qemu_fdt_setprop_cell(mc->fdt, cpu_name, "phandle", cpu_phandle);
+qemu_fdt_setprop_string(ms->fdt, cpu_name, "device_type", "cpu");
+riscv_socket_fdt_write_id(ms, cpu_name, socket);
+qemu_fdt_setprop_cell(ms->fdt, cpu_name, "phandle", cpu_phandle);
 
 intc_phandles[cpu] = (*phandle)++;
 
 intc_name = g_strdup_printf("%s/interrupt-controller", cpu_name);
-qemu_fdt_add_subnode(mc->fdt, intc_name);
-qemu_fdt_setprop_cell(mc->fdt, intc_name, "phandle",
+qemu_fdt_add_subnode(ms->fdt, intc_name);
+qemu_fdt_setprop_cell(ms->fdt, intc_name, "phandle",
 intc_phandles[cpu]);
-qemu_fdt_setprop_string(mc->fdt, intc_name, "compatible",
+qemu_fdt_setprop_string(ms->fdt, intc_name, "compatible",
 "riscv,cpu-intc");
-qemu_fdt_setprop(mc->fdt, intc_name, "interrupt-controller", NULL, 0);
-qemu_fdt_setprop_cell(mc->fdt, intc_name, "#interrupt-cells", 1);
+qemu_fdt_setprop(ms->fdt, intc_name, "interrupt-controller", NULL, 0);
+qemu_fdt_setprop_cell(ms->fdt, intc_name, "#interrupt-cells", 1);
 
 core_name = g_strdup_printf("%s/core%d", clust_name, cpu);
-qemu_fdt_add_subnode(mc->fdt, core_name);
-qemu_fdt_setprop_cell(mc->fdt, core_name, "cpu", cpu_phandle);
+qemu_fdt_add_subnode(ms->fdt, core_name);
+qemu_fdt_setprop_cell(ms->fdt, core_name, "cpu", cpu_phandle);
 
 g_free(core_name);
 g_free(intc_name);
@@ -282,16 +282,16 @@ static void create_fdt_socket_memory(RISCVVirtState *s,
 {
 char *mem_name;
 uint64_t addr, size;
-MachineState *mc = MACHINE(s);
+MachineState *ms = MACHINE(s);
 
-addr = memmap[VIRT_DRAM].base + riscv_socket_mem_offset(mc, socket);
-size = riscv_socket_mem_size(mc, socket);
+addr = 

Re: [PATCH v3 09/14] RISC-V: Adding T-Head MemIdx extension

2023-01-24 Thread Richard Henderson

On 1/24/23 09:59, Christoph Muellner wrote:

+/* XTheadMemIdx */
+
+/*
+ * Load with memop from indexed address and add (imm5 << imm2) to rs1.
+ * If !preinc, then the load address is rs1.
+ * If  preinc, then the load address is rs1 + (imm5) << imm2).
+ */
+static bool gen_load_inc(DisasContext *ctx, arg_th_meminc *a, MemOp memop,
+ bool preinc)
+{
+TCGv rd = dest_gpr(ctx, a->rd);
+TCGv addr = get_address(ctx, a->rs1, preinc ? a->imm5 << a->imm2 : 0);
+
+tcg_gen_qemu_ld_tl(rd, addr, ctx->mem_idx, memop);
+addr = get_address(ctx, a->rs1, !preinc ? a->imm5 << a->imm2 : 0);


First, you're leaking the previous 'addr' temporary.
Second, get_address may make modifications to 'addr' which you don't want to 
write back.
Third, you are not checking for rd != rs1.

I think what you want is

int imm = a->imm5 << a->imm2;
TCGv addr = get_address(ctx, a->rs1, preinc ? imm : 0);
TCGv rd = dest_gpr(ctx, a->rd);
TCGv rs1 = get_gpr(ctx, a->rs1, EXT_NONE);

tcg_gen_qemu_ld_tl(rd, addr, ctx->mem_idx, memop);
tcg_gen_addi_tl(rs1, rs1, imm);
gen_set_gpr(ctx, a->rd, rd);
gen_set_gpr(ctx, a->rs1, rs1);


r~



Re: [PATCH RFC 11/21] migration: Add hugetlb-doublemap cap

2023-01-24 Thread Peter Xu
On Tue, Jan 24, 2023 at 12:45:38PM +, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > Add a new cap to allow mapping hugetlbfs backed RAMs in small page sizes.
> > 
> > Signed-off-by: Peter Xu 
> 
> 
> Reviewed-by: Dr. David Alan Gilbert 

Thanks.

> 
> although, I'm curious if the protocol actually changes

Yes it does.

It differs not in the form of a changed header or any frame definitions,
but in the format of how huge pages are sent.  The old binary can only send
a huge page by sending all the small pages sequentially starting from index
0 to index N_HUGE-1; while the new binary can send the huge page out of
order.  For the latter it's the same as when huge page is not used.

> or whether a doublepage enabled destination would work with an unmodified
> source?

This is an interesting question.

I would expect old -> new work as usual, because the page frames are not
modified so the dest node will just see pages being migrated in a
sequential manner.  The latency of page request will be the same as old
binary though because even if dest host can handle small pages it won't be
able to get asap on the pages it wants - src host decides which page to
send.

Meanwhile new -> old shouldn't work I think as described above, because the
dest host should see weird things happening, e.g., a huge page was sent not
starting fron index 0 but index X (0 I guess potentially you can get away without the dirty clearing
> of the partially sent hugepages that the source normally does.

Good point. It's actually more relevant to the other patch later on
reworking the discard logic.  I kept it as-is for majorly two reasons:

 1) It is still not 100% confirmed on how MADV_DONTNEED should behave on
HGM enabled memory ranges where huge pages used to be mapped.  It's
part of the discussion upstream on the kernel patchset.  I think it's
settling, but in the current series I kept it in a form so it'll work
in all cases.

 2) Not dirtying the partially sent huge pages can always reduce small
pages being migrated, but it can also change the content of discard
messages due to the frame format of MIG_CMD_POSTCOPY_RAM_DISCARD, in
that we can have a lot more scattered ranges, so a lot more messaging
can be needed.  While when with the existing logic, since we'll always
re-dirty the partial sent pages, the ranges are more likely to be
efficient.

* CMD_POSTCOPY_RAM_DISCARD consist of:
*  byte   version (0)
*  byte   Length of name field (not including 0)
*  n x byte   RAM block name
*  byte   0 terminator (just for safety)
*  n xByte ranges within the named RAMBlock
*  be64   Start of the range
*  be64   Length

I think 1) may not hold as the kernel series evolves, so it may not be true
anymore.  2) may still be true, but I think worth some testing (especially
on 1G pages) to see how it could interfere the discard procedure.  Maybe it
won't be as bad as I think.  Even if it could, we can evaluate the tradeoff
between "slower discard sync" and "less page need to send".  E.g., we can
consider changing the frame layout by boosting postcopy_ram_discard_version.

I'll take a note on this one and provide more update in the next version.

-- 
Peter Xu




Re: [PATCH v7 6/7] mac_newworld: Deprecate mac99 "via" option

2023-01-24 Thread Warner Losh


> On Jan 4, 2023, at 2:59 PM, BALATON Zoltan  wrote:
> 
> Setting emulated machine type with a property called "via" is
> confusing users so deprecate the "via" option in favour of newly added
> explicit machine types. The default via=cuda option is not a valid
> config (no real Mac has this combination of hardware) so no machine
> type could be defined for that therefore it is kept for backwards
> compatibility with older QEMU versions for now but other options
> resembling real machines are deprecated.
> 
> Signed-off-by: BALATON Zoltan 
> ---
> hw/ppc/mac_newworld.c | 9 +
> 1 file changed, 9 insertions(+)
> 
> diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
> index f07c37328b..adf185bd3a 100644
> --- a/hw/ppc/mac_newworld.c
> +++ b/hw/ppc/mac_newworld.c
> @@ -169,6 +169,15 @@ static void ppc_core99_init(MachineState *machine)
> if (PPC_INPUT(env) == PPC_FLAGS_INPUT_970) {
> warn_report("mac99 with G5 CPU is deprecated, "
> "use powermac7_3 instead");
> +} else {
> +if (core99_machine->via_config == CORE99_VIA_CONFIG_PMU) {
> +warn_report("mac99,via=pmu is deprecated, "
> +"use powermac3_1 instead");

so use ‘-m mac99,via=powermac3_1’ or ‘-m powermac3_1’ or ‘-m mac99,powerpmac3_1’

I have no clue which one I’m supposed to use. It would be better to tell the 
user
which of these three possibilities they should really use. From the other 
patches
in the series, I’m guessing it’s the middle one, but even after looking at the 
code, I’m
unsure.

> +}
> +if (core99_machine->via_config == CORE99_VIA_CONFIG_PMU_ADB) {
> +warn_report("mac99,via=pmu-adb is deprecated, "
> +"use powerbook3_2 instead");

Same basic comment here.

I’m thinking adding '-m’ or ‘machine type’ before powerbook… in both of these 
would
resolve it..

Warner

> +}
> }
> }
> /* allocate RAM */
> --
> 2.30.6
> 
> 



signature.asc
Description: Message signed with OpenPGP


Re: [PATCH] nubus-device: fix memory leak in nubus_device_realize

2023-01-24 Thread Mauro Matteo Cascella
On Thu, Dec 22, 2022 at 6:29 PM Mauro Matteo Cascella
 wrote:
>
> Local variable "name" is allocated through strdup_printf and should be
> freed with g_free() to avoid memory leak.
>
> Fixes: 3616f424 ("nubus-device: add romfile property for loading declaration 
> ROMs")
> Signed-off-by: Mauro Matteo Cascella 
> ---
>  hw/nubus/nubus-device.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/hw/nubus/nubus-device.c b/hw/nubus/nubus-device.c
> index 0f1852f671..49008e4938 100644
> --- a/hw/nubus/nubus-device.c
> +++ b/hw/nubus/nubus-device.c
> @@ -80,6 +80,7 @@ static void nubus_device_realize(DeviceState *dev, Error 
> **errp)
> _abort);
>  ret = load_image_mr(path, >decl_rom);
>  g_free(path);
> +g_free(name);
>  if (ret < 0) {
>  error_setg(errp, "could not load romfile \"%s\"", nd->romfile);
>  return;
> --
> 2.38.1

Hi, any updates here? Is this patch going to be merged?

Thanks,

--
Mauro Matteo Cascella
Red Hat Product Security
PGP-Key ID: BB3410B0




Re: [PATCH RFC 10/21] ramblock: Add ramblock_file_map()

2023-01-24 Thread Peter Xu
On Tue, Jan 24, 2023 at 10:06:48AM +, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > Add a helper to do mmap() for a ramblock based on the cached informations.
> > 
> > A trivial thing to mention is we need to move ramblock->fd setup to be
> > earlier, before the ramblock_file_map() call, because it'll need to
> > reference the fd being mapped.  However that should not be a problem at
> > all, majorly because the fd won't be freed if successful, and if it failed
> > the fd will be freeed (or to be explicit, close()ed) by the caller.
> > 
> > Export it - prepare to be used outside this file.
> > 
> > Signed-off-by: Peter Xu 
> > ---
> >  include/exec/ram_addr.h |  1 +
> >  softmmu/physmem.c   | 25 +
> >  2 files changed, 18 insertions(+), 8 deletions(-)
> > 
> > diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> > index 0bf9cfc659..56db25009a 100644
> > --- a/include/exec/ram_addr.h
> > +++ b/include/exec/ram_addr.h
> > @@ -98,6 +98,7 @@ bool ramblock_is_pmem(RAMBlock *rb);
> >  
> >  long qemu_minrampagesize(void);
> >  long qemu_maxrampagesize(void);
> > +void *ramblock_file_map(RAMBlock *block);
> >  
> >  /**
> >   * qemu_ram_alloc_from_file,
> > diff --git a/softmmu/physmem.c b/softmmu/physmem.c
> > index 6096eac286..cdda7eaea5 100644
> > --- a/softmmu/physmem.c
> > +++ b/softmmu/physmem.c
> > @@ -1532,17 +1532,31 @@ static int file_ram_open(const char *path,
> >  return fd;
> >  }
> >  
> > +/* Do the mmap() for a ramblock based on information already setup */
> > +void *ramblock_file_map(RAMBlock *block)
> > +{
> > +uint32_t qemu_map_flags;
> > +
> > +qemu_map_flags = (block->flags & RAM_READONLY) ? QEMU_MAP_READONLY : 0;
> > +qemu_map_flags |= (block->flags & RAM_SHARED) ? QEMU_MAP_SHARED : 0;
> > +qemu_map_flags |= (block->flags & RAM_PMEM) ? QEMU_MAP_SYNC : 0;
> > +qemu_map_flags |= (block->flags & RAM_NORESERVE) ? QEMU_MAP_NORESERVE 
> > : 0;
> > +
> > +return qemu_ram_mmap(block->fd, block->mmap_length, block->mr->align,
> > + qemu_map_flags, block->file_offset);
> > +}
> > +
> >  static void *file_ram_alloc(RAMBlock *block,
> >  int fd,
> >  bool truncate,
> >  off_t offset,
> >  Error **errp)
> >  {
> > -uint32_t qemu_map_flags;
> >  void *area;
> >  
> >  /* Remember the offset just in case we'll need to map the range again 
> > */
> 
> Note that this comment is now wrong; you need to always set that for the
> map call.

This line is added in patch 7.  After this patch, a ramblock should always
be mapped with ramblock_file_map(), so it keeps being true?

> 
> Other than that,
> 
> Reviewed-by: Dr. David Alan Gilbert 

Thanks,

-- 
Peter Xu




Re: [PATCH v3 08/14] RISC-V: Adding T-Head MemPair extension

2023-01-24 Thread Richard Henderson

On 1/24/23 09:59, Christoph Muellner wrote:

+static bool gen_loadpair_tl(DisasContext *ctx, arg_th_pair *a, MemOp memop,
+int shamt)
+{
+TCGv rd1 = dest_gpr(ctx, a->rd1);
+TCGv rd2 = dest_gpr(ctx, a->rd2);
+TCGv addr1 = tcg_temp_new();
+TCGv addr2 = tcg_temp_new();
+
+addr1 = get_address(ctx, a->rs, a->sh2 << shamt);
+if ((memop & MO_SIZE) == MO_64) {
+addr2 = get_address(ctx, a->rs, 8 + (a->sh2 << shamt));
+} else {
+addr2 = get_address(ctx, a->rs, 4 + (a->sh2 << shamt));
+}
+
+tcg_gen_qemu_ld_tl(rd1, addr1, ctx->mem_idx, memop);
+tcg_gen_qemu_ld_tl(rd2, addr2, ctx->mem_idx, memop);
+gen_set_gpr(ctx, a->rd1, rd1);
+gen_set_gpr(ctx, a->rd2, rd2);


Since dest_gpr may return cpu_gpr[n], this may update the rd1 before recognizing the 
exception that the second load may generate.  Is that correct?


The manual says that rd1, rd2, and rs1 must not be the same, but you do not 
check this.


r~



Re: [PATCH v4 0/2] riscv: Add support for Zicbo[m,z,p] instructions

2023-01-24 Thread Sudip Mukherjee
Hi Christoph,

On Wed, Feb 16, 2022 at 04:48:37PM +0100, Christoph Muellner wrote:
> The RISC-V base cache management operation ISA extension has been
> ratified [1]. This patchset adds support for the defined instructions.
> 
> As the exception behavior of these instructions depend on the PMP
> configuration, the first patch introduces a new API to probe the access
> of an address range with a specified size with optional nonfaulting
> behavior.
> 
> The Zicbo[m,z,p] patch should be straight-forward and has been reviewed
> in previous versions of this patchset.

I have not seen any v5 yet, unless I have missed. Are you planning to
send one?
fwiw, I rebased them on top of v7.2.0 and tested that it works.

-- 
Regards
Sudip



Re: [PATCH RFC 08/21] ramblock: Cache the length to do file mmap() on ramblocks

2023-01-24 Thread Peter Xu
On Mon, Jan 23, 2023 at 06:51:51PM +, Dr. David Alan Gilbert wrote:
> * Peter Xu (pet...@redhat.com) wrote:
> > We do proper page size alignment for file backed mmap()s for ramblocks.
> > Even if it's as simple as that, cache the value because it'll be used in
> > multiple places.
> > 
> > Since at it, drop size for file_ram_alloc() and just use max_length because
> > that's always true for file-backed ramblocks.
> 
> Having a length previously called 'memory' was a bit odd!

:-D

> 
> > Signed-off-by: Peter Xu 
> 
> Reviewed-by: Dr. David Alan Gilbert 

Thanks,

-- 
Peter Xu




Re: [PATCH v3 02/14] RISC-V: Adding XTheadSync ISA extension

2023-01-24 Thread Richard Henderson

On 1/24/23 09:59, Christoph Muellner wrote:

+static bool trans_th_sfence_vmas(DisasContext *ctx, arg_th_sfence_vmas *a)
+{
+(void) a;
+REQUIRE_XTHEADSYNC(ctx);
+
+#ifndef CONFIG_USER_ONLY
+REQUIRE_PRIV_MS(ctx);
+decode_save_opc(ctx);
+gen_helper_tlb_flush_all(cpu_env);


Why are you using decode_save_opc() when helper_tlb_flush_all() cannot raise an 
exception?


r~



[PATCH] linux-user: un-parent OBJECT(cpu) when closing thread

2023-01-24 Thread Richard Henderson
This reinstates commit 52f0c1607671293afcdb2acc2f83e9bccbfa74bb:

While forcing the CPU to unrealize by hand does trigger the clean-up
code we never fully free resources because refcount never reaches
zero. This is because QOM automatically added objects without an
explicit parent to /unattached/, incrementing the refcount.

Instead of manually triggering unrealization just unparent the object
and let the device machinery deal with that for us.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866
Signed-off-by: Alex Bennée 
Reviewed-by: Laurent Vivier 
Message-Id: <20220811151413.3350684-2-alex.ben...@linaro.org>

The original patch tickled a problem in target/arm, and was reverted.
But that problem is fixed as of commit 3b07a936d3bf.

Signed-off-by: Richard Henderson 
---
 linux-user/syscall.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 1f8c10f8ef..4ca1b59343 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8642,7 +8642,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
 if (CPU_NEXT(first_cpu)) {
 TaskState *ts = cpu->opaque;
 
-object_property_set_bool(OBJECT(cpu), "realized", false, NULL);
+if (ts->child_tidptr) {
+put_user_u32(0, ts->child_tidptr);
+do_sys_futex(g2h(cpu, ts->child_tidptr),
+ FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
+}
+
+object_unparent(OBJECT(cpu));
 object_unref(OBJECT(cpu));
 /*
  * At this point the CPU should be unrealized and removed
@@ -8652,11 +8658,6 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
 
 pthread_mutex_unlock(_lock);
 
-if (ts->child_tidptr) {
-put_user_u32(0, ts->child_tidptr);
-do_sys_futex(g2h(cpu, ts->child_tidptr),
- FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
-}
 thread_cpu = NULL;
 g_free(ts);
 rcu_unregister_thread();
-- 
2.34.1




[PATCH v3 03/14] RISC-V: Adding XTheadBa ISA extension

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the XTheadBa ISA extension.
The patch uses the T-Head specific decoder and translation.

Co-developed-by: Philipp Tomsich 
Co-developed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Add ISA_EXT_DATA_ENTRY()
- Split XtheadB* extension into individual commits
- Use single decoder for XThead extensions

 target/riscv/cpu.c |  2 ++
 target/riscv/cpu.h |  1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 39 ++
 target/riscv/translate.c   |  3 +-
 target/riscv/xthead.decode | 22 
 5 files changed, 66 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index ae2009e89c..4b46130c5b 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -109,6 +109,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(svinval, true, PRIV_VERSION_1_12_0, ext_svinval),
 ISA_EXT_DATA_ENTRY(svnapot, true, PRIV_VERSION_1_12_0, ext_svnapot),
 ISA_EXT_DATA_ENTRY(svpbmt, true, PRIV_VERSION_1_12_0, ext_svpbmt),
+ISA_EXT_DATA_ENTRY(xtheadba, true, PRIV_VERSION_1_11_0, ext_xtheadba),
 ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0, ext_xtheadcmo),
 ISA_EXT_DATA_ENTRY(xtheadsync, true, PRIV_VERSION_1_11_0, ext_xtheadsync),
 ISA_EXT_DATA_ENTRY(xventanacondops, true, PRIV_VERSION_1_12_0, 
ext_XVentanaCondOps),
@@ -1073,6 +1074,7 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("zmmul", RISCVCPU, cfg.ext_zmmul, false),
 
 /* Vendor-specific custom extensions */
+DEFINE_PROP_BOOL("xtheadba", RISCVCPU, cfg.ext_xtheadba, false),
 DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
 DEFINE_PROP_BOOL("xtheadsync", RISCVCPU, cfg.ext_xtheadsync, false),
 DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, 
false),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d0ab5c7bb0..d3191bf27b 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -473,6 +473,7 @@ struct RISCVCPUConfig {
 uint64_t mimpid;
 
 /* Vendor-specific custom extensions */
+bool ext_xtheadba;
 bool ext_xtheadcmo;
 bool ext_xtheadsync;
 bool ext_XVentanaCondOps;
diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
index bf5b39c749..a7da156869 100644
--- a/target/riscv/insn_trans/trans_xthead.c.inc
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -16,6 +16,12 @@
  * this program.  If not, see .
  */
 
+#define REQUIRE_XTHEADBA(ctx) do {   \
+if (!ctx->cfg_ptr->ext_xtheadba) {   \
+return false;\
+}\
+} while (0)
+
 #define REQUIRE_XTHEADCMO(ctx) do {  \
 if (!ctx->cfg_ptr->ext_xtheadcmo) {  \
 return false;\
@@ -28,6 +34,39 @@
 }\
 } while (0)
 
+/* XTheadBa */
+
+/*
+ * th.addsl is similar to sh[123]add (from Zba), but not an
+ * alternative encoding: while sh[123] applies the shift to rs1,
+ * th.addsl shifts rs2.
+ */
+
+#define GEN_TH_ADDSL(SHAMT) \
+static void gen_th_addsl##SHAMT(TCGv ret, TCGv arg1, TCGv arg2) \
+{   \
+TCGv t = tcg_temp_new();\
+tcg_gen_shli_tl(t, arg2, SHAMT);\
+tcg_gen_add_tl(ret, t, arg1);   \
+tcg_temp_free(t);   \
+}
+
+GEN_TH_ADDSL(1)
+GEN_TH_ADDSL(2)
+GEN_TH_ADDSL(3)
+
+#define GEN_TRANS_TH_ADDSL(SHAMT)   \
+static bool trans_th_addsl##SHAMT(DisasContext *ctx,\
+  arg_th_addsl##SHAMT * a)  \
+{   \
+REQUIRE_XTHEADBA(ctx);  \
+return gen_arith(ctx, a, EXT_NONE, gen_th_addsl##SHAMT, NULL);  \
+}
+
+GEN_TRANS_TH_ADDSL(1)
+GEN_TRANS_TH_ADDSL(2)
+GEN_TRANS_TH_ADDSL(3)
+
 /* XTheadCmo */
 
 static inline int priv_level(DisasContext *ctx)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index dcda7cfd22..68baf84807 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -130,7 +130,8 @@ static bool always_true_p(DisasContext *ctx  
__attribute__((__unused__)))
 
 static bool has_xthead_p(DisasContext *ctx  __attribute__((__unused__)))
 {
-return ctx->cfg_ptr->ext_xtheadcmo || ctx->cfg_ptr->ext_xtheadsync;
+return ctx->cfg_ptr->ext_xtheadba || ctx->cfg_ptr->ext_xtheadcmo ||
+   ctx->cfg_ptr->ext_xtheadsync;
 }
 
 #define 

[PATCH v3 04/14] RISC-V: Adding XTheadBb ISA extension

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the XTheadBb ISA extension.
The patch uses the T-Head specific decoder and translation.

Co-developed-by: Philipp Tomsich 
Co-developed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Add ISA_EXT_DATA_ENTRY()
- Split XtheadB* extension into individual commits
- Make implementation compatible with RV32.
- Use single decoder for XThead extensions

 target/riscv/cpu.c |   2 +
 target/riscv/cpu.h |   1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 124 +
 target/riscv/translate.c   |   4 +-
 target/riscv/xthead.decode |  20 
 5 files changed, 149 insertions(+), 2 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 4b46130c5b..b995470dd6 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -110,6 +110,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(svnapot, true, PRIV_VERSION_1_12_0, ext_svnapot),
 ISA_EXT_DATA_ENTRY(svpbmt, true, PRIV_VERSION_1_12_0, ext_svpbmt),
 ISA_EXT_DATA_ENTRY(xtheadba, true, PRIV_VERSION_1_11_0, ext_xtheadba),
+ISA_EXT_DATA_ENTRY(xtheadbb, true, PRIV_VERSION_1_11_0, ext_xtheadbb),
 ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0, ext_xtheadcmo),
 ISA_EXT_DATA_ENTRY(xtheadsync, true, PRIV_VERSION_1_11_0, ext_xtheadsync),
 ISA_EXT_DATA_ENTRY(xventanacondops, true, PRIV_VERSION_1_12_0, 
ext_XVentanaCondOps),
@@ -1075,6 +1076,7 @@ static Property riscv_cpu_extensions[] = {
 
 /* Vendor-specific custom extensions */
 DEFINE_PROP_BOOL("xtheadba", RISCVCPU, cfg.ext_xtheadba, false),
+DEFINE_PROP_BOOL("xtheadbb", RISCVCPU, cfg.ext_xtheadbb, false),
 DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
 DEFINE_PROP_BOOL("xtheadsync", RISCVCPU, cfg.ext_xtheadsync, false),
 DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, 
false),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d3191bf27b..ff92705010 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -474,6 +474,7 @@ struct RISCVCPUConfig {
 
 /* Vendor-specific custom extensions */
 bool ext_xtheadba;
+bool ext_xtheadbb;
 bool ext_xtheadcmo;
 bool ext_xtheadsync;
 bool ext_XVentanaCondOps;
diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
index a7da156869..ea6cd6e305 100644
--- a/target/riscv/insn_trans/trans_xthead.c.inc
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -22,6 +22,12 @@
 }\
 } while (0)
 
+#define REQUIRE_XTHEADBB(ctx) do {   \
+if (!ctx->cfg_ptr->ext_xtheadbb) {   \
+return false;\
+}\
+} while (0)
+
 #define REQUIRE_XTHEADCMO(ctx) do {  \
 if (!ctx->cfg_ptr->ext_xtheadcmo) {  \
 return false;\
@@ -67,6 +73,124 @@ GEN_TRANS_TH_ADDSL(1)
 GEN_TRANS_TH_ADDSL(2)
 GEN_TRANS_TH_ADDSL(3)
 
+/* XTheadBb */
+
+/* th.srri is an alternate encoding for rori (from Zbb) */
+static bool trans_th_srri(DisasContext *ctx, arg_th_srri * a)
+{
+REQUIRE_XTHEADBB(ctx);
+return gen_shift_imm_fn_per_ol(ctx, a, EXT_NONE,
+   tcg_gen_rotri_tl, gen_roriw, NULL);
+}
+
+/* th.srriw is an alternate encoding for roriw (from Zbb) */
+static bool trans_th_srriw(DisasContext *ctx, arg_th_srriw *a)
+{
+REQUIRE_XTHEADBB(ctx);
+REQUIRE_64BIT(ctx);
+ctx->ol = MXL_RV32;
+return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_roriw, NULL);
+}
+
+/* th.ext and th.extu perform signed/unsigned bitfield extraction */
+static bool gen_th_bfextract(DisasContext *ctx, arg_th_bfext *a,
+ void (*f)(TCGv, TCGv, unsigned int, unsigned int))
+{
+TCGv dest = dest_gpr(ctx, a->rd);
+TCGv source = get_gpr(ctx, a->rs1, EXT_ZERO);
+
+if (a->lsb <= a->msb) {
+f(dest, source, a->lsb, a->msb - a->lsb + 1);
+gen_set_gpr(ctx, a->rd, dest);
+}
+return true;
+}
+
+static bool trans_th_ext(DisasContext *ctx, arg_th_ext *a)
+{
+REQUIRE_XTHEADBB(ctx);
+return gen_th_bfextract(ctx, a, tcg_gen_sextract_tl);
+}
+
+static bool trans_th_extu(DisasContext *ctx, arg_th_extu *a)
+{
+REQUIRE_XTHEADBB(ctx);
+return gen_th_bfextract(ctx, a, tcg_gen_extract_tl);
+}
+
+/* th.ff0: find first zero (clz on an inverted input) */
+static bool gen_th_ff0(DisasContext *ctx, arg_th_ff0 *a, DisasExtend ext)
+{
+TCGv dest = dest_gpr(ctx, a->rd);
+TCGv src1 = get_gpr(ctx, a->rs1, ext);
+
+int olen = get_olen(ctx);
+TCGv t = tcg_temp_new();
+
+tcg_gen_not_tl(t, src1);
+if (olen != TARGET_LONG_BITS) {
+if (olen == 32) {
+gen_clzw(dest, t);
+} else {
+

[PATCH v3 12/14] RISC-V: Add initial support for T-Head C906

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds the T-Head C906 to the list of known CPUs.
Selecting this CPUs will automatically enable the available
ISA extensions of the CPUs (incl. vendor extensions).

Co-developed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Drop C910 as it does not differ from C906
- Set priv version to 1.11 (new fmin/fmax behaviour)

Changes in v3:
- Removed setting dropped 'xtheadxmae' extension

 target/riscv/cpu.c  | 30 ++
 target/riscv/cpu.h  |  2 ++
 target/riscv/cpu_vendorid.h |  6 ++
 3 files changed, 38 insertions(+)
 create mode 100644 target/riscv/cpu_vendorid.h

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index b18df9fa2a..627512a184 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -279,6 +279,35 @@ static void rv64_sifive_e_cpu_init(Object *obj)
 cpu->cfg.mmu = false;
 }
 
+static void rv64_thead_c906_cpu_init(Object *obj)
+{
+CPURISCVState *env = _CPU(obj)->env;
+RISCVCPU *cpu = RISCV_CPU(obj);
+
+set_misa(env, MXL_RV64, RVI | RVM | RVA | RVF | RVD | RVC | RVS | RVU);
+set_priv_version(env, PRIV_VERSION_1_11_0);
+
+cpu->cfg.ext_g = true;
+cpu->cfg.ext_c = true;
+cpu->cfg.ext_u = true;
+cpu->cfg.ext_s = true;
+cpu->cfg.ext_icsr = true;
+cpu->cfg.ext_zfh = true;
+cpu->cfg.mmu = true;
+cpu->cfg.ext_xtheadba = true;
+cpu->cfg.ext_xtheadbb = true;
+cpu->cfg.ext_xtheadbs = true;
+cpu->cfg.ext_xtheadcmo = true;
+cpu->cfg.ext_xtheadcondmov = true;
+cpu->cfg.ext_xtheadfmemidx = true;
+cpu->cfg.ext_xtheadmac = true;
+cpu->cfg.ext_xtheadmemidx = true;
+cpu->cfg.ext_xtheadmempair = true;
+cpu->cfg.ext_xtheadsync = true;
+
+cpu->cfg.mvendorid = THEAD_VENDOR_ID;
+}
+
 static void rv128_base_cpu_init(Object *obj)
 {
 if (qemu_tcg_mttcg_enabled()) {
@@ -1320,6 +1349,7 @@ static const TypeInfo riscv_cpu_type_infos[] = {
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_E51,   rv64_sifive_e_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_U54,   rv64_sifive_u_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_SHAKTI_C, rv64_sifive_u_cpu_init),
+DEFINE_CPU(TYPE_RISCV_CPU_THEAD_C906,   rv64_thead_c906_cpu_init),
 DEFINE_CPU(TYPE_RISCV_CPU_BASE128,  rv128_base_cpu_init),
 #endif
 };
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index b0ec5fcf9e..134dc29c6e 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -27,6 +27,7 @@
 #include "qom/object.h"
 #include "qemu/int128.h"
 #include "cpu_bits.h"
+#include "cpu_vendorid.h"
 
 #define TCG_GUEST_DEFAULT_MO 0
 
@@ -53,6 +54,7 @@
 #define TYPE_RISCV_CPU_SIFIVE_E51   RISCV_CPU_TYPE_NAME("sifive-e51")
 #define TYPE_RISCV_CPU_SIFIVE_U34   RISCV_CPU_TYPE_NAME("sifive-u34")
 #define TYPE_RISCV_CPU_SIFIVE_U54   RISCV_CPU_TYPE_NAME("sifive-u54")
+#define TYPE_RISCV_CPU_THEAD_C906   RISCV_CPU_TYPE_NAME("thead-c906")
 #define TYPE_RISCV_CPU_HOST RISCV_CPU_TYPE_NAME("host")
 
 #if defined(TARGET_RISCV32)
diff --git a/target/riscv/cpu_vendorid.h b/target/riscv/cpu_vendorid.h
new file mode 100644
index 00..a5aa249bc9
--- /dev/null
+++ b/target/riscv/cpu_vendorid.h
@@ -0,0 +1,6 @@
+#ifndef TARGET_RISCV_CPU_VENDORID_H
+#define TARGET_RISCV_CPU_VENDORID_H
+
+#define THEAD_VENDOR_ID 0x5b7
+
+#endif /*  TARGET_RISCV_CPU_VENDORID_H */
-- 
2.39.0




[PATCH v3 06/14] RISC-V: Adding XTheadCondMov ISA extension

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the XTheadCondMov ISA extension.
The patch uses the T-Head specific decoder and translation.

Co-developed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Add ISA_EXT_DATA_ENTRY()
- Fix invalid use of register from dest_gpr()
- Use single decoder for XThead extensions

 target/riscv/cpu.c |  2 ++
 target/riscv/cpu.h |  1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 35 ++
 target/riscv/translate.c   |  2 +-
 target/riscv/xthead.decode |  4 +++
 5 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 805fec4d76..b3ede7223a 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -113,6 +113,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(xtheadbb, true, PRIV_VERSION_1_11_0, ext_xtheadbb),
 ISA_EXT_DATA_ENTRY(xtheadbs, true, PRIV_VERSION_1_11_0, ext_xtheadbs),
 ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0, ext_xtheadcmo),
+ISA_EXT_DATA_ENTRY(xtheadcondmov, true, PRIV_VERSION_1_11_0, 
ext_xtheadcondmov),
 ISA_EXT_DATA_ENTRY(xtheadsync, true, PRIV_VERSION_1_11_0, ext_xtheadsync),
 ISA_EXT_DATA_ENTRY(xventanacondops, true, PRIV_VERSION_1_12_0, 
ext_XVentanaCondOps),
 };
@@ -1080,6 +1081,7 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("xtheadbb", RISCVCPU, cfg.ext_xtheadbb, false),
 DEFINE_PROP_BOOL("xtheadbs", RISCVCPU, cfg.ext_xtheadbs, false),
 DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
+DEFINE_PROP_BOOL("xtheadcondmov", RISCVCPU, cfg.ext_xtheadcondmov, false),
 DEFINE_PROP_BOOL("xtheadsync", RISCVCPU, cfg.ext_xtheadsync, false),
 DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, 
false),
 
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 2f92211d9f..5286bd487c 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -477,6 +477,7 @@ struct RISCVCPUConfig {
 bool ext_xtheadbb;
 bool ext_xtheadbs;
 bool ext_xtheadcmo;
+bool ext_xtheadcondmov;
 bool ext_xtheadsync;
 bool ext_XVentanaCondOps;
 
diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
index 339a54e3d6..894b95a741 100644
--- a/target/riscv/insn_trans/trans_xthead.c.inc
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -40,6 +40,12 @@
 }\
 } while (0)
 
+#define REQUIRE_XTHEADCONDMOV(ctx) do {  \
+if (!ctx->cfg_ptr->ext_xtheadcondmov) {  \
+return false;\
+}\
+} while (0)
+
 #define REQUIRE_XTHEADSYNC(ctx) do { \
 if (!ctx->cfg_ptr->ext_xtheadsync) { \
 return false;\
@@ -264,6 +270,35 @@ NOP_PRIVCHECK(th_l2cache_call, REQUIRE_XTHEADCMO, 
REQUIRE_PRIV_MS)
 NOP_PRIVCHECK(th_l2cache_ciall, REQUIRE_XTHEADCMO, REQUIRE_PRIV_MS)
 NOP_PRIVCHECK(th_l2cache_iall, REQUIRE_XTHEADCMO, REQUIRE_PRIV_MS)
 
+/* XTheadCondMov */
+
+static bool gen_th_condmove(DisasContext *ctx, arg_r *a, TCGCond cond)
+{
+TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE);
+TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE);
+TCGv old = get_gpr(ctx, a->rd, EXT_NONE);
+TCGv dest = dest_gpr(ctx, a->rd);
+
+tcg_gen_movcond_tl(cond, dest, src2, ctx->zero, src1, old);
+
+gen_set_gpr(ctx, a->rd, dest);
+return true;
+}
+
+/* th.mveqz: "if (rs2 == 0) rd = rs1;" */
+static bool trans_th_mveqz(DisasContext *ctx, arg_th_mveqz *a)
+{
+REQUIRE_XTHEADCONDMOV(ctx);
+return gen_th_condmove(ctx, a, TCG_COND_EQ);
+}
+
+/* th.mvnez: "if (rs2 != 0) rd = rs1;" */
+static bool trans_th_mvnez(DisasContext *ctx, arg_th_mveqz *a)
+{
+REQUIRE_XTHEADCONDMOV(ctx);
+return gen_th_condmove(ctx, a, TCG_COND_NE);
+}
+
 /* XTheadSync */
 
 static bool trans_th_sfence_vmas(DisasContext *ctx, arg_th_sfence_vmas *a)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 96bdf5fb73..d61705e775 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -132,7 +132,7 @@ static bool has_xthead_p(DisasContext *ctx  
__attribute__((__unused__)))
 {
 return ctx->cfg_ptr->ext_xtheadba || ctx->cfg_ptr->ext_xtheadbb ||
ctx->cfg_ptr->ext_xtheadbs || ctx->cfg_ptr->ext_xtheadcmo ||
-   ctx->cfg_ptr->ext_xtheadsync;
+   ctx->cfg_ptr->ext_xtheadcondmov || ctx->cfg_ptr->ext_xtheadsync;
 }
 
 #define MATERIALISE_EXT_PREDICATE(ext)  \
diff --git a/target/riscv/xthead.decode b/target/riscv/xthead.decode
index 8494805611..a8ebd8a18b 100644
--- a/target/riscv/xthead.decode
+++ b/target/riscv/xthead.decode
@@ -84,6 +84,10 @@ th_l2cache_call  000 10101 0 000 0 0001011
 th_l2cache_ciall 000 10111 0 000 0 0001011
 th_l2cache_iall  

[PATCH v3 14/14] target/riscv: add a MAINTAINERS entry for XThead* extension support

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

The XThead* extensions are maintained by T-Head and VRULL.
Adding a point of contact from both companies.

Signed-off-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6982be48c6..f16916fd07 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -295,6 +295,14 @@ F: include/hw/riscv/
 F: linux-user/host/riscv32/
 F: linux-user/host/riscv64/
 
+RISC-V XThead* extensions
+M: Christoph Muellner 
+M: LIU Zhiwei 
+L: qemu-ri...@nongnu.org
+S: Supported
+F: target/riscv/insn_trans/trans_xthead.c.inc
+F: target/riscv/xthead*.decode
+
 RISC-V XVentanaCondOps extension
 M: Philipp Tomsich 
 L: qemu-ri...@nongnu.org
-- 
2.39.0




[PATCH v3 11/14] RISC-V: Set minimum priv version for Zfh to 1.11

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

There are no differences for floating point instructions in priv version 1.11
and 1.12. There is also no dependency for Zfh to priv version 1.12.
Therefore allow Zfh to be enabled for priv version 1.11.

Acked-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
 target/riscv/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 6121a5e4ba..b18df9fa2a 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -77,7 +77,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(zifencei, true, PRIV_VERSION_1_10_0, ext_ifencei),
 ISA_EXT_DATA_ENTRY(zihintpause, true, PRIV_VERSION_1_10_0, 
ext_zihintpause),
 ISA_EXT_DATA_ENTRY(zawrs, true, PRIV_VERSION_1_12_0, ext_zawrs),
-ISA_EXT_DATA_ENTRY(zfh, true, PRIV_VERSION_1_12_0, ext_zfh),
+ISA_EXT_DATA_ENTRY(zfh, true, PRIV_VERSION_1_11_0, ext_zfh),
 ISA_EXT_DATA_ENTRY(zfhmin, true, PRIV_VERSION_1_12_0, ext_zfhmin),
 ISA_EXT_DATA_ENTRY(zfinx, true, PRIV_VERSION_1_12_0, ext_zfinx),
 ISA_EXT_DATA_ENTRY(zdinx, true, PRIV_VERSION_1_12_0, ext_zdinx),
-- 
2.39.0




[PATCH v3 13/14] RISC-V: Adding XTheadFmv ISA extension

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the XTheadFmv ISA extension.
The patch uses the T-Head specific decoder and translation.

Signed-off-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
 target/riscv/cpu.c |  2 +
 target/riscv/cpu.h |  1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 45 ++
 target/riscv/translate.c   |  6 +--
 target/riscv/xthead.decode |  4 ++
 5 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 627512a184..1878c17a59 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -115,6 +115,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0, ext_xtheadcmo),
 ISA_EXT_DATA_ENTRY(xtheadcondmov, true, PRIV_VERSION_1_11_0, 
ext_xtheadcondmov),
 ISA_EXT_DATA_ENTRY(xtheadfmemidx, true, PRIV_VERSION_1_11_0, 
ext_xtheadfmemidx),
+ISA_EXT_DATA_ENTRY(xtheadfmv, true, PRIV_VERSION_1_11_0, ext_xtheadfmv),
 ISA_EXT_DATA_ENTRY(xtheadmac, true, PRIV_VERSION_1_11_0, ext_xtheadmac),
 ISA_EXT_DATA_ENTRY(xtheadmemidx, true, PRIV_VERSION_1_11_0, 
ext_xtheadmemidx),
 ISA_EXT_DATA_ENTRY(xtheadmempair, true, PRIV_VERSION_1_11_0, 
ext_xtheadmempair),
@@ -1116,6 +1117,7 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
 DEFINE_PROP_BOOL("xtheadcondmov", RISCVCPU, cfg.ext_xtheadcondmov, false),
 DEFINE_PROP_BOOL("xtheadfmemidx", RISCVCPU, cfg.ext_xtheadfmemidx, false),
+DEFINE_PROP_BOOL("xtheadfmv", RISCVCPU, cfg.ext_xtheadfmv, false),
 DEFINE_PROP_BOOL("xtheadmac", RISCVCPU, cfg.ext_xtheadmac, false),
 DEFINE_PROP_BOOL("xtheadmemidx", RISCVCPU, cfg.ext_xtheadmemidx, false),
 DEFINE_PROP_BOOL("xtheadmempair", RISCVCPU, cfg.ext_xtheadmempair, false),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 134dc29c6e..04630f3b79 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -481,6 +481,7 @@ struct RISCVCPUConfig {
 bool ext_xtheadcmo;
 bool ext_xtheadcondmov;
 bool ext_xtheadfmemidx;
+bool ext_xtheadfmv;
 bool ext_xtheadmac;
 bool ext_xtheadmemidx;
 bool ext_xtheadmempair;
diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
index dc82a9fc03..0403e90d7a 100644
--- a/target/riscv/insn_trans/trans_xthead.c.inc
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -52,6 +52,12 @@
 }\
 } while (0)
 
+#define REQUIRE_XTHEADFMV(ctx) do {  \
+if (!ctx->cfg_ptr->ext_xtheadfmv) {  \
+return false;\
+}\
+} while (0)
+
 #define REQUIRE_XTHEADMAC(ctx) do {  \
 if (!ctx->cfg_ptr->ext_xtheadmac) {  \
 return false;\
@@ -449,6 +455,45 @@ static bool trans_th_fsurw(DisasContext *ctx, 
arg_th_memidx *a)
 return gen_fstore_idx(ctx, a, MO_TEUL, true);
 }
 
+/* XTheadFmv */
+
+static bool trans_th_fmv_hw_x(DisasContext *ctx, arg_th_fmv_hw_x *a)
+{
+REQUIRE_XTHEADFMV(ctx);
+REQUIRE_32BIT(ctx);
+REQUIRE_FPU;
+REQUIRE_EXT(ctx, RVD);
+
+TCGv src1 = get_gpr(ctx, a->rs1, EXT_ZERO);
+TCGv_i64 t1 = tcg_temp_new_i64();
+
+tcg_gen_extu_tl_i64(t1, src1);
+tcg_gen_deposit_i64(cpu_fpr[a->rd], cpu_fpr[a->rd], t1, 32, 32);
+tcg_temp_free_i64(t1);
+mark_fs_dirty(ctx);
+return true;
+}
+
+static bool trans_th_fmv_x_hw(DisasContext *ctx, arg_th_fmv_x_hw *a)
+{
+REQUIRE_XTHEADFMV(ctx);
+REQUIRE_32BIT(ctx);
+REQUIRE_FPU;
+REQUIRE_EXT(ctx, RVD);
+TCGv dst;
+TCGv_i64 t1;
+
+dst = dest_gpr(ctx, a->rd);
+t1 = tcg_temp_new_i64();
+
+tcg_gen_extract_i64(t1, cpu_fpr[a->rs1], 32, 32);
+tcg_gen_trunc_i64_tl(dst, t1);
+gen_set_gpr(ctx, a->rd, dst);
+tcg_temp_free_i64(t1);
+mark_fs_dirty(ctx);
+return true;
+}
+
 /* XTheadMac */
 
 static bool gen_th_mac(DisasContext *ctx, arg_r *a,
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index c52bc5e0af..d6163daeb2 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -133,9 +133,9 @@ static bool has_xthead_p(DisasContext *ctx  
__attribute__((__unused__)))
 return ctx->cfg_ptr->ext_xtheadba || ctx->cfg_ptr->ext_xtheadbb ||
ctx->cfg_ptr->ext_xtheadbs || ctx->cfg_ptr->ext_xtheadcmo ||
ctx->cfg_ptr->ext_xtheadcondmov ||
-   ctx->cfg_ptr->ext_xtheadfmemidx || ctx->cfg_ptr->ext_xtheadmac ||
-   ctx->cfg_ptr->ext_xtheadmemidx || ctx->cfg_ptr->ext_xtheadmempair ||
-   ctx->cfg_ptr->ext_xtheadsync;
+   ctx->cfg_ptr->ext_xtheadfmemidx || ctx->cfg_ptr->ext_xtheadfmv ||
+   ctx->cfg_ptr->ext_xtheadmac || ctx->cfg_ptr->ext_xtheadmemidx ||
+  

[PATCH v3 10/14] RISC-V: Adding T-Head FMemIdx extension

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the T-Head FMemIdx instructions.
The patch uses the T-Head specific decoder and translation.

Co-developed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Add ISA_EXT_DATA_ENTRY()
- Use single decoder for XThead extensions
- Use get_th_address_indexed for address calculations

 target/riscv/cpu.c |   2 +
 target/riscv/cpu.h |   1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 108 +
 target/riscv/translate.c   |   3 +-
 target/riscv/xthead.decode |  10 ++
 5 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index eb8bbfa436..6121a5e4ba 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -114,6 +114,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(xtheadbs, true, PRIV_VERSION_1_11_0, ext_xtheadbs),
 ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0, ext_xtheadcmo),
 ISA_EXT_DATA_ENTRY(xtheadcondmov, true, PRIV_VERSION_1_11_0, 
ext_xtheadcondmov),
+ISA_EXT_DATA_ENTRY(xtheadfmemidx, true, PRIV_VERSION_1_11_0, 
ext_xtheadfmemidx),
 ISA_EXT_DATA_ENTRY(xtheadmac, true, PRIV_VERSION_1_11_0, ext_xtheadmac),
 ISA_EXT_DATA_ENTRY(xtheadmemidx, true, PRIV_VERSION_1_11_0, 
ext_xtheadmemidx),
 ISA_EXT_DATA_ENTRY(xtheadmempair, true, PRIV_VERSION_1_11_0, 
ext_xtheadmempair),
@@ -1085,6 +1086,7 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("xtheadbs", RISCVCPU, cfg.ext_xtheadbs, false),
 DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
 DEFINE_PROP_BOOL("xtheadcondmov", RISCVCPU, cfg.ext_xtheadcondmov, false),
+DEFINE_PROP_BOOL("xtheadfmemidx", RISCVCPU, cfg.ext_xtheadfmemidx, false),
 DEFINE_PROP_BOOL("xtheadmac", RISCVCPU, cfg.ext_xtheadmac, false),
 DEFINE_PROP_BOOL("xtheadmemidx", RISCVCPU, cfg.ext_xtheadmemidx, false),
 DEFINE_PROP_BOOL("xtheadmempair", RISCVCPU, cfg.ext_xtheadmempair, false),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 4882b9a9cc..b0ec5fcf9e 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -478,6 +478,7 @@ struct RISCVCPUConfig {
 bool ext_xtheadbs;
 bool ext_xtheadcmo;
 bool ext_xtheadcondmov;
+bool ext_xtheadfmemidx;
 bool ext_xtheadmac;
 bool ext_xtheadmemidx;
 bool ext_xtheadmempair;
diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
index e41f3be9a6..dc82a9fc03 100644
--- a/target/riscv/insn_trans/trans_xthead.c.inc
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -46,6 +46,12 @@
 }\
 } while (0)
 
+#define REQUIRE_XTHEADFMEMIDX(ctx) do {  \
+if (!ctx->cfg_ptr->ext_xtheadfmemidx) {  \
+return false;\
+}\
+} while (0)
+
 #define REQUIRE_XTHEADMAC(ctx) do {  \
 if (!ctx->cfg_ptr->ext_xtheadmac) {  \
 return false;\
@@ -341,6 +347,108 @@ static bool trans_th_mvnez(DisasContext *ctx, 
arg_th_mveqz *a)
 return gen_th_condmove(ctx, a, TCG_COND_NE);
 }
 
+/* XTheadFMem */
+
+/*
+ * Load 64-bit float from indexed address.
+ * If !zext_offs, then address is rs1 + (rs2 << imm2).
+ * If  zext_offs, then address is rs1 + (zext(rs2[31:0]) << imm2).
+ */
+static bool gen_fload_idx(DisasContext *ctx, arg_th_memidx *a, MemOp memop,
+  bool zext_offs)
+{
+TCGv_i64 rd = cpu_fpr[a->rd];
+TCGv addr = get_th_address_indexed(ctx, a->rs1, a->rs2, a->imm2, 
zext_offs);
+
+tcg_gen_qemu_ld_i64(rd, addr, ctx->mem_idx, memop);
+if ((memop & MO_SIZE) == MO_32) {
+gen_nanbox_s(rd, rd);
+}
+
+mark_fs_dirty(ctx);
+return true;
+}
+
+/*
+ * Store 64-bit float to indexed address.
+ * If !zext_offs, then address is rs1 + (rs2 << imm2).
+ * If  zext_offs, then address is rs1 + (zext(rs2[31:0]) << imm2).
+ */
+static bool gen_fstore_idx(DisasContext *ctx, arg_th_memidx *a, MemOp memop,
+   bool zext_offs)
+{
+TCGv_i64 rd = cpu_fpr[a->rd];
+TCGv addr = get_th_address_indexed(ctx, a->rs1, a->rs2, a->imm2, 
zext_offs);
+
+tcg_gen_qemu_st_i64(rd, addr, ctx->mem_idx, memop);
+
+return true;
+}
+
+static bool trans_th_flrd(DisasContext *ctx, arg_th_memidx *a)
+{
+REQUIRE_XTHEADFMEMIDX(ctx);
+REQUIRE_FPU;
+REQUIRE_EXT(ctx, RVD);
+return gen_fload_idx(ctx, a, MO_TEUQ, false);
+}
+
+static bool trans_th_flrw(DisasContext *ctx, arg_th_memidx *a)
+{
+REQUIRE_XTHEADFMEMIDX(ctx);
+REQUIRE_FPU;
+REQUIRE_EXT(ctx, RVF);
+return gen_fload_idx(ctx, a, MO_TEUL, false);
+}
+
+static bool trans_th_flurd(DisasContext *ctx, arg_th_memidx *a)
+{
+REQUIRE_XTHEADFMEMIDX(ctx);
+REQUIRE_FPU;
+REQUIRE_EXT(ctx, RVD);

[PATCH v3 02/14] RISC-V: Adding XTheadSync ISA extension

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the XTheadSync ISA extension.
The patch uses the T-Head specific decoder and translation.

The implementation introduces a helper to execute synchronization tasks:
helper_tlb_flush_all() performs a synchronized TLB flush on all CPUs.

Co-developed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Add ISA_EXT_DATA_ENTRY()
- Use helper to synchronize CPUs and perform TLB flushes
- Change implemenation to follow latest spec update
- Use single decoder for XThead extensions

Changes in v3:
- Adjust for renamed REQUIRE_PRIV_* test macros

 target/riscv/cpu.c |  2 +
 target/riscv/cpu.h |  1 +
 target/riscv/helper.h  |  1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 86 ++
 target/riscv/op_helper.c   |  6 ++
 target/riscv/translate.c   |  2 +-
 target/riscv/xthead.decode |  9 +++
 7 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 43a3b9218f..ae2009e89c 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -110,6 +110,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(svnapot, true, PRIV_VERSION_1_12_0, ext_svnapot),
 ISA_EXT_DATA_ENTRY(svpbmt, true, PRIV_VERSION_1_12_0, ext_svpbmt),
 ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0, ext_xtheadcmo),
+ISA_EXT_DATA_ENTRY(xtheadsync, true, PRIV_VERSION_1_11_0, ext_xtheadsync),
 ISA_EXT_DATA_ENTRY(xventanacondops, true, PRIV_VERSION_1_12_0, 
ext_XVentanaCondOps),
 };
 
@@ -1073,6 +1074,7 @@ static Property riscv_cpu_extensions[] = {
 
 /* Vendor-specific custom extensions */
 DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
+DEFINE_PROP_BOOL("xtheadsync", RISCVCPU, cfg.ext_xtheadsync, false),
 DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, 
false),
 
 /* These are experimental so mark with 'x-' */
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 680dd3dfbd..d0ab5c7bb0 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -474,6 +474,7 @@ struct RISCVCPUConfig {
 
 /* Vendor-specific custom extensions */
 bool ext_xtheadcmo;
+bool ext_xtheadsync;
 bool ext_XVentanaCondOps;
 
 uint8_t pmu_num;
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 227c7122ef..d22656698a 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -109,6 +109,7 @@ DEF_HELPER_1(sret, tl, env)
 DEF_HELPER_1(mret, tl, env)
 DEF_HELPER_1(wfi, void, env)
 DEF_HELPER_1(tlb_flush, void, env)
+DEF_HELPER_1(tlb_flush_all, void, env)
 /* Native Debug */
 DEF_HELPER_1(itrigger_match, void, env)
 #endif
diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
index 24acaf188c..bf5b39c749 100644
--- a/target/riscv/insn_trans/trans_xthead.c.inc
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -22,6 +22,12 @@
 }\
 } while (0)
 
+#define REQUIRE_XTHEADSYNC(ctx) do { \
+if (!ctx->cfg_ptr->ext_xtheadsync) { \
+return false;\
+}\
+} while (0)
+
 /* XTheadCmo */
 
 static inline int priv_level(DisasContext *ctx)
@@ -79,3 +85,83 @@ NOP_PRIVCHECK(th_icache_iva, REQUIRE_XTHEADCMO, 
REQUIRE_PRIV_MSU)
 NOP_PRIVCHECK(th_l2cache_call, REQUIRE_XTHEADCMO, REQUIRE_PRIV_MS)
 NOP_PRIVCHECK(th_l2cache_ciall, REQUIRE_XTHEADCMO, REQUIRE_PRIV_MS)
 NOP_PRIVCHECK(th_l2cache_iall, REQUIRE_XTHEADCMO, REQUIRE_PRIV_MS)
+
+/* XTheadSync */
+
+static bool trans_th_sfence_vmas(DisasContext *ctx, arg_th_sfence_vmas *a)
+{
+(void) a;
+REQUIRE_XTHEADSYNC(ctx);
+
+#ifndef CONFIG_USER_ONLY
+REQUIRE_PRIV_MS(ctx);
+decode_save_opc(ctx);
+gen_helper_tlb_flush_all(cpu_env);
+return true;
+#else
+return false;
+#endif
+}
+
+#ifndef CONFIG_USER_ONLY
+static void gen_th_sync_local(DisasContext *ctx)
+{
+/*
+ * Emulate out-of-order barriers with pipeline flush
+ * by exiting the translation block.
+ */
+gen_set_pc_imm(ctx, ctx->pc_succ_insn);
+tcg_gen_exit_tb(NULL, 0);
+ctx->base.is_jmp = DISAS_NORETURN;
+}
+#endif
+
+static bool trans_th_sync(DisasContext *ctx, arg_th_sync *a)
+{
+(void) a;
+REQUIRE_XTHEADSYNC(ctx);
+
+#ifndef CONFIG_USER_ONLY
+REQUIRE_PRIV_MSU(ctx);
+
+/*
+ * th.sync is an out-of-order barrier.
+ */
+gen_th_sync_local(ctx);
+
+return true;
+#else
+return false;
+#endif
+}
+
+static bool trans_th_sync_i(DisasContext *ctx, arg_th_sync_i *a)
+{
+(void) a;
+REQUIRE_XTHEADSYNC(ctx);
+
+#ifndef CONFIG_USER_ONLY
+REQUIRE_PRIV_MSU(ctx);
+
+/*
+ * th.sync.i is th.sync plus pipeline flush.
+ */
+gen_th_sync_local(ctx);
+
+return true;
+#else
+   

[PATCH v3 09/14] RISC-V: Adding T-Head MemIdx extension

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the T-Head MemIdx instructions.
The patch uses the T-Head specific decoder and translation.

Co-developed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Add ISA_EXT_DATA_ENTRY()
- Use single decoder for XThead extensions
- Avoid signed-bitfield-extraction by using signed immediate field imm5
- Use get_address() to calculate addresses
- Introduce helper get_th_address_indexed for rs1+(rs2ext_xtheadmemidx) {   \
+return false;\
+}\
+} while (0)
+
 #define REQUIRE_XTHEADMEMPAIR(ctx) do {  \
 if (!ctx->cfg_ptr->ext_xtheadmempair) {  \
 return false;\
@@ -64,6 +70,30 @@
 }\
 } while (0)
 
+/*
+ * Calculate and return the address for indexed mem operations:
+ * If !zext_offs, then the address is rs1 + (rs2 << imm2).
+ * If  zext_offs, then the address is rs1 + (zext(rs2[31:0]) << imm2).
+ */
+static TCGv get_th_address_indexed(DisasContext *ctx, int rs1, int rs2,
+   int imm2, bool zext_offs)
+{
+TCGv src2 = get_gpr(ctx, rs2, EXT_NONE);
+TCGv offs = tcg_temp_new();
+
+if (zext_offs) {
+tcg_gen_extract_tl(offs, src2, 0, 32);
+tcg_gen_shli_tl(offs, offs, imm2);
+} else {
+tcg_gen_shli_tl(offs, src2, imm2);
+}
+
+TCGv addr = get_address_indexed(ctx, rs1, offs);
+
+tcg_temp_free(offs);
+return addr;
+}
+
 /* XTheadBa */
 
 /*
@@ -388,6 +418,353 @@ static bool trans_th_mulsw(DisasContext *ctx, 
arg_th_mulsw *a)
 return gen_th_mac(ctx, a, tcg_gen_sub_tl, NULL);
 }
 
+/* XTheadMemIdx */
+
+/*
+ * Load with memop from indexed address and add (imm5 << imm2) to rs1.
+ * If !preinc, then the load address is rs1.
+ * If  preinc, then the load address is rs1 + (imm5) << imm2).
+ */
+static bool gen_load_inc(DisasContext *ctx, arg_th_meminc *a, MemOp memop,
+ bool preinc)
+{
+TCGv rd = dest_gpr(ctx, a->rd);
+TCGv addr = get_address(ctx, a->rs1, preinc ? a->imm5 << a->imm2 : 0);
+
+tcg_gen_qemu_ld_tl(rd, addr, ctx->mem_idx, memop);
+addr = get_address(ctx, a->rs1, !preinc ? a->imm5 << a->imm2 : 0);
+gen_set_gpr(ctx, a->rd, rd);
+gen_set_gpr(ctx, a->rs1, addr);
+
+return true;
+}
+
+/*
+ * Store with memop to indexed address and add (imm5 << imm2) to rs1.
+ * If !preinc, then the store address is rs1.
+ * If  preinc, then the store address is rs1 + (imm5) << imm2).
+ */
+static bool gen_store_inc(DisasContext *ctx, arg_th_meminc *a, MemOp memop,
+  bool preinc)
+{
+TCGv data = get_gpr(ctx, a->rd, EXT_NONE);
+TCGv addr = get_address(ctx, a->rs1, preinc ? a->imm5 << a->imm2 : 0);
+
+tcg_gen_qemu_st_tl(data, addr, ctx->mem_idx, memop);
+addr = get_address(ctx, a->rs1, !preinc ? a->imm5 << a->imm2 : 0);
+gen_set_gpr(ctx, a->rs1, addr);
+
+return true;
+}
+
+static bool trans_th_ldia(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+REQUIRE_64BIT(ctx);
+return gen_load_inc(ctx, a, MO_TESQ, false);
+}
+
+static bool trans_th_ldib(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+REQUIRE_64BIT(ctx);
+return gen_load_inc(ctx, a, MO_TESQ, true);
+}
+
+static bool trans_th_lwia(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+return gen_load_inc(ctx, a, MO_TESL, false);
+}
+
+static bool trans_th_lwib(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+return gen_load_inc(ctx, a, MO_TESL, true);
+}
+
+static bool trans_th_lwuia(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+REQUIRE_64BIT(ctx);
+return gen_load_inc(ctx, a, MO_TEUL, false);
+}
+
+static bool trans_th_lwuib(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+REQUIRE_64BIT(ctx);
+return gen_load_inc(ctx, a, MO_TEUL, true);
+}
+
+static bool trans_th_lhia(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+return gen_load_inc(ctx, a, MO_TESW, false);
+}
+
+static bool trans_th_lhib(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+return gen_load_inc(ctx, a, MO_TESW, true);
+}
+
+static bool trans_th_lhuia(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+return gen_load_inc(ctx, a, MO_TEUW, false);
+}
+
+static bool trans_th_lhuib(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+return gen_load_inc(ctx, a, MO_TEUW, true);
+}
+
+static bool trans_th_lbia(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+return gen_load_inc(ctx, a, MO_SB, false);
+}
+
+static bool trans_th_lbib(DisasContext *ctx, arg_th_meminc *a)
+{
+REQUIRE_XTHEADMEMIDX(ctx);
+return gen_load_inc(ctx, a, 

[PATCH v3 05/14] RISC-V: Adding XTheadBs ISA extension

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the XTheadBs ISA extension.
The patch uses the T-Head specific decoder and translation.

Co-developed-by: Philipp Tomsich 
Co-developed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Add ISA_EXT_DATA_ENTRY()
- Split XtheadB* extension into individual commits
- Use single decoder for XThead extensions

 target/riscv/cpu.c |  2 ++
 target/riscv/cpu.h |  1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 15 +++
 target/riscv/translate.c   |  3 ++-
 target/riscv/xthead.decode |  3 +++
 5 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index b995470dd6..805fec4d76 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -111,6 +111,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(svpbmt, true, PRIV_VERSION_1_12_0, ext_svpbmt),
 ISA_EXT_DATA_ENTRY(xtheadba, true, PRIV_VERSION_1_11_0, ext_xtheadba),
 ISA_EXT_DATA_ENTRY(xtheadbb, true, PRIV_VERSION_1_11_0, ext_xtheadbb),
+ISA_EXT_DATA_ENTRY(xtheadbs, true, PRIV_VERSION_1_11_0, ext_xtheadbs),
 ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0, ext_xtheadcmo),
 ISA_EXT_DATA_ENTRY(xtheadsync, true, PRIV_VERSION_1_11_0, ext_xtheadsync),
 ISA_EXT_DATA_ENTRY(xventanacondops, true, PRIV_VERSION_1_12_0, 
ext_XVentanaCondOps),
@@ -1077,6 +1078,7 @@ static Property riscv_cpu_extensions[] = {
 /* Vendor-specific custom extensions */
 DEFINE_PROP_BOOL("xtheadba", RISCVCPU, cfg.ext_xtheadba, false),
 DEFINE_PROP_BOOL("xtheadbb", RISCVCPU, cfg.ext_xtheadbb, false),
+DEFINE_PROP_BOOL("xtheadbs", RISCVCPU, cfg.ext_xtheadbs, false),
 DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
 DEFINE_PROP_BOOL("xtheadsync", RISCVCPU, cfg.ext_xtheadsync, false),
 DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, 
false),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index ff92705010..2f92211d9f 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -475,6 +475,7 @@ struct RISCVCPUConfig {
 /* Vendor-specific custom extensions */
 bool ext_xtheadba;
 bool ext_xtheadbb;
+bool ext_xtheadbs;
 bool ext_xtheadcmo;
 bool ext_xtheadsync;
 bool ext_XVentanaCondOps;
diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
index ea6cd6e305..339a54e3d6 100644
--- a/target/riscv/insn_trans/trans_xthead.c.inc
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -28,6 +28,12 @@
 }\
 } while (0)
 
+#define REQUIRE_XTHEADBS(ctx) do {   \
+if (!ctx->cfg_ptr->ext_xtheadbs) {   \
+return false;\
+}\
+} while (0)
+
 #define REQUIRE_XTHEADCMO(ctx) do {  \
 if (!ctx->cfg_ptr->ext_xtheadcmo) {  \
 return false;\
@@ -191,6 +197,15 @@ static bool trans_th_tstnbz(DisasContext *ctx, 
arg_th_tstnbz *a)
 return gen_unary(ctx, a, EXT_ZERO, gen_th_tstnbz);
 }
 
+/* XTheadBs */
+
+/* th.tst is an alternate encoding for bexti (from Zbs) */
+static bool trans_th_tst(DisasContext *ctx, arg_th_tst *a)
+{
+REQUIRE_XTHEADBS(ctx);
+return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_bext);
+}
+
 /* XTheadCmo */
 
 static inline int priv_level(DisasContext *ctx)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 3bae961be0..96bdf5fb73 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -131,7 +131,8 @@ static bool always_true_p(DisasContext *ctx  
__attribute__((__unused__)))
 static bool has_xthead_p(DisasContext *ctx  __attribute__((__unused__)))
 {
 return ctx->cfg_ptr->ext_xtheadba || ctx->cfg_ptr->ext_xtheadbb ||
-   ctx->cfg_ptr->ext_xtheadcmo || ctx->cfg_ptr->ext_xtheadsync;
+   ctx->cfg_ptr->ext_xtheadbs || ctx->cfg_ptr->ext_xtheadcmo ||
+   ctx->cfg_ptr->ext_xtheadsync;
 }
 
 #define MATERIALISE_EXT_PREDICATE(ext)  \
diff --git a/target/riscv/xthead.decode b/target/riscv/xthead.decode
index 8cd140891b..8494805611 100644
--- a/target/riscv/xthead.decode
+++ b/target/riscv/xthead.decode
@@ -58,6 +58,9 @@ th_rev   101 0 . 001 . 0001011 @r2
 th_revw  1001000 0 . 001 . 0001011 @r2
 th_tstnbz100 0 . 001 . 0001011 @r2
 
+# XTheadBs
+th_tst   100010 .. . 001 . 0001011 @sh6
+
 # XTheadCmo
 th_dcache_call   000 1 0 000 0 0001011
 th_dcache_ciall  000 00011 0 000 0 0001011
-- 
2.39.0




[PATCH v3 08/14] RISC-V: Adding T-Head MemPair extension

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the T-Head MemPair instructions.
The patch uses the T-Head specific decoder and translation.

Co-developed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Add ISA_EXT_DATA_ENTRY()
- Use single decoder for XThead extensions
- Use get_address() to calculate addresses

 target/riscv/cpu.c |  2 +
 target/riscv/cpu.h |  1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 88 ++
 target/riscv/translate.c   |  2 +-
 target/riscv/xthead.decode | 13 
 5 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 2ce8eb6a6f..e3a10f782c 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -115,6 +115,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0, ext_xtheadcmo),
 ISA_EXT_DATA_ENTRY(xtheadcondmov, true, PRIV_VERSION_1_11_0, 
ext_xtheadcondmov),
 ISA_EXT_DATA_ENTRY(xtheadmac, true, PRIV_VERSION_1_11_0, ext_xtheadmac),
+ISA_EXT_DATA_ENTRY(xtheadmempair, true, PRIV_VERSION_1_11_0, 
ext_xtheadmempair),
 ISA_EXT_DATA_ENTRY(xtheadsync, true, PRIV_VERSION_1_11_0, ext_xtheadsync),
 ISA_EXT_DATA_ENTRY(xventanacondops, true, PRIV_VERSION_1_12_0, 
ext_XVentanaCondOps),
 };
@@ -1084,6 +1085,7 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
 DEFINE_PROP_BOOL("xtheadcondmov", RISCVCPU, cfg.ext_xtheadcondmov, false),
 DEFINE_PROP_BOOL("xtheadmac", RISCVCPU, cfg.ext_xtheadmac, false),
+DEFINE_PROP_BOOL("xtheadmempair", RISCVCPU, cfg.ext_xtheadmempair, false),
 DEFINE_PROP_BOOL("xtheadsync", RISCVCPU, cfg.ext_xtheadsync, false),
 DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, 
false),
 
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 55aea777a0..4f5f3b2c20 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -479,6 +479,7 @@ struct RISCVCPUConfig {
 bool ext_xtheadcmo;
 bool ext_xtheadcondmov;
 bool ext_xtheadmac;
+bool ext_xtheadmempair;
 bool ext_xtheadsync;
 bool ext_XVentanaCondOps;
 
diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
index 1c583ea8ec..7ab2a7a48e 100644
--- a/target/riscv/insn_trans/trans_xthead.c.inc
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -52,6 +52,12 @@
 }\
 } while (0)
 
+#define REQUIRE_XTHEADMEMPAIR(ctx) do {  \
+if (!ctx->cfg_ptr->ext_xtheadmempair) {  \
+return false;\
+}\
+} while (0)
+
 #define REQUIRE_XTHEADSYNC(ctx) do { \
 if (!ctx->cfg_ptr->ext_xtheadsync) { \
 return false;\
@@ -382,6 +388,88 @@ static bool trans_th_mulsw(DisasContext *ctx, arg_th_mulsw 
*a)
 return gen_th_mac(ctx, a, tcg_gen_sub_tl, NULL);
 }
 
+/* XTheadMemPair */
+
+static bool gen_loadpair_tl(DisasContext *ctx, arg_th_pair *a, MemOp memop,
+int shamt)
+{
+TCGv rd1 = dest_gpr(ctx, a->rd1);
+TCGv rd2 = dest_gpr(ctx, a->rd2);
+TCGv addr1 = tcg_temp_new();
+TCGv addr2 = tcg_temp_new();
+
+addr1 = get_address(ctx, a->rs, a->sh2 << shamt);
+if ((memop & MO_SIZE) == MO_64) {
+addr2 = get_address(ctx, a->rs, 8 + (a->sh2 << shamt));
+} else {
+addr2 = get_address(ctx, a->rs, 4 + (a->sh2 << shamt));
+}
+
+tcg_gen_qemu_ld_tl(rd1, addr1, ctx->mem_idx, memop);
+tcg_gen_qemu_ld_tl(rd2, addr2, ctx->mem_idx, memop);
+gen_set_gpr(ctx, a->rd1, rd1);
+gen_set_gpr(ctx, a->rd2, rd2);
+
+tcg_temp_free(addr1);
+tcg_temp_free(addr2);
+return true;
+}
+
+static bool trans_th_ldd(DisasContext *ctx, arg_th_pair *a)
+{
+REQUIRE_XTHEADMEMPAIR(ctx);
+REQUIRE_64BIT(ctx);
+return gen_loadpair_tl(ctx, a, MO_TESQ, 4);
+}
+
+static bool trans_th_lwd(DisasContext *ctx, arg_th_pair *a)
+{
+REQUIRE_XTHEADMEMPAIR(ctx);
+return gen_loadpair_tl(ctx, a, MO_TESL, 3);
+}
+
+static bool trans_th_lwud(DisasContext *ctx, arg_th_pair *a)
+{
+REQUIRE_XTHEADMEMPAIR(ctx);
+return gen_loadpair_tl(ctx, a, MO_TEUL, 3);
+}
+
+static bool gen_storepair_tl(DisasContext *ctx, arg_th_pair *a, MemOp memop,
+ int shamt)
+{
+TCGv data1 = get_gpr(ctx, a->rd1, EXT_NONE);
+TCGv data2 = get_gpr(ctx, a->rd2, EXT_NONE);
+TCGv addr1 = tcg_temp_new();
+TCGv addr2 = tcg_temp_new();
+
+addr1 = get_address(ctx, a->rs, a->sh2 << shamt);
+if ((memop & MO_SIZE) == MO_64) {
+addr2 = get_address(ctx, a->rs, 8 + (a->sh2 << shamt));
+} else {
+addr2 = get_address(ctx, a->rs, 4 + (a->sh2 << shamt));
+}
+
+

[PATCH v3 00/14] Add support for the T-Head vendor extensions

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This series introduces support for the T-Head vendor extensions,
which are implemented e.g. in the XuanTie C906 and XuanTie C910:
* XTheadBa
* XTheadBb
* XTheadBs
* XTheadCmo
* XTheadCondMov
* XTheadFMemIdx
* XTheadFmv
* XTheadMac
* XTheadMemIdx
* XTheadMemPair
* XTheadSync

The xthead* extensions are documented here:
  https://github.com/T-head-Semi/thead-extension-spec/releases/latest

The "th." instruction prefix prevents future conflicts with standard
extensions and has been documentented in the RISC-V toolchain conventions:
  https://github.com/riscv-non-isa/riscv-toolchain-conventions

Note, that the T-Head vendor extensions do not contain all
vendor-specific functionality of the T-Head SoCs
(e.g. no vendor-specific CSRs are included).
Instead the extensions cover coherent functionality,
that is exposed to S and U mode.

To enable the extensions above, the following two methods are possible:
* add the extension to the arch string
  E.g. QEMU_CPU="any,xtheadcmo=true,xtheadsync=true"
* implicitly select the extensions via CPU selection
  E.g. QEMU_CPU="thead-c906"

Major changes in v2:
- Add ISA_EXT_DATA_ENTRY()s
- Use single decoder for XThead extensions
- Simplify a lot of translation functions
- Fix RV32 behaviour
- Added XTheadFmv
- Addressed all comments of v1

Major changes in v3:
- Drop XMAE patch
- Rename priv level test macros

Christoph Müllner (14):
  RISC-V: Adding XTheadCmo ISA extension
  RISC-V: Adding XTheadSync ISA extension
  RISC-V: Adding XTheadBa ISA extension
  RISC-V: Adding XTheadBb ISA extension
  RISC-V: Adding XTheadBs ISA extension
  RISC-V: Adding XTheadCondMov ISA extension
  RISC-V: Adding T-Head multiply-accumulate instructions
  RISC-V: Adding T-Head MemPair extension
  RISC-V: Adding T-Head MemIdx extension
  RISC-V: Adding T-Head FMemIdx extension
  RISC-V: Set minimum priv version for Zfh to 1.11
  RISC-V: Add initial support for T-Head C906
  RISC-V: Adding XTheadFmv ISA extension
  target/riscv: add a MAINTAINERS entry for XThead* extension support

 MAINTAINERS|8 +
 target/riscv/cpu.c |   54 +-
 target/riscv/cpu.h |   13 +
 target/riscv/cpu_vendorid.h|6 +
 target/riscv/helper.h  |1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 1081 
 target/riscv/meson.build   |1 +
 target/riscv/op_helper.c   |6 +
 target/riscv/translate.c   |   31 +
 target/riscv/xthead.decode |  185 
 10 files changed, 1385 insertions(+), 1 deletion(-)
 create mode 100644 target/riscv/cpu_vendorid.h
 create mode 100644 target/riscv/insn_trans/trans_xthead.c.inc
 create mode 100644 target/riscv/xthead.decode

-- 
2.39.0




[PATCH v3 01/14] RISC-V: Adding XTheadCmo ISA extension

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the XTheadCmo ISA extension.
To avoid interfering with standard extensions, decoder and translation
are in its own xthead* specific files.
Future patches should be able to easily add additional T-Head extension.

The implementation does not have much functionality (besides accepting
the instructions and not qualifying them as illegal instructions if
the hart executes in the required privilege level for the instruction),
as QEMU does not model CPU caches and instructions are documented
to not raise any exceptions.

Co-developed-by: LIU Zhiwei 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Add ISA_EXT_DATA_ENTRY()
- Explicit test for PRV_U
- Encapsule access to env-priv in inline function
- Use single decoder for XThead extensions

Changes in v3:
- Appling mask TB_FLAGS_PRIV_MMU_MASK to use of ctx->mem_idx
- Removing code from test macro REQUIRE_PRIV_MSU()
- Removing PRV_H from test macro REQUIRE_PRIV_MS()
- Remove unrelated clean-up
- Reorder decoder includes

 target/riscv/cpu.c |  2 +
 target/riscv/cpu.h |  1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 81 ++
 target/riscv/meson.build   |  1 +
 target/riscv/translate.c   |  8 +++
 target/riscv/xthead.decode | 38 ++
 6 files changed, 131 insertions(+)
 create mode 100644 target/riscv/insn_trans/trans_xthead.c.inc
 create mode 100644 target/riscv/xthead.decode

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index cc75ca7667..43a3b9218f 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -109,6 +109,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(svinval, true, PRIV_VERSION_1_12_0, ext_svinval),
 ISA_EXT_DATA_ENTRY(svnapot, true, PRIV_VERSION_1_12_0, ext_svnapot),
 ISA_EXT_DATA_ENTRY(svpbmt, true, PRIV_VERSION_1_12_0, ext_svpbmt),
+ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0, ext_xtheadcmo),
 ISA_EXT_DATA_ENTRY(xventanacondops, true, PRIV_VERSION_1_12_0, 
ext_XVentanaCondOps),
 };
 
@@ -1071,6 +1072,7 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("zmmul", RISCVCPU, cfg.ext_zmmul, false),
 
 /* Vendor-specific custom extensions */
+DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
 DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, 
false),
 
 /* These are experimental so mark with 'x-' */
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index f5609b62a2..680dd3dfbd 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -473,6 +473,7 @@ struct RISCVCPUConfig {
 uint64_t mimpid;
 
 /* Vendor-specific custom extensions */
+bool ext_xtheadcmo;
 bool ext_XVentanaCondOps;
 
 uint8_t pmu_num;
diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
new file mode 100644
index 00..24acaf188c
--- /dev/null
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -0,0 +1,81 @@
+/*
+ * RISC-V translation routines for the T-Head vendor extensions (xthead*).
+ *
+ * Copyright (c) 2022 VRULL GmbH.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#define REQUIRE_XTHEADCMO(ctx) do {  \
+if (!ctx->cfg_ptr->ext_xtheadcmo) {  \
+return false;\
+}\
+} while (0)
+
+/* XTheadCmo */
+
+static inline int priv_level(DisasContext *ctx)
+{
+#ifdef CONFIG_USER_ONLY
+return PRV_U;
+#else
+ /* Priv level is part of mem_idx. */
+return ctx->mem_idx & TB_FLAGS_PRIV_MMU_MASK;
+#endif
+}
+
+/* Test if priv level is M, S, or U (cannot fail). */
+#define REQUIRE_PRIV_MSU(ctx)
+
+/* Test if priv level is M or S. */
+#define REQUIRE_PRIV_MS(ctx)\
+do {\
+int priv = priv_level(ctx); \
+if (!(priv == PRV_M ||  \
+  priv == PRV_S)) { \
+return false;   \
+}   \
+} while (0)
+
+#define NOP_PRIVCHECK(insn, extcheck, privcheck)\
+static bool trans_ ## insn(DisasContext *ctx, 

[PATCH v3 07/14] RISC-V: Adding T-Head multiply-accumulate instructions

2023-01-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch adds support for the T-Head MAC instructions.
The patch uses the T-Head specific decoder and translation.

Co-developed-by: LIU Zhiwei 
Reviewed-by: Alistair Francis 
Signed-off-by: Christoph Müllner 
---
Changes in v2:
- Add ISA_EXT_DATA_ENTRY()
- Use single decoder for XThead extensions

 target/riscv/cpu.c |  2 +
 target/riscv/cpu.h |  1 +
 target/riscv/insn_trans/trans_xthead.c.inc | 83 ++
 target/riscv/translate.c   |  3 +-
 target/riscv/xthead.decode |  8 +++
 5 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index b3ede7223a..2ce8eb6a6f 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -114,6 +114,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
 ISA_EXT_DATA_ENTRY(xtheadbs, true, PRIV_VERSION_1_11_0, ext_xtheadbs),
 ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0, ext_xtheadcmo),
 ISA_EXT_DATA_ENTRY(xtheadcondmov, true, PRIV_VERSION_1_11_0, 
ext_xtheadcondmov),
+ISA_EXT_DATA_ENTRY(xtheadmac, true, PRIV_VERSION_1_11_0, ext_xtheadmac),
 ISA_EXT_DATA_ENTRY(xtheadsync, true, PRIV_VERSION_1_11_0, ext_xtheadsync),
 ISA_EXT_DATA_ENTRY(xventanacondops, true, PRIV_VERSION_1_12_0, 
ext_XVentanaCondOps),
 };
@@ -1082,6 +1083,7 @@ static Property riscv_cpu_extensions[] = {
 DEFINE_PROP_BOOL("xtheadbs", RISCVCPU, cfg.ext_xtheadbs, false),
 DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
 DEFINE_PROP_BOOL("xtheadcondmov", RISCVCPU, cfg.ext_xtheadcondmov, false),
+DEFINE_PROP_BOOL("xtheadmac", RISCVCPU, cfg.ext_xtheadmac, false),
 DEFINE_PROP_BOOL("xtheadsync", RISCVCPU, cfg.ext_xtheadsync, false),
 DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, 
false),
 
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 5286bd487c..55aea777a0 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -478,6 +478,7 @@ struct RISCVCPUConfig {
 bool ext_xtheadbs;
 bool ext_xtheadcmo;
 bool ext_xtheadcondmov;
+bool ext_xtheadmac;
 bool ext_xtheadsync;
 bool ext_XVentanaCondOps;
 
diff --git a/target/riscv/insn_trans/trans_xthead.c.inc 
b/target/riscv/insn_trans/trans_xthead.c.inc
index 894b95a741..1c583ea8ec 100644
--- a/target/riscv/insn_trans/trans_xthead.c.inc
+++ b/target/riscv/insn_trans/trans_xthead.c.inc
@@ -46,6 +46,12 @@
 }\
 } while (0)
 
+#define REQUIRE_XTHEADMAC(ctx) do {  \
+if (!ctx->cfg_ptr->ext_xtheadmac) {  \
+return false;\
+}\
+} while (0)
+
 #define REQUIRE_XTHEADSYNC(ctx) do { \
 if (!ctx->cfg_ptr->ext_xtheadsync) { \
 return false;\
@@ -299,6 +305,83 @@ static bool trans_th_mvnez(DisasContext *ctx, arg_th_mveqz 
*a)
 return gen_th_condmove(ctx, a, TCG_COND_NE);
 }
 
+/* XTheadMac */
+
+static bool gen_th_mac(DisasContext *ctx, arg_r *a,
+   void (*accumulate_func)(TCGv, TCGv, TCGv),
+   void (*extend_operand_func)(TCGv, TCGv))
+{
+TCGv dest = dest_gpr(ctx, a->rd);
+TCGv src0 = get_gpr(ctx, a->rd, EXT_NONE);
+TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE);
+TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE);
+TCGv tmp = tcg_temp_new();
+
+if (extend_operand_func) {
+TCGv tmp2 = tcg_temp_new();
+extend_operand_func(tmp, src1);
+extend_operand_func(tmp2, src2);
+tcg_gen_mul_tl(tmp, tmp, tmp2);
+tcg_temp_free(tmp2);
+} else {
+tcg_gen_mul_tl(tmp, src1, src2);
+}
+
+accumulate_func(dest, src0, tmp);
+gen_set_gpr(ctx, a->rd, dest);
+tcg_temp_free(tmp);
+
+return true;
+}
+
+/* th.mula: "rd = rd + rs1 * rs2" */
+static bool trans_th_mula(DisasContext *ctx, arg_th_mula *a)
+{
+REQUIRE_XTHEADMAC(ctx);
+return gen_th_mac(ctx, a, tcg_gen_add_tl, NULL);
+}
+
+/* th.mulah: "rd = sext.w(rd + sext.w(rs1[15:0]) * sext.w(rs2[15:0]))" */
+static bool trans_th_mulah(DisasContext *ctx, arg_th_mulah *a)
+{
+REQUIRE_XTHEADMAC(ctx);
+ctx->ol = MXL_RV32;
+return gen_th_mac(ctx, a, tcg_gen_add_tl, tcg_gen_ext16s_tl);
+}
+
+/* th.mulaw: "rd = sext.w(rd + rs1 * rs2)" */
+static bool trans_th_mulaw(DisasContext *ctx, arg_th_mulaw *a)
+{
+REQUIRE_XTHEADMAC(ctx);
+REQUIRE_64BIT(ctx);
+ctx->ol = MXL_RV32;
+return gen_th_mac(ctx, a, tcg_gen_add_tl, NULL);
+}
+
+/* th.muls: "rd = rd - rs1 * rs2" */
+static bool trans_th_muls(DisasContext *ctx, arg_th_muls *a)
+{
+REQUIRE_XTHEADMAC(ctx);
+return gen_th_mac(ctx, a, tcg_gen_sub_tl, NULL);
+}
+
+/* th.mulsh: "rd = sext.w(rd - sext.w(rs1[15:0]) * sext.w(rs2[15:0]))" */
+static bool trans_th_mulsh(DisasContext *ctx, arg_th_mulsh *a)
+{
+REQUIRE_XTHEADMAC(ctx);
+ctx->ol = 

Re: [PATCH v2 01/15] RISC-V: Adding XTheadCmo ISA extension

2023-01-24 Thread Christoph Müllner
On Tue, Jan 24, 2023 at 6:31 PM Christoph Müllner <
christoph.muell...@vrull.eu> wrote:

>
>
> On Mon, Jan 23, 2023 at 11:50 PM Alistair Francis 
> wrote:
>
>> On Sat, Dec 24, 2022 at 4:09 AM Christoph Muellner
>>  wrote:
>> >
>> > From: Christoph Müllner 
>> >
>> > This patch adds support for the XTheadCmo ISA extension.
>> > To avoid interfering with standard extensions, decoder and translation
>> > are in its own xthead* specific files.
>> > Future patches should be able to easily add additional T-Head extension.
>> >
>> > The implementation does not have much functionality (besides accepting
>> > the instructions and not qualifying them as illegal instructions if
>> > the hart executes in the required privilege level for the instruction),
>> > as QEMU does not model CPU caches and instructions are documented
>> > to not raise any exceptions.
>> >
>> > Changes in v2:
>> > - Add ISA_EXT_DATA_ENTRY()
>> > - Explicit test for PRV_U
>> > - Encapsule access to env-priv in inline function
>> > - Use single decoder for XThead extensions
>> >
>> > Co-developed-by: LIU Zhiwei 
>> > Signed-off-by: Christoph Müllner 
>> > ---
>> >  target/riscv/cpu.c |  2 +
>> >  target/riscv/cpu.h |  1 +
>> >  target/riscv/insn_trans/trans_xthead.c.inc | 89 ++
>> >  target/riscv/meson.build   |  1 +
>> >  target/riscv/translate.c   | 15 +++-
>> >  target/riscv/xthead.decode | 38 +
>> >  6 files changed, 143 insertions(+), 3 deletions(-)
>> >  create mode 100644 target/riscv/insn_trans/trans_xthead.c.inc
>> >  create mode 100644 target/riscv/xthead.decode
>> >
>> > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>> > index 6fe176e483..a90b82c5c5 100644
>> > --- a/target/riscv/cpu.c
>> > +++ b/target/riscv/cpu.c
>> > @@ -108,6 +108,7 @@ static const struct isa_ext_data isa_edata_arr[] = {
>> >  ISA_EXT_DATA_ENTRY(svinval, true, PRIV_VERSION_1_12_0,
>> ext_svinval),
>> >  ISA_EXT_DATA_ENTRY(svnapot, true, PRIV_VERSION_1_12_0,
>> ext_svnapot),
>> >  ISA_EXT_DATA_ENTRY(svpbmt, true, PRIV_VERSION_1_12_0, ext_svpbmt),
>> > +ISA_EXT_DATA_ENTRY(xtheadcmo, true, PRIV_VERSION_1_11_0,
>> ext_xtheadcmo),
>> >  ISA_EXT_DATA_ENTRY(xventanacondops, true, PRIV_VERSION_1_12_0,
>> ext_XVentanaCondOps),
>> >  };
>> >
>> > @@ -1060,6 +1061,7 @@ static Property riscv_cpu_extensions[] = {
>> >  DEFINE_PROP_BOOL("zmmul", RISCVCPU, cfg.ext_zmmul, false),
>> >
>> >  /* Vendor-specific custom extensions */
>> > +DEFINE_PROP_BOOL("xtheadcmo", RISCVCPU, cfg.ext_xtheadcmo, false),
>> >  DEFINE_PROP_BOOL("xventanacondops", RISCVCPU,
>> cfg.ext_XVentanaCondOps, false),
>> >
>> >  /* These are experimental so mark with 'x-' */
>> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
>> > index 443d15a47c..ad1c19f870 100644
>> > --- a/target/riscv/cpu.h
>> > +++ b/target/riscv/cpu.h
>> > @@ -465,6 +465,7 @@ struct RISCVCPUConfig {
>> >  uint64_t mimpid;
>> >
>> >  /* Vendor-specific custom extensions */
>> > +bool ext_xtheadcmo;
>> >  bool ext_XVentanaCondOps;
>> >
>> >  uint8_t pmu_num;
>> > diff --git a/target/riscv/insn_trans/trans_xthead.c.inc
>> b/target/riscv/insn_trans/trans_xthead.c.inc
>> > new file mode 100644
>> > index 00..00e75c7dca
>> > --- /dev/null
>> > +++ b/target/riscv/insn_trans/trans_xthead.c.inc
>> > @@ -0,0 +1,89 @@
>> > +/*
>> > + * RISC-V translation routines for the T-Head vendor extensions
>> (xthead*).
>> > + *
>> > + * Copyright (c) 2022 VRULL GmbH.
>> > + *
>> > + * This program is free software; you can redistribute it and/or
>> modify it
>> > + * under the terms and conditions of the GNU General Public License,
>> > + * version 2 or later, as published by the Free Software Foundation.
>> > + *
>> > + * This program is distributed in the hope it will be useful, but
>> WITHOUT
>> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
>> or
>> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
>> License for
>> > + * more details.
>> > + *
>> > + * You should have received a copy of the GNU General Public License
>> along with
>> > + * this program.  If not, see .
>> > + */
>> > +
>> > +#define REQUIRE_XTHEADCMO(ctx) do {  \
>> > +if (!ctx->cfg_ptr->ext_xtheadcmo) {  \
>> > +return false;\
>> > +}\
>> > +} while (0)
>> > +
>> > +/* XTheadCmo */
>> > +
>> > +static inline int priv_level(DisasContext *ctx)
>> > +{
>> > +#ifdef CONFIG_USER_ONLY
>> > +return PRV_U;
>> > +#else
>> > + /* Priv level equals mem_idx -- see cpu_mmu_index. */
>> > +return ctx->mem_idx;
>>
>> This should be ANDed with TB_FLAGS_PRIV_MMU_MASK as sometimes this can
>> include hypervisor priv access information
>>
>
> Ok.
>
>
>>
>> > +#endif
>> > +}
>> > +
>> > +#define 

Re: [PATCH v2 01/35] scripts/ci: update gitlab-runner playbook to use latest runner

2023-01-24 Thread Richard Henderson

On 1/24/23 08:00, Alex Bennée wrote:

We were using quite and old runner on our machines and running into
issues with stalling jobs. Gitlab in the meantime now reliably provide
the latest packaged versions of the runner under a stable URL. This
update:

   - creates a per-arch subdir for builds
   - switches from binary tarballs to deb packages
   - re-uses the same binary for the secondary runner
   - updates distro check for second to 22.04

Note this script isn't fully idempotent as we end up accumulating
runners especially during testing. However we also want to be able to
run twice with different GitLab keys (e.g. project and personal) so I
think we just have to be mindful of that during testing.

Signed-off-by: Alex Bennée

---
v2
   - only register aarch32 runner, move service start post both registers
   - tested on s390x
---
  scripts/ci/setup/gitlab-runner.yml | 56 +++---
  scripts/ci/setup/vars.yml.template |  2 --
  2 files changed, 13 insertions(+), 45 deletions(-)


Acked-by: Richard Henderson 


r~



Re: [PATCH v2 34/35] cpu-exec: assert that plugin_mem_cbs is NULL after execution

2023-01-24 Thread Richard Henderson

On 1/24/23 08:01, Alex Bennée wrote:

From: Emilio Cota

Fixes: #1381

Signed-off-by: Emilio Cota
Message-Id:<20230108165107.62488-1-c...@braap.org>
[AJB: manually applied follow-up fix]
Signed-off-by: Alex Bennée
---
  include/qemu/plugin.h | 4 
  accel/tcg/cpu-exec.c  | 2 ++
  2 files changed, 6 insertions(+)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2 33/35] tcg: exclude non-memory effecting helpers from instrumentation

2023-01-24 Thread Richard Henderson

On 1/24/23 08:01, Alex Bennée wrote:

From: Emilio Cota

There are actually a whole bunch of helpers that don't affect memory
that we shouldn't instrument. They are helpfully identified by the
TCG_CALL_NO_SIDE_EFFECTS flag which marks out lookup_tb_ptr as well as
a lot of the maths helpers. To avoid the string compare we introduce a
new flag for plugin internals so we skip that too.

Related: #1381
Signed-off-by: Emilio Cota
Message-Id:<20230108164731.61469-4-c...@braap.org>
[AJB: updated to skip all no SE plugins, add flag for plugin helper]
Signed-off-by: Alex Bennée

---
v2
   - use TCG_CALL_NO_SIDE_EFFECTS as suggested by rth
   - add flag for plugin specific helpers
---
  accel/tcg/plugin-helpers.h | 4 ++--
  include/tcg/tcg.h  | 2 ++
  tcg/tcg.c  | 6 --
  3 files changed, 8 insertions(+), 4 deletions(-)


Reviewed-by: Richard Henderson 


r~



Re: [PATCH] hw/pci-host/mv64361: Reuse pci_swizzle_map_irq_fn

2023-01-24 Thread Daniel Henrique Barboza




On 1/21/23 17:56, Bernhard Beschow wrote:



Am 6. Januar 2023 11:39:27 UTC schrieb Bernhard Beschow :

mv64361_pcihost_map_irq() is a reimplementation of
pci_swizzle_map_irq_fn(). Resolve this redundancy.

Signed-off-by: Bernhard Beschow 


Ping

Patch is reviewed. Who will queue it? Daniel?


Queued in gitlab.com/danielhb/qemu/tree/ppc-next. Thanks,


Daniel



Best regards,
Bernhard


---
Testing done:
* `qemu-system-ppc -machine pegasos2 \
   -rtc base=localtime \
   -device ati-vga,guest_hwcursor=true,romfile="" \
   -cdrom morphos-3.17.iso \
   -kernel morphos-3.17/boot.img`
---
hw/pci-host/mv64361.c | 7 +--
1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/hw/pci-host/mv64361.c b/hw/pci-host/mv64361.c
index cc9c4d6d3b..70db142ec3 100644
--- a/hw/pci-host/mv64361.c
+++ b/hw/pci-host/mv64361.c
@@ -72,11 +72,6 @@ struct MV64361PCIState {
 uint64_t remap[5];
};

-static int mv64361_pcihost_map_irq(PCIDevice *pci_dev, int n)
-{
-return (n + PCI_SLOT(pci_dev->devfn)) % PCI_NUM_PINS;
-}
-
static void mv64361_pcihost_set_irq(void *opaque, int n, int level)
{
 MV64361PCIState *s = opaque;
@@ -97,7 +92,7 @@ static void mv64361_pcihost_realize(DeviceState *dev, Error 
**errp)
 g_free(name);
 name = g_strdup_printf("pci.%d", s->index);
 h->bus = pci_register_root_bus(dev, name, mv64361_pcihost_set_irq,
-   mv64361_pcihost_map_irq, dev,
+   pci_swizzle_map_irq_fn, dev,
>mem, >io, 0, 4, TYPE_PCI_BUS);
 g_free(name);
 pci_create_simple(h->bus, 0, TYPE_MV64361_PCI_BRIDGE);






Re: MinGW and libfdt (was: Re: MSYS2 and libfdt)

2023-01-24 Thread Marc-André Lureau
Hi

On Tue, Jan 24, 2023 at 7:08 PM Daniel P. Berrangé  wrote:
>
> On Tue, Jan 24, 2023 at 03:43:25PM +0100, Thomas Huth wrote:
> > On 23/01/2023 17.23, Daniel P. Berrangé wrote:
> > > On Fri, Jan 20, 2023 at 05:57:29PM +0400, Marc-André Lureau wrote:
> > ...
> > > > > > On Thu, Jan 19, 2023 at 12:31 PM Thomas Huth  
> > > > > > wrote:
> > > > > > >
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > in some spare minutes, I started playing with a patch to try to 
> > > > > > > remove the
> > > > > > > dtc submodule from the QEMU git repository - according to
> > > > > > > https://repology.org/project/dtc/versions our supported build 
> > > > > > > platforms
> > > > > > > should now all provide the minimum required version.
> > ...
> > > So in theory we can try to drop the submodule for dtc now
> >
> > The dtc package is also still missing in the MinGW cross compiler suite in
> > Fedora ... does anybody know what's the right way to request it there?
>
> Someone will need to write a specfile, and submit it for review. I can do
> the submission, or the review, but not both (can't mark your own homework)
>

It's already been in rawhide for a few months. We can probably merge
and update f37.
https://packages.fedoraproject.org/pkgs/dtc/dtc/


-- 
Marc-André Lureau



[PATCH 1/1] modules: load modules from /var/run/qemu/ directory firstly

2023-01-24 Thread Siddhi Katage
From: Siddhi Katage 

An old running QEMU will try to load modules with new build-id first, this
will fail as expected, then QEMU will fallback to load the old modules that
matches its build-id from /var/run/qemu/ directory.
Make /var/run/qemu/ directory as first search path to load modules.

Fixes: bd83c861c0 ("modules: load modules from versioned /var/run dir")
Signed-off-by: Siddhi Katage 
---
 util/module.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/util/module.c b/util/module.c
index 32e2631..b723d65 100644
--- a/util/module.c
+++ b/util/module.c
@@ -233,17 +233,17 @@ int module_load(const char *prefix, const char *name, 
Error **errp)
 g_hash_table_add(loaded_modules, module_name);
 
 search_dir = getenv("QEMU_MODULE_DIR");
-if (search_dir != NULL) {
-dirs[n_dirs++] = g_strdup_printf("%s", search_dir);
-}
-dirs[n_dirs++] = get_relocated_path(CONFIG_QEMU_MODDIR);
-
 #ifdef CONFIG_MODULE_UPGRADES
 version_dir = g_strcanon(g_strdup(QEMU_PKGVERSION),
  G_CSET_A_2_Z G_CSET_a_2_z G_CSET_DIGITS "+-.~",
  '_');
 dirs[n_dirs++] = g_strdup_printf("/var/run/qemu/%s", version_dir);
 #endif
+if (search_dir != NULL) {
+dirs[n_dirs++] = g_strdup_printf("%s", search_dir);
+}
+dirs[n_dirs++] = get_relocated_path(CONFIG_QEMU_MODDIR);
+
 assert(n_dirs <= ARRAY_SIZE(dirs));
 
 /* end of resources managed by the out: label */
-- 
1.8.3.1




[PATCH v2 15/35] tests/tcg: skip the vma-pthread test on CI

2023-01-24 Thread Alex Bennée
We are getting a lot of failures that are not related to changes so
this could be a flaky test.

Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 
---
 tests/tcg/multiarch/Makefile.target | 9 +
 1 file changed, 9 insertions(+)

diff --git a/tests/tcg/multiarch/Makefile.target 
b/tests/tcg/multiarch/Makefile.target
index e7213af492..ae8b3d7268 100644
--- a/tests/tcg/multiarch/Makefile.target
+++ b/tests/tcg/multiarch/Makefile.target
@@ -42,6 +42,15 @@ munmap-pthread: LDFLAGS+=-pthread
 vma-pthread: CFLAGS+=-pthread
 vma-pthread: LDFLAGS+=-pthread
 
+# The vma-pthread seems very sensitive on gitlab and we currently
+# don't know if its exposing a real bug or the test is flaky.
+ifneq ($(GITLAB_CI),)
+run-vma-pthread: vma-pthread
+   $(call skip-test, $<, "flaky on CI?")
+run-plugin-vma-pthread-with-%: vma-pthread
+   $(call skip-test, $<, "flaky on CI?")
+endif
+
 # We define the runner for test-mmap after the individual
 # architectures have defined their supported pages sizes. If no
 # additional page sizes are defined we only run the default test.
-- 
2.34.1




[PATCH v2 29/35] util/qht: use striped locks under TSAN

2023-01-24 Thread Alex Bennée
From: Emilio Cota 

Fixes this tsan crash, easy to reproduce with any large enough program:

$ tests/unit/test-qht
1..2
ThreadSanitizer: CHECK failed: sanitizer_deadlock_detector.h:67 
"((n_all_locks_)) < 
(((sizeof(all_locks_with_contexts_)/sizeof((all_locks_with_contexts_)[0]" 
(0x40, 0x40) (tid=1821568)
#0 __tsan::CheckUnwind() ../../../../src/libsanitizer/tsan/tsan_rtl.cpp:353 
(libtsan.so.2+0x90034)
#1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long 
long, unsigned long long) 
../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86 
(libtsan.so.2+0xca555)
#2 __sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, 
__sanitizer::BasicBitVector > >::addLock(unsigned long, unsigned 
long, unsigned int) 
../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:67 
(libtsan.so.2+0xb3616)
#3 __sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, 
__sanitizer::BasicBitVector > >::addLock(unsigned long, unsigned 
long, unsigned int) 
../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:59 
(libtsan.so.2+0xb3616)
#4 __sanitizer::DeadlockDetector<__sanitizer::TwoLevelBitVector<1ul, 
__sanitizer::BasicBitVector > 
>::onLockAfter(__sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul,
 __sanitizer::BasicBitVector > >*, unsigned long, unsigned int) 
../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:216 
(libtsan.so.2+0xb3616)
#5 __sanitizer::DD::MutexAfterLock(__sanitizer::DDCallback*, 
__sanitizer::DDMutex*, bool, bool) 
../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector1.cpp:169
 (libtsan.so.2+0xb3616)
#6 __tsan::MutexPostLock(__tsan::ThreadState*, unsigned long, unsigned 
long, unsigned int, int) 
../../../../src/libsanitizer/tsan/tsan_rtl_mutex.cpp:200 (libtsan.so.2+0xa3382)
#7 __tsan_mutex_post_lock 
../../../../src/libsanitizer/tsan/tsan_interface_ann.cpp:384 
(libtsan.so.2+0x76bc3)
#8 qemu_spin_lock /home/cota/src/qemu/include/qemu/thread.h:259 
(test-qht+0x44a97)
#9 qht_map_lock_buckets ../util/qht.c:253 (test-qht+0x44a97)
#10 do_qht_iter ../util/qht.c:809 (test-qht+0x45f33)
#11 qht_iter ../util/qht.c:821 (test-qht+0x45f33)
#12 iter_check ../tests/unit/test-qht.c:121 (test-qht+0xe473)
#13 qht_do_test ../tests/unit/test-qht.c:202 (test-qht+0xe473)
#14 qht_test ../tests/unit/test-qht.c:240 (test-qht+0xe7c1)
#15 test_default ../tests/unit/test-qht.c:246 (test-qht+0xe828)
#16   (libglib-2.0.so.0+0x7daed)
#17   (libglib-2.0.so.0+0x7d80a)
#18   (libglib-2.0.so.0+0x7d80a)
#19 g_test_run_suite  (libglib-2.0.so.0+0x7dfe9)
#20 g_test_run  (libglib-2.0.so.0+0x7e055)
#21 main ../tests/unit/test-qht.c:259 (test-qht+0xd2c6)
#22 __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 
(libc.so.6+0x29d8f)
#23 __libc_start_main_impl ../csu/libc-start.c:392 (libc.so.6+0x29e3f)
#24 _start  (test-qht+0xdb44)

Signed-off-by: Emilio Cota 
Reviewed-by: Richard Henderson 
Message-Id: <2023051628.320011-5-c...@braap.org>
Signed-off-by: Alex Bennée 
---
 util/qht.c | 95 ++
 1 file changed, 81 insertions(+), 14 deletions(-)

diff --git a/util/qht.c b/util/qht.c
index 15866299e6..92c6b78759 100644
--- a/util/qht.c
+++ b/util/qht.c
@@ -151,6 +151,22 @@ struct qht_bucket {
 
 QEMU_BUILD_BUG_ON(sizeof(struct qht_bucket) > QHT_BUCKET_ALIGN);
 
+/*
+ * Under TSAN, we use striped locks instead of one lock per bucket chain.
+ * This avoids crashing under TSAN, since TSAN aborts the program if more than
+ * 64 locks are held (this is a hardcoded limit in TSAN).
+ * When resizing a QHT we grab all the buckets' locks, which can easily
+ * go over TSAN's limit. By using striped locks, we avoid this problem.
+ *
+ * Note: this number must be a power of two for easy index computation.
+ */
+#define QHT_TSAN_BUCKET_LOCKS_BITS 4
+#define QHT_TSAN_BUCKET_LOCKS (1 << QHT_TSAN_BUCKET_LOCKS_BITS)
+
+struct qht_tsan_lock {
+QemuSpin lock;
+} QEMU_ALIGNED(QHT_BUCKET_ALIGN);
+
 /**
  * struct qht_map - structure to track an array of buckets
  * @rcu: used by RCU. Keep it as the top field in the struct to help valgrind
@@ -160,6 +176,7 @@ QEMU_BUILD_BUG_ON(sizeof(struct qht_bucket) > 
QHT_BUCKET_ALIGN);
  * @n_added_buckets: number of added (i.e. "non-head") buckets
  * @n_added_buckets_threshold: threshold to trigger an upward resize once the
  * number of added buckets surpasses it.
+ * @tsan_bucket_locks: Array of striped locks to be used only under TSAN.
  *
  * Buckets are tracked in what we call a "map", i.e. this structure.
  */
@@ -169,6 +186,9 @@ struct qht_map {
 size_t n_buckets;
 size_t n_added_buckets;
 size_t n_added_buckets_threshold;
+#ifdef CONFIG_TSAN
+struct qht_tsan_lock tsan_bucket_locks[QHT_TSAN_BUCKET_LOCKS];
+#endif
 };
 
 /* trigger a resize when 

[PATCH v2 16/35] tests/tcg: Use SIGKILL for timeout

2023-01-24 Thread Alex Bennée
From: Richard Henderson 

linux-user blocks all signals while attempting to handle guest
signals (e.g. ABRT), which means that the default TERM sent by timeout
has no effect -- KILL instead.

Signed-off-by: Richard Henderson 
Message-Id: <20230117035701.168514-2-richard.hender...@linaro.org>
[AJB: expanded commit message from cover letter]
Signed-off-by: Alex Bennée 
---
 tests/tcg/Makefile.target | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/tcg/Makefile.target b/tests/tcg/Makefile.target
index 14bc013181..a3b0aaf8af 100644
--- a/tests/tcg/Makefile.target
+++ b/tests/tcg/Makefile.target
@@ -54,10 +54,10 @@ cc-option = if $(call cc-test, $1); then \
 
 # $1 = test name, $2 = cmd, $3 = desc
 ifeq ($(filter %-softmmu, $(TARGET)),)
-run-test = $(call quiet-command, timeout --foreground $(TIMEOUT) $2 > $1.out, \
+run-test = $(call quiet-command, timeout -s KILL --foreground $(TIMEOUT) $2 > 
$1.out, \
TEST,$(or $3, $*, $<) on $(TARGET_NAME))
 else
-run-test = $(call quiet-command, timeout --foreground $(TIMEOUT) $2, \
+run-test = $(call quiet-command, timeout -s KILL --foreground $(TIMEOUT) $2, \
 TEST,$(or $3, $*, $<) on $(TARGET_NAME))
 endif
 
-- 
2.34.1




[PATCH v2 25/35] tests/tcg: add memory-sve test for aarch64

2023-01-24 Thread Alex Bennée
This will be helpful in debugging problems with tracking SVE memory
accesses via the TCG plugins system.

Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 
Cc: Robert Henry 
Cc: Aaron Lindsay 
---
 tests/tcg/aarch64/Makefile.softmmu-target | 7 +++
 tests/tcg/aarch64/system/boot.S   | 3 ++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/tests/tcg/aarch64/Makefile.softmmu-target 
b/tests/tcg/aarch64/Makefile.softmmu-target
index a1368905f5..df9747bae8 100644
--- a/tests/tcg/aarch64/Makefile.softmmu-target
+++ b/tests/tcg/aarch64/Makefile.softmmu-target
@@ -36,6 +36,13 @@ config-cc.mak: Makefile
 
 memory: CFLAGS+=-DCHECK_UNALIGNED=1
 
+memory-sve: memory.c $(LINK_SCRIPT) $(CRT_OBJS) $(MINILIB_OBJS)
+   $(CC) $(CFLAGS) $(EXTRA_CFLAGS) $< -o $@ $(LDFLAGS)
+
+memory-sve: CFLAGS+=-DCHECK_UNALIGNED=1 -march=armv8.1-a+sve -O3 
-fno-tree-loop-distribute-patterns
+
+TESTS+=memory-sve
+
 # Running
 QEMU_BASE_MACHINE=-M virt -cpu max -display none
 QEMU_OPTS+=$(QEMU_BASE_MACHINE) -semihosting-config 
enable=on,target=native,chardev=output -kernel
diff --git a/tests/tcg/aarch64/system/boot.S b/tests/tcg/aarch64/system/boot.S
index e190b1efa6..f136363d2a 100644
--- a/tests/tcg/aarch64/system/boot.S
+++ b/tests/tcg/aarch64/system/boot.S
@@ -179,12 +179,13 @@ __start:
isb
 
/*
-* Enable FP registers. The standard C pre-amble will be
+* Enable FP/SVE registers. The standard C pre-amble will be
 * saving these and A-profile compilers will use AdvSIMD
 * registers unless we tell it not to.
*/
mrs x0, cpacr_el1
orr x0, x0, #(3 << 20)
+   orr x0, x0, #(3 << 16)
msr cpacr_el1, x0
 
/* Setup some stack space and enter the test code.
-- 
2.34.1




[PATCH v2 27/35] util/qht: add missing atomic_set(hashes[i])

2023-01-24 Thread Alex Bennée
From: Emilio Cota 

We forgot to add this one in "a890643958 util/qht: atomically set b->hashes".

Detected with tsan.

Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Alex Bennée 
Signed-off-by: Emilio Cota 
Message-Id: <2023051628.320011-3-c...@braap.org>
Signed-off-by: Alex Bennée 
---
 util/qht.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/qht.c b/util/qht.c
index 065fc501f4..15866299e6 100644
--- a/util/qht.c
+++ b/util/qht.c
@@ -688,7 +688,7 @@ static inline void qht_bucket_remove_entry(struct 
qht_bucket *orig, int pos)
 int i;
 
 if (qht_entry_is_last(orig, pos)) {
-orig->hashes[pos] = 0;
+qatomic_set(>hashes[pos], 0);
 qatomic_set(>pointers[pos], NULL);
 return;
 }
-- 
2.34.1




[PATCH v2 35/35] plugins: Iterate on cb_lists in qemu_plugin_user_exit

2023-01-24 Thread Alex Bennée
From: Richard Henderson 

Rather than iterate over all plugins for all events,
iterate over plugins that have registered a given event.

Signed-off-by: Richard Henderson 
Message-Id: <20230117035701.168514-4-richard.hender...@linaro.org>
---
 plugins/core.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/plugins/core.c b/plugins/core.c
index 728bacef95..e04ffa1ba4 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -514,9 +514,10 @@ void qemu_plugin_user_exit(void)
 /* un-register all callbacks except the final AT_EXIT one */
 for (ev = 0; ev < QEMU_PLUGIN_EV_MAX; ev++) {
 if (ev != QEMU_PLUGIN_EV_ATEXIT) {
-struct qemu_plugin_ctx *ctx;
-QTAILQ_FOREACH(ctx, , entry) {
-plugin_unregister_cb__locked(ctx, ev);
+struct qemu_plugin_cb *cb, *next;
+
+QLIST_FOREACH_SAFE_RCU(cb, _lists[ev], entry, next) {
+plugin_unregister_cb__locked(cb->ctx, ev);
 }
 }
 }
-- 
2.34.1




[PATCH v2 18/35] MAINTAINERS: Fix the entry for tests/tcg/nios2

2023-01-24 Thread Alex Bennée
From: Thomas Huth 

tests/tcg/nios2/Makefile.target has accidentally been added
to the Microblaze section. Move it into the correct nios2
section instead - and while we're at it, it should also cover
the whole folder, and not only the Makefile.

Fixes: 67f80eb4d0 ("tests/tcg: enable debian-nios2-cross for test building")
Signed-off-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20230119130326.2030297-1-th...@redhat.com>
Signed-off-by: Alex Bennée 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index c581c11a64..629ab5bbb1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -240,7 +240,6 @@ F: target/microblaze/
 F: hw/microblaze/
 F: disas/microblaze.c
 F: tests/docker/dockerfiles/debian-microblaze-cross.d/build-toolchain.sh
-F: tests/tcg/nios2/Makefile.target
 
 MIPS TCG CPUs
 M: Philippe Mathieu-Daudé 
@@ -262,6 +261,7 @@ F: hw/nios2/
 F: disas/nios2.c
 F: configs/devices/nios2-softmmu/default.mak
 F: tests/docker/dockerfiles/debian-nios2-cross.d/build-toolchain.sh
+F: tests/tcg/nios2/
 
 OpenRISC TCG CPUs
 M: Stafford Horne 
-- 
2.34.1




[PATCH v2 31/35] plugins: fix optimization in plugin_gen_disable_mem_helpers

2023-01-24 Thread Alex Bennée
From: Emilio Cota 

We were mistakenly checking tcg_ctx->plugin_insn as a canary to know
whether the TB had emitted helpers that might have accessed memory.

The problem is that tcg_ctx->plugin_insn gets updated on every
instruction in the TB, which results in us wrongly performing the
optimization (i.e. not clearing cpu->plugin_mem_cbs) way too often,
since it's not rare that the last instruction in the TB doesn't
use helpers.

Fix it by tracking a per-TB canary.

While at it, expand documentation.

Related: #1381

Signed-off-by: Emilio Cota 
Message-Id: <20230108164731.61469-2-c...@braap.org>
Signed-off-by: Alex Bennée 
---
 include/qemu/plugin.h  |  7 +++
 accel/tcg/plugin-gen.c | 26 ++
 2 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index a772e14193..e0ebedef84 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -118,7 +118,10 @@ struct qemu_plugin_insn {
 void *haddr;
 GArray *cbs[PLUGIN_N_CB_TYPES][PLUGIN_N_CB_SUBTYPES];
 bool calls_helpers;
+
+/* if set, the instruction calls helpers that might access guest memory */
 bool mem_helper;
+
 bool mem_only;
 };
 
@@ -158,6 +161,10 @@ struct qemu_plugin_tb {
 void *haddr1;
 void *haddr2;
 bool mem_only;
+
+/* if set, the TB calls helpers that might access guest memory */
+bool mem_helper;
+
 GArray *cbs[PLUGIN_N_CB_SUBTYPES];
 };
 
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index c7d6514840..17a686bd9e 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -579,7 +579,8 @@ static void inject_mem_helper(TCGOp *begin_op, GArray *arr)
  * is possible that the code we generate after the instruction is
  * dead, we also add checks before generating tb_exit etc.
  */
-static void inject_mem_enable_helper(struct qemu_plugin_insn *plugin_insn,
+static void inject_mem_enable_helper(struct qemu_plugin_tb *ptb,
+ struct qemu_plugin_insn *plugin_insn,
  TCGOp *begin_op)
 {
 GArray *cbs[2];
@@ -599,6 +600,7 @@ static void inject_mem_enable_helper(struct 
qemu_plugin_insn *plugin_insn,
 rm_ops(begin_op);
 return;
 }
+ptb->mem_helper = true;
 
 arr = g_array_sized_new(false, false,
 sizeof(struct qemu_plugin_dyn_cb), n_cbs);
@@ -626,15 +628,22 @@ void plugin_gen_disable_mem_helpers(void)
 {
 TCGv_ptr ptr;
 
-if (likely(tcg_ctx->plugin_insn == NULL ||
-   !tcg_ctx->plugin_insn->mem_helper)) {
+/*
+ * We could emit the clearing unconditionally and be done. However, this 
can
+ * be wasteful if for instance plugins don't track memory accesses, or if
+ * most TBs don't use helpers. Instead, emit the clearing iff the TB calls
+ * helpers that might access guest memory.
+ *
+ * Note: we do not reset plugin_tb->mem_helper here; a TB might have 
several
+ * exit points, and we want to emit the clearing from all of them.
+ */
+if (!tcg_ctx->plugin_tb->mem_helper) {
 return;
 }
 ptr = tcg_const_ptr(NULL);
 tcg_gen_st_ptr(ptr, cpu_env, offsetof(CPUState, plugin_mem_cbs) -
  offsetof(ArchCPU, env));
 tcg_temp_free_ptr(ptr);
-tcg_ctx->plugin_insn->mem_helper = false;
 }
 
 static void plugin_gen_tb_udata(const struct qemu_plugin_tb *ptb,
@@ -682,14 +691,14 @@ static void plugin_gen_mem_inline(const struct 
qemu_plugin_tb *ptb,
 inject_inline_cb(cbs, begin_op, op_rw);
 }
 
-static void plugin_gen_enable_mem_helper(const struct qemu_plugin_tb *ptb,
+static void plugin_gen_enable_mem_helper(struct qemu_plugin_tb *ptb,
  TCGOp *begin_op, int insn_idx)
 {
 struct qemu_plugin_insn *insn = g_ptr_array_index(ptb->insns, insn_idx);
-inject_mem_enable_helper(insn, begin_op);
+inject_mem_enable_helper(ptb, insn, begin_op);
 }
 
-static void plugin_gen_disable_mem_helper(const struct qemu_plugin_tb *ptb,
+static void plugin_gen_disable_mem_helper(struct qemu_plugin_tb *ptb,
   TCGOp *begin_op, int insn_idx)
 {
 struct qemu_plugin_insn *insn = g_ptr_array_index(ptb->insns, insn_idx);
@@ -750,7 +759,7 @@ static void pr_ops(void)
 #endif
 }
 
-static void plugin_gen_inject(const struct qemu_plugin_tb *plugin_tb)
+static void plugin_gen_inject(struct qemu_plugin_tb *plugin_tb)
 {
 TCGOp *op;
 int insn_idx = -1;
@@ -870,6 +879,7 @@ bool plugin_gen_tb_start(CPUState *cpu, const 
DisasContextBase *db,
 ptb->haddr1 = db->host_addr[0];
 ptb->haddr2 = NULL;
 ptb->mem_only = mem_only;
+ptb->mem_helper = false;
 
 plugin_gen_empty_callback(PLUGIN_GEN_FROM_TB);
 }
-- 
2.34.1




[PATCH v2 34/35] cpu-exec: assert that plugin_mem_cbs is NULL after execution

2023-01-24 Thread Alex Bennée
From: Emilio Cota 

Fixes: #1381

Signed-off-by: Emilio Cota 
Message-Id: <20230108165107.62488-1-c...@braap.org>
[AJB: manually applied follow-up fix]
Signed-off-by: Alex Bennée 
---
 include/qemu/plugin.h | 4 
 accel/tcg/cpu-exec.c  | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index e0ebedef84..fb338ba576 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -59,6 +59,8 @@ get_plugin_meminfo_rw(qemu_plugin_meminfo_t i)
 #ifdef CONFIG_PLUGIN
 extern QemuOptsList qemu_plugin_opts;
 
+#define QEMU_PLUGIN_ASSERT(cond) g_assert(cond)
+
 static inline void qemu_plugin_add_opts(void)
 {
 qemu_add_opts(_plugin_opts);
@@ -250,6 +252,8 @@ void qemu_plugin_user_postfork(bool is_child);
 
 #else /* !CONFIG_PLUGIN */
 
+#define QEMU_PLUGIN_ASSERT(cond)
+
 static inline void qemu_plugin_add_opts(void)
 { }
 
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 25ec73ef9a..9c857eeb07 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -504,6 +504,7 @@ static void cpu_exec_exit(CPUState *cpu)
 if (cc->tcg_ops->cpu_exec_exit) {
 cc->tcg_ops->cpu_exec_exit(cpu);
 }
+QEMU_PLUGIN_ASSERT(cpu->plugin_mem_cbs == NULL);
 }
 
 void cpu_exec_step_atomic(CPUState *cpu)
@@ -980,6 +981,7 @@ cpu_exec_loop(CPUState *cpu, SyncClocks *sc)
 
 cpu_loop_exec_tb(cpu, tb, pc, _tb, _exit);
 
+QEMU_PLUGIN_ASSERT(cpu->plugin_mem_cbs == NULL);
 /* Try to align the host and virtual clocks
if the guest is in advance */
 align_clocks(sc, cpu);
-- 
2.34.1




[PATCH v2 23/35] semihosting: Write back semihosting data before completion callback

2023-01-24 Thread Alex Bennée
From: Keith Packard 

'lock_user' allocates a host buffer to shadow a target buffer,
'unlock_user' copies that host buffer back to the target and frees the
host memory. If the completion function uses the target buffer, it
must be called after unlock_user to ensure the data are present.

This caused the arm-compatible TARGET_SYS_READC to fail as the
completion function, common_semi_readc_cb, pulled data from the target
buffer which would not have been gotten the console data.

I decided to fix all instances of this pattern instead of just the
console_read function to make things consistent and potentially fix
bugs in other cases.

Signed-off-by: Keith Packard 
Reviewed-by: Richard Henderson 
Message-Id: <20221012014822.1242170-1-kei...@keithp.com>
Signed-off-by: Alex Bennée 
---
 semihosting/syscalls.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/semihosting/syscalls.c b/semihosting/syscalls.c
index 5893c760c5..ba28194b59 100644
--- a/semihosting/syscalls.c
+++ b/semihosting/syscalls.c
@@ -319,11 +319,11 @@ static void host_read(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 ret = RETRY_ON_EINTR(read(gf->hostfd, ptr, len));
 if (ret == -1) {
-complete(cs, -1, errno);
 unlock_user(ptr, buf, 0);
+complete(cs, -1, errno);
 } else {
-complete(cs, ret, 0);
 unlock_user(ptr, buf, ret);
+complete(cs, ret, 0);
 }
 }
 
@@ -339,8 +339,8 @@ static void host_write(CPUState *cs, 
gdb_syscall_complete_cb complete,
 return;
 }
 ret = write(gf->hostfd, ptr, len);
-complete(cs, ret, ret == -1 ? errno : 0);
 unlock_user(ptr, buf, 0);
+complete(cs, ret, ret == -1 ? errno : 0);
 }
 
 static void host_lseek(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -426,8 +426,8 @@ static void host_stat(CPUState *cs, gdb_syscall_complete_cb 
complete,
 ret = -1;
 }
 }
-complete(cs, ret, err);
 unlock_user(name, fname, 0);
+complete(cs, ret, err);
 }
 
 static void host_remove(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -444,8 +444,8 @@ static void host_remove(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 
 ret = remove(p);
-complete(cs, ret, ret ? errno : 0);
 unlock_user(p, fname, 0);
+complete(cs, ret, ret ? errno : 0);
 }
 
 static void host_rename(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -469,9 +469,9 @@ static void host_rename(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 
 ret = rename(ostr, nstr);
-complete(cs, ret, ret ? errno : 0);
 unlock_user(ostr, oname, 0);
 unlock_user(nstr, nname, 0);
+complete(cs, ret, ret ? errno : 0);
 }
 
 static void host_system(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -488,8 +488,8 @@ static void host_system(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 
 ret = system(p);
-complete(cs, ret, ret == -1 ? errno : 0);
 unlock_user(p, cmd, 0);
+complete(cs, ret, ret == -1 ? errno : 0);
 }
 
 static void host_gettimeofday(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -554,8 +554,8 @@ static void staticfile_read(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 memcpy(ptr, gf->staticfile.data + gf->staticfile.off, len);
 gf->staticfile.off += len;
-complete(cs, len, 0);
 unlock_user(ptr, buf, len);
+complete(cs, len, 0);
 }
 
 static void staticfile_lseek(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -608,8 +608,8 @@ static void console_read(CPUState *cs, 
gdb_syscall_complete_cb complete,
 return;
 }
 ret = qemu_semihosting_console_read(cs, ptr, len);
-complete(cs, ret, 0);
 unlock_user(ptr, buf, ret);
+complete(cs, ret, 0);
 }
 
 static void console_write(CPUState *cs, gdb_syscall_complete_cb complete,
@@ -624,8 +624,8 @@ static void console_write(CPUState *cs, 
gdb_syscall_complete_cb complete,
 return;
 }
 ret = qemu_semihosting_console_write(ptr, len);
-complete(cs, ret ? ret : -1, ret ? 0 : EIO);
 unlock_user(ptr, buf, 0);
+complete(cs, ret ? ret : -1, ret ? 0 : EIO);
 }
 
 static void console_fstat(CPUState *cs, gdb_syscall_complete_cb complete,
-- 
2.34.1




[PATCH v2 20/35] docs: add a new section to outline emulation support

2023-01-24 Thread Alex Bennée
This affects both system and user mode emulation so we should probably
list it up front.

Acked-by: Richard Henderson 
Signed-off-by: Alex Bennée 

---
v2
  - HPs -> HP's
  - MIPs-like -> MIPS-like
---
 docs/about/emulation.rst  | 103 ++
 docs/about/index.rst  |   1 +
 docs/devel/tcg-plugins.rst|   2 +
 docs/system/arm/emulation.rst |   2 +
 4 files changed, 108 insertions(+)
 create mode 100644 docs/about/emulation.rst

diff --git a/docs/about/emulation.rst b/docs/about/emulation.rst
new file mode 100644
index 00..bdc0630b35
--- /dev/null
+++ b/docs/about/emulation.rst
@@ -0,0 +1,103 @@
+Emulation
+=
+
+QEMU's Tiny Code Generator (TCG) gives it the ability to emulate a
+number of CPU architectures on any supported platform. Both
+:ref:`System Emulation` and :ref:`User Mode Emulation` are supported
+depending on the guest architecture.
+
+.. list-table:: Supported Guest Architectures for Emulation
+  :widths: 30 10 10 50
+  :header-rows: 1
+
+  * - Architecture (qemu name)
+- System
+- User-mode
+- Notes
+  * - Alpha
+- Yes
+- Yes
+- Legacy 64 bit RISC ISA developed by DEC
+  * - Arm (arm, aarch64)
+- Yes
+- Yes
+- Wide range of features, see :ref:`Arm Emulation` for details
+  * - AVR
+- Yes
+- No
+- 8 bit micro controller, often used in maker projects
+  * - Cris
+- Yes
+- Yes
+- Embedded RISC chip developed by AXIS
+  * - Hexagon
+- No
+- Yes
+- Family of DSPs by Qualcomm
+  * - PA-RISC (hppa)
+- Yes
+- Yes
+- A legacy RISC system used in HP's old minicomputers
+  * - x86 (i386, x86_64)
+- Yes
+- Yes
+- The ubiquitous desktop PC CPU architecture, 32 and 64 bit.
+  * - Loongarch
+- Yes
+- Yes
+- A MIPS-like 64bit RISC architecture developed in China
+  * - m68k
+- Yes
+- Yes
+- Motorola 68000 variants and ColdFire
+  * - Microblaze
+- Yes
+- Yes
+- RISC based soft-core by Xilinx
+  * - MIPS (mips, mipsel, mips64, mips64el)
+- Yes
+- Yes
+- Venerable RISC architecture originally out of Stanford University
+  * - Nios2
+- Yes
+- Yes
+- 32 bit embedded soft-core by Altera
+  * - OpenRISC
+- Yes
+- Yes
+- Open source RISC architecture developed by the OpenRISC community
+  * - Power (ppc, ppc64)
+- Yes
+- Yes
+- A general purpose RISC architecture now managed by IBM
+  * - RISC-V
+- Yes
+- Yes
+- An open standard RISC ISA maintained by RISC-V International
+  * - RX
+- Yes
+- No
+- A 32 bit micro controller developed by Renesas
+  * - s390x
+- Yes
+- Yes
+- A 64 bit CPU found in IBM's System Z mainframes
+  * - sh4
+- Yes
+- Yes
+- A 32 bit RISC embedded CPU developed by Hitachi
+  * - SPARC (sparc, sparc64)
+- Yes
+- Yes
+- A RISC ISA originally developed by Sun Microsystems
+  * - Tricore
+- Yes
+- No
+- A 32 bit RISC/uController/DSP developed by Infineon
+  * - Xtensa
+- Yes
+- Yes
+- A configurable 32 bit soft core now owned by Cadence
+
+A number of features are are only available when running under
+emulation including :ref:`Record/Replay` and :ref:`TCG Plugins`.
diff --git a/docs/about/index.rst b/docs/about/index.rst
index bae1309cc6..b00b584b31 100644
--- a/docs/about/index.rst
+++ b/docs/about/index.rst
@@ -23,6 +23,7 @@ allows you to create, convert and modify disk images.
:maxdepth: 2
 
build-platforms
+   emulation
deprecated
removed-features
license
diff --git a/docs/devel/tcg-plugins.rst b/docs/devel/tcg-plugins.rst
index 9740a70406..81dcd43a61 100644
--- a/docs/devel/tcg-plugins.rst
+++ b/docs/devel/tcg-plugins.rst
@@ -3,6 +3,8 @@
Copyright (c) 2019, Linaro Limited
Written by Emilio Cota and Alex Bennée
 
+.. _TCG Plugins:
+
 QEMU TCG Plugins
 
 
diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index b33d7c28dc..b87e064d9d 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -1,3 +1,5 @@
+.. _Arm Emulation:
+
 A-profile CPU architecture support
 ==
 
-- 
2.34.1




[PATCH v2 19/35] docs: add hotlinks to about preface text

2023-01-24 Thread Alex Bennée
Make it easier to navigate the documentation.

Reviewed-by: Peter Maydell 
Acked-by: Richard Henderson 
Signed-off-by: Alex Bennée 
---
 docs/about/index.rst  | 16 
 docs/system/index.rst |  2 ++
 docs/tools/index.rst  |  2 ++
 docs/user/index.rst   |  2 ++
 4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/docs/about/index.rst b/docs/about/index.rst
index 5bea653c07..bae1309cc6 100644
--- a/docs/about/index.rst
+++ b/docs/about/index.rst
@@ -5,19 +5,19 @@ About QEMU
 QEMU is a generic and open source machine emulator and virtualizer.
 
 QEMU can be used in several different ways. The most common is for
-"system emulation", where it provides a virtual model of an
+:ref:`System Emulation`, where it provides a virtual model of an
 entire machine (CPU, memory and emulated devices) to run a guest OS.
-In this mode the CPU may be fully emulated, or it may work with
-a hypervisor such as KVM, Xen, Hax or Hypervisor.Framework to
-allow the guest to run directly on the host CPU.
+In this mode the CPU may be fully emulated, or it may work with a
+hypervisor such as KVM, Xen, Hax or Hypervisor.Framework to allow the
+guest to run directly on the host CPU.
 
-The second supported way to use QEMU is "user mode emulation",
+The second supported way to use QEMU is :ref:`User Mode Emulation`,
 where QEMU can launch processes compiled for one CPU on another CPU.
 In this mode the CPU is always emulated.
 
-QEMU also provides a number of standalone commandline utilities,
-such as the ``qemu-img`` disk image utility that allows you to create,
-convert and modify disk images.
+QEMU also provides a number of standalone :ref:`command line
+utilities`, such as the ``qemu-img`` disk image utility that
+allows you to create, convert and modify disk images.
 
 .. toctree::
:maxdepth: 2
diff --git a/docs/system/index.rst b/docs/system/index.rst
index e3695649c5..282b6ffb56 100644
--- a/docs/system/index.rst
+++ b/docs/system/index.rst
@@ -1,3 +1,5 @@
+.. _System Emulation:
+
 
 System Emulation
 
diff --git a/docs/tools/index.rst b/docs/tools/index.rst
index 1edd5a8054..2151adcf78 100644
--- a/docs/tools/index.rst
+++ b/docs/tools/index.rst
@@ -1,3 +1,5 @@
+.. _Tools:
+
 -
 Tools
 -
diff --git a/docs/user/index.rst b/docs/user/index.rst
index 2c4e29f3db..782d27cda2 100644
--- a/docs/user/index.rst
+++ b/docs/user/index.rst
@@ -1,3 +1,5 @@
+.. _User Mode Emulation:
+
 ---
 User Mode Emulation
 ---
-- 
2.34.1




[PATCH v2 33/35] tcg: exclude non-memory effecting helpers from instrumentation

2023-01-24 Thread Alex Bennée
From: Emilio Cota 

There are actually a whole bunch of helpers that don't affect memory
that we shouldn't instrument. They are helpfully identified by the
TCG_CALL_NO_SIDE_EFFECTS flag which marks out lookup_tb_ptr as well as
a lot of the maths helpers. To avoid the string compare we introduce a
new flag for plugin internals so we skip that too.

Related: #1381
Signed-off-by: Emilio Cota 
Message-Id: <20230108164731.61469-4-c...@braap.org>
[AJB: updated to skip all no SE plugins, add flag for plugin helper]
Signed-off-by: Alex Bennée 

---
v2
  - use TCG_CALL_NO_SIDE_EFFECTS as suggested by rth
  - add flag for plugin specific helpers
---
 accel/tcg/plugin-helpers.h | 4 ++--
 include/tcg/tcg.h  | 2 ++
 tcg/tcg.c  | 6 --
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/accel/tcg/plugin-helpers.h b/accel/tcg/plugin-helpers.h
index 9829abe4a9..8e685e0654 100644
--- a/accel/tcg/plugin-helpers.h
+++ b/accel/tcg/plugin-helpers.h
@@ -1,4 +1,4 @@
 #ifdef CONFIG_PLUGIN
-DEF_HELPER_FLAGS_2(plugin_vcpu_udata_cb, TCG_CALL_NO_RWG, void, i32, ptr)
-DEF_HELPER_FLAGS_4(plugin_vcpu_mem_cb, TCG_CALL_NO_RWG, void, i32, i32, i64, 
ptr)
+DEF_HELPER_FLAGS_2(plugin_vcpu_udata_cb, TCG_CALL_NO_RWG | TCG_CALL_PLUGIN, 
void, i32, ptr)
+DEF_HELPER_FLAGS_4(plugin_vcpu_mem_cb, TCG_CALL_NO_RWG | TCG_CALL_PLUGIN, 
void, i32, i32, i64, ptr)
 #endif
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 6f497172f8..8dc291d030 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -405,6 +405,8 @@ typedef TCGv_ptr TCGv_env;
 #define TCG_CALL_NO_SIDE_EFFECTS0x0004
 /* Helper is G_NORETURN.  */
 #define TCG_CALL_NO_RETURN  0x0008
+/* Helper is part of Plugins.  */
+#define TCG_CALL_PLUGIN 0x0010
 
 /* convenience version of most used call flags */
 #define TCG_CALL_NO_RWG TCG_CALL_NO_READ_GLOBALS
diff --git a/tcg/tcg.c b/tcg/tcg.c
index d502327be2..fd557d55d3 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1674,8 +1674,10 @@ void tcg_gen_callN(void *func, TCGTemp *ret, int nargs, 
TCGTemp **args)
 op = tcg_op_alloc(INDEX_op_call, total_args);
 
 #ifdef CONFIG_PLUGIN
-/* detect non-plugin helpers */
-if (tcg_ctx->plugin_insn && unlikely(strncmp(info->name, "plugin_", 7))) {
+/* Flag helpers that may affect guest state */
+if (tcg_ctx->plugin_insn &&
+!(info->flags & TCG_CALL_PLUGIN) &&
+!(info->flags & TCG_CALL_NO_SIDE_EFFECTS)) {
 tcg_ctx->plugin_insn->calls_helpers = true;
 }
 #endif
-- 
2.34.1




[PATCH v2 28/35] thread: de-const qemu_spin_destroy

2023-01-24 Thread Alex Bennée
From: Emilio Cota 

Reviewed-by: Alex Bennée 
Signed-off-by: Emilio Cota 
Reviewed-by: Richard Henderson 
Message-Id: <2023051628.320011-4-c...@braap.org>
Signed-off-by: Alex Bennée 
---
 include/qemu/thread.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 7c6703bce3..7841084199 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -237,11 +237,10 @@ static inline void qemu_spin_init(QemuSpin *spin)
 #endif
 }
 
-/* const parameter because the only purpose here is the TSAN annotation */
-static inline void qemu_spin_destroy(const QemuSpin *spin)
+static inline void qemu_spin_destroy(QemuSpin *spin)
 {
 #ifdef CONFIG_TSAN
-__tsan_mutex_destroy((void *)spin, __tsan_mutex_not_static);
+__tsan_mutex_destroy(spin, __tsan_mutex_not_static);
 #endif
 }
 
-- 
2.34.1




[PATCH v2 32/35] translator: always pair plugin_gen_insn_{start, end} calls

2023-01-24 Thread Alex Bennée
From: Emilio Cota 

Related: #1381

Signed-off-by: Emilio Cota 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Richard Henderson 
Message-Id: <20230108164731.61469-3-c...@braap.org>
Signed-off-by: Alex Bennée 
---
 accel/tcg/translator.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 061519691f..ef5193c67e 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -100,19 +100,24 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, 
int max_insns,
 ops->translate_insn(db, cpu);
 }
 
-/* Stop translation if translate_insn so indicated.  */
-if (db->is_jmp != DISAS_NEXT) {
-break;
-}
-
 /*
  * We can't instrument after instructions that change control
  * flow although this only really affects post-load operations.
+ *
+ * Calling plugin_gen_insn_end() before we possibly stop translation
+ * is important. Even if this ends up as dead code, plugin generation
+ * needs to see a matching plugin_gen_insn_{start,end}() pair in order
+ * to accurately track instrumented helpers that might access memory.
  */
 if (plugin_enabled) {
 plugin_gen_insn_end();
 }
 
+/* Stop translation if translate_insn so indicated.  */
+if (db->is_jmp != DISAS_NEXT) {
+break;
+}
+
 /* Stop translation if the output buffer is full,
or we have executed all of the allowed instructions.  */
 if (tcg_op_buf_full() || db->num_insns >= db->max_insns) {
-- 
2.34.1




[PATCH v2 21/35] semihosting: add semihosting section to the docs

2023-01-24 Thread Alex Bennée
The main reason to do this is to document our O_BINARY implementation
decision somewhere. However I've also moved some of the implementation
details out of qemu-options and added links between the two. As a
bonus I've highlighted the scary warnings about host access with the
appropriate RST tags.

Acked-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Alex Bennée 

---
v2
  - moved inside the generic emulation section
  - make it clearer semihosting is specified by the architecture
  - more expansive description for O_BINARY
  - s/mips/MIPS/
---
 docs/about/emulation.rst | 89 
 qemu-options.hx  | 27 +---
 2 files changed, 99 insertions(+), 17 deletions(-)

diff --git a/docs/about/emulation.rst b/docs/about/emulation.rst
index bdc0630b35..dde892a226 100644
--- a/docs/about/emulation.rst
+++ b/docs/about/emulation.rst
@@ -101,3 +101,92 @@ depending on the guest architecture.
 
 A number of features are are only available when running under
 emulation including :ref:`Record/Replay` and :ref:`TCG Plugins`.
+
+.. _Semihosting:
+
+Semihosting
+---
+
+Semihosting is a feature defined by the owner of the architecture to
+allow programs to interact with a debugging host system. On real
+hardware this is usually provided by an In-circuit emulator (ICE)
+hooked directly to the board. QEMU's implementation allows for
+semihosting calls to be passed to the host system or via the
+``gdbstub``.
+
+Generally semihosting makes it easier to bring up low level code before a
+more fully functional operating system has been enabled. On QEMU it
+also allows for embedded micro-controller code which typically doesn't
+have a full libc to be run as "bare-metal" code under QEMU's user-mode
+emulation. It is also useful for writing test cases and indeed a
+number of compiler suites as well as QEMU itself use semihosting calls
+to exit test code while reporting the success state.
+
+Semihosting is only available using TCG emulation. This is because the
+instructions to trigger a semihosting call are typically reserved
+causing most hypervisors to trap and fault on them.
+
+.. warning::
+   Semihosting inherently bypasses any isolation there may be between
+   the guest and the host. As a result a program using semihosting can
+   happily trash your host system. You should only ever run trusted
+   code with semihosting enabled.
+
+Redirection
+~~~
+
+Semihosting calls can be re-directed to a (potentially remote) gdb
+during debugging via the :ref:`gdbstub`. Output to the
+semihosting console is configured as a ``chardev`` so can be
+redirected to a file, pipe or socket like any other ``chardev``
+device.
+
+See :ref:`Semihosting Options` for details.
+
+Supported Targets
+~
+
+Most targets offer similar semihosting implementations with some
+minor changes to define the appropriate instruction to encode the
+semihosting call and which registers hold the parameters. They tend to
+presents a simple POSIX-like API which allows your program to read and
+write files, access the console and some other basic interactions.
+
+For full details of the ABI for a particular target, and the set of
+calls it provides, you should consult the semihosting specification
+for that architecture.
+
+.. note::
+   QEMU makes an implementation decision to implement all file
+   access in ``O_BINARY`` mode. The user-visible effect of this is
+   regardless of the text/binary mode the program sets QEMU will
+   always select a binary mode ensuring no line-terminator conversion
+   is performed on input or output. This is because gdb semihosting
+   support doesn't make the distinction between the modes and
+   magically processing line endings can be confusing.
+
+.. list-table:: Guest Architectures supporting Semihosting
+  :widths: 10 10 80
+  :header-rows: 1
+
+  * - Architecture
+- Modes
+- Specification
+  * - Arm
+- System and User-mode
+- 
https://github.com/ARM-software/abi-aa/blob/main/semihosting/semihosting.rst
+  * - m68k
+- System
+- 
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=libgloss/m68k/m68k-semi.txt;hb=HEAD
+  * - MIPS
+- System
+- Unified Hosting Interface (MD01069)
+  * - Nios II
+- System
+- 
https://sourceware.org/git/gitweb.cgi?p=newlib-cygwin.git;a=blob;f=libgloss/nios2/nios2-semi.txt;hb=HEAD
+  * - RISC-V
+- System and User-mode
+- 
https://github.com/riscv/riscv-semihosting-spec/blob/main/riscv-semihosting-spec.adoc
+  * - Xtensa
+- System
+- Tensilica ISS SIMCALL
diff --git a/qemu-options.hx b/qemu-options.hx
index d59d19704b..4508a00c59 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4633,10 +4633,13 @@ DEF("semihosting", 0, QEMU_OPTION_semihosting,
 QEMU_ARCH_MIPS | QEMU_ARCH_NIOS2 | QEMU_ARCH_RISCV)
 SRST
 ``-semihosting``
-Enable semihosting mode (ARM, M68K, Xtensa, MIPS, Nios II, RISC-V only).
+Enable :ref:`Semihosting` mode (ARM, 

[PATCH v2 17/35] gitlab: wrap up test results for custom runners

2023-01-24 Thread Alex Bennée
Instead of spewing the whole log to stdout lets just define them as
build artefacts so we can examine them later. Where we are running
check-tcg run it first as those tests are yet to be integrated into
meson. To avoid confusion we don't run multiple check-tcg tests at
once.

Reviewed-by: Thomas Huth 
Signed-off-by: Alex Bennée 

---
v2
  - mention we don't parallelise check-tcg
---
 .gitlab-ci.d/custom-runners.yml | 11 +++
 .gitlab-ci.d/custom-runners/ubuntu-20.04-s390x.yml  | 13 ++---
 .../custom-runners/ubuntu-22.04-aarch32.yml |  2 +-
 .../custom-runners/ubuntu-22.04-aarch64.yml | 13 ++---
 4 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/.gitlab-ci.d/custom-runners.yml b/.gitlab-ci.d/custom-runners.yml
index 97f99e29c2..9fdc476c48 100644
--- a/.gitlab-ci.d/custom-runners.yml
+++ b/.gitlab-ci.d/custom-runners.yml
@@ -13,6 +13,17 @@
 variables:
   GIT_STRATEGY: clone
 
+# All custom runners can extend this template to upload the testlog
+# data as an artifact and also feed the junit report
+.custom_artifacts_template:
+  artifacts:
+name: "$CI_JOB_NAME-$CI_COMMIT_REF_SLUG"
+expire_in: 7 days
+paths:
+  - build/meson-logs/testlog.txt
+reports:
+  junit: build/meson-logs/testlog.junit.xml
+
 include:
   - local: '/.gitlab-ci.d/custom-runners/ubuntu-20.04-s390x.yml'
   - local: '/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml'
diff --git a/.gitlab-ci.d/custom-runners/ubuntu-20.04-s390x.yml 
b/.gitlab-ci.d/custom-runners/ubuntu-20.04-s390x.yml
index fcaef9e5ef..f512eaeaa3 100644
--- a/.gitlab-ci.d/custom-runners/ubuntu-20.04-s390x.yml
+++ b/.gitlab-ci.d/custom-runners/ubuntu-20.04-s390x.yml
@@ -3,6 +3,7 @@
 # "Install basic packages to build QEMU on Ubuntu 20.04/20.04"
 
 ubuntu-20.04-s390x-all-linux-static:
+ extends: .custom_artifacts_template
  needs: []
  stage: build
  tags:
@@ -19,12 +20,11 @@ ubuntu-20.04-s390x-all-linux-static:
  - ../configure --enable-debug --static --disable-system --disable-glusterfs 
--disable-libssh
|| { cat config.log meson-logs/meson-log.txt; exit 1; }
  - make --output-sync -j`nproc`
+ - make --output-sync check-tcg
  - make --output-sync -j`nproc` check
-   || { cat meson-logs/testlog.txt; exit 1; } ;
- - make --output-sync -j`nproc` check-tcg
-   || { cat meson-logs/testlog.txt; exit 1; } ;
 
 ubuntu-20.04-s390x-all:
+ extends: .custom_artifacts_template
  needs: []
  stage: build
  tags:
@@ -41,9 +41,9 @@ ubuntu-20.04-s390x-all:
|| { cat config.log meson-logs/meson-log.txt; exit 1; }
  - make --output-sync -j`nproc`
  - make --output-sync -j`nproc` check
-   || { cat meson-logs/testlog.txt; exit 1; } ;
 
 ubuntu-20.04-s390x-alldbg:
+ extends: .custom_artifacts_template
  needs: []
  stage: build
  tags:
@@ -64,9 +64,9 @@ ubuntu-20.04-s390x-alldbg:
  - make clean
  - make --output-sync -j`nproc`
  - make --output-sync -j`nproc` check
-   || { cat meson-logs/testlog.txt; exit 1; } ;
 
 ubuntu-20.04-s390x-clang:
+ extends: .custom_artifacts_template
  needs: []
  stage: build
  tags:
@@ -86,7 +86,6 @@ ubuntu-20.04-s390x-clang:
|| { cat config.log meson-logs/meson-log.txt; exit 1; }
  - make --output-sync -j`nproc`
  - make --output-sync -j`nproc` check
-   || { cat meson-logs/testlog.txt; exit 1; } ;
 
 ubuntu-20.04-s390x-tci:
  needs: []
@@ -109,6 +108,7 @@ ubuntu-20.04-s390x-tci:
  - make --output-sync -j`nproc`
 
 ubuntu-20.04-s390x-notcg:
+ extends: .custom_artifacts_template
  needs: []
  stage: build
  tags:
@@ -128,4 +128,3 @@ ubuntu-20.04-s390x-notcg:
|| { cat config.log meson-logs/meson-log.txt; exit 1; }
  - make --output-sync -j`nproc`
  - make --output-sync -j`nproc` check
-   || { cat meson-logs/testlog.txt; exit 1; } ;
diff --git a/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml 
b/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml
index 2c386fa3e9..42137aaf2a 100644
--- a/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml
+++ b/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch32.yml
@@ -3,6 +3,7 @@
 # "Install basic packages to build QEMU on Ubuntu 20.04"
 
 ubuntu-22.04-aarch32-all:
+ extends: .custom_artifacts_template
  needs: []
  stage: build
  tags:
@@ -22,4 +23,3 @@ ubuntu-22.04-aarch32-all:
|| { cat config.log meson-logs/meson-log.txt; exit 1; }
  - make --output-sync -j`nproc --ignore=40`
  - make --output-sync -j`nproc --ignore=40` check
-   || { cat meson-logs/testlog.txt; exit 1; } ;
diff --git a/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml 
b/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
index 725ca8ffea..8ba85be440 100644
--- a/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
+++ b/.gitlab-ci.d/custom-runners/ubuntu-22.04-aarch64.yml
@@ -3,6 +3,7 @@
 # "Install basic packages to build QEMU on Ubuntu 20.04"
 
 ubuntu-22.04-aarch64-all-linux-static:
+ extends: .custom_artifacts_template
  needs: []
  stage: build
  tags:
@@ -19,12 +20,11 @@ ubuntu-22.04-aarch64-all-linux-static:
  - ../configure 

[PATCH v2 30/35] plugins: make qemu_plugin_user_exit's locking order consistent with fork_start's

2023-01-24 Thread Alex Bennée
From: Emilio Cota 

To fix potential deadlocks as reported by tsan.

Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Emilio Cota 
Message-Id: <2023051628.320011-6-c...@braap.org>
Signed-off-by: Alex Bennée 
---
 plugins/core.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/plugins/core.c b/plugins/core.c
index ccb770a485..728bacef95 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -500,10 +500,17 @@ void qemu_plugin_user_exit(void)
 enum qemu_plugin_event ev;
 CPUState *cpu;
 
-QEMU_LOCK_GUARD();
-
+/*
+ * Locking order: we must acquire locks in an order that is consistent
+ * with the one in fork_start(). That is:
+ * - start_exclusive(), which acquires qemu_cpu_list_lock,
+ *   must be called before acquiring plugin.lock.
+ * - tb_flush(), which acquires mmap_lock(), must be called
+ *   while plugin.lock is not held.
+ */
 start_exclusive();
 
+qemu_rec_mutex_lock();
 /* un-register all callbacks except the final AT_EXIT one */
 for (ev = 0; ev < QEMU_PLUGIN_EV_MAX; ev++) {
 if (ev != QEMU_PLUGIN_EV_ATEXIT) {
@@ -513,13 +520,12 @@ void qemu_plugin_user_exit(void)
 }
 }
 }
-
-tb_flush(current_cpu);
-
 CPU_FOREACH(cpu) {
 qemu_plugin_disable_mem_helpers(cpu);
 }
+qemu_rec_mutex_unlock();
 
+tb_flush(current_cpu);
 end_exclusive();
 
 /* now it's safe to handle the exit case */
-- 
2.34.1




[PATCH v2 24/35] semihosting: add O_BINARY flag in host_open for NT compatibility

2023-01-24 Thread Alex Bennée
From: Evgeny Iakovlev 

Windows open(2) implementation opens files in text mode by default and
needs a Windows-only O_BINARY flag to open files as binary. QEMU already
knows about that flag in osdep and it is defined to 0 on non-Windows,
so we can just add it to the host_flags for better compatibility.

Signed-off-by: Evgeny Iakovlev 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Bin Meng 
Message-Id: <20230106102018.20520-1-eiakov...@linux.microsoft.com>
Signed-off-by: Alex Bennée 
---
 semihosting/syscalls.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/semihosting/syscalls.c b/semihosting/syscalls.c
index ba28194b59..e89992cf90 100644
--- a/semihosting/syscalls.c
+++ b/semihosting/syscalls.c
@@ -253,7 +253,7 @@ static void host_open(CPUState *cs, gdb_syscall_complete_cb 
complete,
 {
 CPUArchState *env G_GNUC_UNUSED = cs->env_ptr;
 char *p;
-int ret, host_flags;
+int ret, host_flags = O_BINARY;
 
 ret = validate_lock_user_string(, cs, fname, fname_len);
 if (ret < 0) {
@@ -262,11 +262,11 @@ static void host_open(CPUState *cs, 
gdb_syscall_complete_cb complete,
 }
 
 if (gdb_flags & GDB_O_WRONLY) {
-host_flags = O_WRONLY;
+host_flags |= O_WRONLY;
 } else if (gdb_flags & GDB_O_RDWR) {
-host_flags = O_RDWR;
+host_flags |= O_RDWR;
 } else {
-host_flags = O_RDONLY;
+host_flags |= O_RDONLY;
 }
 if (gdb_flags & GDB_O_CREAT) {
 host_flags |= O_CREAT;
-- 
2.34.1




[PATCH v2 13/35] tests/docker: Install flex in debian-tricore-cross

2023-01-24 Thread Alex Bennée
From: Philippe Mathieu-Daudé 

When flex is not available, binutils sources default to the
'missing' script, but the current script available is not in
the format expected by the 'configure' script:

  $ ./configure
  ...
  /usr/src/binutils/missing: Unknown `--run' option
  Try `/usr/src/binutils/missing --help' for more information
  configure: WARNING: `missing' script is too old or missing
  ...
  checking for bison... bison -y
  checking for flex... no
  checking for lex... no
  checking for flex... /usr/src/binutils/missing flex

  $ make
  ...
  updating ldgram.h
  gcc -DHAVE_CONFIG_H -I. -I. -I. -D_GNU_SOURCE -I. -I. -I../bfd -I./../bfd 
-I./../include -I./../intl -I../intl  -w 
-DLOCALEDIR="\"/usr/local/share/locale\""   -W -Wall -Wstrict-prototypes 
-Wmissing-prototypes -w -c `test -f 'ldgram.c' || echo './'`ldgram.c
  `test -f ldlex.l || echo './'`ldlex.l
  /bin/sh: 1: ldlex.l: not found
  make[3]: *** [Makefile:662: ldlex.c] Error 127
  make[3]: Leaving directory '/usr/src/binutils/ld'
  make[2]: *** [Makefile:799: all-recursive] Error 1

By pass the 'missing' script use by directly installing 'flex'
in the container.

Reported-by: Peter Maydell 
Suggested-by: Peter Maydell 
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20230112155643.7408-1-phi...@linaro.org>
Reviewed-by: Bastian-Koppelmann 
Signed-off-by: Alex Bennée 
---
 tests/docker/dockerfiles/debian-tricore-cross.docker | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/docker/dockerfiles/debian-tricore-cross.docker 
b/tests/docker/dockerfiles/debian-tricore-cross.docker
index 34b2cea4e3..5ae58efa09 100644
--- a/tests/docker/dockerfiles/debian-tricore-cross.docker
+++ b/tests/docker/dockerfiles/debian-tricore-cross.docker
@@ -20,6 +20,7 @@ RUN apt update && \
bzip2 \
ca-certificates \
ccache \
+   flex \
g++ \
gcc \
git \
-- 
2.34.1




[PATCH v2 12/35] lcitool: drop texinfo from QEMU project/dependencies

2023-01-24 Thread Alex Bennée
From: Marc-André Lureau 

Signed-off-by: Marc-André Lureau 
Reviewed-by: Daniel P. Berrangé 
Message-Id: <20230110132700.833690-9-marcandre.lur...@redhat.com>
Signed-off-by: Alex Bennée 
---
 .gitlab-ci.d/cirrus/freebsd-12.vars   | 2 +-
 .gitlab-ci.d/cirrus/freebsd-13.vars   | 2 +-
 .gitlab-ci.d/cirrus/macos-12.vars | 2 +-
 tests/docker/dockerfiles/alpine.docker| 1 -
 tests/docker/dockerfiles/centos8.docker   | 1 -
 tests/docker/dockerfiles/debian-amd64-cross.docker| 3 +--
 tests/docker/dockerfiles/debian-amd64.docker  | 1 -
 tests/docker/dockerfiles/debian-arm64-cross.docker| 3 +--
 tests/docker/dockerfiles/debian-armel-cross.docker| 3 +--
 tests/docker/dockerfiles/debian-armhf-cross.docker| 3 +--
 tests/docker/dockerfiles/debian-mips64el-cross.docker | 3 +--
 tests/docker/dockerfiles/debian-mipsel-cross.docker   | 3 +--
 tests/docker/dockerfiles/debian-ppc64el-cross.docker  | 3 +--
 tests/docker/dockerfiles/debian-s390x-cross.docker| 3 +--
 tests/docker/dockerfiles/debian-toolchain.docker  | 1 -
 tests/docker/dockerfiles/fedora-win32-cross.docker| 1 -
 tests/docker/dockerfiles/fedora-win64-cross.docker| 1 -
 tests/docker/dockerfiles/fedora.docker| 1 -
 tests/docker/dockerfiles/opensuse-leap.docker | 1 -
 tests/docker/dockerfiles/ubuntu2004.docker| 1 -
 tests/lcitool/projects/qemu.yml   | 1 -
 21 files changed, 11 insertions(+), 29 deletions(-)

diff --git a/.gitlab-ci.d/cirrus/freebsd-12.vars 
b/.gitlab-ci.d/cirrus/freebsd-12.vars
index f32f01a954..8934e5d57f 100644
--- a/.gitlab-ci.d/cirrus/freebsd-12.vars
+++ b/.gitlab-ci.d/cirrus/freebsd-12.vars
@@ -11,6 +11,6 @@ MAKE='/usr/local/bin/gmake'
 NINJA='/usr/local/bin/ninja'
 PACKAGING_COMMAND='pkg'
 PIP3='/usr/local/bin/pip-3.8'
-PKGS='alsa-lib bash bison bzip2 ca_root_nss capstone4 ccache 
cdrkit-genisoimage cmocka ctags curl cyrus-sasl dbus diffutils dtc flex 
fusefs-libs3 gettext git glib gmake gnutls gsed gtk3 json-c libepoxy libffi 
libgcrypt libjpeg-turbo libnfs libslirp libspice-server libssh libtasn1 llvm 
lzo2 meson ncurses nettle ninja opencv pixman pkgconf png py39-numpy 
py39-pillow py39-pip py39-sphinx py39-sphinx_rtd_theme py39-yaml python3 
rpm2cpio sdl2 sdl2_image snappy sndio spice-protocol tesseract texinfo usbredir 
virglrenderer vte3 zstd'
+PKGS='alsa-lib bash bison bzip2 ca_root_nss capstone4 ccache 
cdrkit-genisoimage cmocka ctags curl cyrus-sasl dbus diffutils dtc flex 
fusefs-libs3 gettext git glib gmake gnutls gsed gtk3 json-c libepoxy libffi 
libgcrypt libjpeg-turbo libnfs libslirp libspice-server libssh libtasn1 llvm 
lzo2 meson ncurses nettle ninja opencv pixman pkgconf png py39-numpy 
py39-pillow py39-pip py39-sphinx py39-sphinx_rtd_theme py39-yaml python3 
rpm2cpio sdl2 sdl2_image snappy sndio spice-protocol tesseract usbredir 
virglrenderer vte3 zstd'
 PYPI_PKGS=''
 PYTHON='/usr/local/bin/python3'
diff --git a/.gitlab-ci.d/cirrus/freebsd-13.vars 
b/.gitlab-ci.d/cirrus/freebsd-13.vars
index 813c051616..65ce456c48 100644
--- a/.gitlab-ci.d/cirrus/freebsd-13.vars
+++ b/.gitlab-ci.d/cirrus/freebsd-13.vars
@@ -11,6 +11,6 @@ MAKE='/usr/local/bin/gmake'
 NINJA='/usr/local/bin/ninja'
 PACKAGING_COMMAND='pkg'
 PIP3='/usr/local/bin/pip-3.8'
-PKGS='alsa-lib bash bison bzip2 ca_root_nss capstone4 ccache 
cdrkit-genisoimage cmocka ctags curl cyrus-sasl dbus diffutils dtc flex 
fusefs-libs3 gettext git glib gmake gnutls gsed gtk3 json-c libepoxy libffi 
libgcrypt libjpeg-turbo libnfs libslirp libspice-server libssh libtasn1 llvm 
lzo2 meson ncurses nettle ninja opencv pixman pkgconf png py39-numpy 
py39-pillow py39-pip py39-sphinx py39-sphinx_rtd_theme py39-yaml python3 
rpm2cpio sdl2 sdl2_image snappy sndio spice-protocol tesseract texinfo usbredir 
virglrenderer vte3 zstd'
+PKGS='alsa-lib bash bison bzip2 ca_root_nss capstone4 ccache 
cdrkit-genisoimage cmocka ctags curl cyrus-sasl dbus diffutils dtc flex 
fusefs-libs3 gettext git glib gmake gnutls gsed gtk3 json-c libepoxy libffi 
libgcrypt libjpeg-turbo libnfs libslirp libspice-server libssh libtasn1 llvm 
lzo2 meson ncurses nettle ninja opencv pixman pkgconf png py39-numpy 
py39-pillow py39-pip py39-sphinx py39-sphinx_rtd_theme py39-yaml python3 
rpm2cpio sdl2 sdl2_image snappy sndio spice-protocol tesseract usbredir 
virglrenderer vte3 zstd'
 PYPI_PKGS=''
 PYTHON='/usr/local/bin/python3'
diff --git a/.gitlab-ci.d/cirrus/macos-12.vars 
b/.gitlab-ci.d/cirrus/macos-12.vars
index 33bb4e1040..65b78fa08f 100644
--- a/.gitlab-ci.d/cirrus/macos-12.vars
+++ b/.gitlab-ci.d/cirrus/macos-12.vars
@@ -11,6 +11,6 @@ MAKE='/opt/homebrew/bin/gmake'
 NINJA='/opt/homebrew/bin/ninja'
 PACKAGING_COMMAND='brew'
 PIP3='/opt/homebrew/bin/pip3'
-PKGS='bash bc bison bzip2 capstone ccache cmocka ctags curl dbus diffutils dtc 
flex gcovr gettext git glib gnu-sed gnutls gtk+3 jemalloc jpeg-turbo json-c 
libepoxy libffi libgcrypt libiscsi 

[PATCH v2 02/35] gitlab: add FF_SCRIPT_SECTIONS for timings

2023-01-24 Thread Alex Bennée
From: Mark Cave-Ayland 

Suggested-by: Mark Cave-Ayland 
Signed-off-by: Alex Bennée 
Reviewed-by: Thomas Huth 
---
 .gitlab-ci.d/base.yml | 5 +
 1 file changed, 5 insertions(+)

diff --git a/.gitlab-ci.d/base.yml b/.gitlab-ci.d/base.yml
index 69b36c148a..50fb59e147 100644
--- a/.gitlab-ci.d/base.yml
+++ b/.gitlab-ci.d/base.yml
@@ -6,6 +6,11 @@
 # most restrictive to least restrictive
 #
 .base_job_template:
+  variables:
+# Each script line from will be in a collapsible section in the job output
+# and show the duration of each line.
+FF_SCRIPT_SECTIONS: 1
+
   rules:
 #
 # Stage 1: exclude scenarios where we definitely don't
-- 
2.34.1




  1   2   3   >