[PATCH 23/84] tcg/loongarch64: Remove TARGET_LONG_BITS, TCG_TYPE_TL

2023-05-03 Thread Richard Henderson
All uses replaced with TCGContext.addr_type. Signed-off-by: Richard Henderson --- tcg/loongarch64/tcg-target.c.inc | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc index ea5f2a8f00..2e2428bc30 100644

[PATCH v4 06/57] accel/tcg: Honor atomicity of loads

2023-05-03 Thread Richard Henderson
Create ldst_atomicity.c.inc. Not required for user-only code loads, because we've ensured that the page is read-only before beginning to translate code. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 170 +++--- accel/tcg/user-exec.c

[PATCH 04/84] tcg: Widen helper_{ld,st}_i128 addresses to uint64_t

2023-05-03 Thread Richard Henderson
Always pass the target address as uint64_t. Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 4 ++-- accel/tcg/cputlb.c | 5 ++--- accel/tcg/user-exec.c | 5 ++--- tcg/tcg-op-ldst.c | 26 -- 4 files changed, 30 insertions(+), 10 deletions(-)

[PATCH 14/84] tcg: Split INDEX_op_qemu_{ld, st}* for guest address size

2023-05-03 Thread Richard Henderson
For 32-bit hosts, we cannot simply rely on TCGContext.addr_bits, as we need one or two host registers to represent the guest address. Create the new opcodes and update all users. Since we have not yet eliminated TARGET_LONG_BITS, only one of the two opcodes will ever be used, so we can get away

[PATCH 07/84] accel/tcg: Merge gen_mem_wrapped with plugin_gen_empty_mem_callback

2023-05-03 Thread Richard Henderson
As gen_mem_wrapped is only used in plugin_gen_empty_mem_callback, we can avoid the curiosity of union mem_gen_fn by inlining it. Signed-off-by: Richard Henderson --- accel/tcg/plugin-gen.c | 30 ++ 1 file changed, 6 insertions(+), 24 deletions(-) diff --git

[PATCH 35/84] tcg: Remove TCG_TARGET_TLB_DISPLACEMENT_BITS

2023-05-03 Thread Richard Henderson
The last use was removed by e77c89fb086a. Fixes: e77c89fb086a ("cputlb: Remove static tlb sizing") Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 1 - tcg/arm/tcg-target.h | 1 - tcg/i386/tcg-target.h| 1 - tcg/mips/tcg-target.h| 1 - tcg/ppc/tcg-target.h | 1 -

[PATCH 54/84] tcg: Add insn_start_words to TCGContext

2023-05-03 Thread Richard Henderson
This will enable replacement of TARGET_INSN_START_WORDS in tcg.c. Split out "tcg/insn-start-words.h" and use it in target/. Signed-off-by: Richard Henderson --- include/tcg/insn-start-words.h | 17 + include/tcg/tcg-op.h | 8 include/tcg/tcg-opc.h |

[PULL v2 01/12] softmmu: Tidy dirtylimit_dirty_ring_full_time

2023-05-03 Thread Richard Henderson
Drop inline marker: let compiler decide. Change return type to uint64_t: this matches the computation in the return statement and the local variable assignment in the caller. Rename local to dirty_ring_size_MB to fix typo. Simplify conversion to MiB via qemu_target_page_bits and right shift.

[PULL v2 06/12] tcg: Add tcg_gen_gvec_rotrs

2023-05-03 Thread Richard Henderson
From: Nazar Kazakov Add tcg expander and helper functions for rotate right vector with scalar operand. Signed-off-by: Nazar Kazakov Message-Id: <20230428144757.57530-10-lawrence.hun...@codethink.co.uk> [rth: Split out of larger patch; mask rotation count.] Signed-off-by: Richard Henderson ---

[PATCH 37/84] *: Add missing includes of qemu/error-report.h

2023-05-03 Thread Richard Henderson
This had been pulled in from tcg/tcg.h, via exec/cpu_ldst.h, via exec/exec-all.h, but the include of tcg.h will be removed. Signed-off-by: Richard Henderson --- target/avr/helper.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/avr/helper.c b/target/avr/helper.c index

[PATCH 59/84] exec-all: Widen tb_page_addr_t for user-only

2023-05-03 Thread Richard Henderson
This is a step toward making TranslationBlock agnostic to the address size of the guest. Signed-off-by: Richard Henderson --- include/exec/exec-all.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index

[PATCH 00/84] tcg: Build once for system, once for user

2023-05-03 Thread Richard Henderson
Based-on: 20230503070656.1746170-1-richard.hender...@linaro.org ("[PATCH v4 00/57] tcg: Improve atomicity support") and also Based-on: 20230502160846.1289975-1-richard.hender...@linaro.org ("[PATCH 00/16] tcg: Remove TARGET_ALIGNED_ONLY") The goal here is only tcg/, leaving accel/tcg/ for

[PATCH v4 21/57] tcg/i386: Use full load/store helpers in user-only mode

2023-05-03 Thread Richard Henderson
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers. This will allow the fast path to increase alignment to implement atomicity while not immediately raising an alignment exception. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 52

[PATCH 06/84] tcg: Widen tcg_gen_code pc_start argument to uint64_t

2023-05-03 Thread Richard Henderson
Signed-off-by: Richard Henderson --- include/tcg/tcg.h | 2 +- tcg/tcg.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index 7c6a613364..7d6df5eabe 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -852,7 +852,7 @@

[PATCH 25/84] tcg/ppc: Remove TARGET_LONG_BITS, TCG_TYPE_TL

2023-05-03 Thread Richard Henderson
All uses replaced with TCGContext.addr_type. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.c.inc | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc index 6bda1358ef..33237368e4 100644 ---

[PATCH 34/84] tcg: Add tlb_fast_offset to TCGContext

2023-05-03 Thread Richard Henderson
Disconnect the layout of ArchCPU from TCG compilation. Pass the relative offset of 'env' and 'neg.tlb.f' as a parameter. Signed-off-by: Richard Henderson --- include/exec/cpu-defs.h | 39 +- include/exec/tlb-common.h| 56

[PATCH v4 25/57] tcg/riscv: Use full load/store helpers in user-only mode

2023-05-03 Thread Richard Henderson
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers. This will allow the fast path to increase alignment to implement atomicity while not immediately raising an alignment exception. Signed-off-by: Richard Henderson --- tcg/riscv/tcg-target.c.inc | 29

[PATCH 47/84] tcg: Remove outdated comments in helper-head.h

2023-05-03 Thread Richard Henderson
Signed-off-by: Richard Henderson --- include/exec/helper-head.h | 18 +++--- 1 file changed, 3 insertions(+), 15 deletions(-) diff --git a/include/exec/helper-head.h b/include/exec/helper-head.h index f863a6ef5d..a355ef8ebe 100644 --- a/include/exec/helper-head.h +++

[PATCH v4 45/54] tcg/mips: Remove MO_BSWAP handling

2023-05-03 Thread Richard Henderson
While performing the load in the delay slot of the call to the common bswap helper function is cute, it is not worth the added complexity. Signed-off-by: Richard Henderson --- tcg/mips/tcg-target.h | 4 +- tcg/mips/tcg-target.c.inc | 284 ++ 2 files

[PATCH 28/84] tcg/sparc64: Remove TARGET_LONG_BITS, TCG_TYPE_TL

2023-05-03 Thread Richard Henderson
All uses replaced with TCGContext.addr_type. Signed-off-by: Richard Henderson --- tcg/sparc64/tcg-target.c.inc | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc index 79ca667559..ccbf4a179c 100644 ---

[PATCH 01/84] tcg: Split out memory ops to tcg-op-ldst.c

2023-05-03 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/tcg-op-ldst.c | 1017 + tcg/tcg-op.c | 985 --- tcg/meson.build |1 + 3 files changed, 1018 insertions(+), 985 deletions(-) create mode 100644

[PATCH v4 40/54] tcg/mips: Convert tcg_out_qemu_{ld,st}_slow_path

2023-05-03 Thread Richard Henderson
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret, and tcg_out_st_helper_args. This allows our local tcg_out_arg_* infrastructure to be removed. We are no longer filling the call or return branch delay slots, nor are we tail-calling for the store, but this seems a small price to pay.

[PATCH 42/84] tcg: Split out tcg/oversized-guest.h

2023-05-03 Thread Richard Henderson
Move a use of TARGET_LONG_BITS out of tcg/tcg.h. Include the new file only where required. Signed-off-by: Richard Henderson --- include/exec/cpu_ldst.h | 3 +-- include/tcg/oversized-guest.h | 23 +++ include/tcg/tcg.h | 9 - accel/tcg/cputlb.c

[PULL v2 05/12] tcg: Add tcg_gen_gvec_andcs

2023-05-03 Thread Richard Henderson
From: Nazar Kazakov Add tcg expander and helper functions for and-compliment vector with scalar operand. Signed-off-by: Nazar Kazakov Message-Id: <20230428144757.57530-10-lawrence.hun...@codethink.co.uk> [rth: Split out of larger patch.] Signed-off-by: Richard Henderson ---

[PATCH 68/84] target/arm: Tidy helpers for translation

2023-05-03 Thread Richard Henderson
Move most includes from *translate*.c to translate.h, ensuring that we get the ordering correct. Ensure cpu.h is first. Use disas/disas.h instead of exec/log.h. Drop otherwise unused includes. Signed-off-by: Richard Henderson --- target/arm/tcg/translate.h| 3 +++

[PATCH v4 24/54] tcg/riscv: Introduce prepare_host_addr

2023-05-03 Thread Richard Henderson
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment, and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st into one function that returns TCGReg and TCGLabelQemuLdst. Signed-off-by: Richard Henderson --- tcg/riscv/tcg-target.c.inc | 253

[PULL v2 04/12] qemu/host-utils.h: Add clz and ctz functions for lower-bit integers

2023-05-03 Thread Richard Henderson
From: Kiran Ostrolenk This is for use in the RISC-V vclz and vctz instructions (implemented in proceeding commit). Signed-off-by: Kiran Ostrolenk Reviewed-by: Richard Henderson Message-Id: <20230428144757.57530-11-lawrence.hun...@codethink.co.uk> Signed-off-by: Richard Henderson ---

[PATCH v4 13/57] meson: Detect atomic128 support with optimization

2023-05-03 Thread Richard Henderson
There is an edge condition prior to gcc13 for which optimization is required to generate 16-byte atomic sequences. Detect this. Signed-off-by: Richard Henderson --- accel/tcg/ldst_atomicity.c.inc | 38 ++--- meson.build| 52

[PULL v2 07/12] qemu/int128: Re-shuffle Int128Alias members

2023-05-03 Thread Richard Henderson
Clang 14, with --enable-tcg-interpreter errors with include/qemu/int128.h:487:16: error: alignment of field 'i' (128 bits) does not match the alignment of the first field in transparent union; transparent_union attribute ignored [-Werror,-Wignored-attributes] __int128_t i;

[PATCH 15/84] tcg/tci: Elimnate TARGET_LONG_BITS, target_ulong

2023-05-03 Thread Richard Henderson
We now have the address size as part of the opcode, so we no longer need to test TARGET_LONG_BITS. We can use uint64_t for target_ulong, as passed into load/store helpers. Signed-off-by: Richard Henderson --- tcg/tci.c| 61 +---

[PATCH v4 00/57] tcg: Improve atomicity support

2023-05-03 Thread Richard Henderson
v1: https://lore.kernel.org/qemu-devel/20221118094754.242910-1-richard.hender...@linaro.org/ v2: https://lore.kernel.org/qemu-devel/20230216025739.1211680-1-richard.hender...@linaro.org/ v3: https://lore.kernel.org/qemu-devel/20230425193146.2106111-1-richard.hender...@linaro.org/ Based-on:

[PATCH 04/84] tcg: Widen helper_{ld,st}_i128 addresses to uint64_t

2023-05-03 Thread Richard Henderson
Always pass the target address as uint64_t. Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 4 ++-- accel/tcg/cputlb.c | 5 ++--- accel/tcg/user-exec.c | 5 ++--- tcg/tcg-op-ldst.c | 26 -- 4 files changed, 30 insertions(+), 10 deletions(-)

[PATCH v4 02/57] accel/tcg: Add cpu_in_serial_context

2023-05-03 Thread Richard Henderson
Like cpu_in_exclusive_context, but also true if there is no other cpu against which we could race. Use it in tb_flush as a direct replacement. Use it in cpu_loop_exit_atomic to ensure that there is no loop against cpu_exec_step_atomic. Reviewed-by: Alex Bennée Reviewed-by: Philippe

[PULL v2 02/12] accel/tcg: Uncache the host address for instruction fetch when tlb size < 1

2023-05-03 Thread Richard Henderson
From: Weiwei Li When PMP entry overlap part of the page, we'll set the tlb_size to 1, which will make the address in tlb entry set with TLB_INVALID_MASK, and the next access will again go through tlb_fill.However, this way will not work in tb_gen_code() => get_page_addr_code_hostp(): the TLB

[PATCH v4 18/57] tcg/aarch64: Detect have_lse, have_lse2 for darwin

2023-05-03 Thread Richard Henderson
These features are present for Apple M1. Tested-by: Philippe Mathieu-Daudé Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.c.inc | 28 1 file changed, 28 insertions(+) diff --git a/tcg/aarch64/tcg-target.c.inc

[PATCH v4 17/57] tcg/aarch64: Detect have_lse, have_lse2 for linux

2023-05-03 Thread Richard Henderson
Notice when the host has additional atomic instructions. The new variables will also be used in generated code. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 3 +++ tcg/aarch64/tcg-target.c.inc | 12 2 files changed, 15

[PATCH 02/84] tcg: Widen gen_insn_data to uint64_t

2023-05-03 Thread Richard Henderson
We already pass uint64_t to restore_state_to_opc; this changes all of the other uses from insn_start through the encoding to decoding. Signed-off-by: Richard Henderson --- include/tcg/tcg-op.h | 39 +-- include/tcg/tcg-opc.h | 2 +-

[PATCH v4 31/54] tcg: Replace REG_P with arg_loc_reg_p

2023-05-03 Thread Richard Henderson
An inline function is safer than a macro, and REG_P was rather too generic. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/tcg-internal.h | 4 tcg/tcg.c | 16 +--- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git

[PATCH 03/84] accel/tcg: Widen tcg-ldst.h addresses to uint64_t

2023-05-03 Thread Richard Henderson
Always pass the target address as uint64_t. Adjust tcg_out_{ld,st}_helper_args to match. Signed-off-by: Richard Henderson --- include/tcg/tcg-ldst.h | 26 +- accel/tcg/cputlb.c | 26 +- accel/tcg/user-exec.c | 26 +- tcg/tcg.c |

[PATCH 09/84] tcg: Reduce copies for plugin_gen_mem_callbacks

2023-05-03 Thread Richard Henderson
We only need to make copies for loads, when the destination overlaps the address. For now, only eliminate the copy for stores and 128-bit loads. Rename plugin_prep_mem_callbacks to plugin_maybe_preserve_addr, returning NULL if no copy is made. Signed-off-by: Richard Henderson ---

[PATCH v4 40/57] tcg: Add INDEX_op_qemu_{ld,st}_i128

2023-05-03 Thread Richard Henderson
Add opcodes for backend support for 128-bit memory operations. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- include/tcg/tcg-opc.h| 8 + tcg/aarch64/tcg-target.h | 2 ++ tcg/arm/tcg-target.h | 2 ++ tcg/i386/tcg-target.h| 2 ++

[PATCH v4 39/57] tcg: Introduce tcg_target_has_memory_bswap

2023-05-03 Thread Richard Henderson
Replace the unparameterized TCG_TARGET_HAS_MEMORY_BSWAP macro with a function with a memop argument. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 1 - tcg/arm/tcg-target.h | 1 - tcg/i386/tcg-target.h| 3 --- tcg/loongarch64/tcg-target.h

[PATCH 09/84] tcg: Reduce copies for plugin_gen_mem_callbacks

2023-05-03 Thread Richard Henderson
We only need to make copies for loads, when the destination overlaps the address. For now, only eliminate the copy for stores and 128-bit loads. Rename plugin_prep_mem_callbacks to plugin_maybe_preserve_addr, returning NULL if no copy is made. Signed-off-by: Richard Henderson ---

[PATCH v4 42/57] tcg: Introduce atom_and_align_for_opc

2023-05-03 Thread Richard Henderson
Examine MemOp for atomicity and alignment, adjusting alignment as required to implement atomicity on the host. Signed-off-by: Richard Henderson --- tcg/tcg.c | 69 +++ 1 file changed, 69 insertions(+) diff --git a/tcg/tcg.c b/tcg/tcg.c index

[PATCH v4 01/57] include/exec/memop: Add bits describing atomicity

2023-05-03 Thread Richard Henderson
These bits may be used to describe the precise atomicity requirements of the guest, which may then be used to constrain the methods by which it may be emulated by the host. For instance, the AArch64 LDP (32-bit) instruction changes semantics with ARMv8.4 LSE2, from MO_64 | MO_ATMAX_4 |

[PATCH v4 44/57] tcg/aarch64: Use atom_and_align_for_opc

2023-05-03 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.c.inc | 38 +++- 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index 8e5f3d3688..1d6d382edd 100644 ---

[PATCH v4 41/54] tcg/ppc: Convert tcg_out_qemu_{ld,st}_slow_path

2023-05-03 Thread Richard Henderson
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret, and tcg_out_st_helper_args. Reviewed-by: Daniel Henrique Barboza Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.c.inc | 88 1 file changed, 26 insertions(+), 62 deletions(-) diff --git

[PATCH v4 16/57] accel/tcg: Add aarch64 specific support in ldst_atomicity

2023-05-03 Thread Richard Henderson
We have code in atomic128.h noting that through GCC 8, there was no support for atomic operations on __uint128. This has been fixed in GCC 10. But we can still improve over any basic compare-and-swap loop using the ldxp/stxp instructions. Signed-off-by: Richard Henderson ---

[PATCH v4 31/57] tcg/sparc64: Rename tcg_out_movi_imm13 to tcg_out_movi_s13

2023-05-03 Thread Richard Henderson
Emphasize that the constant is signed. Signed-off-by: Richard Henderson --- tcg/sparc64/tcg-target.c.inc | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc index 64464ab363..2e6127d506

[PATCH v4 46/57] tcg/loongarch64: Use atom_and_align_for_opc

2023-05-03 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/loongarch64/tcg-target.c.inc | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc index 62bf823084..43341524f2 100644 --- a/tcg/loongarch64/tcg-target.c.inc +++

[PATCH v4 55/57] tcg/aarch64: Support 128-bit load/store

2023-05-03 Thread Richard Henderson
Use LDXP+STXP when LSE2 is not present and 16-byte atomicity is required, and LDP/STP otherwise. This requires allocating a second general-purpose temporary, as Rs cannot overlap Rn in STXP. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target-con-set.h | 2 + tcg/aarch64/tcg-target.h

[PATCH v4 50/57] tcg/s390x: Use atom_and_align_for_opc

2023-05-03 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/s390x/tcg-target.c.inc | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc index 22f0206b5a..ddd9860a6a 100644 --- a/tcg/s390x/tcg-target.c.inc +++

[PATCH v4 54/57] tcg/aarch64: Rename temporaries

2023-05-03 Thread Richard Henderson
We will need to allocate a second general-purpose temporary. Rename the existing temps to add a distinguishing number. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.c.inc | 50 ++-- 1 file changed, 25 insertions(+), 25 deletions(-) diff --git

[PATCH v4 05/54] tcg/i386: Introduce tcg_out_testi

2023-05-03 Thread Richard Henderson
Split out a helper for choosing testb vs testl. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 30 ++ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc

[PATCH v4 53/57] tcg/i386: Support 128-bit load/store with have_atomic16

2023-05-03 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 3 +- tcg/i386/tcg-target.c.inc | 184 +- 2 files changed, 182 insertions(+), 5 deletions(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 943af6775e..7f69997e30 100644 ---

[PATCH v4 25/54] tcg/s390x: Pass TCGType to tcg_out_qemu_{ld,st}

2023-05-03 Thread Richard Henderson
We need to set this in TCGLabelQemuLdst, so plumb this all the way through from tcg_out_op. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/s390x/tcg-target.c.inc | 22 ++ 1 file changed, 14 insertions(+), 8 deletions(-) diff --git

[PATCH v4 11/57] tcg/tci: Use helper_{ld,st}*_mmu for user-only

2023-05-03 Thread Richard Henderson
We can now fold these two pieces of code. Signed-off-by: Richard Henderson --- tcg/tci.c | 89 --- 1 file changed, 89 deletions(-) diff --git a/tcg/tci.c b/tcg/tci.c index 5bde2e1f2e..15f2f8c463 100644 --- a/tcg/tci.c +++ b/tcg/tci.c @@

[PATCH v4 12/57] tcg: Add 128-bit guest memory primitives

2023-05-03 Thread Richard Henderson
Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h| 3 + include/tcg/tcg-ldst.h | 4 + accel/tcg/cputlb.c | 392 + accel/tcg/user-exec.c | 94 ++-- tcg/tcg-op.c | 184 +++-

[PATCH v4 07/54] tcg/i386: Use indexed addressing for softmmu fast path

2023-05-03 Thread Richard Henderson
Since tcg_out_{ld,st}_helper_args, the slow path no longer requires the address argument to be set up by the tlb load sequence. Use a plain load for the addend and indexed addressing with the original input address register. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 25

[PATCH v4 23/57] tcg/ppc: Use full load/store helpers in user-only mode

2023-05-03 Thread Richard Henderson
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers. This will allow the fast path to increase alignment to implement atomicity while not immediately raising an alignment exception. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.c.inc | 44

[PATCH v4 57/57] tcg/s390x: Support 128-bit load/store

2023-05-03 Thread Richard Henderson
Use LPQ/STPQ when 16-byte atomicity is required. Note that these instructions require 16-byte alignment. Signed-off-by: Richard Henderson --- tcg/s390x/tcg-target-con-set.h | 2 + tcg/s390x/tcg-target.h | 2 +- tcg/s390x/tcg-target.c.inc | 100 -

[PATCH v4 03/54] tcg/i386: Introduce HostAddress

2023-05-03 Thread Richard Henderson
Collect the 4 potential parts of the host address into a struct. Reorg tcg_out_qemu_{ld,st}_direct to use it. Reorg guest_base handling to use it. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 165 +- 1

[PATCH v4 42/54] tcg/riscv: Convert tcg_out_qemu_{ld,st}_slow_path

2023-05-03 Thread Richard Henderson
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret, and tcg_out_st_helper_args. Reviewed-by: Daniel Henrique Barboza Signed-off-by: Richard Henderson --- tcg/riscv/tcg-target.c.inc | 37 ++--- 1 file changed, 10 insertions(+), 27 deletions(-) diff --git

[PATCH v4 22/57] tcg/aarch64: Use full load/store helpers in user-only mode

2023-05-03 Thread Richard Henderson
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers. This will allow the fast path to increase alignment to implement atomicity while not immediately raising an alignment exception. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.c.inc | 35

[PATCH v4 33/57] tcg/sparc64: Split out tcg_out_movi_s32

2023-05-03 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/sparc64/tcg-target.c.inc | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc index e244209890..4375a06377 100644 --- a/tcg/sparc64/tcg-target.c.inc +++

[PATCH v4 52/57] tcg/i386: Honor 64-bit atomicity in 32-bit mode

2023-05-03 Thread Richard Henderson
Use the fpu to perform 64-bit loads and stores. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 44 +-- 1 file changed, 38 insertions(+), 6 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index

[PATCH v4 34/57] tcg/sparc64: Use standard slow path for softmmu

2023-05-03 Thread Richard Henderson
Drop the target-specific trampolines for the standard slow path. This lets us use tcg_out_helper_{ld,st}_args, and handles the new atomicity bits within MemOp. At the same time, use the full load/store helpers for user-only mode. Drop inline unaligned access support for user-only mode, as it does

[PATCH v4 34/54] tcg: Add routines for calling slow-path helpers

2023-05-03 Thread Richard Henderson
Add tcg_out_ld_helper_args, tcg_out_ld_helper_ret, and tcg_out_st_helper_args. These and their subroutines use the existing knowledge of the host function call abi to load the function call arguments and return results. These will be used to simplify the backends in turn. Signed-off-by: Richard

[PATCH v4 35/57] accel/tcg: Remove helper_unaligned_{ld,st}

2023-05-03 Thread Richard Henderson
These functions are now unused. Signed-off-by: Richard Henderson --- include/tcg/tcg-ldst.h | 6 -- accel/tcg/user-exec.c | 10 -- 2 files changed, 16 deletions(-) diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h index 64f48e6990..7dd57013e9 100644 ---

[PATCH v4 32/54] tcg: Introduce arg_slot_stk_ofs

2023-05-03 Thread Richard Henderson
Unify all computation of argument stack offset in one function. This requires that we adjust ref_slot to be in the same units, by adding max_reg_slots during init_call_layout. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/tcg.c | 29 +

[PATCH v4 48/57] tcg/ppc: Use atom_and_align_for_opc

2023-05-03 Thread Richard Henderson
Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.c.inc | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc index f0a4118bbb..60375804cd 100644 --- a/tcg/ppc/tcg-target.c.inc +++

[PATCH v4 56/57] tcg/ppc: Support 128-bit load/store

2023-05-03 Thread Richard Henderson
Use LQ/STQ with ISA v2.07, and 16-byte atomicity is required. Note that these instructions do not require 16-byte alignment. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target-con-set.h | 2 + tcg/ppc/tcg-target-con-str.h | 1 + tcg/ppc/tcg-target.h | 3 +-

[PATCH v4 38/57] tcg/riscv: Support softmmu unaligned accesses

2023-05-03 Thread Richard Henderson
The system is required to emulate unaligned accesses, even if the hardware does not support it. The resulting trap may or may not be more efficient than the qemu slow path. There are linux kernel patches in flight to allow userspace to query hardware support; we can re-evaluate whether to enable

[PATCH v4 27/57] tcg/arm: Use full load/store helpers in user-only mode

2023-05-03 Thread Richard Henderson
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers. This will allow the fast path to increase alignment to implement atomicity while not immediately raising an alignment exception. Signed-off-by: Richard Henderson --- tcg/arm/tcg-target.c.inc | 45

[PATCH v4 24/57] tcg/loongarch64: Use full load/store helpers in user-only mode

2023-05-03 Thread Richard Henderson
Instead of using helper_unaligned_{ld,st}, use the full load/store helpers. This will allow the fast path to increase alignment to implement atomicity while not immediately raising an alignment exception. Signed-off-by: Richard Henderson --- tcg/loongarch64/tcg-target.c.inc | 30

[PATCH v4 45/57] tcg/arm: Use atom_and_align_for_opc

2023-05-03 Thread Richard Henderson
No change to the ultimate load/store routines yet, so some atomicity conditions not yet honored, but plumbs the change to alignment through the relevant functions. Signed-off-by: Richard Henderson --- tcg/arm/tcg-target.c.inc | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-)

[PATCH v4 43/57] tcg/i386: Use atom_and_align_for_opc

2023-05-03 Thread Richard Henderson
No change to the ultimate load/store routines yet, so some atomicity conditions not yet honored, but plumbs the change to alignment through the relevant functions. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 34 ++ 1 file changed, 22

[PATCH v4 21/54] tcg/ppc: Introduce prepare_host_addr

2023-05-03 Thread Richard Henderson
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment, and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st into one function that returns HostAddress and TCGLabelQemuLdst structures. Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.c.inc | 377

[PATCH v4 03/57] accel/tcg: Introduce tlb_read_idx

2023-05-03 Thread Richard Henderson
Instead of playing with offsetof in various places, use MMUAccessType to index an array. This is easily defined instead of the previous dummy padding array in the union. Reviewed-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- include/exec/cpu-defs.h

[PATCH v4 22/54] tcg/riscv: Require TCG_TARGET_REG_BITS == 64

2023-05-03 Thread Richard Henderson
The port currently does not support "oversize" guests, which means riscv32 can only target 32-bit guests. We will soon be building TCG once for all guests. This implies that we can only support riscv64. Since all Linux distributions target riscv64 not riscv32, this is not much of a restriction

[PATCH v4 47/54] tcg/mips: Simplify constraints on qemu_ld/st

2023-05-03 Thread Richard Henderson
The softmmu tlb uses TCG_REG_TMP[0-3], not any of the normally available registers. Now that we handle overlap betwen inputs and helper arguments, and have eliminated use of A0, we can allow any allocatable reg. Signed-off-by: Richard Henderson --- tcg/mips/tcg-target-con-set.h | 13

[PATCH v4 08/57] target/loongarch: Do not include tcg-ldst.h

2023-05-03 Thread Richard Henderson
This header is supposed to be private to tcg and in fact does not need to be included here at all. Reviewed-by: Song Gao Signed-off-by: Richard Henderson --- target/loongarch/csr_helper.c | 1 - target/loongarch/iocsr_helper.c | 1 - 2 files changed, 2 deletions(-) diff --git

[PATCH v4 29/54] tcg/sparc64: Pass TCGType to tcg_out_qemu_{ld,st}

2023-05-03 Thread Richard Henderson
We need to set this in TCGLabelQemuLdst, so plumb this all the way through from tcg_out_op. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/sparc64/tcg-target.c.inc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git

[PATCH v4 04/57] accel/tcg: Reorg system mode load helpers

2023-05-03 Thread Richard Henderson
Instead of trying to unify all operations on uint64_t, pull out mmu_lookup() to perform the basic tlb hit and resolution. Create individual functions to handle access by size. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 644

[PATCH v4 17/54] tcg/mips: Rationalize args to tcg_out_qemu_{ld,st}

2023-05-03 Thread Richard Henderson
Interpret the variable argument placement in the caller. There are several places where we already convert back from bool to type. Clean things up by using type throughout. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/mips/tcg-target.c.inc | 186

[PATCH v4 05/57] accel/tcg: Reorg system mode store helpers

2023-05-03 Thread Richard Henderson
Instead of trying to unify all operations on uint64_t, use mmu_lookup() to perform the basic tlb hit and resolution. Create individual functions to handle access by size. Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 408 + 1 file changed,

[PATCH v4 48/54] tcg/ppc: Reorg tcg_out_tlb_read

2023-05-03 Thread Richard Henderson
Allocate TCG_REG_TMP2. Use R0, TMP1, TMP2 instead of any of the normally allocated registers for the tlb load. Reviewed-by: Daniel Henrique Barboza Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target.c.inc | 84 1 file changed, 51 insertions(+), 33

[PATCH v4 19/57] accel/tcg: Add have_lse2 support in ldst_atomicity

2023-05-03 Thread Richard Henderson
Add fast paths for FEAT_LSE2, using the detection in tcg. Signed-off-by: Richard Henderson --- accel/tcg/ldst_atomicity.c.inc | 37 ++ 1 file changed, 33 insertions(+), 4 deletions(-) diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc

[PATCH v4 11/54] tcg/arm: Rationalize args to tcg_out_qemu_{ld,st}

2023-05-03 Thread Richard Henderson
Interpret the variable argument placement in the caller. Pass data_type instead of is_64. We need to set this in TCGLabelQemuLdst, so plumb this all the way through from tcg_out_op. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/arm/tcg-target.c.inc | 113

[PATCH v4 50/54] tcg/ppc: Remove unused constraints A, B, C, D

2023-05-03 Thread Richard Henderson
These constraints have not been used for quite some time. Fixes: 77b73de67632 ("Use rem/div[u]_i32 drop div[u]2_i32") Reviewed-by: Daniel Henrique Barboza Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target-con-str.h | 4 1 file changed, 4

[PATCH v4 14/57] tcg/i386: Add have_atomic16

2023-05-03 Thread Richard Henderson
Notice when Intel or AMD have guaranteed that vmovdqa is atomic. The new variable will also be used in generated code. Signed-off-by: Richard Henderson --- include/qemu/cpuid.h | 18 ++ tcg/i386/tcg-target.h | 1 + tcg/i386/tcg-target.c.inc | 27

[PATCH v4 19/54] tcg/ppc: Rationalize args to tcg_out_qemu_{ld,st}

2023-05-03 Thread Richard Henderson
Interpret the variable argument placement in the caller. Pass data_type instead of is64 -- there are several places where we already convert back from bool to type. Clean things up by using type throughout. Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Daniel Henrique Barboza

[PATCH v4 00/54] tcg: Simplify calls to load/store helpers

2023-05-03 Thread Richard Henderson
v1: https://lore.kernel.org/qemu-devel/20230408024314.3357414-1-richard.hender...@linaro.org/ v2: https://lore.kernel.org/qemu-devel/20230411010512.5375-1-richard.hender...@linaro.org/ v3: https://lore.kernel.org/qemu-devel/20230424054105.1579315-1-richard.hender...@linaro.org/ There are

[PATCH v4 46/54] tcg/mips: Reorg tlb load within prepare_host_addr

2023-05-03 Thread Richard Henderson
Compare the address vs the tlb entry with sign-extended values. This simplifies the page+alignment mask constant, and the generation of the last byte address for the misaligned test. Move the tlb addend load up, and the zero-extension down. This frees up a register, which allows us use TMP3 as

[PATCH v4 18/54] tcg/mips: Introduce prepare_host_addr

2023-05-03 Thread Richard Henderson
Merge tcg_out_tlb_load, add_qemu_ldst_label, tcg_out_test_alignment, and some code that lived in both tcg_out_qemu_ld and tcg_out_qemu_st into one function that returns HostAddress and TCGLabelQemuLdst structures. Signed-off-by: Richard Henderson --- tcg/mips/tcg-target.c.inc | 404

[PATCH v4 38/54] tcg/arm: Convert tcg_out_qemu_{ld,st}_slow_path

2023-05-03 Thread Richard Henderson
Use tcg_out_ld_helper_args, tcg_out_ld_helper_ret, and tcg_out_st_helper_args. This allows our local tcg_out_arg_* infrastructure to be removed. Signed-off-by: Richard Henderson --- tcg/arm/tcg-target.c.inc | 140 +-- 1 file changed, 18 insertions(+), 122

[PATCH v4 51/54] tcg/ppc: Remove unused constraint J

2023-05-03 Thread Richard Henderson
Never used since its introduction. Fixes: 3d582c6179c ("tcg-ppc64: Rearrange integer constant constraints") Signed-off-by: Richard Henderson --- tcg/ppc/tcg-target-con-str.h | 1 - tcg/ppc/tcg-target.c.inc | 3 --- 2 files changed, 4 deletions(-) diff --git a/tcg/ppc/tcg-target-con-str.h

[PATCH v4 10/57] accel/tcg: Implement helper_{ld, st}*_mmu for user-only

2023-05-03 Thread Richard Henderson
TCG backends may need to defer to a helper to implement the atomicity required by a given operation. Mirror the interface used in system mode. Signed-off-by: Richard Henderson --- include/tcg/tcg-ldst.h | 6 +- accel/tcg/user-exec.c | 393 - tcg/tcg.c

[PATCH v4 09/54] tcg/aarch64: Introduce HostAddress

2023-05-03 Thread Richard Henderson
Collect the 3 potential parts of the host address into a struct. Reorg tcg_out_qemu_{ld,st}_direct to use it. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.c.inc | 86 +--- 1 file changed, 59 insertions(+), 27

[PATCH v4 20/57] tcg: Introduce TCG_OPF_TYPE_MASK

2023-05-03 Thread Richard Henderson
Reorg TCG_OPF_64BIT and TCG_OPF_VECTOR into a two-bit field so that we can add TCG_OPF_128BIT without requiring another bit. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- include/tcg/tcg.h| 22 -- tcg/optimize.c | 15

<    1   2   3   4   5   6   >