Version 1 was back in November: https://lore.kernel.org/qemu-devel/20221118094754.242910-1-richard.hender...@linaro.org/
Prerequisites, and there were many, are now upstream. Changes are too many to mention. But at least I've fixed the clang and darwin build problems Phil reported. The main objective here is to support Arm FEAT_LSE2, which says that any single memory access that does not cross a 16-byte boundary is atomic. This is the MO_ATOM_WITHIN16 control. While I'm touching all of this, a secondary objective is to handle the atomicity of the IBM machines. Both Power and s390x treat misaligned accesses as atomic on the lsb of the pointer. For instance, an 8-byte access at ptr % 8 == 4 will appear as two atomic 4-byte accesses, and ptr % 4 == 2 will appear as four 3-byte accesses. This is the MO_ATOM_SUBALIGN control. By default, acceses are atomic only if aligned, which is the current behaviour of the tcg code generator (mostly, anyway, there were bugs). This is the MO_ATOM_IFALIGN control. Further, one can say that a large memory access is really a set of contiguous smaller accesses, and we need not provide more atomicity than that (modulo MO_ATOM_WITHIN16). This is the MO_ATMAX_* control. While I've had a go at documenting all of this, I'm certain it could be improved -- soliciting suggestions. r~ Richard Henderson (30): include/qemu/cpuid: Introduce xgetbv_low include/exec/memop: Add bits describing atomicity accel/tcg: Add cpu_in_serial_context accel/tcg: Introduce tlb_read_idx accel/tcg: Reorg system mode load helpers accel/tcg: Reorg system mode store helpers accel/tcg: Honor atomicity of loads accel/tcg: Honor atomicity of stores tcg/tci: Use cpu_{ld,st}_mmu tcg: Unify helper_{be,le}_{ld,st}* accel/tcg: Implement helper_{ld,st}*_mmu for user-only tcg: Add 128-bit guest memory primitives meson: Detect atomic128 support with optimization tcg/i386: Add have_atomic16 accel/tcg: Use have_atomic16 in ldst_atomicity.c.inc accel/tcg: Add aarch64 specific support in ldst_atomicity tcg/aarch64: Detect have_lse, have_lse2 for linux tcg/aarch64: Detect have_lse, have_lse2 for darwin accel/tcg: Add have_lse2 support in ldst_atomicity tcg: Introduce TCG_OPF_TYPE_MASK tcg: Add INDEX_op_qemu_{ld,st}_i128 tcg/i386: Introduce tcg_out_mov2 tcg/i386: Introduce tcg_out_testi tcg/i386: Use full load/store helpers in user-only mode tcg/i386: Replace is64 with type in qemu_ld/st routines tcg/i386: Mark Win64 call-saved vector regs as reserved tcg/i386: Examine MemOp for atomicity and alignment tcg/i386: Support 128-bit load/store with have_atomic16 tcg/i386: Add vex_v argument to tcg_out_vex_modrm_pool tcg/i386: Honor 64-bit atomicity in 32-bit mode docs/devel/loads-stores.rst | 36 +- docs/devel/tcg-ops.rst | 11 +- meson.build | 52 +- accel/tcg/internal.h | 5 + accel/tcg/tcg-runtime.h | 3 + include/exec/cpu-defs.h | 7 +- include/exec/cpu_ldst.h | 26 +- include/exec/memop.h | 36 + include/qemu/cpuid.h | 25 + include/tcg/tcg-ldst.h | 70 +- include/tcg/tcg-opc.h | 8 + include/tcg/tcg.h | 22 +- tcg/aarch64/tcg-target.h | 5 + tcg/arm/tcg-target.h | 2 + tcg/i386/tcg-target.h | 4 + tcg/loongarch64/tcg-target.h | 1 + tcg/mips/tcg-target.h | 2 + tcg/ppc/tcg-target.h | 2 + tcg/riscv/tcg-target.h | 2 + tcg/s390x/tcg-target.h | 2 + tcg/sparc64/tcg-target.h | 2 + tcg/tci/tcg-target.h | 2 + accel/tcg/cpu-exec-common.c | 3 + accel/tcg/cputlb.c | 1767 ++++++++++++++++++------------ accel/tcg/tb-maint.c | 2 +- accel/tcg/user-exec.c | 478 +++++--- tcg/optimize.c | 15 +- tcg/tcg-op.c | 246 +++-- tcg/tcg.c | 8 +- tcg/tci.c | 127 +-- util/bufferiszero.c | 3 +- accel/tcg/ldst_atomicity.c.inc | 1370 +++++++++++++++++++++++ tcg/aarch64/tcg-target.c.inc | 87 +- tcg/arm/tcg-target.c.inc | 45 +- tcg/i386/tcg-target.c.inc | 1228 ++++++++++++++------- tcg/loongarch64/tcg-target.c.inc | 25 +- tcg/mips/tcg-target.c.inc | 40 +- tcg/ppc/tcg-target.c.inc | 30 +- tcg/riscv/tcg-target.c.inc | 51 +- tcg/s390x/tcg-target.c.inc | 38 +- tcg/sparc64/tcg-target.c.inc | 37 +- tcg/tci/tcg-target.c.inc | 3 +- 42 files changed, 4236 insertions(+), 1692 deletions(-) create mode 100644 accel/tcg/ldst_atomicity.c.inc -- 2.34.1