This patchseries provides the third and fourth slices of the MVE implementation, which gives us complete coverage of all instructions and brings us to the point where we can actually enable it.
In this series: * fixes for minor bugs in a couple of the insns already upstream * all the remaining integer instructions * the remaining loads and stores (scatter-gather and interleaving) * the floating point instructions * patch enabling MVE for the Cortex-M55 Things still to do: * MVE loads/stores should check alignment (this will depend on the patchset that RTH just sent out, and I didn't want to entangle the two features unnecessarily) * gdbstub support (blocked on the gdb folks nailing down what the XML for it should be) * optimization: many of the insns should have inline versions to use when we know we aren't doing any predication But none of those are blockers for this landing upstream once we reopen for 6.2. Still to review: 03, 07, 10, 21, 26, and the new patches 36-53 thanks -- PMM Peter Maydell (53): target/arm: Note that we handle VMOVL as a special case of VSHLL target/arm: Print MVE VPR in CPU dumps target/arm: Fix MVE VSLI by 0 and VSRI by <dt> target/arm: Fix signed VADDV target/arm: Fix mask handling for MVE narrowing operations target/arm: Fix 48-bit saturating shifts target/arm: Fix MVE 48-bit SQRSHRL for small right shifts target/arm: Fix calculation of LTP mask when LR is 0 target/arm: Factor out mve_eci_mask() target/arm: Fix VPT advance when ECI is non-zero target/arm: Fix VLDRB/H/W for predicated elements target/arm: Implement MVE VMULL (polynomial) target/arm: Implement MVE incrementing/decrementing dup insns target/arm: Factor out gen_vpst() target/arm: Implement MVE integer vector comparisons target/arm: Implement MVE integer vector-vs-scalar comparisons target/arm: Implement MVE VPSEL target/arm: Implement MVE VMLAS target/arm: Implement MVE shift-by-scalar target/arm: Move 'x' and 'a' bit definitions into vmlaldav formats target/arm: Implement MVE integer min/max across vector target/arm: Implement MVE VABAV target/arm: Implement MVE narrowing moves target/arm: Rename MVEGenDualAccOpFn to MVEGenLongDualAccOpFn target/arm: Implement MVE VMLADAV and VMLSLDAV target/arm: Implement MVE VMLA target/arm: Implement MVE saturating doubling multiply accumulates target/arm: Implement MVE VQABS, VQNEG target/arm: Implement MVE VMAXA, VMINA target/arm: Implement MVE VMOV to/from 2 general-purpose registers target/arm: Implement MVE VPNOT target/arm: Implement MVE VCTP target/arm: Implement MVE scatter-gather insns target/arm: Implement MVE scatter-gather immediate forms target/arm: Implement MVE interleaving loads/stores target/arm: Implement MVE VADD (floating-point) target/arm: Implement MVE VSUB, VMUL, VABD, VMAXNM, VMINNM target/arm: Implement MVE VCADD target/arm: Implement MVE VFMA and VFMS target/arm: Implement MVE VCMUL and VCMLA target/arm: Implement MVE VMAXNMA and VMINNMA target/arm: Implement MVE scalar fp insns target/arm: Implement MVE fp-with-scalar VFMA, VFMAS softfloat: Remove assertion preventing silencing of NaN in default-NaN mode target/arm: Implement MVE FP max/min across vector target/arm: Implement MVE fp vector comparisons target/arm: Implement MVE fp scalar comparisons target/arm: Implement MVE VCVT between floating and fixed point target/arm: Implement MVE VCVT between fp and integer target/arm: Implement MVE VCVT with specified rounding mode target/arm: Implement MVE VCVT between single and half precision target/arm: Implement MVE VRINT insns target/arm: Enable MVE in Cortex-M55 docs/system/arm/emulation.rst | 1 + target/arm/helper-mve.h | 425 +++++++ target/arm/translate-a32.h | 2 + target/arm/translate.h | 6 + target/arm/vec_internal.h | 11 + target/arm/mve.decode | 463 +++++++- target/arm/t32.decode | 1 + target/arm/cpu.c | 3 + target/arm/cpu_tcg.c | 7 +- target/arm/mve_helper.c | 1899 +++++++++++++++++++++++++++++++- target/arm/translate-mve.c | 1154 ++++++++++++++++++- target/arm/translate-neon.c | 6 - target/arm/translate-vfp.c | 2 +- target/arm/translate.c | 33 + target/arm/vec_helper.c | 14 +- fpu/softfloat-specialize.c.inc | 1 - 16 files changed, 3911 insertions(+), 117 deletions(-) -- 2.20.1