On 6/28/21 6:58 AM, Peter Maydell wrote:
The initial implementation of the MVE VRMLALDAVH and VRMLSLDAVH
insns had some bugs:
* the 32x32 multiply of elements was being done as 32x32->32,
not 32x32->64
* we were incorrectly maintaining the accumulator in its full
72-bit form across all 4 beats of the insn; in the pseudocode
it is squashed back into the 64 bits of the RdaHi:RdaLo
registers after each beat
In particular, fixing the second of these allows us to recast
the implementation to avoid 128-bit arithmetic entirely.
Since the element size here is always 4, we can also drop the
parameterization of ESIZE to make the code a little more readable.
Suggested-by: Richard Henderson<richard.hender...@linaro.org>
Signed-off-by: Peter Maydell<peter.mayd...@linaro.org>
---
Richard suggested this change in review of v1 of the original
MVE-slice-1 series, but at that time I was incorrectly reading the
pseudocode as requiring the 72-bit accumulation over all four beats.
Testing with a wider range of inputs showed I was wrong...
---
target/arm/mve_helper.c | 38 +++++++++++++++++++++-----------------
1 file changed, 21 insertions(+), 17 deletions(-)
Reviewed-by: Richard Henderson <richard.hender...@linaro.org>
r~