On 6/28/21 6:58 AM, Peter Maydell wrote:
The initial implementation of the MVE VRMLALDAVH and VRMLSLDAVH
insns had some bugs:
  * the 32x32 multiply of elements was being done as 32x32->32,
    not 32x32->64
  * we were incorrectly maintaining the accumulator in its full
    72-bit form across all 4 beats of the insn; in the pseudocode
    it is squashed back into the 64 bits of the RdaHi:RdaLo
    registers after each beat

In particular, fixing the second of these allows us to recast
the implementation to avoid 128-bit arithmetic entirely.

Since the element size here is always 4, we can also drop the
parameterization of ESIZE to make the code a little more readable.

Suggested-by: Richard Henderson<richard.hender...@linaro.org>
Signed-off-by: Peter Maydell<peter.mayd...@linaro.org>
---
Richard suggested this change in review of v1 of the original
MVE-slice-1 series, but at that time I was incorrectly reading the
pseudocode as requiring the 72-bit accumulation over all four beats.
Testing with a wider range of inputs showed I was wrong...
---
  target/arm/mve_helper.c | 38 +++++++++++++++++++++-----------------
  1 file changed, 21 insertions(+), 17 deletions(-)

Reviewed-by: Richard Henderson <richard.hender...@linaro.org>

r~

Reply via email to