On 5/6/25 08:26, WANG Rui wrote:
This patch fixes incorrect results for [xv]fnm{add,sub}.{s,d}
instructions when rounding toward zero, postive, negative.
According to the LoongArch ISA specification, these instructions perform
a fused multiply-add followed by a negation of the final result.
Previously, the sign inversion was applied before fused operation, which
interfered with rounding decisions that depend on the result sign --
leading to deviations from the expected behavior. This patch corrects
the implementation by applying the negation after fused multiply-add,
ensuring that rounding is performed on the correct intermediate result.
Reported-by: mengqinggang <mengqingg...@loongson.cn>
Signed-off-by: WANG Rui <wang...@loongson.cn>
---
target/loongarch/tcg/fpu_helper.c | 8 ++++++++
target/loongarch/tcg/vec_helper.c | 7 ++++++-
tests/tcg/loongarch64/Makefile.target | 2 ++
tests/tcg/loongarch64/test_fnmsub.c | 20 ++++++++++++++++++++
tests/tcg/loongarch64/test_vfnmsub.c | 27 +++++++++++++++++++++++++++
5 files changed, 63 insertions(+), 1 deletion(-)
create mode 100644 tests/tcg/loongarch64/test_fnmsub.c
create mode 100644 tests/tcg/loongarch64/test_vfnmsub.c
diff --git a/target/loongarch/tcg/fpu_helper.c
b/target/loongarch/tcg/fpu_helper.c
index fc3fd0561e..970b88ac56 100644
--- a/target/loongarch/tcg/fpu_helper.c
+++ b/target/loongarch/tcg/fpu_helper.c
@@ -389,9 +389,13 @@ uint64_t helper_fmuladd_s(CPULoongArchState *env, uint64_t
fj,
uint64_t fk, uint64_t fa, uint32_t flag)
{
uint64_t fd;
+ uint32_t neg_res;
+ neg_res = flag & float_muladd_negate_result;
+ flag ^= neg_res;
fd = nanbox_s(float32_muladd((uint32_t)fj, (uint32_t)fk,
(uint32_t)fa, flag, &env->fp_status));
+ fd |= neg_res ? 0x80000000ULL : 0;
|= cannot possibly be correct -- it would have to be ^=.
Is this a problem with the sign of a NaN result? That's the only case that isn't handled
by floatN_muladd. target/arm/ does not use the float_muladd_* flags for exactly this
reason. The LoongArch manual has
FNMSUB.D:
FR[fd] = -FP64_fusedMultiplyAdd(FR[fj], FR[fk], -FR[fa])
which matches the language in the Arm manual. Which does suggest that the negation before
on fa and the negation after on the result should be unconditional.
However, for this test case, if I'm interpreting these numbers correctly,
+ *(uint64_t *)&x = 0x4ff0000000000000UL;
+ *(uint64_t *)&y = 0x4ff0000000000000UL;
0x1.0p256
+ *(uint64_t *)&z = 0x2ff0000000000000UL;
0x1.0p-256
+
+ fesetround(FE_DOWNWARD);
+ asm("fnmsub.d %[x], %[x], %[y], %[z]\n\t"
+ :[x]"+f"(x)
+ :[y]"f"(y), [z]"f"(z));
+
+ assert(*(uint64_t *)&x == 0xdfefffffffffffffUL);
-(0x1.0p512 - epsilon)
That really should have worked as-is. I'll have a quick look.
r~