On 8/25/22 09:48, Paolo Bonzini wrote:
The DPPS (Dot Product) instruction is defined to first sum pairs of
intermediate results, then sum those values to get the final result.
i.e. (A+B)+(C+D)

We incrementally sum the results, i.e. ((A+B)+C)+D, which can result
in incorrect rouding.

For consistency, also change the variable names to the ones used
in the Intel SDM and implement DPPD following the manual.

Based on a patch by Paul Brook<p...@nowt.org>.

Signed-off-by: Paolo Bonzini<pbonz...@redhat.com>
---
  target/i386/ops_sse.h | 67 ++++++++++++++++++++++---------------------
  1 file changed, 35 insertions(+), 32 deletions(-)

Reviewed-by: Richard Henderson <richard.hender...@linaro.org>


r~

Reply via email to