Bug ID: 85819
           Summary: conversion from __v[48]su to __v[48]sf should use FMA
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot
          Reporter: kretz at kde dot org
  Target Milestone: ---

Testcase (cf.

using T = float;
using To [[gnu::vector_size(32)]] = T;
using From [[gnu::vector_size(32)]] = unsigned;

#define A2(I) (T)a[I], (T)a[1+I]
#define A4(I) A2(I), A2(2+I)
#define A8(I) A4(I), A4(4+I)

To f(From a) {
    return To{A8(0)};

This compiles to:
  vpand .LC0(%rip), %ymm0, %ymm1
  vpsrld $16, %ymm0, %ymm0
  vcvtdq2ps %ymm0, %ymm0
  vcvtdq2ps %ymm1, %ymm1
  vmulps .LC1(%rip), %ymm0, %ymm0
  vaddps %ymm0, %ymm1, %ymm0

The last vmulps and vaddps can be contracted to vfmadd132ps .LC1(%rip), %ymm1,

The same is true for vector_size(16).

Reply via email to