https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93828

            Bug ID: 93828
           Summary: [10 Regression] incorrect shufps instruction emitted
                    for -march=k8
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kretz at kde dot org
  Target Milestone: ---
            Target: x86_64-*-*

Test case (https://godbolt.org/z/ramAe3):

using float2 [[gnu::vector_size(8)]] = float;
using int2 [[gnu::vector_size(8)]] = int;
float2 y = {2, 2};

int main() {
  const auto k = y == float2{2, 2};
  if (k[1] == 0)
    __builtin_abort();
  const auto a = k & int2{2, 2};
  return a[0] - 2;
}

Compile with `-O2 -march=k8`. The resulting instruction sequence uses:
  movlps xmm0, QWORD PTR y[rip]
  shufps xmm1, xmm0, 0xe5
to place y[0] in xmm0[0] and y[1] in xmm1[0]. The latter is missing a `movaps
xmm1, xmm0` to work correctly, though. Most/all other -march flags load
individual floats (using movss) from y.

Reply via email to