https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108064

            Bug ID: 108064
           Summary: [13 Regression] apache-arrow-cpp-9.0.0 is vectored
                    incorrectly: arithmetic shift instead of logical
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: slyfox at gcc dot gnu.org
  Target Milestone: ---

Initially observed the failure as an array test failure in
apache-arrow-cpp-9.0.0:

    [  FAILED  ] TestSwapEndianArrayData.RandomData

There array of int16_t gets endianness shifted element by element. Minimized
example:

// $ cat a.cc
typedef short int i16;

static inline i16 ByteSwap16(i16 value) {
  constexpr auto m = static_cast<i16>(0xff);
  return static_cast<i16>(((value >> 8) & m) | ((value & m) << 8));
}

__attribute__((noipa))
void swab16(i16 * d, const i16* s) {
  for (unsigned long i = 0; i < 4; i++) {
    d[i] = ByteSwap16(s[i]);
  }
}

__attribute__((noipa))
int main(void) {
  /* need to alogn inputs to make sure vectized part
     of the loop gets executed. */
  alignas(16) i16 a[4] = {0xff, 0, 0, 0};
  alignas(16) i16 b[4];
  alignas(16) i16 c[4];

  swab16(b, a);
  swab16(c, b);

  /* Contents of 'a' should be equivalent to 'c'.
     But gcc bug generates invalid vectored shifts.  */
  if (a[0] != c[0])
    __builtin_trap();
}

Weekly gcc-13 (and master branch) generate invalid code for it:

    $ ./gcc-git/bin/g++ -O3 a.cc -o a && ./a
    Illegal instruction (core dumped)
    $ ./gcc-git/bin/g++ -O0 a.cc -o a && ./a

AFAIU swab16() gets miscompiled:

  Dump of assembler code for function _Z6swab16PsPKs:
   ...
    movq   (%rsi),%xmm0
    movdqa %xmm0,%xmm1
    psllw  $0x8,%xmm0
    psraw  $0x8,%xmm1 ; <<<- should be psrlw!
    por    %xmm1,%xmm0
    movq   %xmm0,(%rdi)

Here 'gcc' loads 64 bits at a time and swaps even and odd bytes
- 'psllw' moves odd bytes (zero-filling, ok)
- 'psraw' moves even bytes (sign-extending, bug)

As a result 'por' has a chance of masking even byte position with a sign bit.

$ ./gcc-git/bin/g++ -v |& unnix
Using built-in specs.
COLLECT_GCC=/<<NIX>>/gcc-13.0.0/bin/g++
COLLECT_LTO_WRAPPER=/<<NIX>>/gcc-13.0.0/libexec/gcc/x86_64-unknown-linux-gnu/13.0.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with:
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.0.0 20221211 (experimental) (GCC)

Reply via email to