[Bug tree-optimization/101139] New: Unable to remove double byteswap in fast path

steinar+gcc at gunderson dot no via Gcc-bugs Sun, 20 Jun 2021 02:37:16 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101139


            Bug ID: 101139
           Summary: Unable to remove double byteswap in fast path
           Product: gcc
           Version: 10.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: steinar+gcc at gunderson dot no
  Target Milestone: ---

The following code is reduced from a real interpreter:

extern void (*a[])();
int d, e, h, l;
typedef struct {
  char ab;
} f;
f g;
short i();
short m68ki_read_imm_16() {
  short j, k;
  int b = d;
  f f = g;
  if (b < h)
    return __builtin_bswap16((&f.ab)[0]);
  k = i();
  short c = k;
  j = __builtin_bswap16(c);
  return j;
}
int b() {
  short m;
  do {
    m = m68ki_read_imm_16();
    short c = m;
    l = __builtin_bswap16(c);
    a[l]();
  } while (e);
  return e;
}

Compiling with arm-linux-gnueabihf-gcc-10 -O2 yields this interesting sequence
in the function:

        b       .L11
.L15:
        ldrb    r3, [r5, #8]    @ zero_extendqisi2
        rev16   r3, r3
        uxth    r3, r3
.L10:
        rev16   r3, r3
        uxth    r3, r3

The original code intention was to have a reusable function that returned in
big-endian, but that a specific use of it would be able to ignore endianness
into a table lookup, removing the double-swap entirely. GCC can normally do
that, but it seems that the branch in m68ki_read_imm_16() somehow gets in the
way. Just to be clear, I expect zero rev16 instructions altogether in b() when
m68ki_read_imm_16() is inlined.

The problem is not ARM-specific; x86 shows a similar problematic sequence:

        leaq    a(%rip), %rbx
        jmp     .L11
        .p2align 4,,10
        .p2align 3
.L15:
        movsbw  g(%rip), %ax
        rolw    $8, %ax
.L10:
        rolw    $8, %ax
        movzwl  %ax, %edx

Also verified with

gcc version 12.0.0 20210527 (experimental) [master revision
262e75d22c3:7bb6b9b2f47:9d3a953ec4d2695e9a6bfa5f22655e2aea47a973] (Debian
20210527-1)

[Bug tree-optimization/101139] New: Unable to remove double byteswap in fast path

Reply via email to