https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102393

            Bug ID: 102393
           Summary: Failure to optimize 2 8-bit stores into a single
                    16-bit store
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

#include <stdint.h>

void HeaderWriteU16LE(int offset, uint16_t value, uint8_t *RomHeader)
{
    RomHeader[offset] = value;
    RomHeader[offset + 1] = value >> 8;
}

Non-withstanding aliasing, this can be optimized to `*(uint16_t *)(RomHeader +
offset) = value`. This transformation is done by LLVM, but not by GCC.

Sample AMD64 output for this from GCC:

HeaderWriteU16LE:
  movsx rdi, edi
  mov eax, esi
  mov BYTE PTR [rdx+rdi], sil
  mov BYTE PTR [rdx+1+rdi], ah
  ret

And from LLVM:

HeaderWriteU16LE:
  movsxd rax, edi
  mov word ptr [rdx + rax], si
  ret

PS: The equivalent pattern for 4 8-bit stores gets optimized into a single
32-bit store.

Reply via email to