https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92979
Bug ID: 92979 Summary: bswap not finding a bswap with a memory load at the beginging of the instruction stream Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64-linux-gnu Take these two functions: unsigned g(unsigned *a) { unsigned M0 = *a & 0xff; unsigned M1 = (*a>>8) & 0xff; unsigned M2 = (*a>>16) & 0xff; unsigned M3 = (*a>>24) & 0xff; unsigned t = 0; t |= M0; t <<= 8; t |= M1; t <<= 8; t |= M2; t <<= 8; t |= M3; return t; } unsigned g1(unsigned a) { unsigned M0 = a & 0xff; unsigned M1 = (a>>8) & 0xff; unsigned M2 = (a>>16) & 0xff; unsigned M3 = (a>>24) & 0xff; unsigned t = 0; t |= M0; t <<= 8; t |= M1; t <<= 8; t |= M2; t <<= 8; t |= M3; return t; } ----- CUT --- Only g1 is detected as bswap while g is not. The problem is the way bswap is too "agressive" in looking through the instruction stream to find loads. I found this while implementing lowering of bit-fields.