https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110249

            Bug ID: 110249
           Summary: __builtin_unreachable helps optimisation at -O1 but
                    not at -O2
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: david at westcontrol dot com
  Target Milestone: ---

Sometimes it can be useful to use __builtin_unreachable() to give the compiler
hints that can improve optimisation.  For example, it can be used here to tell
the compiler that the parameter is always 8-byte aligned:

#include <stdint.h>
#include <string.h>

uint64_t read64(const uint64_t * p) {
    if ((uint64_t) p % 8 ) {
        __builtin_unreachable();
    }
     uint64_t value;
     memcpy( &value, p, sizeof(uint64_t) );
     return value;     
}

For some targets, such as 32-bit ARM and especially RISC-V, this can make a
difference to the generated code.  In testing, when given -O1 the compiler
takes advantage of the explicit undefined behaviour to see that the pointer is
aligned, and generates a single 64-bit load.  With -O2, however, it seems that
information is lost - perhaps due to earlier optimisation passes - and now slow
unaligned load code is generated.

Ideally, such optimisation information from undefined behaviour (explicit via a
builtin, or implicit via other code) should be kept - -O2 should have at least
as much information as -O1.  An alternative would be the addition of a more
directed "__builtin_assume" function that could be used.

(I know that in this particular case, __builtin_assume_aligned is available and
works exactly as intended at -O1 and -O2, but I think this is a more general
issue than just alignments.)

Reply via email to