https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110249
Bug ID: 110249 Summary: __builtin_unreachable helps optimisation at -O1 but not at -O2 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: david at westcontrol dot com Target Milestone: --- Sometimes it can be useful to use __builtin_unreachable() to give the compiler hints that can improve optimisation. For example, it can be used here to tell the compiler that the parameter is always 8-byte aligned: #include <stdint.h> #include <string.h> uint64_t read64(const uint64_t * p) { if ((uint64_t) p % 8 ) { __builtin_unreachable(); } uint64_t value; memcpy( &value, p, sizeof(uint64_t) ); return value; } For some targets, such as 32-bit ARM and especially RISC-V, this can make a difference to the generated code. In testing, when given -O1 the compiler takes advantage of the explicit undefined behaviour to see that the pointer is aligned, and generates a single 64-bit load. With -O2, however, it seems that information is lost - perhaps due to earlier optimisation passes - and now slow unaligned load code is generated. Ideally, such optimisation information from undefined behaviour (explicit via a builtin, or implicit via other code) should be kept - -O2 should have at least as much information as -O1. An alternative would be the addition of a more directed "__builtin_assume" function that could be used. (I know that in this particular case, __builtin_assume_aligned is available and works exactly as intended at -O1 and -O2, but I think this is a more general issue than just alignments.)