https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80568
--- Comment #2 from Peter Cordes <peter at cordes dot ca> --- Using ISA-extension options removes some microarchitectures from the set of CPUs that can run the code, so it would be appropriate for them to have some effect on tuning. A "generic AVX2 CPU" is much more specific than a "generic x86-64 CPU". For example, rep ret is useless with -mavx, since PhenomII doesn't support AVX (or SSE4, actually). As it stands now, gcc doesn't have a way to tune for a "generic avx2 CPU". (i.e. only try to avoid problems on Haswell, Skylake, KNL, Excavator, and Ryzen. Don't care about things that are slow on IvyBridge, Steamroller, or Atom.) -mtune=haswell tells gcc that bsf/bsr are fast, but that's not the case on Excavator (at least it isn't on Steamroller). So -mtune=intel or -mtune=haswell aren't necessarily appropriate, especially if we're just talking about -mavx, not -mavx2. --- In the absence of any -mtune or -march options, -mavx could imply -mtune=generic-avx, the way -march implies a tuning but can be overridden with -march=foo -mtune=bar. Or maybe the default -mtune option should be changed to -mtune=generic-isa, so users can think of it as a tuning that looks at what -m options are enabled to decide which uarches it can ignore. It might be easier to maintain if those tune options are limited to only disabling workaround-options like rep ret and splitting 256b loads/stores. Or maybe this suggestion would already add too much maintenance work. --- I don't know whether -mavx256-split-unaligned-load/store is still worth it if we take SnB/IvB out of the picture. If it helps a lot for Excavator/Zen, then maybe. It probably hurts for KNL, which easily bottlenecks on decode throughput according to Agner Fog, so more instructions is definitely bad. --- I didn't find any related bug reports, searching even on closed bugs for split unaligned load, or for -mavx256-split-unaligned-load. I did search some (including in git for the commit that changed this), but didn't find anything. Thanks for confirming that it was an intentional bugfix.