https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #59 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #58) > (In reply to wilco from comment #57) > > (In reply to Bernd Edlinger from comment #56) > > > Agreed, I can split the patch. > > > > > > From what I understand, we should never emit ldrd/strd out of > > > the memmovdi2 pattern when optimizing for speed and disable > > > the peephole in the way I proposed it in the patch. > > > > No that's incorrect. Not generating LDRD when optimizing for speed means a > > slowdown on most cores, so it is essential we keep generating LDRD whenever > > possible. > > But if that is true, the current setting of prefer_lrdr_strd is wrong > in most cores, and should be fixed. The meaning is really: "prefer using ldrd/strd over ldm/stm in function prolog/epilog and inlined memcpy". So it says something about performance of large LDMs vs multiple LDRDs, rather than about performance of a single LDRD vs 2x LDR (basically LDRD doubles available memory bandwidth so is pretty much always a good idea). And yes I wouldn't be surprised if the setting is non-optimal for some cores.