[Bug target/77308] surprisingly large stack usage for sha512 on arm

wilco at gcc dot gnu.org Thu, 03 Nov 2016 08:43:11 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308


--- Comment #59 from wilco at gcc dot gnu.org ---
(In reply to Bernd Edlinger from comment #58)
> (In reply to wilco from comment #57)
> > (In reply to Bernd Edlinger from comment #56)
> > > Agreed, I can split the patch.
> > > 
> > > From what I understand, we should never emit ldrd/strd out of
> > > the memmovdi2 pattern when optimizing for speed and disable
> > > the peephole in the way I proposed it in the patch.
> > 
> > No that's incorrect. Not generating LDRD when optimizing for speed means a
> > slowdown on most cores, so it is essential we keep generating LDRD whenever
> > possible.
> 
> But if that is true, the current setting of prefer_lrdr_strd is wrong
> in most cores, and should be fixed.

The meaning is really: "prefer using ldrd/strd over ldm/stm in function
prolog/epilog and inlined memcpy". So it says something about performance of
large LDMs vs multiple LDRDs, rather than about performance of a single LDRD vs
2x LDR (basically LDRD doubles available memory bandwidth so is pretty much
always a good idea). And yes I wouldn't be surprised if the setting is
non-optimal for some cores.

[Bug target/77308] surprisingly large stack usage for sha512 on arm

Reply via email to