https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842

--- Comment #12 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Hongtao Liu from comment #11)
> (In reply to Tamar Christina from comment #9)
> > (In reply to Hongtao Liu from comment #8)
> > > (In reply to Tamar Christina from comment #7)
> > > > (In reply to Hongtao Liu from comment #6)
> > > > >  I noticed some double-counting of cost in group-candidate (regarding 
> > > > > loop
> > > > > invariant expressions), this modification reduces the number of 
> > > > > instructions
> > > > > executed by ~8% for exchange_r binary compiled with -march=x86-64-v3 
> > > > > -O2.
> > > > > 
> > > > 
> > > > Note that this patch causes regressions on AArch64.  While exchange 
> > > > improves
> > > > slightly I see regressions in: leela, -5%, mcf, xz, x264, deepsjeng -2%,
> > > > geomean -1%
> > > 
> > > What options do you use, we have an AmpereOne machine, like to try to see 
> > > if
> > > it's reproduciable on it.
> > 
> > This was on Neoverse-V2, but probably reproducible on AmpereOne, the flags
> > was -mcpu=native -Ofast -fomit-framepointer -flto=auto
> 
> I tested my patch against latest trunk, and use the same option, can't
> reproduce those regression on AWS graviton4.
> 

Sorry for the slow response. I did rebase and retry with latest trunk and
indeed  I no longer see any slowdowns with current trunk.

Reply via email to