[Bug target/110625] [AArch64] Vect: SLP fails to vectorize a loop as the reduction_latency calculated by new costs is too large

rguenth at gcc dot gnu.org via Gcc-bugs Tue, 11 Jul 2023 03:41:48 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110625


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |aarch64
           Keywords|                            |missed-optimization

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, I think count is handled correctly even for SLP.  Given we accumulate
'short' to 'double' we likely perform 'count' adds to the m's here and those
are chained in a simple way.  We specifically avoid creating more
reduction variables because of register pressure issues with and without SLP
if possible.  Note when you have for example three scalar reductions we will
up the number of IVs to use with SLP, so using 'count' isn't always 100%
accurate but it the case of the testcase it should be.

But I'm not sure what "reduction-latency" tries to measure.

[Bug target/110625] [AArch64] Vect: SLP fails to vectorize a loop as the reduction_latency calculated by new costs is too large

Reply via email to