https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858
Andrew Pinski changed:
What|Removed |Added
Target Milestone|--- |8.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858
--- Comment #8 from Andrew Pinski ---
(In reply to Richard Biener from comment #7)
> It was fixed by adding another loop header copying pass before
> vectorization, aka ch_vect.
But that went in way in GCC 6 (r6-1951) but the loop header
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858
Richard Biener changed:
What|Removed |Added
CC||rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858
Andrew Pinski changed:
What|Removed |Added
Keywords||missed-optimization
--- Comment #6 from
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858
--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org ---
Actually, it isn't vectorized at all, because PRE attempts to be smart, figures
out that for the first iteration of the loop it can avoid computing the sqrt
because the result will be
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858
--- Comment #5 from vincenzo Innocente vincenzo.innocente at cern dot ch ---
I remember something similar in the past
--param max-completely-peel-times=1
sort of fix it… (why pre does not recognize that 1/(1+0) == 1 btw??
of course it is just
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858
Jakub Jelinek jakub at gcc dot gnu.org changed:
What|Removed |Added
CC||jakub at gcc dot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858
--- Comment #2 from vincenzo Innocente vincenzo.innocente at cern dot ch ---
actually the code for div and sqr is different already for standard SSE
c++ -std=c++11 -Ofast -S avx2sqrt.cc -ftree-vectorizer-verbose=1 -Wall ; cat
avx2sqrt.s
.L2:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57858
--- Comment #3 from Marc Glisse glisse at gcc dot gnu.org ---
-fno-tree-pre lets it vectorize sqr as well. PRE creates a jump to the middle
of the loop body, which is nice but prevents vectorization.