https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88540
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Target| |x86_64-*-*, i?86-*-*
Status|UNCONFIRMED |NEW
Last reconfirmed| |2018-12-19
CC| |jakub at gcc dot gnu.org,
| |rguenth at gcc dot gnu.org
Component|c |tree-optimization
Blocks| |53947
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
This is because without -ffast-math the completely unrolled loop isn't
if-converted to MIN and thus basic-block vectorization fails. With
loop vectorization we apply if-conversion:
_1 = (long unsigned int) n_20;
_2 = _1 * 8;
_3 = d1_12(D) + _2;
_4 = *_3;
_5 = d2_13(D) + _2;
_6 = *_5;
iftmp.0_9 = _4 < _6 ? _4 : _6;
_7 = d3_14(D) + _2;
*_7 = iftmp.0_9;
n_16 = n_20 + 1;
and vectorize it as
vect_iftmp.7_43 = VEC_COND_EXPR <vect__4.3_39 < vect__6.6_42, vect__4.3_39,
vect__6.6_42>;
ending up as
(insn 12 11 13 (set (reg:V2DF 98 [ vect_iftmp.7 ])
(unspec:V2DF [
(reg:V2DF 87 [ vect__4.3 ])
(reg:V2DF 88 [ vect__6.6 ])
] UNSPEC_IEEE_MIN)) "t.c":7 -1
(nil))
and exactly the same assembly as with -ffast-math.
So the issue is that we do not if-convert the MIN pattern to use
a COND_EXPR in phiopt [when the target has an IEEE MIN we can use].
Or, that basic-block vectorization does not perform if-conversion
on non-loop code.
You can workaround in your code with
#pragma GCC unroll 0
for (int n = 0; n < SIZE; ++n)
{
d3[n] = d1[n] < d2[n] ? d1[n] : d2[n];
}
keeping the loop and using loop vectorization.
Note the backend could implement the fmin/fmax optabs which allows
more optimizations. Also minmax_replacement in phi-opt could make
use of the FMIN/FMAX IFNs when HONOR_NANS || HONOR_SIGNED_ZEROS
and the direct IFN is available.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations