https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85324
Richard Biener changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed||2018-04-11
Ever confirmed|0 |1
--- Comment #1 from Richard Biener ---
_2 = __builtin_ia32_cvttps2dq ({ 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 });
_3 = _2 + { 1, 1, 1, 1 };
..
_2 = __builtin_ia32_cvttps2qq128_mask ({ 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 }, {
0, 0 }, 255);
_3 = _2 + { 1, 1 };
..
_2 = __builtin_ia32_cvttpd2dq ({ 1.0e+0, 1.0e+0 });
_3 = _2 + { 1, 1, 1, 1 };
..
_2 = __builtin_ia32_cvttpd2qq128_mask ({ 1.0e+0, 1.0e+0 }, { 0, 0 }, 255);
_3 = _2 + { 1, 1 };
..
_2 = __builtin_ia32_pmovdw128_mask ({ 1, 1, 1, 1 }, { 0, 0, 0, 0, 0, 0, 0, 0
}, 255);
_3 = _2 + { 1, 1, 1, 1, 1, 1, 1, 1 };
the middle-end has representations for all of those and can constant-fold them.
I suggest to fold the builtins to middle-end codes in the targets
gimple_fold_builtin hook. For the mask cases with a not always execute mask
the story may be different (exposing this to the middle-end requires a
two-vector "permutation" which might not combine back to the desired ops),
but maybe even then constant folding is beneficial in some cases (and then
good enough with the middle-end exposure?).