[Bug target/85324] missing constant propagation on SSE/AVX conversion intrinsics

2021-09-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85324

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   Last reconfirmed|2018-04-11 00:00:00 |2021-9-4

[Bug target/85324] missing constant propagation on SSE/AVX conversion intrinsics

2018-04-11 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85324

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-04-11
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
  _2 = __builtin_ia32_cvttps2dq ({ 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 });
  _3 = _2 + { 1, 1, 1, 1 };
..
  _2 = __builtin_ia32_cvttps2qq128_mask ({ 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 }, {
0, 0 }, 255);
  _3 = _2 + { 1, 1 };
..
  _2 = __builtin_ia32_cvttpd2dq ({ 1.0e+0, 1.0e+0 });
  _3 = _2 + { 1, 1, 1, 1 };
..
  _2 = __builtin_ia32_cvttpd2qq128_mask ({ 1.0e+0, 1.0e+0 }, { 0, 0 }, 255);
  _3 = _2 + { 1, 1 };
..
  _2 = __builtin_ia32_pmovdw128_mask ({ 1, 1, 1, 1 }, { 0, 0, 0, 0, 0, 0, 0, 0
}, 255);
  _3 = _2 + { 1, 1, 1, 1, 1, 1, 1, 1 };

the middle-end has representations for all of those and can constant-fold them.

I suggest to fold the builtins to middle-end codes in the targets
gimple_fold_builtin hook.  For the mask cases with a not always execute mask
the story may be different (exposing this to the middle-end requires a
two-vector "permutation" which might not combine back to the desired ops),
but maybe even then constant folding is beneficial in some cases (and then
good enough with the middle-end exposure?).