[Bug tree-optimization/112457] Possible better vectorization of different reduction min/max reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457 --- Comment #5 from Richard Biener --- You want to find the duplicate bugreport for the min/max + index reductions, IIRC the issue is that we fail the reduction detection because of multi-use and we should really have two conditional reductions, one on the value and one on the index without trying to be too clever combining them into a single one. That is, don't try to invent sth completely new based on what LLVM does but understand what's missing in GCCs handling of conditional reductions (it can do conditional value and conditional index reductions just fine, just not both at the same time IIRC).
[Bug tree-optimization/112457] Possible better vectorization of different reduction min/max reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #4 from Xi Ruoyao --- There is also: double test (double *p) { double ret = p[0]; for (int i = 1; i < 4; i++) ret = __builtin_fmin (ret, p[i]); return ret; } This is not vectorized. And double test (double *p) { double ret = __builtin_inf(); /* or __builtin_nan("") */ for (int i = 0; i < 4; i++) ret = __builtin_fmin (ret, p[i]); return ret; } is compiled to: _16 = .REDUC_FMIN (vect__4.7_17); _22 = .REDUC_FMIN ({ Inf, Inf, Inf, Inf }); _20 = .FMIN (_16, _22); [tail call] return _20; So there is an redundant .FMIN operation.
[Bug tree-optimization/112457] Possible better vectorization of different reduction min/max reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457 --- Comment #3 from JuzheZhong --- Created attachment 56973 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56973=edit min/max reduction approach with index Hi, Richi. I have watch all PPT/video of 2023 llvm development meeting. Turns out they already have a feasible solution/approach to support min/max reduction with index. Is it Ok that I support it by following the LLVM approach ? The attachment is the PPT of LLVM development meeting that mentioned min/max reduction with index. Thanks.
[Bug tree-optimization/112457] Possible better vectorization of different reduction min/max reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112457 Richard Biener changed: What|Removed |Added Component|c |tree-optimization Blocks||53947 --- Comment #2 from Richard Biener --- Well, this is because MAX_EXPR detection fails when store motion inserts flags (the max = max is elided) to avoid store-data races. Also when using -Ofast we avoid this but then the next phiopt comes too late to discover MAX after store motion is applied. The more practical example is int foo2 (int max, int n, int * __restrict a) { for (int i = 0; i < n; ++i) if (max < a[i]) { max = a[i]; } return max; } and that's handled OK. For your second example, index reduction, there's already bugreports. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations