[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-05-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #30 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Richard Sandiford
:

https://gcc.gnu.org/g:2602b71103d5ef2ef86000cac832b31dad3dfe2b

commit r13-8813-g2602b71103d5ef2ef86000cac832b31dad3dfe2b
Author: Richard Sandiford 
Date:   Fri May 31 15:56:05 2024 +0100

vect: Tighten vect_determine_precisions_from_range [PR113281]

This was another PR caused by the way that
vect_determine_precisions_from_range handles shifts.  We tried to
narrow 32768 >> x to a 16-bit shift based on range information for
the inputs and outputs, with vect_recog_over_widening_pattern
(after PR110828) adjusting the shift amount.  But this doesn't
work for the case where x is in [16, 31], since then 32-bit
32768 >> x is a well-defined zero, whereas no well-defined
16-bit 32768 >> y will produce 0.

We could perhaps generate x < 16 ? 32768 >> x : 0 instead,
but since vect_determine_precisions_from_range was never really
supposed to rely on fix-ups, it seems better to fix that instead.

The patch also makes the code more selective about which codes
can be narrowed based on input and output ranges.  This showed
that vect_truncatable_operation_p was missing cases for
BIT_NOT_EXPR (equivalent to BIT_XOR_EXPR of -1) and NEGATE_EXPR
(equivalent to BIT_NOT_EXPR followed by a PLUS_EXPR of 1).

pr113281-1.c is the original testcase.  pr113281-[23].c failed
before the patch due to overly optimistic narrowing.  pr113281-[45].c
previously passed and are meant to protect against accidental
optimisation regressions.

gcc/
PR target/113281
* tree-vect-patterns.cc (vect_recog_over_widening_pattern): Remove
workaround for right shifts.
(vect_truncatable_operation_p): Handle NEGATE_EXPR and
BIT_NOT_EXPR.
(vect_determine_precisions_from_range): Be more selective about
which codes can be narrowed based on their input and output ranges.
For shifts, require at least one more bit of precision than the
maximum shift amount.

gcc/testsuite/
PR target/113281
* gcc.dg/vect/pr113281-1.c: New test.
* gcc.dg/vect/pr113281-2.c: Likewise.
* gcc.dg/vect/pr113281-3.c: Likewise.
* gcc.dg/vect/pr113281-4.c: Likewise.
* gcc.dg/vect/pr113281-5.c: Likewise.

(cherry picked from commit 1a8261e047f7a2c2b0afb95716f7615cba718cd1)

[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-05-31 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #29 from Robin Dapp  ---
Just to document again:  The test case should not be vectorized and at some
point we will adjust the cost model so it is not going to be.  I'd prefer to
base that decision on real uarchs rather than adjust the generic cost model
right away though.

[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-03-13 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #28 from JuzheZhong  ---
The original cost model I did work for all cases but with some middle-end
changes
the cost model failed.

I don't have time to figure out what's going on here.

Robin may be interested at it.

[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-03-13 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #27 from Patrick O'Neill  ---
(In reply to Andrew Pinski from comment #26)
> (In reply to Edwin Lu from comment #25)
> > It's still persisting on trunk (at least for pr113281-1.c
> > https://godbolt.org/z/M9EK44hKe)
> 
> I looked into what the vectorizer produces:
>   vect__22.13_31 = (vector(8) int) vect_vec_iv_.12_8;
>   _22 = (int) a.4_25;
>   vect__12.14_33 = { 32872, 32872, 32872, 32872, 32872, 32872, 32872, 32872
> } >> vect__22.13_31;
>   _12 = 32872 >> _22;
>   vect_b_7.15_34 = (vector(8) short int) vect__12.14_33;
> 
> that is valid thing to do. That is do the shift in `vector(8) int` and then
> do a truncation. The issue originally was about doing the shift in
> `vector(8) short` which is not happening here.

The regressed testcase looks like its testing if riscv vectorizes the code at
all (the first issue Juzhe noted in comment #3 and then fixed). So this is a
performance regression for risc-v, not correctness.

[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-03-13 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #26 from Andrew Pinski  ---
(In reply to Edwin Lu from comment #25)
> It's still persisting on trunk (at least for pr113281-1.c
> https://godbolt.org/z/M9EK44hKe)

I looked into what the vectorizer produces:
  vect__22.13_31 = (vector(8) int) vect_vec_iv_.12_8;
  _22 = (int) a.4_25;
  vect__12.14_33 = { 32872, 32872, 32872, 32872, 32872, 32872, 32872, 32872 }
>> vect__22.13_31;
  _12 = 32872 >> _22;
  vect_b_7.15_34 = (vector(8) short int) vect__12.14_33;

that is valid thing to do. That is do the shift in `vector(8) int` and then do
a truncation. The issue originally was about doing the shift in `vector(8)
short` which is not happening here.

[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-03-13 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Edwin Lu  changed:

   What|Removed |Added

 CC||ewlu at rivosinc dot com

--- Comment #25 from Edwin Lu  ---
(In reply to Richard Sandiford from comment #24)
> Fixed on trunk so far, but it's latent on branches.  I'll see what
> the trunk fallout is like before asking about backports.

It looks like we have a regression for riscv 

I was going through the scan dump failures on trunk and ended up revisiting
https://github.com/patrick-rivos/gcc-postcommit-ci/issues/463 where
gcc.dg/vect/costmodel/riscv/rvv/pr113281-[125].c are failing the scan-dump
checks. I didn't realize at the time that the scan dumps were checking code
correctness and ended up ignoring it. 

It's still persisting on trunk (at least for pr113281-1.c
https://godbolt.org/z/M9EK44hKe)

A bisection on https://github.com/patrick-rivos/gcc-postcommit-ci/issues/463
commit range suggests
https://gcc.gnu.org/g:1a8261e047f7a2c2b0afb95716f7615cba718cd1 introduced it.

# first bad commit: [1a8261e047f7a2c2b0afb95716f7615cba718cd1] vect: Tighten
vect_determine_precisions_from_range [PR113281]

Configuration
../configure --prefix=$(pwd) --with-multilib-generator="rv64gcv-lp64d--"
make stamps/build-gcc-linux-stage1 -j 32

Testing
./build-gcc-linux-stage1/gcc/cc1  
../gcc/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c 
-march=rv64gcv -mabi=lp64d -mtune=rocket -mcmodel=medlow  
-fdiagnostics-plain-output  -march=rv64gcv_zvl256b -mabi=lp64d -O3
-ftree-vectorize -ffat-lto-objects -fno-ident   -o pr113281-1.s

[Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P1  |P2
   Target Milestone|14.0|11.5
Summary|[14 Regression] Wrong code  |[11/12/13 Regression]
   |due to vectorization of |Latent wrong code due to
   |shift reduction and missing |vectorization of shift
   |promotions since r14-3027   |reduction and missing
   ||promotions since r9-1590