[Bug target/109964] auto-vectorization of shift ignores integral promotions

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Richard Biener  ---
Fixed.

[Bug target/109964] auto-vectorization of shift ignores integral promotions

2024-04-15 Thread mkretz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

--- Comment #7 from Matthias Kretz (Vir)  ---
looks good to me

[Bug target/109964] auto-vectorization of shift ignores integral promotions

2024-04-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

--- Comment #6 from Richard Biener  ---
I think this has been fixed now?

[Bug target/109964] auto-vectorization of shift ignores integral promotions

2023-07-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

--- Comment #5 from Richard Biener  ---
This now hits PR110838.

[Bug target/109964] auto-vectorization of shift ignores integral promotions

2023-05-25 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

--- Comment #4 from rsandifo at gcc dot gnu.org  
---
(In reply to Richard Biener from comment #3)
> So the bug in the vectorizer is that it does
> 
> t.ii:14:5: note:   can narrow to signed:16 without loss of precision: _31 =
> 1 >> _30;
> t.ii:14:5: note:   only the low 16 bits of _30 are significant
> 
> while that's correct that's not the proper constraint to check for in
> vect_recog_over_widening_pattern I think.  That has code to deal with
> overflow for plus/minus/mult but no defense against shifts
> (that's also not a vect_truncatable_operation_p, so maybe it should simply
> check that)
Like you say, vect_truncatable_operation_p already checks for operations
for which truncation is distributive.  But the function is supposed
to handle other cases too.  E.g. I think the current code is correct
for division.

But yeah, right shifts are an awkward case, because the defined
range of the second operand is much smaller than the range of the
type, and yet the defined range depends on the range of the type.
Sorry for not thinking about that at the time.

[Bug target/109964] auto-vectorization of shift ignores integral promotions

2023-05-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

--- Comment #3 from Richard Biener  ---
So the bug in the vectorizer is that it does

t.ii:14:5: note:   can narrow to signed:16 without loss of precision: _31 = 1
>> _30;
t.ii:14:5: note:   only the low 16 bits of _30 are significant

while that's correct that's not the proper constraint to check for in
vect_recog_over_widening_pattern I think.  That has code to deal with
overflow for plus/minus/mult but no defense against shifts
(that's also not a vect_truncatable_operation_p, so maybe it should simply
check that)

[Bug target/109964] auto-vectorization of shift ignores integral promotions

2023-05-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

Richard Biener  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #2 from Richard Biener  ---
The simplest thing would be to make this a target bug for their vec_shr
expander not fulfilling the (undocumented) semantics.

[Bug target/109964] auto-vectorization of shift ignores integral promotions

2023-05-25 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109964

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2023-05-25
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||rguenth at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org

--- Comment #1 from Richard Biener  ---
using V [[gnu::vector_size(16)]] = short;

V f(V s)
{
  return V{
short(1) >> s[0],
short(1) >> s[1],
short(1) >> s[2],
short(1) >> s[3],
short(1) >> s[4],
short(1) >> s[5],
short(1) >> s[6],
short(1) >> s[7]
};
}

the frontend gives us

return  = {(short int) (1 >> (int) VIEW_CONVERT_EXPR(s)[0]), (short int) (1 >> (int) VIEW_CONVERT_EXPR(s)[1]),
(short int) (1 >> (int) VIEW_CONVERT_EXPR(s)[2]), (short int) (1
>> (int) VIEW_CONVERT_EXPR(s)[3]), (short int) (1 >> (int)
VIEW_CONVERT_EXPR(s)[4]), (short int) (1 >> (int)
VIEW_CONVERT_EXPR(s)[5]), (short int) (1 >> (int)
VIEW_CONVERT_EXPR(s)[6]), (short int) (1 >> (int)
VIEW_CONVERT_EXPR(s)[7])};

and the promotion/demotion stays until eventually basic-block vectorization
vectorizes this without doing the promotion.

t.ii:14:5: note:   vect_recog_over_widening_pattern: detected: _3 = 1 >> _2;
t.ii:14:5: note:   demoting int to signed short
t.ii:14:5: note:   created pattern stmt: patt_36 = 1 >> _1;

I don't think there's any way to query the target about whether vector
shifts "behave".

We end up with

V f (V s)
{
  vector(8) signed short vect_patt_36.3;

   [local count: 1073741824]:
  vect_patt_36.3_62 = { 1, 1, 1, 1, 1, 1, 1, 1 } >> s_33(D);
  return vect_patt_36.3_62;