[Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization

2024-03-07 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

--- Comment #5 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #4)
> Note it is not just about constants either.

That is the same as what is mentioned in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94301#c2 even :).

[Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization

2024-02-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

--- Comment #4 from Andrew Pinski  ---
Note it is not just about constants either.
Take:
```
#define vect64 __attribute__((vector_size(8) ))
#define vect128 __attribute__((vector_size(16) ))

vect128 unsigned int f(vect64 unsigned int a, vect64 unsigned int b)
{
  vect64 unsigned int zero={0, 0};
  return __builtin_shufflevector (a, b, 0, 1, 2, 3);
}
```

We get:
```
  _1 = {a_3(D), { 0, 0 }};
  _2 = {b_4(D), { 0, 0 }};
  _5 = VEC_PERM_EXPR <_1, _2, { 0, 1, 4, 5 }>;
```

Which obvious could be done to just:
`_5 = {a_3(D), b_4(D)};`

[Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-31
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener  ---
Yeah, most of the code in forwprop/match doesn't deal with the "new" permutes
where the result isn't the same length as the inputs.

[Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization

2024-01-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

Andrew Pinski  changed:

   What|Removed |Added

 Target|x86_64  |x86_64 aarch64

--- Comment #2 from Andrew Pinski  ---
Here is another example, using 64/128 on aarch64:
```
#define vect64 __attribute__((vector_size(8) ))
#define vect128 __attribute__((vector_size(16) ))

vect128 unsigned int f(vect64 unsigned int a)
{
  vect64 unsigned int zero={0, 0};
  return __builtin_shufflevector (a, zero, 0, 1, 2, 3);
}
```

We get:
```
f:
moviv31.4s, 0
fmovd0, d0
zip1v0.2d, v0.2d, v31.2d
```

This should just produce the `fmov` for little-endian and `mov/ins` for
big-endian.

Note for this part of the issue the aarch64 back-end represents zip using
UNSPEC where it could use VEC_CONCAT instead. And it would do the correct thing
there ...

[Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization

2024-01-30 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

Andrew Pinski  changed:

   What|Removed |Added

 Target||x86_64

--- Comment #1 from Andrew Pinski  ---
I should note I noticed this while working on adding V4QI support for aarch64
but it is definite a generic issue.