[Bug target/98792] Fail to use SHRN instructions for narrowing shift on aarch64

2023-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Andrew Pinski  ---
It was fixed in GCC 12 by one of the following commits:
r12-7142-g83d7e720cd1d07
r12-7141-gbce43c0493f65d
r12-7140-g4057266ce5afc1
r12-7138-gaeef5c57f161ad

[Bug target/98792] Fail to use SHRN instructions for narrowing shift on aarch64

2021-09-02 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792

--- Comment #3 from Andrew Pinski  ---
We do produce shrn2 but not shrn now:

.L2:
ldp q0, q1, [x0]
add x0, x0, 32
ushrv0.4s, v0.4s, 3
xtn v0.4h, v0.4s
shrn2   v0.8h, v1.4s, 3
str q0, [x1], 16
cmp x0, x2
bne .L2

[Bug target/98792] Fail to use SHRN instructions for narrowing shift on aarch64

2021-03-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-03-07
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Severity|normal  |enhancement

--- Comment #2 from Andrew Pinski  ---
Confirmed.


(insn 17 16 18 3 (set (reg:V8HI 109 [ vect__3.8 ])
(vec_concat:V8HI (truncate:V4HI (reg:V4SI 105 [ vect__2.7 ]))
(truncate:V4HI (reg:V4SI 107 [ vect__2.7 ] "t9.c":9:16 1942
{vec_pack_trunc_v4si}
 (expr_list:REG_DEAD (reg:V4SI 107 [ vect__2.7 ])
(expr_list:REG_DEAD (reg:V4SI 105 [ vect__2.7 ])
(nil
(insn 18 17 19 3 (set (mem:V8HI (post_inc:DI (reg:DI 92 [ ivtmp.16 ])) [2 MEM
 [(short unsigned int *)_7]+0 S16 A128])
(reg:V8HI 109 [ vect__3.8 ])) "t9.c":9:16 1161 {*aarch64_simd_movv8hi}
 (expr_list:REG_DEAD (reg:V8HI 109 [ vect__3.8 ])
(expr_list:REG_INC (reg:DI 92 [ ivtmp.16 ])
(nil
Part of the problem is the above. 
So this might need to be done at the gimple level such that we don't do the
vec_concat in the first place 
That is if we had the RTL for:
ushrv1.4s, v1.4s, 3
ushrv0.4s, v0.4s, 3
xtn v2.4h, v1.4s
xtn v3.8h, v0.4s
str d3, d2, [x1], 16
I think combine would have done its job.

[Bug target/98792] Fail to use SHRN instructions for narrowing shift on aarch64

2021-01-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98792

Richard Biener  changed:

   What|Removed |Added

 Blocks||53947

--- Comment #1 from Richard Biener  ---
would need such concept, like a named pattern and a vector pattern recognizing
it.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations