[Bug target/92658] x86 lacks vector extend / truncate

2023-06-02 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #27 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:49337040865269e13cdc2ead276d12ecb2e9f606

commit r14-1509-g49337040865269e13cdc2ead276d12ecb2e9f606
Author: liuhongt 
Date:   Thu Jun 1 15:08:02 2023 +0800

i386: Add missing vector truncate patterns [PR92658].

Add missing insn patterns for v2si -> v2hi/v2qi and v2hi-> v2qi vector
truncate.

gcc/ChangeLog:

PR target/92658
* config/i386/mmx.md (truncv2hiv2qi2): New define_insn.
(truncv2si2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr92658-avx512bw-trunc-2.c: New test.

[Bug target/92658] x86 lacks vector extend / truncate

2023-05-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #26 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:608e7f3ab47fe746279c552c3574147aa3d8ee76

commit r14-666-g608e7f3ab47fe746279c552c3574147aa3d8ee76
Author: Uros Bizjak 
Date:   Wed May 10 22:40:53 2023 +0200

i386: Add missing vector extend patterns [PR92658]

Add missing insn pattern for v2qi -> v2si vector extend and named
expanders to activate generation of vector extends to 8-byte and 4-byte
vectors.

gcc/ChangeLog:

PR target/92658
* config/i386/mmx.md (sse4_1_v2qiv2si2): New insn pattern.
(v4qiv4hi2): New expander.
(v2hiv2si2): Ditto.
(v2qiv2si2): Ditto.
(v2qiv2hi2): Ditto.

gcc/testsuite/ChangeLog:

PR target/92658
* gcc.target/i386/pr92658-sse4-4b.c: New test.
* gcc.target/i386/pr92658-sse4-8b.c: New test.

[Bug target/92658] x86 lacks vector extend / truncate

2022-08-18 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658
Bug 92658 depends on bug 95201, which changed state.

Bug 95201 Summary: Some x86 vector-extend patterns are not exercised.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95201

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug target/92658] x86 lacks vector extend / truncate

2021-12-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #25 from Hongtao.liu  ---
(In reply to Uroš Bizjak from comment #5)
> Created attachment 47927 [details]
> Prototype patch v2
> 
> A couple of typos fixed.
> 
> Still doesn't vectorize v4qi->v4si, v2qi->v2di, v2hi->v2di and v4qi->v4di.
> 
> Looks like tree-opt issue, as explained in Comment #4.

v4qi->v4si, v2hi->v2di and v4qi->v4di are vectorized, but v2qi->v2di is not,
guess it's related to 16-bit vector support.

[Bug target/92658] x86 lacks vector extend / truncate

2021-12-14 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug target/92658] x86 lacks vector extend / truncate

2021-01-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #24 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:1b5669752426d225b0088d57d1d2fffba9625032

commit r11-6514-g1b5669752426d225b0088d57d1d2fffba9625032
Author: Hongyu Wang 
Date:   Tue Dec 29 15:14:09 2020 +0800

Adjust testcase for PR 92658

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr92658-avx512bw.c: Add
-mprefer-vector-width=512 to avoid impact of different default
mtune which gcc is built with.
* gcc.target/i386/pr92658-avx512bw-2.c: Ditto.

[Bug target/92658] x86 lacks vector extend / truncate

2020-12-28 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #23 from Hongtao.liu  ---
(In reply to Uroš Bizjak from comment #22)
> (In reply to Hongtao.liu from comment #21)
> > Add define_code_attr like aarch64/iterators.md?
> > 
> > --
> > ;; Map rtl objects to optab names
> > (define_code_attr optab [(ashift "ashl")
> >  (ashiftrt "ashr")
> >  (lshiftrt "lshr")
> >  (rotatert "rotr")
> >  (sign_extend "extend")
> >  (zero_extend "zero_extend")
> 
> Yes. Please go ahead, the patch is preapproved.

Fixed by r11-6351

[Bug target/92658] x86 lacks vector extend / truncate

2020-12-25 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #22 from Uroš Bizjak  ---
(In reply to Hongtao.liu from comment #21)
> Add define_code_attr like aarch64/iterators.md?
> 
> --
> ;; Map rtl objects to optab names
> (define_code_attr optab [(ashift "ashl")
>(ashiftrt "ashr")
>(lshiftrt "lshr")
>(rotatert "rotr")
>(sign_extend "extend")
>(zero_extend "zero_extend")

Yes. Please go ahead, the patch is preapproved.

[Bug target/92658] x86 lacks vector extend / truncate

2020-12-24 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #21 from Hongtao.liu  ---
(In reply to CVS Commits from comment #14)
> The master branch has been updated by Uros Bizjak :
> 
> https://gcc.gnu.org/g:f6e40195ec3d3b402a5f6c58dbf359479bc4cbfa
> 
> commit r11-485-gf6e40195ec3d3b402a5f6c58dbf359479bc4cbfa
> Author: Uros Bizjak 
> Date:   Tue May 19 11:25:46 2020 +0200
> 
> i386: Add missing vector zero/sign extend expanders [PR92658]
> 
> 2020-05-19  Uroš Bizjak  
> 
> gcc/ChangeLog:
> PR target/92658
> * config/i386/sse.md (v16qiv16hi2): New expander.
> (v32qiv32hi2): Ditto.
> (v8qiv8hi2): Ditto.
> (v16qiv16si2): Ditto.
> (v8qiv8si2): Ditto.
> (v4qiv4si2): Ditto.
> (v16hiv16si2): Ditto.
> (v8hiv8si2): Ditto.
> (v4hiv4si2): Ditto.
> (v8qiv8di2): Ditto.
> (v4qiv4di2): Ditto.
> (v2qiv2di2): Ditto.
> (v8hiv8di2): Ditto.
> (v4hiv4di2): Ditto.
> (v2hiv2di2): Ditto.
> (v8siv8di2): Ditto.
> (v4siv4di2): Ditto.
> (v2siv2di2): Ditto.
> 

 for sign_extend is sign_extend, but the standard pattern names for
sign_extend is extendmn. We need to to refine those expanders

Add define_code_attr like aarch64/iterators.md?

--
;; Map rtl objects to optab names
(define_code_attr optab [(ashift "ashl")
 (ashiftrt "ashr")
 (lshiftrt "lshr")
 (rotatert "rotr")
 (sign_extend "extend")
 (zero_extend "zero_extend")
--

> gcc/testsuite/ChangeLog:
> PR target/92658
> * gcc.target/i386/pr92658-sse4.c: New test.
> * gcc.target/i386/pr92658-avx2.c: New test.
> * gcc.target/i386/pr92658-avx512bw.c: New test.

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-22 Thread crazylht at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #20 from Hongtao.liu  ---
(In reply to Mark Wielaard from comment #19)
> (In reply to CVS Commits from comment #18)
> > gcc/testsuite/ChangeLog:
> > * gcc.target/i386/pr92658-avx512f.c: New test.
> > * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> 
> Note that the second one as committed has an extra closing brace which
> causes an error:
> 
> ERROR: gcc.target/i386/pr92658-avx512vl.c: unknown dg option: \} for "}"
> 
> diff --git a/gcc/testsuite/gcc.target/i386/pr92658-avx512vl.c
> b/gcc/testsuite/gcc.target/i386/pr92658-avx512vl.c
> index 50b32f968ac3..dc50084119b5 100644
> --- a/gcc/testsuite/gcc.target/i386/pr92658-avx512vl.c
> +++ b/gcc/testsuite/gcc.target/i386/pr92658-avx512vl.c
> @@ -121,7 +121,7 @@ truncdb_128 (v16qi * dst, v4si * __restrict src)
>dst[0] = *(v16qi *) tem;
>  }
>  
> -/* { dg-final { scan-assembler-times "vpmovqd" 2 } } } */
> +/* { dg-final { scan-assembler-times "vpmovqd" 2 } } */
>  /* { dg-final { scan-assembler-times "vpmovqw" 2 { xfail *-*-* } } } */
>  /* { dg-final { scan-assembler-times "vpmovqb" 2 { xfail *-*-* } } } */
>  /* { dg-final { scan-assembler-times "vpmovdw" 1 } } */

Oh, sorry for typo.

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-22 Thread mark at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

Mark Wielaard  changed:

   What|Removed |Added

 CC||mark at gcc dot gnu.org

--- Comment #19 from Mark Wielaard  ---
(In reply to CVS Commits from comment #18)
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/pr92658-avx512f.c: New test.
> * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.

Note that the second one as committed has an extra closing brace which causes
an error:

ERROR: gcc.target/i386/pr92658-avx512vl.c: unknown dg option: \} for "}"

diff --git a/gcc/testsuite/gcc.target/i386/pr92658-avx512vl.c
b/gcc/testsuite/gcc.target/i386/pr92658-avx512vl.c
index 50b32f968ac3..dc50084119b5 100644
--- a/gcc/testsuite/gcc.target/i386/pr92658-avx512vl.c
+++ b/gcc/testsuite/gcc.target/i386/pr92658-avx512vl.c
@@ -121,7 +121,7 @@ truncdb_128 (v16qi * dst, v4si * __restrict src)
   dst[0] = *(v16qi *) tem;
 }

-/* { dg-final { scan-assembler-times "vpmovqd" 2 } } } */
+/* { dg-final { scan-assembler-times "vpmovqd" 2 } } */
 /* { dg-final { scan-assembler-times "vpmovqw" 2 { xfail *-*-* } } } */
 /* { dg-final { scan-assembler-times "vpmovqb" 2 { xfail *-*-* } } } */
 /* { dg-final { scan-assembler-times "vpmovdw" 1 } } */

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-22 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #18 from CVS Commits  ---
The master branch has been updated by hongtao Liu :

https://gcc.gnu.org/g:e740f3d73144abbca1ad98a04825c6bd63314a0b

commit r11-571-ge740f3d73144abbca1ad98a04825c6bd63314a0b
Author: liuhongt 
Date:   Wed May 20 15:53:14 2020 +0800

Add missing vector truncmn2 expanders [PR92658]

2020-05-22  Hongtao.liu  

gcc/ChangeLog:
PR target/92658
* config/i386/sse.md (trunc2): New expander
(truncv32hiv32qi2): Ditto.
(trunc2): Ditto.
(trunc2): Ditto.
(trunc2): Ditto.
(truncv2div2si2): Ditto.
(truncv8div8qi2): Ditto.
(avx512f_v8div16qi2): Renaming from
*avx512f_v8div16qi2.
(avx512vl_v2div2si): Renaming from *avx512vl_v2div2si2.
(avx512vl_v2qi2): Renaming
from *avx512vl_vqi2.

gcc/testsuite/ChangeLog:
* gcc.target/i386/pr92658-avx512f.c: New test.
* gcc.target/i386/pr92658-avx512vl.c: Ditto.
* gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-20 Thread crazylht at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #17 from Hongtao.liu  ---
Created attachment 48570
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48570=edit
0001-Add-missing-vector-truncmn2-expanders-PR92658.patch

Seems there're only truncmn2 for truncate, not expander for us_truncate and
ss_truncate, am i missing?

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-19 Thread crazylht at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #16 from Hongtao.liu  ---
(In reply to Uroš Bizjak from comment #15)
> I will leave truncations (Down Converts in Intel speak) which are AVX512F
> instructions to someone else. It should be easy to add missing patterns and
> tests following the example of committed patch.

I'll take a look.

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-19 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

Uroš Bizjak  changed:

   What|Removed |Added

   Keywords||easyhack
   Assignee|ubizjak at gmail dot com   |unassigned at gcc dot 
gnu.org
 CC||ubizjak at gmail dot com
 Status|ASSIGNED|NEW

--- Comment #15 from Uroš Bizjak  ---
I will leave truncations (Down Converts in Intel speak) which are AVX512F
instructions to someone else. It should be easy to add missing patterns and
tests following the example of committed patch.

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-19 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #14 from CVS Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:f6e40195ec3d3b402a5f6c58dbf359479bc4cbfa

commit r11-485-gf6e40195ec3d3b402a5f6c58dbf359479bc4cbfa
Author: Uros Bizjak 
Date:   Tue May 19 11:25:46 2020 +0200

i386: Add missing vector zero/sign extend expanders [PR92658]

2020-05-19  Uroš Bizjak  

gcc/ChangeLog:
PR target/92658
* config/i386/sse.md (v16qiv16hi2): New expander.
(v32qiv32hi2): Ditto.
(v8qiv8hi2): Ditto.
(v16qiv16si2): Ditto.
(v8qiv8si2): Ditto.
(v4qiv4si2): Ditto.
(v16hiv16si2): Ditto.
(v8hiv8si2): Ditto.
(v4hiv4si2): Ditto.
(v8qiv8di2): Ditto.
(v4qiv4di2): Ditto.
(v2qiv2di2): Ditto.
(v8hiv8di2): Ditto.
(v4hiv4di2): Ditto.
(v2hiv2di2): Ditto.
(v8siv8di2): Ditto.
(v4siv4di2): Ditto.
(v2siv2di2): Ditto.

gcc/testsuite/ChangeLog:
PR target/92658
* gcc.target/i386/pr92658-sse4.c: New test.
* gcc.target/i386/pr92658-avx2.c: New test.
* gcc.target/i386/pr92658-avx512bw.c: New test.

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-15 Thread crazylht at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #13 from Hongtao.liu  ---
*** Bug 92611 has been marked as a duplicate of this bug. ***

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-14 Thread rsandifo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #12 from rsandifo at gcc dot gnu.org  
---
(In reply to rguent...@suse.de from comment #11)
> On Thu, 14 May 2020, ubizjak at gmail dot com wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658
> > 
> > --- Comment #10 from Uroš Bizjak  ---
> > The patch is ready to be pushed, it is waiting for a decision what to do 
> > with
> > failed cases.
> > 
> > Richi, should this patch move forward (eventually XFAILing failed cases), 
> > or do
> > you plan to look at the fails from the generic vectorizer POV?
> 
> I think we should go forward with the patch, either XFAILing the testcases
> or splitting out the testcase (and backend patterns that do not get
> exercised due to the issue).
> 
> I've already said in comment#8 that the issue here is optabs working
> with modes and not vector types, so it's a bit hard to use the
> same mechanism to deal with the currently failing cases.  One
> possible route would be to add V4QImode similar to how we now
> do V2SFmode with SSE but of course where do we stop ...
> 
> As said we can try to tackle this incrementally.  Maybe Richard also
> has input here?
Nothing to add really, but: yeah, the idea was that the target
would provide smaller vector modes if it can efficiently load,
store and (at least) add them.  I think it would be good to do
that for aarch64 Advanced SIMD at some point.

There are three points behind using vector modes for this:

1. It gives the target the flexibility to decide how best to
   implement the operation (All at one end, and if so, which end?
   Or spread out evenly across the vector?)  On aarch64, the
   choice would differ between SVE and Advanced SIMD.

2. It means that __attribute__((vector_size)) vectors benefit.

3. The smaller modes can be used in conjunction with
   autovectorize_vector_modes to give the loop vectoriser
   a choice between operating on packed or unpacked vectors.
   This is mostly useful for VECT_COMPARE_COSTS but can help
   with low-trip-count loops even without.

So yeah, defining V4QI sounds like the way to go if the
architecture can process it efficiently.  I get the "where
do we stop" concern, but .md mode iterators should help.

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-14 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #11 from rguenther at suse dot de  ---
On Thu, 14 May 2020, ubizjak at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658
> 
> --- Comment #10 from Uroš Bizjak  ---
> The patch is ready to be pushed, it is waiting for a decision what to do with
> failed cases.
> 
> Richi, should this patch move forward (eventually XFAILing failed cases), or 
> do
> you plan to look at the fails from the generic vectorizer POV?

I think we should go forward with the patch, either XFAILing the testcases
or splitting out the testcase (and backend patterns that do not get
exercised due to the issue).

I've already said in comment#8 that the issue here is optabs working
with modes and not vector types, so it's a bit hard to use the
same mechanism to deal with the currently failing cases.  One
possible route would be to add V4QImode similar to how we now
do V2SFmode with SSE but of course where do we stop ...

As said we can try to tackle this incrementally.  Maybe Richard also
has input here?

[Bug target/92658] x86 lacks vector extend / truncate

2020-05-14 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #10 from Uroš Bizjak  ---
The patch is ready to be pushed, it is waiting for a decision what to do with
failed cases.

Richi, should this patch move forward (eventually XFAILing failed cases), or do
you plan to look at the fails from the generic vectorizer POV?

[Bug target/92658] x86 lacks vector extend / truncate

2020-03-02 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #9 from Uroš Bizjak  ---
(In reply to rguent...@suse.de from comment #8)
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658
> > 
> > --- Comment #3 from Uroš Bizjak  ---
> > Richi, should the following test also vectorize?
> 
> In priciple yes.  I see both cases "vectorized" to a store
> from a CTOR but then my primitive pattern matching in forwprop
> not applying because supportable_convert_operation is confused
> about a vector(4) char type having SImode and bailing out at the
> 
> 292   if (!VECTOR_MODE_P (m1) || !VECTOR_MODE_P (m2))
> 293 return false;
> 
> check.  So it looks like with just SSE2 the backend doesn't
> consider V4QImode a supported vector type.  I'm not sure we
> want to fix that but then this means regular optab checks
> won't do it here.
> 
> Interesting cases nevertheless.
> 
> Would those conversions map to good assembly?

Yes, please see attached test cases, especially pr92658-sse4.c. We have
pmovzxbd for v4qi->v4si, pmovzxbq for v2qi->v2di, and pmovzxwd for v2hi->v2di,
in addition to AVX2 pmovzxbq for v4qi->v4di. All testcases should vectorize
with a single instruction. It is also possible to introduce additional
conversions to 64bit vectors (e.g. v2qi->v2si) for TARGET_MMX_WITH_SSE targets.

Please note that there are similar sign_extend patterns and corresponding
truncate patterns with AVX512F, currently not covered by attached patch and
testcases.

[Bug target/92658] x86 lacks vector extend / truncate

2020-03-01 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #8 from rguenther at suse dot de  ---
On Fri, 28 Feb 2020, ubizjak at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658
> 
> --- Comment #3 from Uroš Bizjak  ---
> Richi, should the following test also vectorize?

In priciple yes.  I see both cases "vectorized" to a store
from a CTOR but then my primitive pattern matching in forwprop
not applying because supportable_convert_operation is confused
about a vector(4) char type having SImode and bailing out at the

292   if (!VECTOR_MODE_P (m1) || !VECTOR_MODE_P (m2))
293 return false;

check.  So it looks like with just SSE2 the backend doesn't
consider V4QImode a supported vector type.  I'm not sure we
want to fix that but then this means regular optab checks
won't do it here.

Interesting cases nevertheless.

Would those conversions map to good assembly?

> --cut here--
> typedef unsigned char v16qi __attribute__((vector_size (16)));
> typedef unsigned int v4si __attribute__((vector_size (16)));
> 
> void
> foo_u8_u32 (v4si * dst, v16qi * __restrict src)
> {
>   unsigned int tem[4];
>   tem[0] = (*src)[0];
>   tem[1] = (*src)[1];
>   tem[2] = (*src)[2];
>   tem[3] = (*src)[3];
>   dst[0] = *(v4si *) tem;
> }
> 
> void
> bar_u8_u32 (v4si * dst, v16qi src)
> {
>   unsigned int tem[4];
>   tem[0] = src[0];
>   tem[1] = src[1];
>   tem[2] = src[2];
>   tem[3] = src[3];
>   dst[0] = *(v4si *) tem;
> }
> --cut here--

[Bug target/92658] x86 lacks vector extend / truncate

2020-02-28 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

Uroš Bizjak  changed:

   What|Removed |Added

 CC||richard.sandiford at arm dot 
com

--- Comment #7 from Uroš Bizjak  ---
CC author of the generic functionality part.

[Bug target/92658] x86 lacks vector extend / truncate

2020-02-28 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #6 from Uroš Bizjak  ---
Created attachment 47928
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47928=edit
Test cases

sse4, avx2 and avx512bw test cases.

Fails:

FAIL: gcc.target/i386/pr92658-avx2.c scan-assembler-times pmovzxbq 2
FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxbd 2
FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxbq 2
FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxwq 2

[Bug target/92658] x86 lacks vector extend / truncate

2020-02-28 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

Uroš Bizjak  changed:

   What|Removed |Added

  Attachment #47924|0   |1
is obsolete||

--- Comment #5 from Uroš Bizjak  ---
Created attachment 47927
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47927=edit
Prototype patch v2

A couple of typos fixed.

Still doesn't vectorize v4qi->v4si, v2qi->v2di, v2hi->v2di and v4qi->v4di.

Looks like tree-opt issue, as explained in Comment #4.

[Bug target/92658] x86 lacks vector extend / truncate

2020-02-28 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #4 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #3)
> Richi, should the following test also vectorize?

It doesn't vectorize because supportable_convert_operation returns false for:

(gdb) p debug_generic_expr (vectype_out)
vector(4) unsigned int
$1 = void
(gdb) p debug_generic_expr (vectype_in)
vector(4) unsigned char
$2 = void

and TYPE_MODE for "vector(4) unsigned char" returns E_SImode:

(gdb) p m1
$3 = E_V4SImode
(gdb) p m2
$4 = E_SImode

and:

  if (!VECTOR_MODE_P (m1) || !VECTOR_MODE_P (m2))
return false;

[Bug target/92658] x86 lacks vector extend / truncate

2020-02-28 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #3 from Uroš Bizjak  ---
Richi, should the following test also vectorize?

--cut here--
typedef unsigned char v16qi __attribute__((vector_size (16)));
typedef unsigned int v4si __attribute__((vector_size (16)));

void
foo_u8_u32 (v4si * dst, v16qi * __restrict src)
{
  unsigned int tem[4];
  tem[0] = (*src)[0];
  tem[1] = (*src)[1];
  tem[2] = (*src)[2];
  tem[3] = (*src)[3];
  dst[0] = *(v4si *) tem;
}

void
bar_u8_u32 (v4si * dst, v16qi src)
{
  unsigned int tem[4];
  tem[0] = src[0];
  tem[1] = src[1];
  tem[2] = src[2];
  tem[3] = src[3];
  dst[0] = *(v4si *) tem;
}
--cut here--

[Bug target/92658] x86 lacks vector extend / truncate

2020-02-27 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

Uroš Bizjak  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ubizjak at gmail dot com

--- Comment #2 from Uroš Bizjak  ---
Created attachment 47924
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47924=edit
Prototype patch

Prototype patch that introduces missing expanders to generate PMOVZX and PMOVSX
packed moves. For the testcase in Comment #0, we generate (-O3 -mavx):

bar:
vpmovzxbw   (%rsi), %xmm0
vmovdqa %xmm0, (%rdi)
ret

foo:
vpmovzxbw   %xmm0, %xmm0
vmovdqa %xmm0, (%rdi)
ret

[Bug target/92658] x86 lacks vector extend / truncate

2020-01-30 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2020-01-30
 CC||marxin at gcc dot gnu.org
 Ever confirmed|0   |1

[Bug target/92658] x86 lacks vector extend / truncate

2019-11-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658

--- Comment #1 from Richard Biener  ---
It looks like there are already some avx512 patterns matching this but they
are not visible to the RTL expanders.

(define_insn "zero_extendv8qiv8hi2"
  [(set (match_operand:V8HI 0 "register_operand" "=x,v")
(zero_extend:V8HI (match_operand:V8QI 1 "register_operand" "x,v")))]
  "TARGET_SSE2"
  "@
   punpcklbw\t{%1, %0|%0, %1}
   vpunpcklbw\t{%1, %0|%0, %1}"
)

does it for the testcase pasted but truncation and other extends are also
missing, likewise for floats.