Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-22 Thread Uros Bizjak via Gcc-patches
On Fri, May 22, 2020 at 11:52 AM Hongtao Liu  wrote:
> > On a related note, it looks that pmov stores are modelled in a wrong
> > way. For example, this pattern;
> >
> > (define_insn "*avx512f_v8div16qi2_store"
> >   [(set (match_operand:V16QI 0 "memory_operand" "=m")
> > (vec_concat:V16QI
> >   (any_truncate:V8QI
> > (match_operand:V8DI 1 "register_operand" "v"))
> >   (vec_select:V8QI
> > (match_dup 0)
> > (parallel [(const_int 8) (const_int 9)
> >(const_int 10) (const_int 11)
> >(const_int 12) (const_int 13)
> >(const_int 14) (const_int 15)]]
> >
> > models the store in 128bit mode, but according to ISA, it stores in 16bit 
> > mode.
> >
> according to ISA, it stores in 64bit mode
> vpmovqb xmm1/m64 {k1}{z}, zmm2.
>
> memory_operand is 128bit but upper 64bit is not changed which means it
> store only lower 64bits, just same meaning to ISA.

Sorry, I somehow mixed insn patterns. This is the right example:

(define_insn "*avx512vl_v2div2qi2_store"
  [(set (match_operand:V16QI 0 "memory_operand" "=m")
(vec_concat:V16QI
  (any_truncate:V2QI
  (match_operand:V2DI 1 "register_operand" "v"))
  (vec_select:V14QI
(match_dup 0)
(parallel [(const_int 2) (const_int 3)
   (const_int 4) (const_int 5)
   (const_int 6) (const_int 7)
   (const_int 8) (const_int 9)
   (const_int 10) (const_int 11)
   (const_int 12) (const_int 13)
   (const_int 14) (const_int 15)]]
  "TARGET_AVX512VL"
  "vpmovqb\t{%1, %0|%w0, %1}"
  [(set_attr "type" "ssemov")
   (set_attr "memory" "store")
   (set_attr "prefix" "evex")
   (set_attr "mode" "TI")])

The isa says:

EVEX.128.F3.0F38.W0 32 /r VPMOVQB xmm1/m16 {k1}{z}, xmm2

However, the pattern says that V16QImode is stored to a memory. Due to
this, insn template needs %w modifier for intel dialect, which is the
sign that something is wrong with the pattern.

These conversions should be reimplemented as having
nonimmedate_operand output operand and memory operand should be split
to a separate insn using a pre-reload splitter. Please see how sse4_1
conversions handle their input operands.

Uros.


Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-22 Thread Hongtao Liu via Gcc-patches
On Fri, May 22, 2020 at 2:41 PM Uros Bizjak  wrote:
>
> On Fri, May 22, 2020 at 6:55 AM Hongtao Liu  wrote:
> >
> > On Thu, May 21, 2020 at 7:18 PM Uros Bizjak  wrote:
> > >
> > > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu  wrote:
> > > >
> > > > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak  wrote:
> > > > >
> > > > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu  
> > > > > wrote:
> > > > > >
> > > > > > Hi:
> > > > > >   Bootstrap is ok, regression test on i386/x86-64 backend is ok.
> > > > > >
> > > > > > gcc/ChangeLog:
> > > > > > PR target/92658
> > > > > > * config/i386/sse.md
> > > > > > (trunc2, truncv32hiv32qi2,
> > > > > > trunc2): New expander.
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > > * gcc.target/i386/pr92658-avx512f.c: New test.
> > > > > > * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > > > > > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> > > > >
> > > > > There are more conversions to be added. There are:
> > > > >
> > > > > V2DImode to V2QImode, V2HImode, V2SImode
> > > > > V4DImode to V4QImode, V4HImode, V4SImode
> > > > > V8DImode to V8QImode, V8HImode, V8SImode
> > > > >
> > > > > V4SImode to V4QImode, V4HImode
> > > > > V8SImode to V8QImode, V8HImode
> > > > > V16SImode to V16QImode, V16HImode
> > > > >
> > > > > V8HImode to V8QImode
> > > > > V16HImode to V16QImode
> > > > > V32HImode to V32QImode
> > > > >
> > > > All of them are added
> > > >
> > > > Vectorization failure: (Add xfail in testcase for them since they need
> > > > generic part)
> > > > V2DImode to V2QImode, V2HImode
> > > > V4DImode to V4QImode, V4HImode
> > > > V8DImode to V8QImode
> > > >
> > > > V4SImode to V4QImode, V4HImode
> > > > V8SImode to V8QImode
> > > >
> > > > V8HImode to V8QImode
> > > >
> > > > Vectorization success:
> > > > V2DImode to V2SImode (under TARGET_MMX_WITH_SSE)
> > > > V4DImode to V4SImode
> > > > V8DImode to V8HImode, V8SImode
> > > >
> > > > V8SImode to V8HImode
> > > > V16SImode to V16QImode, V16HImode
> > > >
> > > > V32HImode to V32QImode
> > > > V16HImode to V16HImode.
> > > >
> > > >
> > > > > Uros.
> > > >
> > > > Update patch.
> > > > Regression test on i386/x86-64 backend is ok, bootstrap is ok.
> > > >
> > > > gcc/ChangeLog:
> > > > PR target/92658
> > > > * config/i386/sse.md
> > > > (trunc2): New expander
> > > > (truncv32hiv32qi2): Ditto.
> > > > (trunc2): Ditto.
> > > > (trunc2): Ditto.
> > > > (trunc2): Ditto.
> > > > (truncv2div2si2): Ditto.
> > > > (truncv8div8qi2): Ditto.
> > > > (avx512f_v8div16qi2): Renaming
> > > > from *avx512f_v8div16qi2.
> > > > (avx512vl_v2div2si): Renaming
> > > > from *avx512vl_v2div2si2.
> > > > (avx512vl_v2qi2): Renaming
> > > > from *avx512vl_vqi2.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > > * gcc.target/i386/pr92658-avx512f.c: New test.
> > > > * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > > > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> > >
> > >
> > > +  rtx op = simplify_subreg (V16QImode, operands[0], mode, 0);
> > > +  operands[0] = op ? op : gen_rtx_SUBREG (V16QImode, operands[0], 0);
> > >
> > > You should use simplify_gen_subreg, without null op fixup:
> > >
> > > operands[0] = simplify_gen_subreg (V16QImode, operands[0], 
> > > mode, 0);
> > >
> > Changed.
> > > +  "TARGET_MMX_WITH_SSE && TARGET_AVX512VL"
> > >
> > > Do you really need TARGET_MMX_WITH_SSE?  Narrow modes are active even
> > > without this flag.
> > >
> > Changed.
>
> +(define_expand "truncv8div8qi2"
> +  [(set (match_operand:V8QI 0 "register_operand")
> +(truncate:V8QI
> +(match_operand:V8DI 1 "register_operand")))]
> +  "TARGET_AVX512F && TARGET_MMX_WITH_SSE"
> +{
> +  operands[0] = simplify_gen_subreg (V16QImode, operands[0], V8QImode, 0);
> +  emit_insn (gen_avx512f_truncatev8div16qi2 (operands[0], operands[1]));
> +  DONE;
> +})
>
> You left one here.
>
> +/* { dg-final { scan-assembler-times "vpmovqd" 2 { target { ! ia32 } } } } */
>
> Target selector shouldn't be needed here.
>
> The patch is OK with the above changes.
>
Changed.
> On a related note, it looks that pmov stores are modelled in a wrong
> way. For example, this pattern;
>
> (define_insn "*avx512f_v8div16qi2_store"
>   [(set (match_operand:V16QI 0 "memory_operand" "=m")
> (vec_concat:V16QI
>   (any_truncate:V8QI
> (match_operand:V8DI 1 "register_operand" "v"))
>   (vec_select:V8QI
> (match_dup 0)
> (parallel [(const_int 8) (const_int 9)
>(const_int 10) (const_int 11)
>(const_int 12) (const_int 13)
>(const_int 14) (const_int 15)]]
>
> models the store in 128bit mode, but according to ISA, it stores in 16bit 
> mode.
>
according to ISA, it stores in 64bit mode
vpmovqb xmm1/m64 {k1}{z}, zmm2.

memory_operand is 128bit but upper 64bit is not changed which means it

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-22 Thread Uros Bizjak via Gcc-patches
On Fri, May 22, 2020 at 6:55 AM Hongtao Liu  wrote:
>
> On Thu, May 21, 2020 at 7:18 PM Uros Bizjak  wrote:
> >
> > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu  wrote:
> > >
> > > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak  wrote:
> > > >
> > > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu  wrote:
> > > > >
> > > > > Hi:
> > > > >   Bootstrap is ok, regression test on i386/x86-64 backend is ok.
> > > > >
> > > > > gcc/ChangeLog:
> > > > > PR target/92658
> > > > > * config/i386/sse.md
> > > > > (trunc2, truncv32hiv32qi2,
> > > > > trunc2): New expander.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > > * gcc.target/i386/pr92658-avx512f.c: New test.
> > > > > * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > > > > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> > > >
> > > > There are more conversions to be added. There are:
> > > >
> > > > V2DImode to V2QImode, V2HImode, V2SImode
> > > > V4DImode to V4QImode, V4HImode, V4SImode
> > > > V8DImode to V8QImode, V8HImode, V8SImode
> > > >
> > > > V4SImode to V4QImode, V4HImode
> > > > V8SImode to V8QImode, V8HImode
> > > > V16SImode to V16QImode, V16HImode
> > > >
> > > > V8HImode to V8QImode
> > > > V16HImode to V16QImode
> > > > V32HImode to V32QImode
> > > >
> > > All of them are added
> > >
> > > Vectorization failure: (Add xfail in testcase for them since they need
> > > generic part)
> > > V2DImode to V2QImode, V2HImode
> > > V4DImode to V4QImode, V4HImode
> > > V8DImode to V8QImode
> > >
> > > V4SImode to V4QImode, V4HImode
> > > V8SImode to V8QImode
> > >
> > > V8HImode to V8QImode
> > >
> > > Vectorization success:
> > > V2DImode to V2SImode (under TARGET_MMX_WITH_SSE)
> > > V4DImode to V4SImode
> > > V8DImode to V8HImode, V8SImode
> > >
> > > V8SImode to V8HImode
> > > V16SImode to V16QImode, V16HImode
> > >
> > > V32HImode to V32QImode
> > > V16HImode to V16HImode.
> > >
> > >
> > > > Uros.
> > >
> > > Update patch.
> > > Regression test on i386/x86-64 backend is ok, bootstrap is ok.
> > >
> > > gcc/ChangeLog:
> > > PR target/92658
> > > * config/i386/sse.md
> > > (trunc2): New expander
> > > (truncv32hiv32qi2): Ditto.
> > > (trunc2): Ditto.
> > > (trunc2): Ditto.
> > > (trunc2): Ditto.
> > > (truncv2div2si2): Ditto.
> > > (truncv8div8qi2): Ditto.
> > > (avx512f_v8div16qi2): Renaming
> > > from *avx512f_v8div16qi2.
> > > (avx512vl_v2div2si): Renaming
> > > from *avx512vl_v2div2si2.
> > > (avx512vl_v2qi2): Renaming
> > > from *avx512vl_vqi2.
> > >
> > > gcc/testsuite/ChangeLog:
> > > * gcc.target/i386/pr92658-avx512f.c: New test.
> > > * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> >
> >
> > +  rtx op = simplify_subreg (V16QImode, operands[0], mode, 0);
> > +  operands[0] = op ? op : gen_rtx_SUBREG (V16QImode, operands[0], 0);
> >
> > You should use simplify_gen_subreg, without null op fixup:
> >
> > operands[0] = simplify_gen_subreg (V16QImode, operands[0], 
> > mode, 0);
> >
> Changed.
> > +  "TARGET_MMX_WITH_SSE && TARGET_AVX512VL"
> >
> > Do you really need TARGET_MMX_WITH_SSE?  Narrow modes are active even
> > without this flag.
> >
> Changed.

+(define_expand "truncv8div8qi2"
+  [(set (match_operand:V8QI 0 "register_operand")
+(truncate:V8QI
+(match_operand:V8DI 1 "register_operand")))]
+  "TARGET_AVX512F && TARGET_MMX_WITH_SSE"
+{
+  operands[0] = simplify_gen_subreg (V16QImode, operands[0], V8QImode, 0);
+  emit_insn (gen_avx512f_truncatev8div16qi2 (operands[0], operands[1]));
+  DONE;
+})

You left one here.

+/* { dg-final { scan-assembler-times "vpmovqd" 2 { target { ! ia32 } } } } */

Target selector shouldn't be needed here.

The patch is OK with the above changes.

On a related note, it looks that pmov stores are modelled in a wrong
way. For example, this pattern;

(define_insn "*avx512f_v8div16qi2_store"
  [(set (match_operand:V16QI 0 "memory_operand" "=m")
(vec_concat:V16QI
  (any_truncate:V8QI
(match_operand:V8DI 1 "register_operand" "v"))
  (vec_select:V8QI
(match_dup 0)
(parallel [(const_int 8) (const_int 9)
   (const_int 10) (const_int 11)
   (const_int 12) (const_int 13)
   (const_int 14) (const_int 15)]]

models the store in 128bit mode, but according to ISA, it stores in 16bit mode.

Uros.


Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-21 Thread Hongtao Liu via Gcc-patches
On Thu, May 21, 2020 at 7:18 PM Uros Bizjak  wrote:
>
> On Thu, May 21, 2020 at 7:35 AM Hongtao Liu  wrote:
> >
> > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak  wrote:
> > >
> > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu  wrote:
> > > >
> > > > Hi:
> > > >   Bootstrap is ok, regression test on i386/x86-64 backend is ok.
> > > >
> > > > gcc/ChangeLog:
> > > > PR target/92658
> > > > * config/i386/sse.md
> > > > (trunc2, truncv32hiv32qi2,
> > > > trunc2): New expander.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > > * gcc.target/i386/pr92658-avx512f.c: New test.
> > > > * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > > > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> > >
> > > There are more conversions to be added. There are:
> > >
> > > V2DImode to V2QImode, V2HImode, V2SImode
> > > V4DImode to V4QImode, V4HImode, V4SImode
> > > V8DImode to V8QImode, V8HImode, V8SImode
> > >
> > > V4SImode to V4QImode, V4HImode
> > > V8SImode to V8QImode, V8HImode
> > > V16SImode to V16QImode, V16HImode
> > >
> > > V8HImode to V8QImode
> > > V16HImode to V16QImode
> > > V32HImode to V32QImode
> > >
> > All of them are added
> >
> > Vectorization failure: (Add xfail in testcase for them since they need
> > generic part)
> > V2DImode to V2QImode, V2HImode
> > V4DImode to V4QImode, V4HImode
> > V8DImode to V8QImode
> >
> > V4SImode to V4QImode, V4HImode
> > V8SImode to V8QImode
> >
> > V8HImode to V8QImode
> >
> > Vectorization success:
> > V2DImode to V2SImode (under TARGET_MMX_WITH_SSE)
> > V4DImode to V4SImode
> > V8DImode to V8HImode, V8SImode
> >
> > V8SImode to V8HImode
> > V16SImode to V16QImode, V16HImode
> >
> > V32HImode to V32QImode
> > V16HImode to V16HImode.
> >
> >
> > > Uros.
> >
> > Update patch.
> > Regression test on i386/x86-64 backend is ok, bootstrap is ok.
> >
> > gcc/ChangeLog:
> > PR target/92658
> > * config/i386/sse.md
> > (trunc2): New expander
> > (truncv32hiv32qi2): Ditto.
> > (trunc2): Ditto.
> > (trunc2): Ditto.
> > (trunc2): Ditto.
> > (truncv2div2si2): Ditto.
> > (truncv8div8qi2): Ditto.
> > (avx512f_v8div16qi2): Renaming
> > from *avx512f_v8div16qi2.
> > (avx512vl_v2div2si): Renaming
> > from *avx512vl_v2div2si2.
> > (avx512vl_v2qi2): Renaming
> > from *avx512vl_vqi2.
> >
> > gcc/testsuite/ChangeLog:
> > * gcc.target/i386/pr92658-avx512f.c: New test.
> > * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
>
>
> +  rtx op = simplify_subreg (V16QImode, operands[0], mode, 0);
> +  operands[0] = op ? op : gen_rtx_SUBREG (V16QImode, operands[0], 0);
>
> You should use simplify_gen_subreg, without null op fixup:
>
> operands[0] = simplify_gen_subreg (V16QImode, operands[0], mode, 
> 0);
>
Changed.
> +  "TARGET_MMX_WITH_SSE && TARGET_AVX512VL"
>
> Do you really need TARGET_MMX_WITH_SSE?  Narrow modes are active even
> without this flag.
>
Changed.

> Uros.

Update patch.

-- 
BR,
Hongtao


0001-Add-missing-vector-truncmn2-expanders-PR92658_V3.patch
Description: Binary data


Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-21 Thread Uros Bizjak via Gcc-patches
On Thu, May 21, 2020 at 7:35 AM Hongtao Liu  wrote:
>
> On Wed, May 20, 2020 at 11:43 PM Uros Bizjak  wrote:
> >
> > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu  wrote:
> > >
> > > Hi:
> > >   Bootstrap is ok, regression test on i386/x86-64 backend is ok.
> > >
> > > gcc/ChangeLog:
> > > PR target/92658
> > > * config/i386/sse.md
> > > (trunc2, truncv32hiv32qi2,
> > > trunc2): New expander.
> > >
> > > gcc/testsuite/ChangeLog:
> > > * gcc.target/i386/pr92658-avx512f.c: New test.
> > > * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> >
> > There are more conversions to be added. There are:
> >
> > V2DImode to V2QImode, V2HImode, V2SImode
> > V4DImode to V4QImode, V4HImode, V4SImode
> > V8DImode to V8QImode, V8HImode, V8SImode
> >
> > V4SImode to V4QImode, V4HImode
> > V8SImode to V8QImode, V8HImode
> > V16SImode to V16QImode, V16HImode
> >
> > V8HImode to V8QImode
> > V16HImode to V16QImode
> > V32HImode to V32QImode
> >
> All of them are added
>
> Vectorization failure: (Add xfail in testcase for them since they need
> generic part)
> V2DImode to V2QImode, V2HImode
> V4DImode to V4QImode, V4HImode
> V8DImode to V8QImode
>
> V4SImode to V4QImode, V4HImode
> V8SImode to V8QImode
>
> V8HImode to V8QImode
>
> Vectorization success:
> V2DImode to V2SImode (under TARGET_MMX_WITH_SSE)
> V4DImode to V4SImode
> V8DImode to V8HImode, V8SImode
>
> V8SImode to V8HImode
> V16SImode to V16QImode, V16HImode
>
> V32HImode to V32QImode
> V16HImode to V16HImode.
>
>
> > Uros.
>
> Update patch.
> Regression test on i386/x86-64 backend is ok, bootstrap is ok.
>
> gcc/ChangeLog:
> PR target/92658
> * config/i386/sse.md
> (trunc2): New expander
> (truncv32hiv32qi2): Ditto.
> (trunc2): Ditto.
> (trunc2): Ditto.
> (trunc2): Ditto.
> (truncv2div2si2): Ditto.
> (truncv8div8qi2): Ditto.
> (avx512f_v8div16qi2): Renaming
> from *avx512f_v8div16qi2.
> (avx512vl_v2div2si): Renaming
> from *avx512vl_v2div2si2.
> (avx512vl_v2qi2): Renaming
> from *avx512vl_vqi2.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/pr92658-avx512f.c: New test.
> * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.


+  rtx op = simplify_subreg (V16QImode, operands[0], mode, 0);
+  operands[0] = op ? op : gen_rtx_SUBREG (V16QImode, operands[0], 0);

You should use simplify_gen_subreg, without null op fixup:

operands[0] = simplify_gen_subreg (V16QImode, operands[0], mode, 0);

+  "TARGET_MMX_WITH_SSE && TARGET_AVX512VL"

Do you really need TARGET_MMX_WITH_SSE?  Narrow modes are active even
without this flag.

Uros.


Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-20 Thread Hongtao Liu via Gcc-patches
On Wed, May 20, 2020 at 11:43 PM Uros Bizjak  wrote:
>
> On Wed, May 20, 2020 at 10:35 AM Hongtao Liu  wrote:
> >
> > Hi:
> >   Bootstrap is ok, regression test on i386/x86-64 backend is ok.
> >
> > gcc/ChangeLog:
> > PR target/92658
> > * config/i386/sse.md
> > (trunc2, truncv32hiv32qi2,
> > trunc2): New expander.
> >
> > gcc/testsuite/ChangeLog:
> > * gcc.target/i386/pr92658-avx512f.c: New test.
> > * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
>
> There are more conversions to be added. There are:
>
> V2DImode to V2QImode, V2HImode, V2SImode
> V4DImode to V4QImode, V4HImode, V4SImode
> V8DImode to V8QImode, V8HImode, V8SImode
>
> V4SImode to V4QImode, V4HImode
> V8SImode to V8QImode, V8HImode
> V16SImode to V16QImode, V16HImode
>
> V8HImode to V8QImode
> V16HImode to V16QImode
> V32HImode to V32QImode
>
All of them are added

Vectorization failure: (Add xfail in testcase for them since they need
generic part)
V2DImode to V2QImode, V2HImode
V4DImode to V4QImode, V4HImode
V8DImode to V8QImode

V4SImode to V4QImode, V4HImode
V8SImode to V8QImode

V8HImode to V8QImode

Vectorization success:
V2DImode to V2SImode (under TARGET_MMX_WITH_SSE)
V4DImode to V4SImode
V8DImode to V8HImode, V8SImode

V8SImode to V8HImode
V16SImode to V16QImode, V16HImode

V32HImode to V32QImode
V16HImode to V16HImode.


> Uros.

Update patch.
Regression test on i386/x86-64 backend is ok, bootstrap is ok.

gcc/ChangeLog:
PR target/92658
* config/i386/sse.md
(trunc2): New expander
(truncv32hiv32qi2): Ditto.
(trunc2): Ditto.
(trunc2): Ditto.
(trunc2): Ditto.
(truncv2div2si2): Ditto.
(truncv8div8qi2): Ditto.
(avx512f_v8div16qi2): Renaming
from *avx512f_v8div16qi2.
(avx512vl_v2div2si): Renaming
from *avx512vl_v2div2si2.
(avx512vl_v2qi2): Renaming
from *avx512vl_vqi2.

gcc/testsuite/ChangeLog:
* gcc.target/i386/pr92658-avx512f.c: New test.
* gcc.target/i386/pr92658-avx512vl.c: Ditto.
* gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.

-- 
BR,
Hongtao
From 6abdd010e60f590eb3fdfde9e0835af2aaecbd17 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Wed, 20 May 2020 15:53:14 +0800
Subject: [PATCH] Add missing vector truncmn2 expanders [PR92658]

2020-0520  Hongtao.liu  

gcc/ChangeLog:
	PR target/92658
	* config/i386/sse.md
	(trunc2): New expander
	(truncv32hiv32qi2): Ditto.
	(trunc2): Ditto.
	(trunc2): Ditto.
	(trunc2): Ditto.
	(truncv2div2si2): Ditto.
	(truncv8div8qi2): Ditto.
	(avx512f_v8div16qi2): Renaming
	from *avx512f_v8div16qi2.
	(avx512vl_v2div2si): Renaming
	from *avx512vl_v2div2si2.
	(avx512vl_v2qi2): Renaming
	from *avx512vl_vqi2.

gcc/testsuite/ChangeLog:
	* gcc.target/i386/pr92658-avx512f.c: New test.
	* gcc.target/i386/pr92658-avx512vl.c: Ditto.
	* gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
---
 gcc/config/i386/sse.md|  81 ++-
 .../gcc.target/i386/pr92658-avx512bw-trunc.c  |  91 
 .../gcc.target/i386/pr92658-avx512f.c | 106 ++
 .../gcc.target/i386/pr92658-avx512vl.c| 129 ++
 4 files changed, 403 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr92658-avx512bw-trunc.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr92658-avx512f.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr92658-avx512vl.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 9bf4361384a..78af0e2ea14 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -10491,6 +10491,12 @@
 (define_mode_attr pmov_suff_1
   [(V16QI "db") (V16HI "dw") (V8SI "qd") (V8HI "qw")])
 
+(define_expand "trunc2"
+  [(set (match_operand:PMOV_DST_MODE_1 0 "nonimmediate_operand")
+	(truncate:PMOV_DST_MODE_1
+	  (match_operand: 1 "register_operand")))]
+  "TARGET_AVX512F")
+
 (define_insn "*avx512f_2"
   [(set (match_operand:PMOV_DST_MODE_1 0 "nonimmediate_operand" "=v,m")
 	(any_truncate:PMOV_DST_MODE_1
@@ -10525,6 +10531,12 @@
   (match_operand: 2 "register_operand")))]
   "TARGET_AVX512F")
 
+(define_expand "truncv32hiv32qi2"
+  [(set (match_operand:V32QI 0 "nonimmediate_operand")
+	(truncate:V32QI
+	  (match_operand:V32HI 1 "register_operand")))]
+  "TARGET_AVX512BW")
+
 (define_insn "avx512bw_v32hiv32qi2"
   [(set (match_operand:V32QI 0 "nonimmediate_operand" "=v,m")
 	(any_truncate:V32QI
@@ -10564,6 +10576,12 @@
 (define_mode_attr pmov_suff_2
   [(V16QI "wb") (V8HI "dw") (V4SI "qd")])
 
+(define_expand "trunc2"
+  [(set (match_operand:PMOV_DST_MODE_2 0 "nonimmediate_operand")
+	(truncate:PMOV_DST_MODE_2
+	  (match_operand: 1 "register_operand")))]
+  "TARGET_AVX512VL")
+
 (define_insn "*avx512vl_2"
   [(set (match_operand:PMOV_DST_MODE_2 0 "nonimmediate_operand" "=v,m")
 	(any_truncate:PMOV_DST_MODE_2
@@ -10606,7 +10624,21 @@
 (define_mode_attr pmov_suff_3
   [(V4DI "qb") 

Re: [PATCH][PR92658] Add missing vector truncmn2 expanders for avx512f

2020-05-20 Thread Uros Bizjak via Gcc-patches
On Wed, May 20, 2020 at 10:35 AM Hongtao Liu  wrote:
>
> Hi:
>   Bootstrap is ok, regression test on i386/x86-64 backend is ok.
>
> gcc/ChangeLog:
> PR target/92658
> * config/i386/sse.md
> (trunc2, truncv32hiv32qi2,
> trunc2): New expander.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/pr92658-avx512f.c: New test.
> * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.

There are more conversions to be added. There are:

V2DImode to V2QImode, V2HImode, V2SImode
V4DImode to V4QImode, V4HImode, V4SImode
V8DImode to V8QImode, V8HImode, V8SImode

V4SImode to V4QImode, V4HImode
V8SImode to V8QImode, V8HImode
V16SImode to V16QImode, V16HImode

V8HImode to V8QImode
V16HImode to V16QImode
V32HImode to V32QImode

Uros.