On Fri, May 22, 2020 at 6:55 AM Hongtao Liu <crazy...@gmail.com> wrote:
>
> On Thu, May 21, 2020 at 7:18 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> >
> > On Thu, May 21, 2020 at 7:35 AM Hongtao Liu <crazy...@gmail.com> wrote:
> > >
> > > On Wed, May 20, 2020 at 11:43 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> > > >
> > > > On Wed, May 20, 2020 at 10:35 AM Hongtao Liu <crazy...@gmail.com> wrote:
> > > > >
> > > > > Hi:
> > > > >   Bootstrap is ok, regression test on i386/x86-64 backend is ok.
> > > > >
> > > > > gcc/ChangeLog:
> > > > >         PR target/92658
> > > > >         * config/i386/sse.md
> > > > >         (trunc<pmov_src_lower><mode>2, truncv32hiv32qi2,
> > > > >         trunc<ssedoublemodelower><mode>2): New expander.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >         * gcc.target/i386/pr92658-avx512f.c: New test.
> > > > >         * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > > > >         * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> > > >
> > > > There are more conversions to be added. There are:
> > > >
> > > > V2DImode to V2QImode, V2HImode, V2SImode
> > > > V4DImode to V4QImode, V4HImode, V4SImode
> > > > V8DImode to V8QImode, V8HImode, V8SImode
> > > >
> > > > V4SImode to V4QImode, V4HImode
> > > > V8SImode to V8QImode, V8HImode
> > > > V16SImode to V16QImode, V16HImode
> > > >
> > > > V8HImode to V8QImode
> > > > V16HImode to V16QImode
> > > > V32HImode to V32QImode
> > > >
> > > All of them are added
> > >
> > > Vectorization failure: (Add xfail in testcase for them since they need
> > > generic part)
> > > V2DImode to V2QImode, V2HImode
> > > V4DImode to V4QImode, V4HImode
> > > V8DImode to V8QImode
> > >
> > > V4SImode to V4QImode, V4HImode
> > > V8SImode to V8QImode
> > >
> > > V8HImode to V8QImode
> > >
> > > Vectorization success:
> > > V2DImode to V2SImode (under TARGET_MMX_WITH_SSE)
> > > V4DImode to V4SImode
> > > V8DImode to V8HImode, V8SImode
> > >
> > > V8SImode to V8HImode
> > > V16SImode to V16QImode, V16HImode
> > >
> > > V32HImode to V32QImode
> > > V16HImode to V16HImode.
> > >
> > >
> > > > Uros.
> > >
> > > Update patch.
> > > Regression test on i386/x86-64 backend is ok, bootstrap is ok.
> > >
> > > gcc/ChangeLog:
> > >         PR target/92658
> > >         * config/i386/sse.md
> > >         (trunc<pmov_src_lower><mode>2): New expander
> > >         (truncv32hiv32qi2): Ditto.
> > >         (trunc<ssedoublemodelower><mode>2): Ditto.
> > >         (trunc<mode><pmov_dst_3>2): Ditto.
> > >         (trunc<mode><pmov_dst_mode_4>2): Ditto.
> > >         (truncv2div2si2): Ditto.
> > >         (truncv8div8qi2): Ditto.
> > >         (avx512f_<code>v8div16qi2): Renaming
> > >         from *avx512f_<code>v8div16qi2.
> > >         (avx512vl_<code>v2div2si): Renaming
> > >         from *avx512vl_<code>v2div2si2.
> > >         (avx512vl_<code><mode>v2<ssecakarnum>qi2): Renaming
> > >         from *avx512vl_<code><mode>v<ssescalarnum>qi2.
> > >
> > > gcc/testsuite/ChangeLog:
> > >         * gcc.target/i386/pr92658-avx512f.c: New test.
> > >         * gcc.target/i386/pr92658-avx512vl.c: Ditto.
> > >         * gcc.target/i386/pr92658-avx512bw-trunc.c: Ditto.
> >
> >
> > +  rtx op = simplify_subreg (V16QImode, operands[0], <pmov_dst_3>mode, 0);
> > +  operands[0] = op ? op : gen_rtx_SUBREG (V16QImode, operands[0], 0);
> >
> > You should use simplify_gen_subreg, without null op fixup:
> >
> > operands[0] = simplify_gen_subreg (V16QImode, operands[0], 
> > <pmov_dst_3>mode, 0);
> >
> Changed.
> > +  "TARGET_MMX_WITH_SSE && TARGET_AVX512VL"
> >
> > Do you really need TARGET_MMX_WITH_SSE?  Narrow modes are active even
> > without this flag.
> >
> Changed.

+(define_expand "truncv8div8qi2"
+  [(set (match_operand:V8QI 0 "register_operand")
+    (truncate:V8QI
+        (match_operand:V8DI 1 "register_operand")))]
+  "TARGET_AVX512F && TARGET_MMX_WITH_SSE"
+{
+  operands[0] = simplify_gen_subreg (V16QImode, operands[0], V8QImode, 0);
+  emit_insn (gen_avx512f_truncatev8div16qi2 (operands[0], operands[1]));
+  DONE;
+})

You left one here.

+/* { dg-final { scan-assembler-times "vpmovqd" 2 { target { ! ia32 } } } } */

Target selector shouldn't be needed here.

The patch is OK with the above changes.

On a related note, it looks that pmov stores are modelled in a wrong
way. For example, this pattern;

(define_insn "*avx512f_<code>v8div16qi2_store"
  [(set (match_operand:V16QI 0 "memory_operand" "=m")
    (vec_concat:V16QI
      (any_truncate:V8QI
        (match_operand:V8DI 1 "register_operand" "v"))
      (vec_select:V8QI
        (match_dup 0)
        (parallel [(const_int 8) (const_int 9)
               (const_int 10) (const_int 11)
               (const_int 12) (const_int 13)
               (const_int 14) (const_int 15)]))))]

models the store in 128bit mode, but according to ISA, it stores in 16bit mode.

Uros.

Reply via email to