On Mon, May 25, 2020 at 8:41 PM Uros Bizjak <[email protected]> wrote:
>
> On Mon, May 25, 2020 at 2:21 PM Hongtao Liu <[email protected]> wrote:
> >
> > According to Intel SDM, VPMOVQB xmm1/m16 {k1}{z}, xmm2 has 16-bit
> > memory_operand instead of 128-bit one which exists in current
> > implementation. Also for other vpmov instructions which have
> > memory_operand narrower than 128bits.
> >
> > Bootstrap is ok, regression test for i386/x86-64 backend is ok.
>
>
> + [(set (match_operand:HI 0 "memory_operand" "=m")
> + (subreg:HI (any_truncate:V2QI
> + (match_operand:V2DI 1 "register_operand" "v")) 0))]
>
> This should store in V2QImode, subregs are not allowed in insn patterns.
>
> You need a pre-reload splitter to split from register_operand to a
> memory_operand, Jakub fixed a bunch of pmov patterns a while ago, so
> perhaps he can give some additional advice here.
>
Like this?
---
(define_insn "*avx512vl_<code>v2div2qi2_store"
[(set (match_operand:V2QI 0 "memory_operand" "=m")
(any_truncate:V2QI
(match_operand:V2DI 1 "register_operand" "v")))]
"TARGET_AVX512VL"
"vpmov<trunsuffix>qb\t{%1, %0|%0, %1}"
[(set_attr "type" "ssemov")
(set_attr "memory" "store")
(set_attr "prefix" "evex")
(set_attr "mode" "TI")])
(define_insn_and_split "*avx512vl_<code>v2div2qi2_store"
[(set (match_operand:HI 0 "memory_operand")
(subreg:HI
(any_truncate:V2QI
(match_operand:V2DI 1 "register_operand")) 0))]
"TARGET_AVX512VL && ix86_pre_reload_split ()"
"#"
"&& 1"
[(set (match_dup 0)
(any_truncate:V2QI (match_dup 1)))]
"operands[0] = adjust_address_nv (operands[0], V2QImode, 0);")
---
> Uros.
>
>
> > gcc/ChangeLog
> >
> > * config/i386/sse.md (*avx512vl_<code>v2div2qi2_store): Refine
> > size of memory_operand according to Intel SDM.
> > (avx512vl_<code>v2div2qi2_mask_store): Ditto.
> > (*avx512vl_<code><mode>v4qi2_store): Ditto.
> > (avx512vl_<code><mode>v4qi2_mask_store): Ditto.
> > (*avx512vl_<code><mode>v8qi2_store): Ditto.
> > (avx512vl_<code><mode>v8qi2_mask_store): Ditto.
> > (*avx512vl_<code><mode>v4hi2_store): Ditto.
> > (avx512vl_<code><mode>v4hi2_mask_store): Ditto.
> > (*avx512vl_<code>v2div2hi2_store): Ditto.
> > (avx512vl_<code>v2div2hi2_mask_store): Ditto.
> > (*avx512vl_<code>v2div2si2_store): Ditto.
> > (avx512vl_<code>v2div2si2_mask_store): Ditto.
> > (*avx512f_<code>v8div16qi2_store): Ditto.
> > (avx512f_<code>v8div16qi2_mask_store): Ditto.
> > * config/i386/i386-builtin-types.def: Adjust builtin type.
> > * config/i386/i386-expand.c: Ditto.
> > * config/i386/i386-builtin.def: Adjust builtin.
> > * config/i386/avx512fintrin.h: Ditto.
> > * config/i386/avx512vlbwintrin.h: Ditto.
> > * config/i386/avx512vlintrin.h: Ditto.
> >
> > I think the code i changed is already covered by existed intrinsics
> > tests, so i didn't add any new tests.
> > --
> > BR,
> > Hongtao
--
BR,
Hongtao