On Fri, Oct 26, 2018 at 9:19 AM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On 10/25/18, Uros Bizjak <ubiz...@gmail.com> wrote:
> > On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> >>
> >> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
> >> a wrong way.  For example:
> >>
> >> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >>     (any_extend:V8HI
> >>       (vec_select:V8QI
> >>         (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
> >>         (parallel [(const_int 0) (const_int 1)
> >>                (const_int 2) (const_int 3)
> >>                (const_int 4) (const_int 5)
> >>                (const_int 6) (const_int 7)]))))]
> >>
> >> should be defind for memory operands as:
> >>
> >> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >>     (any_extend:V8HI
> >>       (match_operand:V8QI "memory_operand" "m,m,m")))]
> >>
> >> This set of patches updates them to
> >>
> >> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >>     (any_extend:V8HI
> >>       (vec_select:V8QI
> >>         (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
> >>         (parallel [(const_int 0) (const_int 1)
> >>                (const_int 2) (const_int 3)
> >>                (const_int 4) (const_int 5)
> >>                (const_int 6) (const_int 7)]))))]
> >>
> >> (define_insn "*sse4_1_<code>v8qiv8hi2<mask_name>_1"
> >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >>     (any_extend:V8HI
> >>       (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
> >>
> >> with a splitter:
> >>
> >> (define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
> >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> >
> > No constraints needed for pre-reload splitter.
> >
> >>         (any_extend:V8HI
> >>           (vec_select:V8QI
> >>             (subreg:V16QI
> >>               (vec_concat:V2DI
> >>                 (match_operand:DI 1 "memory_operand" "m,*m,m")
> >>                 (const_int 0)) 0)
> >>             (parallel [(const_int 0) (const_int 1)
> >>                        (const_int 2) (const_int 3)
> >>                        (const_int 4) (const_int 5)
> >>                        (const_int 6) (const_int 7)]))))]
> >>   "TARGET_SSE4_1 && <mask_avx512bw_condition> &&
> >> <mask_avx512vl_condition>"
> >>   "#"
> >>   "&& can_create_pseudo_p ()"
> >>   [(set (match_dup 0) (match_dup 1))]
> >
> >  [(set (match_dup 0)
> >       (any_extend:V8HI (match_dup 1)))]
> >
> >> {
> >>   operands[1] = gen_rtx_<CODE> (V8HImode,
> >>                                 gen_rtx_SUBREG (V8QImode,
> >>                                                 operands[1], 0));
> >> })
> >
> > Don't create subregs of memory. Use adjust_address_nv.
>
> Here is the updated patch.

> with a splitter:
>
> (define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
>  [(set (match_operand:V8HI 0 "register_operand")
>        (any_extend:V8HI
>          (vec_select:V8QI
>            (subreg:V16QI
>              (vec_concat:V2DI
>                (match_operand:DI 1 "memory_operand")
>                (const_int 0)) 0)
>            (parallel [(const_int 0) (const_int 1)
>                       (const_int 2) (const_int 3)
>                       (const_int 4) (const_int 5)
>                       (const_int 6) (const_int 7)]))))]
>  "TARGET_SSE4_1 && <mask_avx512bw_condition> && <mask_avx512vl_condition>"
>  "#"
>  "&& can_create_pseudo_p ()"

"can_create_pseudo_p ()" should go to the insn constraint and "&& 1"
should be used for split constraint. Both, insn and splitter are valid
only before reload.

>  [(set (match_dup 0)
>        (any_extend:V8HI (match_dup 1)))]
> {
>  operands[1] = adjust_address_nv (operands[1], V8QImode, 0);
> })

Please use double quotes for one-line preparation statement.

> (any_extend:V4SI
>   (match_operand:V4HI 1 "memory_operand" "m,*m,m")))]

Please remove star in front of memory constraint.

OK with the above changes.

Thanks,
Uros.

Reply via email to