On Sat, May 9, 2026 at 10:20 AM H.J. Lu <[email protected]> wrote:
>
> On Sat, May 9, 2026 at 9:10 AM Liu, Hongtao <[email protected]> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: H.J. Lu <[email protected]>
> > > Sent: Saturday, May 9, 2026 7:57 AM
> > > To: GCC Patches <[email protected]>; Uros Bizjak
> > > <[email protected]>; Liu, Hongtao <[email protected]>
> > > Subject: [PATCH] x86_cse: Check CONST0_RTX and CONSTM1_RTX for
> > > X86_CSE_VEC_DUP
> > >
> > > Check CONST0_RTX and CONSTM1_RTX when placing
> > >
> > > (insn 32 2 7 2 (set (reg:V2DI 114)
> > >         (const_vector:V2DI [
> > >                 (const_int 0 [0]) repeated x2
> > >             ])) -1
> > >      (nil))
> > >
> > > after
> > >
> > > (note 2 3 32 2 NOTE_INSN_FUNCTION_BEG)
> > >
> > > for X86_CSE_VEC_DUP, not X86_CSE_CONST0_VECTOR or
> > > X86_CSE_CONSTM1_VECTOR, after replacing redundant vector loads:
> > >
> > > (insn 31 15 16 2 (set (reg/v/f:DI 99 [ d ])
> > >         (const_int 0 [0])) "x.c":5:16 -1
> > >      (nil))
> > > ...
> > > (insn 18 17 19 2 (set (reg:V2DI 111 [ _22 ])
> > >         (vec_duplicate:V2DI (reg/v/f:DI 99 [ d ]))) "x.c":5:16 9345 
> > > {*vec_dupv2di}
> > >      (nil))
> > >
> > > ...
> > > (insn 29 12 15 2 (set (reg/v/f:DI 98 [ c ])
> > >         (const_int 0 [0])) "x.c":5:16 -1
> > >      (nil))
> > > ...
> > > (insn 20 19 21 2 (set (reg:V2DI 112 [ _20 ])
> > >         (vec_duplicate:V2DI (reg/v/f:DI 98 [ c ]))) "x.c":5:16 9345 
> > > {*vec_dupv2di}
> > >      (nil))
> > >
> > > with
> > >
> > > (insn 18 17 19 2 (set (reg:V2DI 111 [ _22 ])
> > >         (reg:V2DI 114)) "x.c":5:16 2454 {movv2di_internal}
> > >      (nil))
> > >
> > > and
> > >
> > > (insn 20 19 21 2 (set (reg:V2DI 112 [ _20 ])
> > >         (reg:V2DI 114)) "x.c":5:16 2454 {movv2di_internal}
> > >      (nil))
> > >
> > > gcc/
> > >
> > > PR target/125239
> > > * config/i386/i386-features.cc (ix86_place_single_vector_set):
> > > Check CONST0_RTX and CONSTM1_RTX for X86_CSE_VEC_DUP.
> >
> > Can we detect it in ix86_broadcast_inner, set *kind_p to 
> > X86_CSE_CONST0_VECTOR, instead of handle it in ix86_place_single_vector_set.
>
> Done.  I am testing this patch.

The condition should be

  else if (CONST_VECTOR_P (src))
    {
      /* The only possible CONST_VECTORs of SRC are CONST0_RTX and
         CONSTM1_RTX.  Otherwise,

         rtx set = gen_rtx_SET (dest, src);

         won't be a valid instruction.  */
      machine_mode mode = GET_MODE (dest);
      if (!((src == CONST0_RTX (mode)
             && load->kind == X86_CSE_CONST0_VECTOR)
            || (src == CONSTM1_RTX (mode)
                && load->kind == X86_CSE_CONSTM1_VECTOR)))
        gcc_unreachable ();
    }

> > Also, I wonder why pass_combine(or fwprop) doesn't catch this miss 
> > optimization. Set with CONST0_VECTOR should be cheaper than with 
> > vec_duplicate.
>
> Because of -fno-tree-dse -fno-tree-dce?
>
> > >
> > > gcc/testsuite/
> > >
> > > PR target/125239
> > > * gcc.target/i386/pr125239.c: New test.
> > >
> > >
> > > --
> > > H.J.
>
>
>
> --
> H.J.



-- 
H.J.

Reply via email to