On Sat, May 9, 2026 at 3:58 PM Liu, Hongtao <[email protected]> wrote:
>
>
>
> > -----Original Message-----
> > From: H.J. Lu <[email protected]>
> > Sent: Saturday, May 9, 2026 3:55 PM
> > To: Liu, Hongtao <[email protected]>
> > Cc: Hongtao Liu <[email protected]>; GCC Patches <gcc-
> > [email protected]>; Uros Bizjak <[email protected]>
> > Subject: Re: [PATCH] x86_cse: Check CONST0_RTX and CONSTM1_RTX for
> > X86_CSE_VEC_DUP
> >
> > On Sat, May 9, 2026 at 3:33 PM Liu, Hongtao <[email protected]>
> > wrote:
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: H.J. Lu <[email protected]>
> > > > Sent: Saturday, May 9, 2026 3:29 PM
> > > > To: Liu, Hongtao <[email protected]>
> > > > Cc: Hongtao Liu <[email protected]>; GCC Patches <gcc-
> > > > [email protected]>; Uros Bizjak <[email protected]>
> > > > Subject: Re: [PATCH] x86_cse: Check CONST0_RTX and CONSTM1_RTX for
> > > > X86_CSE_VEC_DUP
> > > >
> > > > On Sat, May 9, 2026 at 3:24 PM Liu, Hongtao <[email protected]>
> > > > wrote:
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: H.J. Lu <[email protected]>
> > > > > > Sent: Saturday, May 9, 2026 3:14 PM
> > > > > > To: Hongtao Liu <[email protected]>
> > > > > > Cc: Liu, Hongtao <[email protected]>; GCC Patches <gcc-
> > > > > > [email protected]>; Uros Bizjak <[email protected]>
> > > > > > Subject: Re: [PATCH] x86_cse: Check CONST0_RTX and CONSTM1_RTX
> > > > > > for X86_CSE_VEC_DUP
> > > > > >
> > > > > > On Sat, May 9, 2026 at 3:04 PM Hongtao Liu <[email protected]>
> > > > wrote:
> > > > > > >
> > > > > > > On Sat, May 9, 2026 at 10:28 AM H.J. Lu <[email protected]>
> > wrote:
> > > > > > > >
> > > > > > > > On Sat, May 9, 2026 at 10:20 AM H.J. Lu <[email protected]>
> > wrote:
> > > > > > > > >
> > > > > > > > > On Sat, May 9, 2026 at 9:10 AM Liu, Hongtao
> > > > > > > > > <[email protected]>
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > From: H.J. Lu <[email protected]>
> > > > > > > > > > > Sent: Saturday, May 9, 2026 7:57 AM
> > > > > > > > > > > To: GCC Patches <[email protected]>; Uros Bizjak
> > > > > > > > > > > <[email protected]>; Liu, Hongtao
> > > > > > > > > > > <[email protected]>
> > > > > > > > > > > Subject: [PATCH] x86_cse: Check CONST0_RTX and
> > > > > > > > > > > CONSTM1_RTX for X86_CSE_VEC_DUP
> > > > > > > > > > >
> > > > > > > > > > > Check CONST0_RTX and CONSTM1_RTX when placing
> > > > > > > > > > >
> > > > > > > > > > > (insn 32 2 7 2 (set (reg:V2DI 114)
> > > > > > > > > > >         (const_vector:V2DI [
> > > > > > > > > > >                 (const_int 0 [0]) repeated x2
> > > > > > > > > > >             ])) -1
> > > > > > > > > > >      (nil))
> > > > > > > > > > >
> > > > > > > > > > > after
> > > > > > > > > > >
> > > > > > > > > > > (note 2 3 32 2 NOTE_INSN_FUNCTION_BEG)
> > > > > > > > > > >
> > > > > > > > > > > for X86_CSE_VEC_DUP, not X86_CSE_CONST0_VECTOR or
> > > > > > > > > > > X86_CSE_CONSTM1_VECTOR, after replacing redundant
> > > > > > > > > > > vector
> > > > loads:
> > > > > > > > > > >
> > > > > > > > > > > (insn 31 15 16 2 (set (reg/v/f:DI 99 [ d ])
> > > > > > > > > > >         (const_int 0 [0])) "x.c":5:16 -1
> > > > > > > > > > >      (nil))
> > > > > > > > > > > ...
> > > > > > > > > > > (insn 18 17 19 2 (set (reg:V2DI 111 [ _22 ])
> > > > > > > > > > >         (vec_duplicate:V2DI (reg/v/f:DI 99 [ d ])))
> > > > > > > > > > > "x.c":5:16 9345
> > > > > > {*vec_dupv2di}
> > > > > > > > > > >      (nil))
> > > > > > > > > > >
> > > > > > > > > > > ...
> > > > > > > > > > > (insn 29 12 15 2 (set (reg/v/f:DI 98 [ c ])
> > > > > > > > > > >         (const_int 0 [0])) "x.c":5:16 -1
> > > > > > > > > > >      (nil))
> > > > > > > > > > > ...
> > > > > > > > > > > (insn 20 19 21 2 (set (reg:V2DI 112 [ _20 ])
> > > > > > > > > > >         (vec_duplicate:V2DI (reg/v/f:DI 98 [ c ])))
> > > > > > > > > > > "x.c":5:16 9345
> > > > > > {*vec_dupv2di}
> > > > > > > > > > >      (nil))
> > > > > > > > > > >
> > > > > > > > > > > with
> > > > > > > > > > >
> > > > > > > > > > > (insn 18 17 19 2 (set (reg:V2DI 111 [ _22 ])
> > > > > > > > > > >         (reg:V2DI 114)) "x.c":5:16 2454 {movv2di_internal}
> > > > > > > > > > >      (nil))
> > > > > > > > > > >
> > > > > > > > > > > and
> > > > > > > > > > >
> > > > > > > > > > > (insn 20 19 21 2 (set (reg:V2DI 112 [ _20 ])
> > > > > > > > > > >         (reg:V2DI 114)) "x.c":5:16 2454 {movv2di_internal}
> > > > > > > > > > >      (nil))
> > > > > > > > > > >
> > > > > > > > > > > gcc/
> > > > > > > > > > >
> > > > > > > > > > > PR target/125239
> > > > > > > > > > > * config/i386/i386-features.cc 
> > > > > > > > > > > (ix86_place_single_vector_set):
> > > > > > > > > > > Check CONST0_RTX and CONSTM1_RTX for
> > X86_CSE_VEC_DUP.
> > > > > > > > > >
> > > > > > > > > > Can we detect it in ix86_broadcast_inner, set *kind_p to
> > > > > > X86_CSE_CONST0_VECTOR, instead of handle it in
> > > > > > ix86_place_single_vector_set.
> > > > > > > > >
> > > > > > > > > Done.  I am testing this patch.
> > > > > > > >
> > > > > > > > The condition should be
> > > > > > > >
> > > > > > > >   else if (CONST_VECTOR_P (src))
> > > > > > > >     {
> > > > > > > >       /* The only possible CONST_VECTORs of SRC are CONST0_RTX
> > and
> > > > > > > >          CONSTM1_RTX.  Otherwise,
> > > > > > > >
> > > > > > > >          rtx set = gen_rtx_SET (dest, src);
> > > > > > > >
> > > > > > > >          won't be a valid instruction.  */
> > > > > > > >       machine_mode mode = GET_MODE (dest);
> > > > > > > >       if (!((src == CONST0_RTX (mode)
> > > > > > > >              && load->kind == X86_CSE_CONST0_VECTOR)
> > > > > > > >             || (src == CONSTM1_RTX (mode)
> > > > > > > >                 && load->kind == X86_CSE_CONSTM1_VECTOR)))
> > > > > > > >         gcc_unreachable ();
> > > > > > >
> > > > > > > I think for CONST_VECTOR size > UNITS_PER_WORD, we now
> > > > constructed
> > > > > > the
> > > > > > > const_vector(line  and assigned it to broadcast_source)
> > > > > > >
> > > > > > > 4910                else
> > > > > > > 4911                  {
> > > > > > > 4912                    int nunits = GET_MODE_NUNITS (mode);
> > > > > > > 4913                    rtvec v = rtvec_alloc (nunits);
> > > > > > > 4914                    for (int j = 0; j < nunits ; j++)
> > > > > > > 4915                      RTVEC_ELT (v, j) = load->val;
> > > > > > > 4916                    broadcast_source = gen_rtx_CONST_VECTOR 
> > > > > > > (mode,
> > v);
> > > > > > > 4917                  }
> > > > > > >
> > > > > > > And it will be passed to ix86_place_single_vector_set and
> > > > > > > cause invalid insn, it's a bug that needs to be fixed.
> > > > > >
> > > > > > It shouldn't happen since
> > > > > >
> > > > > >           case X86_CSE_VEC_DUP:
> > > > > >             if (CONST_INT_P (load->val)
> > > > > >                 && (load->val == CONST0_RTX (load->mode)
> > > > > >                     || load->size <= UNITS_PER_WORD))
> > > > > >               {
> > > > > >                 /* Generate CONST_VECTOR load.  */
> > > > > >               case X86_CSE_CONST_VECTOR:
> > > > > >                 mode = ix86_get_vector_cse_mode (load->size,
> > > > > >                                                  load->mode);
> > > > > >
> > > > > >                 if (CONST_VECTOR_P (load->val))
> > > > > >                   broadcast_source = load->val;
> > > > > >                 else if (load->val == CONST0_RTX (load->mode))
> > > > > >                   broadcast_source = CONST0_RTX (mode);
> > > > > >                 else if (load->val == CONSTM1_RTX (load->mode))
> > > > > >                   broadcast_source = CONSTM1_RTX (mode);
> > > > > >                 else
> > > > > >                   {
> > > > > >                     int nunits = GET_MODE_NUNITS (mode);
> > > > > >                     rtvec v = rtvec_alloc (nunits);
> > > > > >                     for (int j = 0; j < nunits ; j++)
> > > > > >                       RTVEC_ELT (v, j) = load->val;
> > > > > >                     broadcast_source = gen_rtx_CONST_VECTOR (mode, 
> > > > > > v);
> > > > > >                   }
> > > > > >
> > > > > > For X86_CSE_VEC_DUP, it is either CONST0_RTX
> > > > > > or load->size <= UNITS_PER_WORD.   For X86_CSE_CONST_VECTOR,
> > > > > > only native CONST_VECTOR is allowed.
> > > > >
> > > > >
> > > > > I c.
> > > > > But the load could be NULL, if it's called from
> > > > remove_partial_avx_dependency?
> > > >
> > > > True.  The v3 patch has
> > > >
> > > >       machine_mode mode = GET_MODE (dest);
> > > >       if (!((src == CONST0_RTX (mode)
> > > >              && (!load || load->kind == X86_CSE_CONST0_VECTOR))
> > > >             || (src == CONSTM1_RTX (mode)
> > > >                 && (!load || load->kind ==
> > > > X86_CSE_CONSTM1_VECTOR))))
> > > I think for CONSTM1_RTX load should be always available, please also add
> > some comments for CONST0_RTX, mentioned it can be called from
> > remove_partial_avx_dependency with CONST0_RTX (V4SFmode) and load is
> > NULL.
> >
> > Like this?
> >
> >       /* The only possible CONST_VECTORs of SRC are CONST0_RTX and
> >          CONSTM1_RTX.  Otherwise,
> >
> >          rtx set = gen_rtx_SET (dest, src);
> >
> >          won't be a valid instruction.  When CONST0_RTX comes from
> >          remove_partial_avx_dependency, LOAD == NULL.  */
> >       machine_mode mode = GET_MODE (dest);
> >       if (!((src == CONST0_RTX (mode)
> >              && (!load || load->kind == X86_CSE_CONST0_VECTOR))
> >             || (src == CONSTM1_RTX (mode)
> >                 && load->kind == X86_CSE_CONSTM1_VECTOR)))
> >         gcc_unreachable ();
> >
>
> Yes, patch is preapproved if it passed bootstrapped and regtested.
>

Will do.  Thanks.



--
H.J.

Reply via email to