On Sat, May 9, 2026 at 3:58 PM Liu, Hongtao <[email protected]> wrote: > > > > > -----Original Message----- > > From: H.J. Lu <[email protected]> > > Sent: Saturday, May 9, 2026 3:55 PM > > To: Liu, Hongtao <[email protected]> > > Cc: Hongtao Liu <[email protected]>; GCC Patches <gcc- > > [email protected]>; Uros Bizjak <[email protected]> > > Subject: Re: [PATCH] x86_cse: Check CONST0_RTX and CONSTM1_RTX for > > X86_CSE_VEC_DUP > > > > On Sat, May 9, 2026 at 3:33 PM Liu, Hongtao <[email protected]> > > wrote: > > > > > > > > > > > > > -----Original Message----- > > > > From: H.J. Lu <[email protected]> > > > > Sent: Saturday, May 9, 2026 3:29 PM > > > > To: Liu, Hongtao <[email protected]> > > > > Cc: Hongtao Liu <[email protected]>; GCC Patches <gcc- > > > > [email protected]>; Uros Bizjak <[email protected]> > > > > Subject: Re: [PATCH] x86_cse: Check CONST0_RTX and CONSTM1_RTX for > > > > X86_CSE_VEC_DUP > > > > > > > > On Sat, May 9, 2026 at 3:24 PM Liu, Hongtao <[email protected]> > > > > wrote: > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > From: H.J. Lu <[email protected]> > > > > > > Sent: Saturday, May 9, 2026 3:14 PM > > > > > > To: Hongtao Liu <[email protected]> > > > > > > Cc: Liu, Hongtao <[email protected]>; GCC Patches <gcc- > > > > > > [email protected]>; Uros Bizjak <[email protected]> > > > > > > Subject: Re: [PATCH] x86_cse: Check CONST0_RTX and CONSTM1_RTX > > > > > > for X86_CSE_VEC_DUP > > > > > > > > > > > > On Sat, May 9, 2026 at 3:04 PM Hongtao Liu <[email protected]> > > > > wrote: > > > > > > > > > > > > > > On Sat, May 9, 2026 at 10:28 AM H.J. Lu <[email protected]> > > wrote: > > > > > > > > > > > > > > > > On Sat, May 9, 2026 at 10:20 AM H.J. Lu <[email protected]> > > wrote: > > > > > > > > > > > > > > > > > > On Sat, May 9, 2026 at 9:10 AM Liu, Hongtao > > > > > > > > > <[email protected]> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > > > > From: H.J. Lu <[email protected]> > > > > > > > > > > > Sent: Saturday, May 9, 2026 7:57 AM > > > > > > > > > > > To: GCC Patches <[email protected]>; Uros Bizjak > > > > > > > > > > > <[email protected]>; Liu, Hongtao > > > > > > > > > > > <[email protected]> > > > > > > > > > > > Subject: [PATCH] x86_cse: Check CONST0_RTX and > > > > > > > > > > > CONSTM1_RTX for X86_CSE_VEC_DUP > > > > > > > > > > > > > > > > > > > > > > Check CONST0_RTX and CONSTM1_RTX when placing > > > > > > > > > > > > > > > > > > > > > > (insn 32 2 7 2 (set (reg:V2DI 114) > > > > > > > > > > > (const_vector:V2DI [ > > > > > > > > > > > (const_int 0 [0]) repeated x2 > > > > > > > > > > > ])) -1 > > > > > > > > > > > (nil)) > > > > > > > > > > > > > > > > > > > > > > after > > > > > > > > > > > > > > > > > > > > > > (note 2 3 32 2 NOTE_INSN_FUNCTION_BEG) > > > > > > > > > > > > > > > > > > > > > > for X86_CSE_VEC_DUP, not X86_CSE_CONST0_VECTOR or > > > > > > > > > > > X86_CSE_CONSTM1_VECTOR, after replacing redundant > > > > > > > > > > > vector > > > > loads: > > > > > > > > > > > > > > > > > > > > > > (insn 31 15 16 2 (set (reg/v/f:DI 99 [ d ]) > > > > > > > > > > > (const_int 0 [0])) "x.c":5:16 -1 > > > > > > > > > > > (nil)) > > > > > > > > > > > ... > > > > > > > > > > > (insn 18 17 19 2 (set (reg:V2DI 111 [ _22 ]) > > > > > > > > > > > (vec_duplicate:V2DI (reg/v/f:DI 99 [ d ]))) > > > > > > > > > > > "x.c":5:16 9345 > > > > > > {*vec_dupv2di} > > > > > > > > > > > (nil)) > > > > > > > > > > > > > > > > > > > > > > ... > > > > > > > > > > > (insn 29 12 15 2 (set (reg/v/f:DI 98 [ c ]) > > > > > > > > > > > (const_int 0 [0])) "x.c":5:16 -1 > > > > > > > > > > > (nil)) > > > > > > > > > > > ... > > > > > > > > > > > (insn 20 19 21 2 (set (reg:V2DI 112 [ _20 ]) > > > > > > > > > > > (vec_duplicate:V2DI (reg/v/f:DI 98 [ c ]))) > > > > > > > > > > > "x.c":5:16 9345 > > > > > > {*vec_dupv2di} > > > > > > > > > > > (nil)) > > > > > > > > > > > > > > > > > > > > > > with > > > > > > > > > > > > > > > > > > > > > > (insn 18 17 19 2 (set (reg:V2DI 111 [ _22 ]) > > > > > > > > > > > (reg:V2DI 114)) "x.c":5:16 2454 {movv2di_internal} > > > > > > > > > > > (nil)) > > > > > > > > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > > > > > (insn 20 19 21 2 (set (reg:V2DI 112 [ _20 ]) > > > > > > > > > > > (reg:V2DI 114)) "x.c":5:16 2454 {movv2di_internal} > > > > > > > > > > > (nil)) > > > > > > > > > > > > > > > > > > > > > > gcc/ > > > > > > > > > > > > > > > > > > > > > > PR target/125239 > > > > > > > > > > > * config/i386/i386-features.cc > > > > > > > > > > > (ix86_place_single_vector_set): > > > > > > > > > > > Check CONST0_RTX and CONSTM1_RTX for > > X86_CSE_VEC_DUP. > > > > > > > > > > > > > > > > > > > > Can we detect it in ix86_broadcast_inner, set *kind_p to > > > > > > X86_CSE_CONST0_VECTOR, instead of handle it in > > > > > > ix86_place_single_vector_set. > > > > > > > > > > > > > > > > > > Done. I am testing this patch. > > > > > > > > > > > > > > > > The condition should be > > > > > > > > > > > > > > > > else if (CONST_VECTOR_P (src)) > > > > > > > > { > > > > > > > > /* The only possible CONST_VECTORs of SRC are CONST0_RTX > > and > > > > > > > > CONSTM1_RTX. Otherwise, > > > > > > > > > > > > > > > > rtx set = gen_rtx_SET (dest, src); > > > > > > > > > > > > > > > > won't be a valid instruction. */ > > > > > > > > machine_mode mode = GET_MODE (dest); > > > > > > > > if (!((src == CONST0_RTX (mode) > > > > > > > > && load->kind == X86_CSE_CONST0_VECTOR) > > > > > > > > || (src == CONSTM1_RTX (mode) > > > > > > > > && load->kind == X86_CSE_CONSTM1_VECTOR))) > > > > > > > > gcc_unreachable (); > > > > > > > > > > > > > > I think for CONST_VECTOR size > UNITS_PER_WORD, we now > > > > constructed > > > > > > the > > > > > > > const_vector(line and assigned it to broadcast_source) > > > > > > > > > > > > > > 4910 else > > > > > > > 4911 { > > > > > > > 4912 int nunits = GET_MODE_NUNITS (mode); > > > > > > > 4913 rtvec v = rtvec_alloc (nunits); > > > > > > > 4914 for (int j = 0; j < nunits ; j++) > > > > > > > 4915 RTVEC_ELT (v, j) = load->val; > > > > > > > 4916 broadcast_source = gen_rtx_CONST_VECTOR > > > > > > > (mode, > > v); > > > > > > > 4917 } > > > > > > > > > > > > > > And it will be passed to ix86_place_single_vector_set and > > > > > > > cause invalid insn, it's a bug that needs to be fixed. > > > > > > > > > > > > It shouldn't happen since > > > > > > > > > > > > case X86_CSE_VEC_DUP: > > > > > > if (CONST_INT_P (load->val) > > > > > > && (load->val == CONST0_RTX (load->mode) > > > > > > || load->size <= UNITS_PER_WORD)) > > > > > > { > > > > > > /* Generate CONST_VECTOR load. */ > > > > > > case X86_CSE_CONST_VECTOR: > > > > > > mode = ix86_get_vector_cse_mode (load->size, > > > > > > load->mode); > > > > > > > > > > > > if (CONST_VECTOR_P (load->val)) > > > > > > broadcast_source = load->val; > > > > > > else if (load->val == CONST0_RTX (load->mode)) > > > > > > broadcast_source = CONST0_RTX (mode); > > > > > > else if (load->val == CONSTM1_RTX (load->mode)) > > > > > > broadcast_source = CONSTM1_RTX (mode); > > > > > > else > > > > > > { > > > > > > int nunits = GET_MODE_NUNITS (mode); > > > > > > rtvec v = rtvec_alloc (nunits); > > > > > > for (int j = 0; j < nunits ; j++) > > > > > > RTVEC_ELT (v, j) = load->val; > > > > > > broadcast_source = gen_rtx_CONST_VECTOR (mode, > > > > > > v); > > > > > > } > > > > > > > > > > > > For X86_CSE_VEC_DUP, it is either CONST0_RTX > > > > > > or load->size <= UNITS_PER_WORD. For X86_CSE_CONST_VECTOR, > > > > > > only native CONST_VECTOR is allowed. > > > > > > > > > > > > > > > I c. > > > > > But the load could be NULL, if it's called from > > > > remove_partial_avx_dependency? > > > > > > > > True. The v3 patch has > > > > > > > > machine_mode mode = GET_MODE (dest); > > > > if (!((src == CONST0_RTX (mode) > > > > && (!load || load->kind == X86_CSE_CONST0_VECTOR)) > > > > || (src == CONSTM1_RTX (mode) > > > > && (!load || load->kind == > > > > X86_CSE_CONSTM1_VECTOR)))) > > > I think for CONSTM1_RTX load should be always available, please also add > > some comments for CONST0_RTX, mentioned it can be called from > > remove_partial_avx_dependency with CONST0_RTX (V4SFmode) and load is > > NULL. > > > > Like this? > > > > /* The only possible CONST_VECTORs of SRC are CONST0_RTX and > > CONSTM1_RTX. Otherwise, > > > > rtx set = gen_rtx_SET (dest, src); > > > > won't be a valid instruction. When CONST0_RTX comes from > > remove_partial_avx_dependency, LOAD == NULL. */ > > machine_mode mode = GET_MODE (dest); > > if (!((src == CONST0_RTX (mode) > > && (!load || load->kind == X86_CSE_CONST0_VECTOR)) > > || (src == CONSTM1_RTX (mode) > > && load->kind == X86_CSE_CONSTM1_VECTOR))) > > gcc_unreachable (); > > > > Yes, patch is preapproved if it passed bootstrapped and regtested. >
Will do. Thanks. -- H.J.
