> -----Original Message----- > From: H.J. Lu <[email protected]> > Sent: Saturday, May 9, 2026 7:57 AM > To: GCC Patches <[email protected]>; Uros Bizjak > <[email protected]>; Liu, Hongtao <[email protected]> > Subject: [PATCH] x86_cse: Check CONST0_RTX and CONSTM1_RTX for > X86_CSE_VEC_DUP > > Check CONST0_RTX and CONSTM1_RTX when placing > > (insn 32 2 7 2 (set (reg:V2DI 114) > (const_vector:V2DI [ > (const_int 0 [0]) repeated x2 > ])) -1 > (nil)) > > after > > (note 2 3 32 2 NOTE_INSN_FUNCTION_BEG) > > for X86_CSE_VEC_DUP, not X86_CSE_CONST0_VECTOR or > X86_CSE_CONSTM1_VECTOR, after replacing redundant vector loads: > > (insn 31 15 16 2 (set (reg/v/f:DI 99 [ d ]) > (const_int 0 [0])) "x.c":5:16 -1 > (nil)) > ... > (insn 18 17 19 2 (set (reg:V2DI 111 [ _22 ]) > (vec_duplicate:V2DI (reg/v/f:DI 99 [ d ]))) "x.c":5:16 9345 > {*vec_dupv2di} > (nil)) > > ... > (insn 29 12 15 2 (set (reg/v/f:DI 98 [ c ]) > (const_int 0 [0])) "x.c":5:16 -1 > (nil)) > ... > (insn 20 19 21 2 (set (reg:V2DI 112 [ _20 ]) > (vec_duplicate:V2DI (reg/v/f:DI 98 [ c ]))) "x.c":5:16 9345 > {*vec_dupv2di} > (nil)) > > with > > (insn 18 17 19 2 (set (reg:V2DI 111 [ _22 ]) > (reg:V2DI 114)) "x.c":5:16 2454 {movv2di_internal} > (nil)) > > and > > (insn 20 19 21 2 (set (reg:V2DI 112 [ _20 ]) > (reg:V2DI 114)) "x.c":5:16 2454 {movv2di_internal} > (nil)) > > gcc/ > > PR target/125239 > * config/i386/i386-features.cc (ix86_place_single_vector_set): > Check CONST0_RTX and CONSTM1_RTX for X86_CSE_VEC_DUP.
Can we detect it in ix86_broadcast_inner, set *kind_p to X86_CSE_CONST0_VECTOR, instead of handle it in ix86_place_single_vector_set. Also, I wonder why pass_combine(or fwprop) doesn't catch this miss optimization. Set with CONST0_VECTOR should be cheaper than with vec_duplicate. > > gcc/testsuite/ > > PR target/125239 > * gcc.target/i386/pr125239.c: New test. > > > -- > H.J.
