https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121947
Bug ID: 121947 Summary: Improve X86_TUNE_DEST_FALSE_DEP_FOR_GLC implementation Product: gcc Version: 15.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: crazylht at gmail dot com Target Milestone: --- Most of X86_TUNE_DEST_FALSE_DEP_FOR_GLC is implemented as (define_insn "<avx512>_<complexopname>_<mode><maskc_name><round_name>" [(set (match_operand:VHF_AVX512VL 0 "register_operand" "=&v") (unspec:VHF_AVX512VL [(match_operand:VHF_AVX512VL 1 "<round_nimm_predicate>" "<int_comm>v") (match_operand:VHF_AVX512VL 2 "<round_nimm_predicate>" "<round_constraint>")] UNSPEC_COMPLEX_F_C_MUL))] "TARGET_AVX512FP16 && <round_mode512bit_condition>" { if (TARGET_DEST_FALSE_DEP_FOR_GLC && <maskc_dest_false_dep_for_glc_cond>) output_asm_insn ("vxorps\t%x0, %x0, %x0", operands); return "v<complexopname><ssemodesuffix>\t{<round_maskc_op3>%2, %1, %0<maskc_operand3>|%0<maskc_operand3>, %1, %2<round_maskc_op3>}"; } There is an extra vxorps before all instructions. They can be implemented as split before reload and run x86_cse pass after it to remove all redundant vxorps instructions.