https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121947

            Bug ID: 121947
           Summary: Improve X86_TUNE_DEST_FALSE_DEP_FOR_GLC implementation
           Product: gcc
           Version: 15.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hjl.tools at gmail dot com
                CC: crazylht at gmail dot com
  Target Milestone: ---

Most of X86_TUNE_DEST_FALSE_DEP_FOR_GLC is implemented as

(define_insn "<avx512>_<complexopname>_<mode><maskc_name><round_name>"
  [(set (match_operand:VHF_AVX512VL 0 "register_operand" "=&v")
      (unspec:VHF_AVX512VL
        [(match_operand:VHF_AVX512VL 1 "<round_nimm_predicate>" "<int_comm>v")
         (match_operand:VHF_AVX512VL 2 "<round_nimm_predicate>"
"<round_constraint>")]
         UNSPEC_COMPLEX_F_C_MUL))]
  "TARGET_AVX512FP16 && <round_mode512bit_condition>"
{
  if (TARGET_DEST_FALSE_DEP_FOR_GLC
      && <maskc_dest_false_dep_for_glc_cond>)
    output_asm_insn ("vxorps\t%x0, %x0, %x0", operands);
  return "v<complexopname><ssemodesuffix>\t{<round_maskc_op3>%2, %1,
%0<maskc_operand3>|%0<maskc_operand3>, %1, %2<round_maskc_op3>}";
}

There is an extra vxorps before all instructions.   They can be implemented as
split before reload and run x86_cse pass after it to remove all redundant
vxorps instructions.

Reply via email to