https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114766

            Bug ID: 114766
           Summary: ^ constraint modifier unexpectedly affects register
                    class selection.
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
                CC: vmakarov at gcc dot gnu.org
  Target Milestone: ---

The documentation for ^ states:

"This constraint is analogous to ‘?’ but it disparages slightly the alternative
only if the operand with the ‘^’ needs a reload."

>From this we gathered that there's only a slight costs when the given operand
required a reload.

In PR114741 we had a regression when using this modifier because it seems to
unexpectedly also change register class selection during coloring.

The pattern was:

(define_insn "<optab><mode>3"
  [(set (match_operand:GPI 0 "register_operand")
        (LOGICAL:GPI (match_operand:GPI 1 "register_operand")
                     (match_operand:GPI 2 "aarch64_logical_operand")))]
  ""
  {@ [ cons: =0 , 1  , 2        ; attrs: type , arch  ]
     [ r        , %r , r        ; logic_reg   , *     ] <logical>\t%<w>0,
%<w>1, %<w>2
     [ rk       , ^r , <lconst> ; logic_imm   , *     ] <logical>\t%<w>0,
%<w>1, %2
     [ w        , 0  , <lconst> ; *           , sve   ] <logical>\t%Z0.<s>,
%Z0.<s>, #%2
     [ w        , w  , w        ; neon_logic  , simd  ] <logical>\t%0.<Vbtype>,
%1.<Vbtype>, %2.<Vbtype>
  }
)

where we wanted to prefer the r->r alternative unless a reload is needed in
which case we preferred the w->w.

But in the simple example of:

    void foo(unsigned v, unsigned *p)
    {
        *p = v & 1;
    }

where the operation can be done on both r->r and w->w, but w->w needs a mov it
would not pick r->r.

This is because during sched1 the penalty applied to the ^ alternative made it
no longer consider GP regs:

        r106: preferred FP_REGS, alternative NO_REGS, allocno FP_REGS
    ;;        3--> b  0: i   9 r106=r105&0x1
        :cortex_a53_slot_any:GENERAL_REGS+0(-1)FP_REGS+1(1)PR_LO_REGS+0(0)
                             PR_HI_REGS+0(0):model 4

The penalty here seems incorrect, and removing it seems to get the constraint
to work properly.
So the question is, is it a bug, or are we using it incorrectly? or a
documentation bug?

Reply via email to