Pip Cet writes:

> On Sun, Aug 16, 2020 at 12:50 AM Segher Boessenkool
> <seg...@kernel.crashing.org> wrote:
>> On Sat, Aug 15, 2020 at 10:18:27AM +0000, Pip Cet wrote:
>> > > > What I'm currently doing is this:
>> > > >
>> > > > (define_split
>> > > >   [(set (match_operand 0 "nonimmediate_operand")
>> > > >     (match_operand 1 "nox_general_operand"))]
>> > > >   "reload_completed && mov_clobbers_cc (insn, operands)"
>> > > >   [(parallel [(set (match_dup 0) (match_dup 1))
>> > > >           (clobber (reg:CC REG_CC))])])
>> > > >
>> > > > which, if I understand correctly, introduces the parallel-clobber
>> > > > expression only when necessary, at the same time as post-reload
>> > > > splitters introduce REG_CC references. Is that correct?
>> > >
>> > > Yes.  And this will work correctly if somehow you ensure that REG_CC
>> > > isn't live anywhere you introduce such a clobber.
>> >
>> > IIUC, the idea is that references to REG_CC, except for clobbers, are
>> > only introduced in the post-reload split pass, so it cannot be live
>> > before our define_split runs. Does that make sense?
>>
>> Yes, it does.  It has some huge restrictions (using the reg in inline
>> assembler can never work reliably, for example, so you'll have to
>> disallow that).
>
> Is there any approach that doesn't suffer from that problem? My
> understanding was that we need to allow reload to insert CC-clobbering
> insns on this (and many other) architectures, and that there are so
> many places where reload might choose to do so (including before and
> after inline asm) that using the register prior to reload just isn't
> possible. I'd be glad to be wrong, though :-)
>
> Is it true that reload chooses which constraint alternative is used
> for each insn? Is that information available somehow to post-reload
> splits? The problem is that the "X" constraint matches whatever's
> there already, and doesn't replace it with the (scratch:CC) expression
> that would work, so I can't rewrite those insns not to clobber the CC
> reg.
>
> For example, here's what I currently have:
>
> (define_expand "mov<mode>"
>   [(parallel [(set (match_operand:MOVMODE 0 "nonimmediate_operand" "")
>            (match_operand:MOVMODE 1 "general_operand" ""))
>           (clobber (reg:CC REG_CC))])]
>   ...)
>
> (define_insn "mov<mode>_insn"
>    [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r,r  ,d    ,Qm
>   ,r ,q,r,*r")
>         (match_operand:ALL1 1 "nox_general_operand"   "r,Y00,n Ynn,r
> Y00,Qm,r,q,i"))
>    (clobber (match_scratch:CC 2 "=X,c,X,c,c,X,X,c"))]
>   ...)
>
> That works, but it results in an incorrect CC clobber for, say,
> register-to-register movqi. For that, I'd need something like
>
> (define_split
>   [(parallel [(set (match_operand:ALL1 0 "nonimmediate_operand")
>              (match_operand:ALL1 1 "nox_general_operand"))
>           (clobber (reg:CC REG_CC))])]
>   "reload_completed && REG_P (operands[0]) && REG_P (operands[1])"
>   [(parallel [(set (match_dup 0)
>            (match_dup 1))
>           (clobber (scratch:CC))])])
>
> and so on, for all four constraint alternatives that don't actually
> clobber REG_CC (and also for a fifth which only rarely clobbers
> REG_CC). That's duplicated code, of course.

The (setattr "cc" ...) that is currently present for all
patterns accounts for the constraint alternatives,so  using
get_attr_cc to split to a (clobber) of either cc_reg_rtx or a
gen_rtx_SCRATCH (CCmode) appears to work.

(define_insn_and_split "mov<mode>_insn"
  [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r    ,d    ,Qm   ,r 
,q,r,*r")
        (match_operand:ALL1 1 "nox_general_operand"   "r Y00,n Ynn,r 
Y00,Qm,r,q,i"))]
  "register_operand (operands[0], <MODE>mode)
    || reg_or_0_operand (operands[1], <MODE>mode)"
  "#"
  "&& reload_completed"
  [(parallel [(set (match_dup 0)
                   (match_dup 1))
              (clobber (match_dup 2))])]
  {
    operands[2] = get_cc_reg_clobber_rtx (curr_insn);
  }
  [(set_attr "cc" "ldi,none,clobber,clobber,none,none,clobber")])

(define_insn "*mov<mode>_insn"
  [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r    ,d    ,Qm   ,r 
,q,r,*r")
        (match_operand:ALL1 1 "nox_general_operand"   "r Y00,n Ynn,r 
Y00,Qm,r,q,i"))
   (clobber (match_scratch:CC 2                      "=X    ,X     ,c   ,c 
,X,X,c"))]
  "(register_operand (operands[0], <MODE>mode)
    || reg_or_0_operand (operands[1], <MODE>mode))
   && reload_completed"
  {
    return output_movqi (insn, operands, NULL);
  }
  [(set_attr "length" "1,1,5,5,1,1,4")
   (set_attr "adjust_len" "mov8")])

As you mentioned elsewhere, keeping the clobber with "X" helps keep a
single pattern for both CC reg clobbering and non-clobbering insns.


>
> All this is at least somewhat academic: the code produced isn't
> drastically better after my cc conversion, but it's not usually any
> worse, and I'm still looking at assembler examples that are pessimized
> a little to find out how to fix them...
>
>> And earlier passes (like combine) cannot optimise this,
>
> I'm not sure what "this" refers to: my understanding is that the idea
> is to let combine see CC-free code, or code with CC clobbers only,
> which it can then optimize, and only add "real" CC references after
> reload, so many optimizations should just work.
>
>> But it is a pretty straightforward way to move from CC0 to the
>> modern world!
>
> With the limitations of the reload pass being as they are, I don't
> really see a dramatically better alternative. I suppose we could
> explicitly save and restore CC flags around insns emitted when
> reload_in_progress is true, to simulate non-CC-clobbering add/mov insn
> patterns? That sounds like it would reduce code quality a lot, unless
> great care were taken to make sure all of the save/restore CC flags
> insns were optimized away in later passes.
>
> In any case, thanks for the answers so far!
>
> Pip

Reply via email to