On 3/24/19 11:42 AM, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Mar 22, 2019 at 12:16:30PM -0600, Jeff Law wrote:
>> So I finally started looking at the fpr-moves regression in this BZ.  No
>> surprise this is further fallout from the combiner changes.
> 
> It *exposed* the problem, yes :-)
Well, it was one of regressions that were a result of that patch.  I'm a
bit disappointed in how many were just punted rather than really
analyzed and either mitigated or explicitly moved out to gcc-10 because
mitigation was too painful with marginal benefits to do at this stage.


> 
>> One might reasonably ask if combine's avoidance of
>> hard regs should be loosened for consecutive insns -- combining
>> consecutive insns isn't going to increase the live range of these hard
>> regs.
> 
> But that is not the only reason we don't want to forward hard registers.
> One of the important reasons is correctness: we could create situations
> that reload/LRA cannot handle (cannot *possibly* handle).  Another reason
> is that combine should not try to do register allocation's job: we on
> average get better code after the combine hard reg change.
No doubt.  My point was that we can and should have been looking for
ways to mitigate the fallout.  Which in my mind includes further
investigation of whether or not we've hit the right balance for what
should and should not be combined.

> 
> (And what are consecutive insns, anyway?  This is before scheduling.)
The insns at combine time looked like:

> (insn 13 8 5 2 (set (reg:TF 196)
>         (reg:TF 44 $f12 [ d ])) "j.c":7:1 -1
>      (expr_list:REG_DEAD (reg:TF 44 $f12 [ d ])
>         (nil)))
> (note 5 13 14 2 NOTE_INSN_DELETED)
> (insn 14 5 6 2 (set (reg:DI 197)
>         (reg:DI 6 $6 [ x ])) "j.c":7:1 -1
>      (expr_list:REG_DEAD (reg:DI 6 $6 [ x ])
>         (nil)))
> (note 6 14 7 2 NOTE_INSN_DELETED)
> (note 7 6 10 2 NOTE_INSN_FUNCTION_BEG)
> (insn 10 7 0 2 (set (mem:TF (reg:DI 197) [1 *x_2(D)+0 S16 A128])
>         (reg:TF 196)) "j.c":8:6 376 {*movtf}
>      (expr_list:REG_DEAD (reg:DI 197)
>         (expr_list:REG_DEAD (reg:TF 196)
>             (nil))))


Without reviewing the thread which lead to the combine.c change I can't
say offhand if there's something we can/should relax in combine to
handle this better.  My point was that it should have been more
thoroughly investigated and mitigations attempted once the regressions
were noted.


> 
>> We end up allocating (reg 196) into a GPR.  It's "cheaper".
> 
> Is this a problem in itself already?  (I don't know MIPS terribly well).
It may be.  While I've done extensive MIPS work in the past, it predates
the TF/TI stuff in there so I don't have a sense of relative costs for
the TF/TI stuff.  It certainly looked fishy when I first looked at the
costs.  But I wasn't up for tackling that right now either :-)  It's an
area Vlad continues to investigate (hard reg costing is a known weak
spot for IRA/LRA).


>> So I'm just punting.  In the MIPS splitter, we can peek ahead one real
>> insn and try to forward propagate the source operand at split time.  We
>> still generate the split insns as well.  So after post-reload splitting
>> we have something like this:
> 
>> Note how we split the second move too and the source operand is now
>> $f12.    insns 17 and 18 will get removed as dead by DCE and we
>> ultimately get the desired code.
> 
> For splitters after reload you have to do a lot of work manually, to get
> reasonable code.  This is a problem everywhere :-(
Yup.  In fact, we've seen multiple BZs this cycle where a DCE pass right
before/after splitting would help.  In both cases we ended up hacking up
the splitters a bit.  It's one of the items I think we'll want to
revisit during gcc-10 stage1 development.

> 
> An obvious fix is to split *before* reload, too, but that doesn't work for
> those few splitters that depend on RA results.
Right.

jeff

Reply via email to