On 29 Jun 2015, at 16:46, Alan Modra <amo...@gmail.com> wrote:

> On Thu, Jun 25, 2015 at 01:28:39PM +0100, Richard Earnshaw wrote:
>> Perhaps the best thing to do is to use the OUTER code to spot the
>> specific case where you've got a SET and return non-zero in that case.
> 
> That's exactly the path I've been following.  It's not as easy as it
> sounds..
> 
> First, some backends call rtx_cost from their targetm.rtx_costs.
> ix86_rtx_costs for instance has this
> 
>    case PLUS:
> ...
>             if (val == 2 || val == 4 || val == 8)
>               {
>                 *total = cost->lea;
>                 *total += rtx_cost (XEXP (XEXP (x, 0), 1),
>                                     outer_code, opno, speed);
>                 *total += rtx_cost (XEXP (XEXP (XEXP (x, 0), 0), 0),
>                                     outer_code, opno, speed);
>                 *total += rtx_cost (XEXP (x, 1), outer_code, opno, speed);
>                 return true;
>               }
> which, when using a non-zero register move cost, results in
> 
> Successfully matched this instruction:
> (set (reg:DI 198 [ D.74663 ])
>    (plus:DI (plus:DI (reg/v/f:DI 172 [ use_entry ])
>            (reg:DI 196 [ D.74662 ]))
>        (const_int -32 [0xffffffffffffffe0])))
> rejecting combination of insns 179 and 180
> original costs 6 + 4 = 10
> replacement cost 15
> 
> So here the x86 backend is calculating the cost of an lea, plus the
> cost of (reg:DI 196), plus the cost of (reg/v/f:DI 172), plus the cost
> of (const_int -32).  outer_code is SET.  That means we add two
> register moves, increasing the overall cost from 7 to 15.
> 
> The second problem I've hit is that fwprop.c:should_replace_address
> has this:
> 
>  /* If the addresses have equivalent cost, prefer the new address
>     if it has the highest `set_src_cost'.  That has the potential of
>     eliminating the most insns without additional costs, and it
>     is the same that cse.c used to do.  */
>  if (gain == 0)
>    gain = (set_src_cost (new_rtx, VOIDmode, speed)
>           - set_src_cost (old_rtx, VOIDmode, speed));
> 
>  return (gain > 0);
> 
> If register moves have the same cost as adding a small constant to a
> register, then this code no longer replaces a pseudo with its value as
> an offset from a base.  I think this particular problem can be fixed
> quite simply by "return gain >= 0;", but really, this code, like the
> x86 code, is expecting the cost of a register move to be zero.
> 
> You'll notice that these example problems are not trying to cost a
> whole instruction.  In both cases they want the cost of just a piece
> of an instruction, but rtx_cost is called in a way that is
> indistinguishable from other code that calls rtx_cost on whole
> register move instructions.
> 
> The real difficulty is in separating out the whole insn cases from the
> partial insn cases.
> 
> Note that we already have insn_rtx_cost, and it returns a minimum cost
> for a SET, so register move insns get a cost of 1 insn.  However,
> despite insn_rtx_cost starting life in combine.c, even combine doesn't
> use it in all whole insn cases.  :-(

Quite often, more complex (combine) insns have to be matched manually using 
C/C++ code in order to implement the costs function.  To avoid that, maybe we 
could have target independent insn attributes that carry the costs?  That would 
be much be much easier/faster (at least) for combine to lookup and is also 
easier to maintain in the backend.

It's also possible to implement that in a target specific way.  Like in the 
costs function, constructing a temporary fake insn, recog it, lookup the 
attribute.  However, this will pointlessly invoke recog twice.  At the time 
when combine gets the insn costs it already has invoked recog.

Cheers,
Oleg

Reply via email to