Klaus Pedersen <proje...@gmail.com> writes:
> On Tue, May 29, 2012 at 6:55 AM, Vladimir Makarov <vmaka...@redhat.com> wrote:
>> On 05/28/2012 03:09 PM, Richard Sandiford wrote:
>>>
>>> Klaus Pedersen<proje...@gmail.com>  writes:
>>>>
>>>> The summery goes something like this:
>>>>
>>>> It is possible for the second pass of ira to get confused and decide that
>>>> NO_REGS or a hard float register are better choices for the result of the
>>>> 2 operand mult. First pass already optimally allocated in
>>>> GR_AND_MD1_REGS.
>>>
>>> Yeah.  I'm afraid this is something I've been sitting on for a while now.
>>>
>>> I think the only practical way of calculating accurate costs for things
>>> like GR_AND_MD_REGS really is to count the cost of the constituent classes
>>> and then take their MAX.
>>>
>>> Vlad, what do you think?  Is the above exclude_p code "just" a
>>> compile-time
>>> speed-up?
>>
>> Yes, I think so.  Every cost pass is very expensive and practically
>> proportional to  number of classes in consideration.
>>
>> Probably, exluding some classes was a bad solution to speed IRA up.  Or may
>> be we need the pressure classes calculation improvements.  As I remember I
>> tried long ago to calculate IRA cover classes automatically and it did not
>> work.  Pressure classes calculation is analgous to the cover classes
>> calculation but it is less critical for register pressure sensitive insn
>> scheduling.
>
> As a test, I tried to search all: ira-exhausive-search.patch
>
> --- gcc-4.7-20120526-orig/gcc/ira-costs.c     2012-06-03 19:01:00.861129575 
> +0800
> +++ gcc-4.7-20120526/gcc/ira-costs.c  2012-06-03 19:01:16.854081473 +0800
> @@ -258,7 +258,7 @@ setup_regno_cost_classes_by_aclass (int
>        for (i = 0; i < ira_important_classes_num; i++)
>       {
>         cl = ira_important_classes[i];
> -       if (exclude_p)
> +
>           {
>             /* Exclude no-pressure classes which are subsets of
>                ACLASS.  */
>
> This didn't make any difference to the output (at least not with -mips1 and
> -O2). Probably my patch is not doing the right thing!

Yeah, the change I was talking was effectively changing "if (exclude_p)"
to "if (0)", whereas the change above does the opposite.

It sounds from Vlad's compile-time measurements (thanks Vlad) that this
case is still important.  I was wondering whether we could record cases
where the best_cost calculated while working out the preferred class
doesn't match the cost actually recorded in the array.  I haven't had
chance to try it yet though.

Probably we'd want something a bit smarter than that, since subclasses
of GENERAL_REGS that get combined through union would often have the
same cost as the union class.  The divergence would often only come
when merging classes for different register sets (although of course
IRA has no way of telling which those are).

> My original fix, that use sane cost for the ACC_REGS: gpr_acc_cost_3.patch

Why sane?  Transfers from and (especially) to HI and LO really are
expensive on many processors.  Obviously it'd be nice at some point to
make this legacy code take processor-specific costs into account, but...

> --- gcc-4.7-20120526-orig/gcc/config/mips/mips.c      2012-06-03
> 19:28:02.137960837 +0800
> +++ gcc-4.7-20120526/gcc/config/mips/mips.c   2012-06-03 19:31:12.587399458 
> +0800
> @@ -11258,7 +11258,7 @@ mips_move_to_gpr_cost (enum machine_mode
>
>      case ACC_REGS:
>        /* MFLO and MFHI.  */
> -      return 6;
> +      return 3;
>
>      case FP_REGS:
>        /* MFC1, etc.  */
> @@ -11294,7 +11294,7 @@ mips_move_from_gpr_cost (enum machine_mo
>
>      case ACC_REGS:
>        /* MTLO and MTHI.  */
> -      return 6;
> +      return 3;
>
>      case FP_REGS:
>        /* MTC1, etc.  */

...this says that it is better to use LO as scratch space than spilling
to memory -- and better by some margin -- which often isn't the case.

As Vlad says, the behaviour you're seeing with the second pass isn't
deliberate.  The costs calculated during the first pass are generally
what they're supposed to be.  In particular, the cost of MD1_REG is
already an accurate reflection of what I believe the costs above
were supposed to achieve.  The problem is that calculating the cost
of "GENERAL_REGS or MD1_REG" using solely the union class is fundamentally
going to give wrong results, for the reasons I mentioned earlier.
That's a general problem that isn't directly related to the choice
of costs (although of course artificially lowering the costs will
make the problem go away in more cases).

Richard

Reply via email to