On 04/10/2012 09:35 AM, Richard Sandiford wrote:
Hi Vlad,

Back in Decemember, when we were still very much in stage 3, I sent
an RFC about an alternative implementation of -fsched-pressure.
Just wanted to send a reminder now that we're in the proper stage:

http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01684.html

Ulrich has benchmarked it on ARM, S/390 and Power7 (thanks), and got
reasonable results.  (I mentioned bad Power 7 results in that message,
because of the way the VSX_REGS class is handled.  Ulrich's results
are without -mvsx though.)

The condition I orignally set myself was that this patch should only
go in if it becomes the default on at least one architecture,
specifically ARM.  Ulrich tells me that Linaro have now made it
the default for ARM in their GCC 4.7 release, so hopefully Ramana
would be OK with doing the same in upstream 4.8.

I realise the whole thing is probably more complicated and ad-hoc
than you'd like.  Saying it can't go in is a perfectly acceptable
answer IMO.

I have a mixed feeling with the patch. I've tried it on SPEC2000 on x86/x86-64 and ARM. Model algorithm generates bigger code up to 3.5% (SPECFP on x86), 2% (SPECFP on 86-64), and 0.23% (SPECFP on ARM) in comparison with the current algorithm. It is slower too. Although the difference is quite insignificant on Corei7, compiler speed slowdown achieves 0.4% on SPECFP2000 on arm. The algorithm also generates slower code on x86 (1.5% on SPECINT and 5% on SPECFP200) and practically the same average code on x86-64 and ARM (I've tried only SPECINT on ARM).

On the other hand, I don't think that 1st insn scheduling will be ever used for x86. And although the SPECFP2000 rate is the same on x86-64 I saw that some SPECFP2000 tests benefit from your algorithm on x86-64 (one amazing difference is 70% improvement on swim on x86-64 although it might be because of different reasons like alignment or cache behaviour). So I think the algorithm might work better on processors with more registers.

I gues it is ok to sumbit this work to the trunk. It may be useful for achieving better performance for specific tests or to make it a default for some targets if it is proven to be better.

As for the patch itself, I think you should document the option in doc/invoke.texi. It is missed. Another minor mistake I found is one line garbage (I guess from -fira-algorithm) in description of -fsched-pressure-algorithm in common.opt.

Thanks, Richard.

By the way, the code in scheduler has been changed since you made a patch and you need to do some merging first.

Reply via email to