On Mon, Jun 19, 2017 at 12:40:53PM -0500, Segher Boessenkool wrote:
> On Mon, Jun 19, 2017 at 05:01:10PM +0100, Richard Earnshaw (lists) wrote:
> > Yeah, and I'm not suggesting we change the logic there (sorry if the
> > description was misleading). Instead I'm proposing that we handle more
> > cases for parallels to not return zero.
>
> Right. My test run is half way through, will have results later --
> your change looks good to me, but it is always surprising whether
> better costs help or not, or even *hurt* good code generation (things
> are just too tightly tuned to the current behaviour, so some things
> may need retuning).
Everything built successfully (31 targets); --enable-checking=yes,rtl,tree
so it took a while, sorry.
The targets with any differences (table shows code size):
old patched
arm 11545709 11545797
powerpc 8442762 8442746
x86_64 10627428 10627363
Arm has very many differences, the others do not. For powerpc (which
is 32-bit, 64-bit showed no differences) most of the difference is
scheduling deciding to do things a bit differently, and most of it
in places where we have not-so-good costs anyway. For arm the effects
often cascade to bb-reorder making different decisions.
Anyway, all differences are small, it is not likely to hurt anything.
I support the patch, if that helps -- but I cannot approve it.
Segher