On 03/02/2016 12:29 PM, Bernd Schmidt wrote:
I had a look at the costs issues in this PR. I think I've found a fair
number of bugs, but fixing those alone doesn't solve the issue; one
additional tweak is needed.
As for the bugs, they are primarily in the mechanism of recording cost
updates and restoring them. When adjusting costs for preferences and
copies, we create records of the adjustments we made. At the end of
assign_hard_reg, when we have assigned a hard register, we try to undo
these changes (through restore_costs_from_copies), and then call
update_costs_from_copies again with the real assigned hard register
number so that future allocations take that into account.
The issues I can see are:
1. restore_costs_from_copies passes true for decr_p, which means costs
aren't restored, but doubled instead.
2. update_costs_from_allocno records a cost update not just for the
initial allocno, but for each of the visited ones. I can sort of see
an argument for doing that (let's say if you assign an allocno in the
middle of a copy chain you'd want the tail end of the chain to be
reset), but in practice I don't think the present algorithm can work
at all. In the case of an allocno in the middle of a copy chain the
restore would progress in both directions, and in any case it looks
like this approach can end up double-counting things when restoring
costs.
It is just a heuristic. Richard Sandiford proposed this update
approach. Before it we had only updates of allocnos directly connected
to allocno in question. Richard's approach helped to improve code in
some cases. If something works better we should use. The bechmarking
is the best criterium.
3. As far as I can tell, in update_costs_from_prefs, we adjust costs
for all allocnos connected to the one with the pref, but not for the
initial one.
The patch below corrects these. Issue #2 has no clearly best solution;
in this patch I've just moved the recording of the cost out of the
loop so it's done only for the initial allocno. The code is a little
convoluted so as to prevent skipping restorations if allocnos in a
copy chain have already been allocated. I have an alternative patch
that more directly records actual cost updates caused to other
allocnos for a given one. That variant is a little more clear, but
consumes a bit more memory. I can post that as well if desired.
As for the cost tweak, Vlad mentioned the effect of copies for
commutative operands in the PR. I ended up dividing the frequency of
such copies by two. (Another way to solve the PR is to reduce the
initial divisor for preference updates, but that seems a little more
hackish).
The overall effect of this patch seems to be fairly minimal in
practice, which is slightly disappointing. On the whole, it seems to
be a very slight net positive, with few instances where we generate
additional instructions.
I am just speculating may be the most important thing for performance is
updating costs not their exact values (although the values can be
important in some cases too).
Ok (in full or in parts, now or for stage1)?
The patch is ok for me. But I'd wait for GCC7. People are sensitive to
their code performance degradation. Even in most cases the patch
improves performance, in some case it can worsen code and people might
fill new PRs after such change.
Bernd, thanks for working on it and providing a fresh view of the code.
For me especially valuable when people benchmark their patches.
Sometimes I have to do it by myself.