On 06/02/2016 06:53 PM, James Greenhalgh wrote:
As I iterated through versions of this patch set, I realised that all we
really wanted for ifcvt was a way to estimate the cost of a branch in units
that were comparable to the cost of instructions. The trouble with BRANCH_COST
wasn't that it was returning a magic number, it was just that it was returning
a magic number which had inconsistent meanings in the compiler. Otherwise,
BRANCH_COST was a straightforward, low-complexity target hook.

[...]

Having worked through the patch set, I'd say it is probably a small
improvement over what we currently do, but I'm not very happy with it. I'm
posting it for comment so we can discuss any directions for costs that I
haven't thought about or prototyped. I'm also happy to drop the costs
rewrite if this seems like complexity for no benefit.

Any thoughts?

I think it all looks fairly reasonable, and on the whole lower complexity is likely a better approach. A few comments on individual patches:

+unsigned int
+default_rtx_branch_cost (bool speed_p,
+                        bool predictable_p)

No need to wrap the line.

+noce_estimate_conversion_profitable_p (struct noce_if_info *if_info,
+                                      unsigned int ninsns)
+{
+  return (if_info->rtx_edge_cost >= ninsns * COSTS_N_INSNS (1));

Please no parens around return. There are several examples across the series.

NINSNS is the number of simple instructions we're going to add, right? How about the instructions we're going to remove, shouldn't these be counted too? I think that kind of thing was implicit in the old tests vs branch_cost.

   if_info.branch_cost = BRANCH_COST (optimize_bb_for_speed_p (test_bb),
                                     predictable_edge_p (then_edge));
+  if_info.rtx_edge_cost
+    = targetm.rtx_branch_cost (optimize_bb_for_speed_p (test_bb),
+                              predictable_edge_p (then_edge));

This I have the most problems with, mostly as an issue with naming. Calling it an edge_cost implies that it depends on whether the branch is taken or not, which I believe is not the case. Maybe the interface ought to be able to provide taken/not-taken information, although I can't off-hand think of a way to make use of such information.

Here, I'd rather call the field branch_cost, but there's already one with that name. Are there still places that use the old one after your patch series?

Hmm, I guess information about whether the branch is likely taken/not taken/unpredictable would be of use to add the instructions behind it into the cost of the existing code.

+/* Return TRUE if CODE is an RTX comparison operator.  */
+
+static bool
+noce_code_is_comparison_p (rtx_code code)

Isn't there some way to do this based on GET_RTX_CLASS?

In the noce_cmove_arith patch, is it possible to just construct the actual sequence we want to use and test its cost (much like the combiner's approach), rather than building up a random one for estimation? Seems like bailing out early based on a cost estimate is no longer profitable for compile-time if getting the estimate is as much work as doing the conversion in the first place.

I've bootstrapped and tested the patch set on x86_64 and aarch64, but
they probably need some further polishing if we were to decide this was a
useful direction.

Also, I'd like some information on what this does to code generation on a few different targets.


Bernd

Reply via email to