On Mon, 29 Oct 2012, Jan Hubicka wrote:
> > > ICK ...
> > >
> > > Why not sth as simple as
> > >
> > > return num_ssa_operands (stmt, SSA_OP_USE);
> > >
> > > ? a[1][2] and b[2] really have the same cost, variable length
> > > objects have extra SSA operands in ARRAY_REF/COMPONENT_REF for
> > > the size. Thus, stmt cost somehow should reflect the number
> > > of dependent stmts.
> > >
> > > So in estimate_num_insns I'd try
> > >
> > > int
> > > estimate_num_insns (gimple stmt, eni_weights *weights)
> > > {
> > > unsigned cost, i;
> > > enum gimple_code code = gimple_code (stmt);
> > > tree lhs;
> > > tree rhs;
> > >
> > > switch (code)
> > > {
> > > case GIMPLE_ASSIGN:
> > > /* Initialize with the number of SSA uses, one is free. */
> > > cost = num_ssa_operands (stmt, SSA_OP_USE);
> > > if (cost > 1)
> > > --cost;
> >
> > Hi,
> > this is the udpated patch I am testing after today discussion. I decided to
> > drop the ipa-inline-analysis changes and do that incrementally. So the patch
> > now trashes tramp3d performance by increasing need for early-inlining-insns,
> > but it is not inexpected.
> >
> > The patch also fixes accounting of addr expr that got previously confused
> > with
> > a load.
> >
> > Does this seem sane?
> >
> > * tree-inline.c (estimate_operator_cost): Return 1 for non-trivial
> > ADDR_EXPR operations.
> > (estimate_num_insns): Do not confuse general single rhs with
> > loads; account cost of non-trivial addr_expr for ASSIGN_EXPR,
> > GIMPLE_RETURN and GIMPLE_ASM
>
> Hi,
> this patch actually do not cause that much of tramp3d fuzz and no differences
> in testsuite due to unroling changes
>
> The change are the constants when accounting loads and stores.
> Typical store has two SSA uses (one for address and one for value to store).
> Of course we lose difference in between array offset and pointer dereference.
>
> Typical load/address expression has one SSA use (for the address)
>
> Bootstrapped/regtested x86_64-linux, OK?
ChangeLog?
> Honza
>
> Index: tree-inline.c
> ===================================================================
> --- tree-inline.c (revision 192945)
> +++ tree-inline.c (working copy)
> @@ -3447,6 +3447,19 @@ estimate_operator_cost (enum tree_code c
> if (TREE_CODE (op2) != INTEGER_CST)
> return weights->div_mod_cost;
> return 1;
> + case ADDR_EXPR:
> + {
> + tree addr_base;
> + HOST_WIDE_INT addr_offset;
> +
> + addr_base = get_addr_base_and_unit_offset (TREE_OPERAND (op1, 0),
> + &addr_offset);
> + /* If the offset is variable or with non-zero offset, return 2. */
> + if (!addr_base || addr_offset != 0 || TREE_CODE (addr_base) != MEM_REF
> + || !integer_zerop (TREE_OPERAND (addr_base, 1)))
> + return 1;
The comment doesn't match the code. Why the TREE_CODE (addr_base) !=
MEM_REF check? If it isn't a MEM_REF then it is a plain decl, thus
no dereference. So it's not clear what you want here ...?
It looks like you want to pessimize variable addresses here,
like &a.a[i]? Before all ADDR_EXPR cost was zero.
I'd say you want simply
if (!addr_base || addr_offset != 0)
return 1;
no? Or even
if (!addr_base
|| (addr_offset != 0 && TREE_CODE (addr_base) == MEM_REF))
return 1;
that keeps &decl + CST as cost 0. Or even
if (!addr_base)
return 1;
that even keeps ptr + CST as cost 0. Both because they are likely
combined with some complex addressing mode later.
> + }
> + return 0;
Inside the } please.
>
> default:
> /* We expect a copy assignment with no operator. */
> @@ -3512,14 +3525,24 @@ estimate_num_insns (gimple stmt, eni_wei
> lhs = gimple_assign_lhs (stmt);
> rhs = gimple_assign_rhs1 (stmt);
>
> - if (is_gimple_reg (lhs))
> - cost = 0;
> - else
> + /* Store. */
> + if (gimple_vdef (stmt))
> cost = estimate_move_cost (TREE_TYPE (lhs));
> + else
> + cost = 0;
That change seems bogus. A !is_gimple_reg lhs is always a store.
>
> - if (!is_gimple_reg (rhs) && !is_gimple_min_invariant (rhs))
> + /* Load. */
> + if (gimple_vuse (stmt))
> cost += estimate_move_cost (TREE_TYPE (rhs));
Likewise. If rhs1 is not a register or a constant this is a load.
All stores also have a VUSE so you are always accounting a store
as aggregate copy this way.
>
> + /* Stores, loads and address expressions may have variable array
> + references in them. Account these. */
> + if (gimple_vdef (stmt))
> + cost += MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 2);
> + else if (gimple_vuse (stmt)
> + || gimple_assign_rhs_code (stmt) == ADDR_EXPR)
> + cost += MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 1);
> +
Rather than doing this here in this awkward and bogus way (see above)
why not do it in estimate_operator_cost? You added ADDR_EXPR already,
so simply add DECLs and handled-components. Then you can do
better than estimating from num_ssa_operands of the stmt by simply
walking the handled-components looking for SSA uses in the
relevant positions.
> cost += estimate_operator_cost (gimple_assign_rhs_code (stmt), weights,
> gimple_assign_rhs1 (stmt),
> get_gimple_rhs_class
> (gimple_assign_rhs_code (stmt))
> @@ -3596,6 +3619,7 @@ estimate_num_insns (gimple stmt, eni_wei
>
> case GIMPLE_RETURN:
> return weights->return_cost;
> + + MAX (0, (int)num_ssa_operands (stmt, SSA_OP_USE) - 1);
Well ... why not estimate_operator_cost?
> case GIMPLE_GOTO:
> case GIMPLE_LABEL:
> @@ -3606,7 +3630,10 @@ estimate_num_insns (gimple stmt, eni_wei
> return 0;
>
> case GIMPLE_ASM:
> - return asm_str_count (gimple_asm_string (stmt));
> + return (asm_str_count (gimple_asm_string (stmt))
> + /* Account also possible non-trivial addr_exprs
> + in the arguments. */
> + + num_ssa_operands (stmt, SSA_OP_USE));
The comment is misleading, you are counting all simple register
operands as well. If you want to improve ASM cost then walk
over the inputs/outputs and use estimate_operator_cost again.
Richard.