On Wed, Aug 10, 2011 at 4:49 PM, Ilya Enkovich <enkovich....@gmail.com> wrote:
> Hello,
>
> Here is a new version of the patch. Changes from the previous version
> (http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02240.html):
>  - updated to trunk
>  - TODO_remove_unused_locals flag was removed from todo_flags_finish
> of reassoc pass
>
> Bootstrapped and checked on x86_64-linux.

The fact that you have to adjust gcc.dg/tree-ssa/pr38533.c looks problematic
to me.  Can you investigate the problem report, look at the geneated
code with the atom default of the param and see whether it's still
reasonably "fixed" (maybe you'd already done this)?

+  /* Check if we may use less width and still compute sequence for
+     the same time.  It will allow us to reduce registers usage.  */
+  while (width > 1
+        && get_required_cycles (ops_num, width - 1) == cycles_best)
+    width--;

I suppose get_required_cycles () is monotonic in width?  Would it
make sense to binary search the best width then (I realize the
default is 1 and the only target differing has 2, but ...)?  Maybe at
least add a comment to this effect?  Or not decrement by one
but instead divide by two on each iteration (which is the same for 1 and 2 ...)?
It's also all a mapping that is constant - we should be able to
pre-compute it for all values, eventually "compressing" it to a
much simpler formula width = f (cpu_width, ops_num)?

+static void
+rewrite_expr_tree_parallel (gimple stmt, int width,
+                           VEC(operand_entry_t, heap) * ops)
+{
+  enum tree_code opcode = gimple_assign_rhs_code (stmt);
+  int op_num = VEC_length (operand_entry_t, ops);
+  int stmt_num = op_num - 1;
+  gimple *stmts = XNEWVEC (gimple, stmt_num);

use XALLOCAVEC here and remove the free call.

+      if (ready_stmts_end == 0 &&
+         (i - stmt_index >= width || op_index < 1))

&&s go to the next line, not at the end

+      else
+       {
+         tree var = create_tmp_reg (TREE_TYPE (last_rhs1), "reassoc");
+         add_referenced_var (var);

please re-use a common variable for the whole chain, they will
all necessarily have compatible type.  Creating and maintaining
decls is expensive (also avoid giving them names, thus just
pass NULL instead of "reassoc").

+               {
+                 enum machine_mode mode = TYPE_MODE (TREE_TYPE (lhs));
+                 int width = get_reassociation_width (ops, rhs_code, mode);

as you are passing in the mode here, consider passing the number
of operands in ops as well instead of ops.  This way we might consider
moving this whole function to the target or to generic code.  Similar
move the dump printing in get_reassociation_width

+  if (dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file,
+            "Width = %d was chosen for reassociation\n", width);

here to the caller.

Sorry for taking so long to review this again.  I'll promise to be quick
once you post an update.

Thanks,
Richard.

> Thanks,
> Ilya
> ---
> gcc/
>
> 2011-08-10  Enkovich Ilya  <ilya.enkov...@intel.com>
>
>        PR middle-end/44382
>        * target.def (reassociation_width): New hook.
>
>        * doc/tm.texi.in (reassociation_width): Likewise.
>
>        * doc/tm.texi (reassociation_width): Likewise.
>
>        * doc/invoke.texi (tree-reassoc-width): New param documented.
>
>        * hooks.h (hook_int_uint_mode_1): New default hook.
>
>        * hooks.c (hook_int_uint_mode_1): Likewise.
>
>        * config/i386/i386.h (ix86_tune_indices): Add
>        X86_TUNE_REASSOC_INT_TO_PARALLEL and
>        X86_TUNE_REASSOC_FP_TO_PARALLEL.
>
>        (TARGET_REASSOC_INT_TO_PARALLEL): New.
>        (TARGET_REASSOC_FP_TO_PARALLEL): Likewise.
>
>        * config/i386/i386.c (initial_ix86_tune_features): Add
>        X86_TUNE_REASSOC_INT_TO_PARALLEL and
>        X86_TUNE_REASSOC_FP_TO_PARALLEL.
>
>        (ix86_reassociation_width) implementation of
>        new hook for i386 target.
>
>        * params.def (PARAM_TREE_REASSOC_WIDTH): New param added.
>
>        * tree-ssa-reassoc.c (get_required_cycles): New function.
>        (get_reassociation_width): Likewise.
>        (swap_ops_for_binary_stmt): Likewise.
>        (rewrite_expr_tree_parallel): Likewise.
>
>        (rewrite_expr_tree): Refactored. Part of code moved into
>        swap_ops_for_binary_stmt.
>
>        (reassociate_bb): Now checks reassociation width to be used
>        and call rewrite_expr_tree_parallel instead of rewrite_expr_tree
>        if needed.
>
> gcc/testsuite/
>
> 2011-08-10  Enkovich Ilya  <ilya.enkov...@intel.com>
>
>        * gcc.dg/tree-ssa/pr38533.c (dg-options): Added option
>        --param tree-reassoc-width=1.
>
>        * gcc.dg/tree-ssa/reassoc-24.c: New test.
>        * gcc.dg/tree-ssa/reassoc-25.c: Likewise.
>

Reply via email to