On Wed, Aug 10, 2011 at 4:49 PM, Ilya Enkovich <enkovich....@gmail.com> wrote: > Hello, > > Here is a new version of the patch. Changes from the previous version > (http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02240.html): > - updated to trunk > - TODO_remove_unused_locals flag was removed from todo_flags_finish > of reassoc pass > > Bootstrapped and checked on x86_64-linux.
The fact that you have to adjust gcc.dg/tree-ssa/pr38533.c looks problematic to me. Can you investigate the problem report, look at the geneated code with the atom default of the param and see whether it's still reasonably "fixed" (maybe you'd already done this)? + /* Check if we may use less width and still compute sequence for + the same time. It will allow us to reduce registers usage. */ + while (width > 1 + && get_required_cycles (ops_num, width - 1) == cycles_best) + width--; I suppose get_required_cycles () is monotonic in width? Would it make sense to binary search the best width then (I realize the default is 1 and the only target differing has 2, but ...)? Maybe at least add a comment to this effect? Or not decrement by one but instead divide by two on each iteration (which is the same for 1 and 2 ...)? It's also all a mapping that is constant - we should be able to pre-compute it for all values, eventually "compressing" it to a much simpler formula width = f (cpu_width, ops_num)? +static void +rewrite_expr_tree_parallel (gimple stmt, int width, + VEC(operand_entry_t, heap) * ops) +{ + enum tree_code opcode = gimple_assign_rhs_code (stmt); + int op_num = VEC_length (operand_entry_t, ops); + int stmt_num = op_num - 1; + gimple *stmts = XNEWVEC (gimple, stmt_num); use XALLOCAVEC here and remove the free call. + if (ready_stmts_end == 0 && + (i - stmt_index >= width || op_index < 1)) &&s go to the next line, not at the end + else + { + tree var = create_tmp_reg (TREE_TYPE (last_rhs1), "reassoc"); + add_referenced_var (var); please re-use a common variable for the whole chain, they will all necessarily have compatible type. Creating and maintaining decls is expensive (also avoid giving them names, thus just pass NULL instead of "reassoc"). + { + enum machine_mode mode = TYPE_MODE (TREE_TYPE (lhs)); + int width = get_reassociation_width (ops, rhs_code, mode); as you are passing in the mode here, consider passing the number of operands in ops as well instead of ops. This way we might consider moving this whole function to the target or to generic code. Similar move the dump printing in get_reassociation_width + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, + "Width = %d was chosen for reassociation\n", width); here to the caller. Sorry for taking so long to review this again. I'll promise to be quick once you post an update. Thanks, Richard. > Thanks, > Ilya > --- > gcc/ > > 2011-08-10 Enkovich Ilya <ilya.enkov...@intel.com> > > PR middle-end/44382 > * target.def (reassociation_width): New hook. > > * doc/tm.texi.in (reassociation_width): Likewise. > > * doc/tm.texi (reassociation_width): Likewise. > > * doc/invoke.texi (tree-reassoc-width): New param documented. > > * hooks.h (hook_int_uint_mode_1): New default hook. > > * hooks.c (hook_int_uint_mode_1): Likewise. > > * config/i386/i386.h (ix86_tune_indices): Add > X86_TUNE_REASSOC_INT_TO_PARALLEL and > X86_TUNE_REASSOC_FP_TO_PARALLEL. > > (TARGET_REASSOC_INT_TO_PARALLEL): New. > (TARGET_REASSOC_FP_TO_PARALLEL): Likewise. > > * config/i386/i386.c (initial_ix86_tune_features): Add > X86_TUNE_REASSOC_INT_TO_PARALLEL and > X86_TUNE_REASSOC_FP_TO_PARALLEL. > > (ix86_reassociation_width) implementation of > new hook for i386 target. > > * params.def (PARAM_TREE_REASSOC_WIDTH): New param added. > > * tree-ssa-reassoc.c (get_required_cycles): New function. > (get_reassociation_width): Likewise. > (swap_ops_for_binary_stmt): Likewise. > (rewrite_expr_tree_parallel): Likewise. > > (rewrite_expr_tree): Refactored. Part of code moved into > swap_ops_for_binary_stmt. > > (reassociate_bb): Now checks reassociation width to be used > and call rewrite_expr_tree_parallel instead of rewrite_expr_tree > if needed. > > gcc/testsuite/ > > 2011-08-10 Enkovich Ilya <ilya.enkov...@intel.com> > > * gcc.dg/tree-ssa/pr38533.c (dg-options): Added option > --param tree-reassoc-width=1. > > * gcc.dg/tree-ssa/reassoc-24.c: New test. > * gcc.dg/tree-ssa/reassoc-25.c: Likewise. >