The following patch makes the vectorizer cost model more finegrained by splitting -f[no-]vect-cost-model into -fvect-cost-model=[unlimited|dynamic|cheap], thereby consuming the -ftree-vect-loop-version flag. The cost model will be always enabled after this patch (as opposed to currently where -O2 -ftree-vectorize will have it disabled but -O3 will have it enabled). It opens up the possibility to, in patch 3/n, enable vectorization by default at -O2 but with the "cheap" cost-model instead of the current default "dynamic". This will disable versioning for alias (but not versioning for alignment sofar).
@item -fvect-cost-model=@var{model} @opindex fvect-cost-model Alter the cost model used for vectorization. The @var{model} argument should be one of @code{unlimited}, @code{dynamic} or @code{cheap}. With the @code{unlimited} model the vectorized code-path is assumed to be profitable while with the @code{dynamic} model a runtime check will guard the vectorized code-path to enable it only for iteration counts that will likely execute faster than when executing the original scalar loop. The @code{cheap} model will disable vectorization of loops where doing so would be cost prohibitive for example due to required runtime checks for data dependence or alignment but otherwise is equal to the @code{dynamic} model. This option is enabled by default, the used cost model depends on other optimization flags and is either @code{dynamic} or @code{cheap}. Bootstrap / regtest running on x86_64-unknown-linux-gnu. Any comments? Thanks, Richard. 2013-05-14 Richard Biener <rguent...@suse.de> common/ * config/i386/i386-common.c (ix86_option_init_struct): Do not enable OPT_fvect_cost_model. * common.opt (fvect-cost-model=): New option, default to 'default'. (vect_cost_model): New enum and values. (fvect-cost-model): Alias to -fvect-cost-model=default. (fno-vect-cost-model): Alias to -fvect-cost-model=unlimited. (ftree-vect-loop-version): Ignore. * opts.c (default_options_table): Do not set OPT_fvect_cost_model. (common_handle_option): Likewise. * flag-types.h (enum vect_cost_model): New enum. * doc/invoke.texi (ftree-vect-loop-version): Remove. (fvect-cost-model): Adjust documentation. * targhooks.c (default_add_stmt_cost): Do not check flag_vect_cost_model. * tree-vectorizer.h (struct _loop_vec_info): Add cost model field. (struct _bb_vec_info): Likewise. * tree-vect-data-refs.c (vect_peeling_hash_insert): Check the loops cost-model flag. (vect_peeling_hash_choose_best_peeling): Likewise. (vect_enhance_data_refs_alignment): Likewise. Do not check flag_tree_vect_loop_version but check the cost model. * tree-vect-loop.c (vect_analyze_loop): Initialize the loops cost model flag. (vect_estimate_min_profitable_iters): Use the loops cost model flag. * tree-vect-slp.c (vect_slp_analyze_bb_1): Initialize and use the BBs cost model flag. * tree-vectorizer.c (gate_vect_slp): Adjust. Index: trunk/gcc/common.opt =================================================================== *** trunk.orig/gcc/common.opt 2013-05-14 14:45:00.000000000 +0200 --- trunk/gcc/common.opt 2013-05-14 15:26:09.640070043 +0200 *************** ftree-slp-vectorize *** 2270,2282 **** Common Report Var(flag_tree_slp_vectorize) Init(2) Optimization Enable basic block vectorization (SLP) on trees fvect-cost-model ! Common Report Var(flag_vect_cost_model) Optimization ! Enable use of cost model in vectorization ftree-vect-loop-version ! Common Report Var(flag_tree_vect_loop_version) Init(1) Optimization ! Enable loop versioning when doing loop vectorization on trees ftree-scev-cprop Common Report Var(flag_tree_scev_cprop) Init(1) Optimization --- 2270,2305 ---- Common Report Var(flag_tree_slp_vectorize) Init(2) Optimization Enable basic block vectorization (SLP) on trees + fvect-cost-model= + Common Joined RejectNegative Enum(vect_cost_model) Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT) + Specifies the cost model for vectorization + + Enum + Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown vectorizer cost model %qs) + + EnumValue + Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT) + + EnumValue + Enum(vect_cost_model) String(unlimited) Value(VECT_COST_MODEL_UNLIMITED) + + EnumValue + Enum(vect_cost_model) String(dynamic) Value(VECT_COST_MODEL_DYNAMIC) + + EnumValue + Enum(vect_cost_model) String(cheap) Value(VECT_COST_MODEL_CHEAP) + fvect-cost-model ! Common RejectNegative Alias(fvect-cost-model=,default) ! Enables the default vectorizer cost model. Preserved for backward compatibility. ! ! fno-vect-cost-model ! Common RejectNegative Alias(fvect-cost-model=,unlimited) ! Enables the unlimited vectorizer cost model. Preserved for backward compatibility. ftree-vect-loop-version ! Common Ignore ! Does nothing. Preserved for backward compatibility. ftree-scev-cprop Common Report Var(flag_tree_scev_cprop) Init(1) Optimization Index: trunk/gcc/opts.c =================================================================== *** trunk.orig/gcc/opts.c 2013-05-14 14:45:00.000000000 +0200 --- trunk/gcc/opts.c 2013-05-14 14:53:11.561805276 +0200 *************** static const struct default_options defa *** 498,504 **** { OPT_LEVELS_3_PLUS, OPT_funswitch_loops, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_fgcse_after_reload, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_ftree_vectorize, NULL, 1 }, - { OPT_LEVELS_3_PLUS, OPT_fvect_cost_model, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_fipa_cp_clone, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_ftree_partial_pre, NULL, 1 }, --- 498,503 ---- *************** common_handle_option (struct gcc_options *** 1597,1604 **** opts->x_flag_gcse_after_reload = value; if (!opts_set->x_flag_tree_vectorize) opts->x_flag_tree_vectorize = value; - if (!opts_set->x_flag_vect_cost_model) - opts->x_flag_vect_cost_model = value; if (!opts_set->x_flag_tree_loop_distribute_patterns) opts->x_flag_tree_loop_distribute_patterns = value; break; --- 1596,1601 ---- Index: trunk/gcc/common/config/i386/i386-common.c =================================================================== *** trunk.orig/gcc/common/config/i386/i386-common.c 2013-05-14 14:44:59.000000000 +0200 --- trunk/gcc/common/config/i386/i386-common.c 2013-05-14 14:46:13.557098918 +0200 *************** ix86_option_init_struct (struct gcc_opti *** 729,735 **** opts->x_flag_pcc_struct_return = 2; opts->x_flag_asynchronous_unwind_tables = 2; - opts->x_flag_vect_cost_model = 1; } /* On the x86 -fsplit-stack and -fstack-protector both use the same --- 729,734 ---- Index: trunk/gcc/flag-types.h =================================================================== *** trunk.orig/gcc/flag-types.h 2013-05-14 14:44:59.000000000 +0200 --- trunk/gcc/flag-types.h 2013-05-14 14:46:13.557098918 +0200 *************** enum fp_contract_mode { *** 191,194 **** --- 191,202 ---- FP_CONTRACT_FAST = 2 }; + /* Vectorizer cost-model. */ + enum vect_cost_model { + VECT_COST_MODEL_UNLIMITED = 0, + VECT_COST_MODEL_CHEAP = 1, + VECT_COST_MODEL_DYNAMIC = 2, + VECT_COST_MODEL_DEFAULT = 3 + }; + #endif /* ! GCC_FLAG_TYPES_H */ Index: trunk/gcc/targhooks.c =================================================================== *** trunk.orig/gcc/targhooks.c 2013-05-14 14:44:59.000000000 +0200 --- trunk/gcc/targhooks.c 2013-05-14 14:46:13.558098929 +0200 *************** default_add_stmt_cost (void *data, int c *** 1050,1070 **** { unsigned *cost = (unsigned *) data; unsigned retval = 0; ! if (flag_vect_cost_model) ! { ! tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; ! int stmt_cost = default_builtin_vectorization_cost (kind, vectype, ! misalign); ! /* Statements in an inner loop relative to the loop being ! vectorized are weighted more heavily. The value here is ! arbitrary and could potentially be improved with analysis. */ ! if (where == vect_body && stmt_info && stmt_in_inner_loop_p (stmt_info)) ! count *= 50; /* FIXME. */ ! ! retval = (unsigned) (count * stmt_cost); ! cost[where] += retval; ! } return retval; } --- 1050,1066 ---- { unsigned *cost = (unsigned *) data; unsigned retval = 0; + tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; + int stmt_cost = default_builtin_vectorization_cost (kind, vectype, + misalign); + /* Statements in an inner loop relative to the loop being + vectorized are weighted more heavily. The value here is + arbitrary and could potentially be improved with analysis. */ + if (where == vect_body && stmt_info && stmt_in_inner_loop_p (stmt_info)) + count *= 50; /* FIXME. */ ! retval = (unsigned) (count * stmt_cost); ! cost[where] += retval; return retval; } Index: trunk/gcc/tree-vect-data-refs.c =================================================================== *** trunk.orig/gcc/tree-vect-data-refs.c 2013-05-14 14:44:59.000000000 +0200 --- trunk/gcc/tree-vect-data-refs.c 2013-05-14 14:46:13.559098940 +0200 *************** vect_peeling_hash_insert (loop_vec_info *** 1087,1093 **** *new_slot = slot; } ! if (!supportable_dr_alignment && !flag_vect_cost_model) slot->count += VECT_MAX_COST; } --- 1087,1094 ---- *new_slot = slot; } ! if (!supportable_dr_alignment ! && loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED) slot->count += VECT_MAX_COST; } *************** vect_peeling_hash_choose_best_peeling (l *** 1197,1203 **** res.peel_info.dr = NULL; res.body_cost_vec = stmt_vector_for_cost(); ! if (flag_vect_cost_model) { res.inside_cost = INT_MAX; res.outside_cost = INT_MAX; --- 1198,1204 ---- res.peel_info.dr = NULL; res.body_cost_vec = stmt_vector_for_cost(); ! if (loop_vinfo->cost_model != VECT_COST_MODEL_UNLIMITED) { res.inside_cost = INT_MAX; res.outside_cost = INT_MAX; *************** vect_enhance_data_refs_alignment (loop_v *** 1426,1432 **** vectorization factor. We do this automtically for cost model, since we calculate cost for every peeling option. */ ! if (!flag_vect_cost_model) possible_npeel_number = vf /nelements; /* Handle the aligned case. We may decide to align some other --- 1427,1433 ---- vectorization factor. We do this automtically for cost model, since we calculate cost for every peeling option. */ ! if (loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED) possible_npeel_number = vf /nelements; /* Handle the aligned case. We may decide to align some other *************** vect_enhance_data_refs_alignment (loop_v *** 1434,1440 **** if (DR_MISALIGNMENT (dr) == 0) { npeel_tmp = 0; ! if (!flag_vect_cost_model) possible_npeel_number++; } --- 1435,1441 ---- if (DR_MISALIGNMENT (dr) == 0) { npeel_tmp = 0; ! if (loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED) possible_npeel_number++; } *************** vect_enhance_data_refs_alignment (loop_v *** 1743,1749 **** /* (2) Versioning to force alignment. */ /* Try versioning if: ! 1) flag_tree_vect_loop_version is TRUE 2) optimize loop for speed 3) there is at least one unsupported misaligned data ref with an unknown misalignment, and --- 1744,1750 ---- /* (2) Versioning to force alignment. */ /* Try versioning if: ! 1) cost model is not VECT_COST_MODEL_CHEAP 2) optimize loop for speed 3) there is at least one unsupported misaligned data ref with an unknown misalignment, and *************** vect_enhance_data_refs_alignment (loop_v *** 1751,1757 **** 5) the number of runtime alignment checks is within reason. */ do_versioning = ! flag_tree_vect_loop_version && optimize_loop_nest_for_speed_p (loop) && (!loop->inner); /* FORNOW */ --- 1752,1758 ---- 5) the number of runtime alignment checks is within reason. */ do_versioning = ! loop_vinfo->cost_model != VECT_COST_MODEL_CHEAP && optimize_loop_nest_for_speed_p (loop) && (!loop->inner); /* FORNOW */ Index: trunk/gcc/tree-vect-loop.c =================================================================== *** trunk.orig/gcc/tree-vect-loop.c 2013-05-14 14:44:59.000000000 +0200 --- trunk/gcc/tree-vect-loop.c 2013-05-14 14:46:13.560098951 +0200 *************** vect_analyze_loop (struct loop *loop) *** 1761,1766 **** --- 1761,1772 ---- return NULL; } + loop_vinfo->cost_model = flag_vect_cost_model; + if (loop_vinfo->cost_model == VECT_COST_MODEL_DEFAULT) + loop_vinfo->cost_model + = ((flag_tree_vectorize == 1 || optimize == 3) + ? VECT_COST_MODEL_DYNAMIC : VECT_COST_MODEL_CHEAP); + if (vect_analyze_loop_2 (loop_vinfo)) { LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1; *************** vect_estimate_min_profitable_iters (loop *** 2634,2640 **** void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); /* Cost model disabled. */ ! if (!flag_vect_cost_model) { dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled."); *ret_min_profitable_niters = 0; --- 2640,2646 ---- void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); /* Cost model disabled. */ ! if (loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED) { dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled."); *ret_min_profitable_niters = 0; Index: trunk/gcc/tree-vect-slp.c =================================================================== *** trunk.orig/gcc/tree-vect-slp.c 2013-05-14 14:44:59.000000000 +0200 --- trunk/gcc/tree-vect-slp.c 2013-05-14 14:46:13.560098951 +0200 *************** vect_slp_analyze_bb_1 (basic_block bb) *** 1992,1997 **** --- 1992,2001 ---- if (!bb_vinfo) return NULL; + bb_vinfo->cost_model = flag_vect_cost_model; + if (bb_vinfo->cost_model != VECT_COST_MODEL_UNLIMITED) + bb_vinfo->cost_model = VECT_COST_MODEL_CHEAP; + if (!vect_analyze_data_refs (NULL, bb_vinfo, &min_vf)) { if (dump_enabled_p ()) *************** vect_slp_analyze_bb_1 (basic_block bb) *** 2093,2099 **** } /* Cost model: check if the vectorization is worthwhile. */ ! if (flag_vect_cost_model && !vect_bb_vectorization_profitable_p (bb_vinfo)) { if (dump_enabled_p ()) --- 2097,2103 ---- } /* Cost model: check if the vectorization is worthwhile. */ ! if (bb_vinfo->cost_model != VECT_COST_MODEL_UNLIMITED && !vect_bb_vectorization_profitable_p (bb_vinfo)) { if (dump_enabled_p ()) Index: trunk/gcc/tree-vectorizer.c =================================================================== *** trunk.orig/gcc/tree-vectorizer.c 2013-05-14 14:44:59.000000000 +0200 --- trunk/gcc/tree-vectorizer.c 2013-05-14 14:46:13.561098962 +0200 *************** gate_vect_slp (void) *** 193,199 **** { /* Apply SLP either if the vectorizer is on and the user didn't specify whether to run SLP or not, or if the SLP flag was set by the user. */ ! return ((flag_tree_vectorize != 0 && flag_tree_slp_vectorize != 0) || flag_tree_slp_vectorize == 1); } --- 193,199 ---- { /* Apply SLP either if the vectorizer is on and the user didn't specify whether to run SLP or not, or if the SLP flag was set by the user. */ ! return ((flag_tree_vectorize == 1 && flag_tree_slp_vectorize != 0) || flag_tree_slp_vectorize == 1); } Index: trunk/gcc/tree-vectorizer.h =================================================================== *** trunk.orig/gcc/tree-vectorizer.h 2013-05-14 14:44:59.000000000 +0200 --- trunk/gcc/tree-vectorizer.h 2013-05-14 14:46:13.561098962 +0200 *************** typedef struct _loop_vec_info { *** 314,319 **** --- 314,322 ---- fix it up. */ bool operands_swapped; + /* The cost model to be used for this loop. */ + enum vect_cost_model cost_model; + } *loop_vec_info; /* Access Functions. */ *************** typedef struct _bb_vec_info { *** 391,396 **** --- 394,402 ---- /* Cost data used by the target cost model. */ void *target_cost_data; + /* The cost model to be used for this BB. */ + enum vect_cost_model cost_model; + } *bb_vec_info; #define BB_VINFO_BB(B) (B)->bb