Re: [PATCH, 1/8] Expand oacc kernels after pass_build_ealias
On 24-11-14 11:56, Tom de Vries wrote: On 15-11-14 18:19, Tom de Vries wrote: On 15-11-14 13:14, Tom de Vries wrote: Hi, I'm submitting a patch series with initial support for the oacc kernels directive. The patch series uses pass_parallelize_loops to implement parallelization of loops in the oacc kernels region. The patch series consists of these 8 patches: ... 1 Expand oacc kernels after pass_build_ealias 2 Add pass_oacc_kernels 3 Add pass_ch_oacc_kernels to pass_oacc_kernels 4 Add pass_tree_loop_{init,done} to pass_oacc_kernels 5 Add pass_loop_im to pass_oacc_kernels 6 Add pass_ccp to pass_oacc_kernels 7 Add pass_parloops_oacc_kernels to pass_oacc_kernels 8 Do simple omp lowering for no address taken var ... This patch moves omp expansion of the oacc kernels directive to after pass_build_ealias. The rationale is that in order to use pass_parallelize_loops for analysis and transformation of an oacc kernels region, we postpone omp expansion of that region until the earliest point in the pass list where enough information is availabe to run pass_parallelize_loops, in other words, after pass_build_ealias. The patch postpones expansion in expand_omp, and ensures expansion by adding pass_expand_omp_ssa: - after pass_build_ealias, and - after pass_all_early_optimizations for the case we're not optimizing. In order to make sure the oacc kernels region arrives at pass_expand_omp_ssa, the way it left expand_omp, the patch makes pass_ccp and pass_forwprop aware of lowered omp code, to handle it conservatively. The patch contains changes in expand_omp_target to deal with ssa-code, similar to what is already present in expand_omp_taskreg. Furthermore, the patch forces the .omp_data_sizes and .omp_data_kinds to not be static for oacc kernels. It does this to get some references to .omp_data_sizes and .omp_data_kinds in the ssa code. Without these references, the definitions will be removed. The reference of the variables in GIMPLE_OACC_KERNELS is not enough to have them not removed. [ In vries/oacc-kernels, I used a BUILT_IN_USE kludge for this purpose ]. Finally, at the end of pass_expand_omp_ssa we're left with SSA_NAMEs in the original function of which the definition has been removed (as in moved to the split off function). TODO_remove_unused_locals takes care of some of them, but not the anonymous ones. So the patch iterates over all SSA_NAMEs to find these dangling SSA_NAMEs and releases them. Reposting with small update: I've replaced the use of the rather generic gimple_stmt_omp_lowering_p with the more specific gimple_stmt_omp_data_i_init_p. Bootstrapped and reg-tested in the same way as before. I've moved pass_expand_omp_ssa one down in the pass list, past pass_fre. This allows fre to unify references to the same omp variable before entering pass_oacc_kernels, which helps pass_lim in pass_oacc_kernels. F.i. this reduction fragment: ... # VUSE .MEM_8 # PT = { D.2282 } _67 = .omp_data_i_59-sumD.2270; # VUSE .MEM_8 _68 = *_67; _70 = _66 + _68; # VUSE .MEM_8 # PT = { D.2282 } _69 = .omp_data_i_59-sumD.2270; # .MEM_71 = VDEF .MEM_8 *_69 = _70; ... is transformed by fre into: ... # VUSE .MEM_8 # PT = { D.2282 } _67 = .omp_data_i_59-sumD.2270; # VUSE .MEM_8 _68 = *_67; _70 = _66 + _68; # .MEM_71 = VDEF .MEM_8 *_67 = _70; ... In order for pass_fre to respect the kernels region boundaries, I've added a change in tree-ssa-sccvn.c:visit_use to handle the .omp_data_i init conservatively. Bootstrapped and reg-tested as before. OK for trunk? Thanks, - Tom [PATCH 1/7] Expand oacc kernels after pass_fre 2014-11-25 Tom de Vries t...@codesourcery.com * function.h (struct function): Add contains_oacc_kernels field. * gimplify.c (gimplify_omp_workshare): Set contains_oacc_kernels. * omp-low.c: Include gimple-pretty-print.h. (release_first_vuse_in_edge_dest): New function. (expand_omp_target): Handle ssa-code. (expand_omp): Don't expand GIMPLE_OACC_KERNELS when not in ssa. (pass_data_expand_omp): Don't set PROP_gimple_eomp unconditionally in properties_provided field. (pass_expand_omp::execute): Set PROP_gimple_eomp in cfun-curr_properties only if cfun does not contain oacc kernels. (pass_data_expand_omp_ssa): Add TODO_remove_unused_locals to todo_flags_finish field. (pass_expand_omp_ssa::execute): Release dangling SSA_NAMEs after calling execute_expand_omp. (lower_omp_target): Add static_arrays variable, init to 1. Don't use static arrays for kernels directive. Use static_arrays variable. Handle case that .omp_data_kinds is not static. (gimple_stmt_ssa_operand_references_var_p) (gimple_stmt_omp_data_i_init_p): New function. * omp-low.h (gimple_stmt_omp_data_i_init_p): Declare. * passes.def: Add pass_expand_omp_ssa after pass_fre. Add pass_expand_omp_ssa after pass_all_early_optimizations. * tree-ssa-ccp.c: Include omp-low.h. (surely_varying_stmt_p, ccp_visit_stmt): Handle
Re: [PATCH, 1/8] Expand oacc kernels after pass_build_ealias
On 15-11-14 18:19, Tom de Vries wrote: On 15-11-14 13:14, Tom de Vries wrote: Hi, I'm submitting a patch series with initial support for the oacc kernels directive. The patch series uses pass_parallelize_loops to implement parallelization of loops in the oacc kernels region. The patch series consists of these 8 patches: ... 1 Expand oacc kernels after pass_build_ealias 2 Add pass_oacc_kernels 3 Add pass_ch_oacc_kernels to pass_oacc_kernels 4 Add pass_tree_loop_{init,done} to pass_oacc_kernels 5 Add pass_loop_im to pass_oacc_kernels 6 Add pass_ccp to pass_oacc_kernels 7 Add pass_parloops_oacc_kernels to pass_oacc_kernels 8 Do simple omp lowering for no address taken var ... This patch moves omp expansion of the oacc kernels directive to after pass_build_ealias. The rationale is that in order to use pass_parallelize_loops for analysis and transformation of an oacc kernels region, we postpone omp expansion of that region until the earliest point in the pass list where enough information is availabe to run pass_parallelize_loops, in other words, after pass_build_ealias. The patch postpones expansion in expand_omp, and ensures expansion by adding pass_expand_omp_ssa: - after pass_build_ealias, and - after pass_all_early_optimizations for the case we're not optimizing. In order to make sure the oacc kernels region arrives at pass_expand_omp_ssa, the way it left expand_omp, the patch makes pass_ccp and pass_forwprop aware of lowered omp code, to handle it conservatively. The patch contains changes in expand_omp_target to deal with ssa-code, similar to what is already present in expand_omp_taskreg. Furthermore, the patch forces the .omp_data_sizes and .omp_data_kinds to not be static for oacc kernels. It does this to get some references to .omp_data_sizes and .omp_data_kinds in the ssa code. Without these references, the definitions will be removed. The reference of the variables in GIMPLE_OACC_KERNELS is not enough to have them not removed. [ In vries/oacc-kernels, I used a BUILT_IN_USE kludge for this purpose ]. Finally, at the end of pass_expand_omp_ssa we're left with SSA_NAMEs in the original function of which the definition has been removed (as in moved to the split off function). TODO_remove_unused_locals takes care of some of them, but not the anonymous ones. So the patch iterates over all SSA_NAMEs to find these dangling SSA_NAMEs and releases them. Reposting with small update: I've replaced the use of the rather generic gimple_stmt_omp_lowering_p with the more specific gimple_stmt_omp_data_i_init_p. Bootstrapped and reg-tested in the same way as before. OK for trunk? Thanks, - Tom 2014-11-14 Tom de Vries t...@codesourcery.com * function.h (struct function): Add contains_oacc_kernels field. * gimplify.c (gimplify_omp_workshare): Set contains_oacc_kernels. * omp-low.c: Include gimple-pretty-print.h. (release_first_vuse_in_edge_dest): New function. (expand_omp_target): Handle ssa-code. (expand_omp): Don't expand GIMPLE_OACC_KERNELS when not in ssa. (pass_data_expand_omp): Don't set PROP_gimple_eomp unconditionally in properties_provided field. (pass_expand_omp::execute): Set PROP_gimple_eomp in cfun-curr_properties only if cfun does not contain oacc kernels. (pass_data_expand_omp_ssa): Add TODO_remove_unused_locals to todo_flags_finish field. (pass_expand_omp_ssa::execute): Release dandging SSA_NAMEs after calling execute_expand_omp. (lower_omp_target): Add static_arrays variable, init to 1. Don't use static arrays for kernels directive. Use static_arrays variable. Handle case that .omp_data_kinds is not static. (gimple_stmt_ssa_operand_references_var_p) (gimple_stmt_omp_data_i_init_p): New function. * omp-low.h (gimple_stmt_omp_data_i_init_p): Declare. * passes.def: Add pass_expand_omp_ssa after pass_build_ealias. * tree-ssa-ccp.c: Include omp-low.h. (surely_varying_stmt_p, ccp_visit_stmt): Handle omp lowering code conservatively. * tree-ssa-forwprop.c: Include omp-low.h. (pass_forwprop::execute): Handle omp lowering code conservatively. --- gcc/function.h | 3 + gcc/gimplify.c | 1 + gcc/omp-low.c | 196 +--- gcc/omp-low.h | 1 + gcc/passes.def | 2 + gcc/tree-ssa-ccp.c | 6 ++ gcc/tree-ssa-forwprop.c | 4 +- 7 files changed, 200 insertions(+), 13 deletions(-) diff --git a/gcc/function.h b/gcc/function.h index 3a6305c..bb48775 100644 --- a/gcc/function.h +++ b/gcc/function.h @@ -667,6 +667,9 @@ struct GTY(()) function { /* Set when the tail call has been identified. */ unsigned int tail_call_marked : 1; + + /* Set when the function contains oacc kernels directives. */ + unsigned int contains_oacc_kernels : 1; }; /* Add the decl D to the local_decls list of FUN. */ diff --git a/gcc/gimplify.c b/gcc/gimplify.c index ad48d51..c40f20f 100644 --- a/gcc/gimplify.c +++