Re: [PATCH, 1/8] Expand oacc kernels after pass_build_ealias

2014-11-25 Thread Tom de Vries

On 24-11-14 11:56, Tom de Vries wrote:

On 15-11-14 18:19, Tom de Vries wrote:

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch moves omp expansion of the oacc kernels directive to after
pass_build_ealias.

The rationale is that in order to use pass_parallelize_loops for analysis and
transformation of an oacc kernels region, we postpone omp expansion of that
region until the earliest point in the pass list where enough information is
availabe to run pass_parallelize_loops, in other words, after pass_build_ealias.

The patch postpones expansion in expand_omp, and ensures expansion by adding
pass_expand_omp_ssa:
- after pass_build_ealias, and
- after pass_all_early_optimizations for the case we're not optimizing.

In order to make sure the oacc kernels region arrives at pass_expand_omp_ssa,
the way it left expand_omp, the patch makes pass_ccp and pass_forwprop aware of
lowered omp code, to handle it conservatively.

The patch contains changes in expand_omp_target to deal with ssa-code, similar
to what is already present in expand_omp_taskreg.

Furthermore, the patch forces the .omp_data_sizes and .omp_data_kinds to not be
static for oacc kernels. It does this to get some references to .omp_data_sizes
and .omp_data_kinds in the ssa code.  Without these references, the definitions
will be removed. The reference of the variables in GIMPLE_OACC_KERNELS is not
enough to have them not removed. [ In vries/oacc-kernels, I used a BUILT_IN_USE
kludge for this purpose ].

Finally, at the end of pass_expand_omp_ssa we're left with SSA_NAMEs in the
original function of which the definition has been removed (as in moved to the
split off function). TODO_remove_unused_locals takes care of some of them, but
not the anonymous ones. So the patch iterates over all SSA_NAMEs to find these
dangling SSA_NAMEs and releases them.



Reposting with small update: I've replaced the use of the rather generic
gimple_stmt_omp_lowering_p with the more specific gimple_stmt_omp_data_i_init_p.

Bootstrapped and reg-tested in the same way as before.



I've moved pass_expand_omp_ssa one down in the pass list, past pass_fre.

This allows fre to unify references to the same omp variable before entering 
pass_oacc_kernels, which helps pass_lim in pass_oacc_kernels.


F.i. this reduction fragment:
...
  # VUSE .MEM_8
  # PT = { D.2282 }
  _67 = .omp_data_i_59-sumD.2270;
  # VUSE .MEM_8
  _68 = *_67;

  _70 = _66 + _68;

  # VUSE .MEM_8
  # PT = { D.2282 }
  _69 = .omp_data_i_59-sumD.2270;
  # .MEM_71 = VDEF .MEM_8
  *_69 = _70;
...

is transformed by fre into:
...
  # VUSE .MEM_8
  # PT = { D.2282 }
  _67 = .omp_data_i_59-sumD.2270;
  # VUSE .MEM_8
  _68 = *_67;

  _70 = _66 + _68;

  # .MEM_71 = VDEF .MEM_8
  *_67 = _70;
...

In order for pass_fre to respect the kernels region boundaries, I've added a 
change in tree-ssa-sccvn.c:visit_use to handle the .omp_data_i init conservatively.


Bootstrapped and reg-tested as before.

OK for trunk?

Thanks,
- Tom

[PATCH 1/7] Expand oacc kernels after pass_fre

2014-11-25  Tom de Vries  t...@codesourcery.com

	* function.h (struct function): Add contains_oacc_kernels field.
	* gimplify.c (gimplify_omp_workshare): Set contains_oacc_kernels.
	* omp-low.c: Include gimple-pretty-print.h.
	(release_first_vuse_in_edge_dest): New function.
	(expand_omp_target): Handle ssa-code.
	(expand_omp): Don't expand GIMPLE_OACC_KERNELS when not in ssa.
	(pass_data_expand_omp): Don't set PROP_gimple_eomp unconditionally in
	properties_provided field.
	(pass_expand_omp::execute): Set PROP_gimple_eomp in
	cfun-curr_properties only if cfun does not contain oacc kernels.
	(pass_data_expand_omp_ssa): Add TODO_remove_unused_locals to
	todo_flags_finish field.
	(pass_expand_omp_ssa::execute): Release dangling SSA_NAMEs after calling
	execute_expand_omp.
	(lower_omp_target): Add static_arrays variable, init to 1.  Don't use
	static arrays for kernels directive.  Use static_arrays variable.
	Handle case that .omp_data_kinds is not static.
	(gimple_stmt_ssa_operand_references_var_p)
	(gimple_stmt_omp_data_i_init_p): New function.
	* omp-low.h (gimple_stmt_omp_data_i_init_p): Declare.
	* passes.def: Add pass_expand_omp_ssa after pass_fre.  Add
	pass_expand_omp_ssa after pass_all_early_optimizations.
	* tree-ssa-ccp.c: Include omp-low.h.
	(surely_varying_stmt_p, ccp_visit_stmt): Handle 

Re: [PATCH, 1/8] Expand oacc kernels after pass_build_ealias

2014-11-24 Thread Tom de Vries

On 15-11-14 18:19, Tom de Vries wrote:

On 15-11-14 13:14, Tom de Vries wrote:

Hi,

I'm submitting a patch series with initial support for the oacc kernels
directive.

The patch series uses pass_parallelize_loops to implement parallelization of
loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
 1  Expand oacc kernels after pass_build_ealias
 2  Add pass_oacc_kernels
 3  Add pass_ch_oacc_kernels to pass_oacc_kernels
 4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
 5  Add pass_loop_im to pass_oacc_kernels
 6  Add pass_ccp to pass_oacc_kernels
 7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
 8  Do simple omp lowering for no address taken var
...


This patch moves omp expansion of the oacc kernels directive to after
pass_build_ealias.

The rationale is that in order to use pass_parallelize_loops for analysis and
transformation of an oacc kernels region, we postpone omp expansion of that
region until the earliest point in the pass list where enough information is
availabe to run pass_parallelize_loops, in other words, after pass_build_ealias.

The patch postpones expansion in expand_omp, and ensures expansion by adding
pass_expand_omp_ssa:
- after pass_build_ealias, and
- after pass_all_early_optimizations for the case we're not optimizing.

In order to make sure the oacc kernels region arrives at pass_expand_omp_ssa,
the way it left expand_omp, the patch makes pass_ccp and pass_forwprop aware of
lowered omp code, to handle it conservatively.

The patch contains changes in expand_omp_target to deal with ssa-code, similar
to what is already present in expand_omp_taskreg.

Furthermore, the patch forces the .omp_data_sizes and .omp_data_kinds to not be
static for oacc kernels. It does this to get some references to .omp_data_sizes
and .omp_data_kinds in the ssa code.  Without these references, the definitions
will be removed. The reference of the variables in GIMPLE_OACC_KERNELS is not
enough to have them not removed. [ In vries/oacc-kernels, I used a BUILT_IN_USE
kludge for this purpose ].

Finally, at the end of pass_expand_omp_ssa we're left with SSA_NAMEs in the
original function of which the definition has been removed (as in moved to the
split off function). TODO_remove_unused_locals takes care of some of them, but
not the anonymous ones. So the patch iterates over all SSA_NAMEs to find these
dangling SSA_NAMEs and releases them.



Reposting with small update: I've replaced the use of the rather generic 
gimple_stmt_omp_lowering_p with the more specific gimple_stmt_omp_data_i_init_p.


Bootstrapped and reg-tested in the same way as before.


OK for trunk?

Thanks,
- Tom



2014-11-14  Tom de Vries  t...@codesourcery.com

	* function.h (struct function): Add contains_oacc_kernels field.
	* gimplify.c (gimplify_omp_workshare): Set contains_oacc_kernels.
	* omp-low.c: Include gimple-pretty-print.h.
	(release_first_vuse_in_edge_dest): New function.
	(expand_omp_target): Handle ssa-code.
	(expand_omp): Don't expand GIMPLE_OACC_KERNELS when not in ssa.
	(pass_data_expand_omp): Don't set PROP_gimple_eomp unconditionally in
	properties_provided field.
	(pass_expand_omp::execute): Set PROP_gimple_eomp in
	cfun-curr_properties only if cfun does not contain oacc kernels.
	(pass_data_expand_omp_ssa): Add TODO_remove_unused_locals to
	todo_flags_finish field.
	(pass_expand_omp_ssa::execute): Release dandging SSA_NAMEs after calling
	execute_expand_omp.
	(lower_omp_target): Add static_arrays variable, init to 1.  Don't use
	static arrays for kernels directive.  Use static_arrays variable.
	Handle case that .omp_data_kinds is not static.
	(gimple_stmt_ssa_operand_references_var_p)
	(gimple_stmt_omp_data_i_init_p): New function.
	* omp-low.h (gimple_stmt_omp_data_i_init_p): Declare.
	* passes.def: Add pass_expand_omp_ssa after pass_build_ealias.
	* tree-ssa-ccp.c: Include omp-low.h.
	(surely_varying_stmt_p, ccp_visit_stmt): Handle omp lowering code
	conservatively.
	* tree-ssa-forwprop.c: Include omp-low.h.
	(pass_forwprop::execute): Handle omp lowering code conservatively.
---
 gcc/function.h  |   3 +
 gcc/gimplify.c  |   1 +
 gcc/omp-low.c   | 196 +---
 gcc/omp-low.h   |   1 +
 gcc/passes.def  |   2 +
 gcc/tree-ssa-ccp.c  |   6 ++
 gcc/tree-ssa-forwprop.c |   4 +-
 7 files changed, 200 insertions(+), 13 deletions(-)

diff --git a/gcc/function.h b/gcc/function.h
index 3a6305c..bb48775 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -667,6 +667,9 @@ struct GTY(()) function {
 
   /* Set when the tail call has been identified.  */
   unsigned int tail_call_marked : 1;
+
+  /* Set when the function contains oacc kernels directives.  */
+  unsigned int contains_oacc_kernels : 1;
 };
 
 /* Add the decl D to the local_decls list of FUN.  */
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index ad48d51..c40f20f 100644
--- a/gcc/gimplify.c
+++