Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of loops in the oacc kernels region.

The patch series consists of these 8 patches:
...
    1  Expand oacc kernels after pass_build_ealias
    2  Add pass_oacc_kernels
    3  Add pass_ch_oacc_kernels to pass_oacc_kernels
    4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
    5  Add pass_loop_im to pass_oacc_kernels
    6  Add pass_ccp to pass_oacc_kernels
    7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
    8  Do simple omp lowering for no address taken var
...

The patch series does not yet apply cleanly to trunk, since it's dependent on the oacc middle end changes present in the gomp-4_0-branch, already submitted by Thomas for trunk.

Furthermore, it's dependent on an assert fix submitted for trunk ('Fix gcc_assert in expand_omp_for_static_chunk' @ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01149.html ).

The patch series is intended for trunk, but - given the dependency on the oacc middle end changes - has been bootstrapped for x86_64 on top of gomp-4_0-branch.

I'll post the patch series in reply to this email.

Thanks,
- Tom

[ FTR In order to get clean libgomp and goacc test results in gomp-4_0-branch, to have a good basis for testing, I used the following patch set:

 Don't allow flto-partition=balance for fopenacc
   Unsubmitted. This works around a compilation problem for
   libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-2.c that I ran into on
   our internal dev branch.  I'll investigate whether I can reproduce with
   gomp-4_0-branch asap.

 Mark fopenacc as LTO option
   @ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00085.html   

 Only use nvidia accelerator if present
   @ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00247.html

 Set default LIBGOMP_PLUGIN_PATH
   @ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00242.html
]

Reply via email to