Jakub,
the following patch series implements the reduction handling for OpenACC:

01-trunk-reductions-core-1102.patch  Core  execution changes
02-trunk-reductions-ptx-1102.patch   PTX backend bits
03-trunk-reductions-tests-1102.patch Testcases


The reduction mechanism relies on a new internal builtin -- IFN_GOACC_REDUCTION, which is used in 4 different places. IYR the loop partionining is managed with FORK and JOIN unique_fn markers. The reductions go around these as follows:

IFN_UNIQUE (HEAD_MARKER ...)
IFN_REDUCTION (SETUP ...)
IFN_UNIQUE (FORK ...)
IFN_REDUCTION (INIT ...)
IFN_UNIQUE (HEAD_MARKER)
<loop here>
IFN_UNIQUE (TAIL_MARKER ...)
IFN_REDUCTION (FINI ...)
IFN_UNIQUE (JOIN ...)
IFN_REDUCTION (TEARDOWN ...)
IFN_UNIQUE (TAIL_MARKER)


There's a quad of functions for each reduction variable of the loop. If a loop is partitioned over multiple dimensions, there are additional quads for each dimension, surrounding the fork/join for that dimension.

All the reduction calls look similar and are:

V = REDUCTION (KIND, REF_TO_RES, LOCAL_VAR, LEVEL, OP, OFFSET)

REF_TO_RES is a pointer to a reciever object. it is a null pointer constant if there is no such object.
LOCAL_VAR is the executing thread's instance of the reduction variable.
LEVEL is the dimension across which this reduction is partitiong (gang, worker, vector). As with the head/tail markers,this assignment of level is deferred to the target compiler.
OP is the reduction operator
OFFSET is an offset into a hypothetical buffer allocated for all the reductions of this particular loop. It's a way of identifying which quad of reductions apply to the same logical variable, and happens to be useful in some use cases (I'll expand on that in the PTX fragment).

All these functions return a new value for the local variable.

When everything collapses to a single thread (i.e. on the host), the implementation of these functions is trivial.

SETUP
- if REF_TO_RES is not nullptrconst, return *REF_TO_RES, else return LOCAL_VAR (this is a compile-time check)
INIT & FINI
  - return LOCAL_VAR
TEARDOWN
  - if REF_TO_RES is not nullptrconst *REF_TO_RES = LOCAL_VAR.
    always return LOCAL_VAR

Reply via email to