On 03/02/2018 06:51 PM, Cesar Philippidis wrote:
This patch teaches the nvptx BE how to process vector reductions with
large vector lengths.
As with the "[nvptx] Generalize state propagation and synchronization"
patch":
- added use of MAX and ROUND_UP
- added missing initialization of
On 03/02/2018 06:51 PM, Cesar Philippidis wrote:
This patch teaches the nvptx BE how to process vector reductions with
large vector lengths.
Committed test-case exercising large vector length with reductions.
Thanks,
- Tom
[openacc] Add vector-length-128-10.c
2018-04-05 Tom de Vries
This patch teaches the nvptx BE how to process vector reductions with
large vector lengths. The original vector reduction finalizer won't work
because it uses a warp shuffle operations. Now that vectors may contain
multiple warps, they need to store the partial reductions into
shared-memory like