================ @@ -61,6 +65,78 @@ static uint32_t gpu_irregular_simd_reduce(void *reduce_data, return (logical_lane_id == 0); } +// Reduction within a block on the GPU. +// +// Template parameters: +// - checkLiveness: Whether to check the liveness of the lanes. This is only +// useful if gpu_block_reduce is called in a context where +// L2 parallel regions are possible. ---------------- jdoerfert wrote:
L2 parallel regions are sequentialized, no? That should be the trivial case of num_threads == 1 handled in nvptx_parallel_reduce_nowait. Am I missing something? https://github.com/llvm/llvm-project/pull/195102 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
