================
@@ -61,6 +65,78 @@ static uint32_t gpu_irregular_simd_reduce(void *reduce_data,
   return (logical_lane_id == 0);
 }
 
+// Reduction within a block on the GPU.
+//
+// Template parameters:
+// - checkLiveness: Whether to check the liveness of the lanes. This is only
+//                  useful if gpu_block_reduce is called in a context where
+//                  L2 parallel regions are possible.
----------------
ro-i wrote:

they can have dispersed lanes afaiu? And, thus num_threads > 1, but not 
contiguous. But checkLiveness is also employed for contiguous partial warps 
(see previous parallel_reduce code). I fixed that in the commits I'm about to 
push.

https://github.com/llvm/llvm-project/pull/195102
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to