[GitHub] [incubator-tvm] roastduck commented on a change in pull request #5498: [Optimization] Warp level reduction support for CUDA

GitBox Sat, 02 May 2020 23:31:16 -0700


roastduck commented on a change in pull request #5498:
URL: https://github.com/apache/incubator-tvm/pull/5498#discussion_r419055679




##########
File path: src/tir/transforms/lower_warp_memory.cc
##########
@@ -265,10 +265,11 @@ class WarpAccessRewriter : protected StmtExprMutator {
           << op->index << " local_index=" << local_index;
       PrimExpr load_value = LoadNode::make(
           op->dtype, op->buffer_var, local_index, op->predicate);
+      PrimExpr mask = IntImm(DataType::UInt(32), 0xFFFFFFFF);

Review comment:
       Setting mask to `0xFFFFFFFF` here might not work, because this call site 
is probably inside a branch, for example the boundary check `if (threadIdx.x < 
n)`. It will be great if we can generate an accurate mask here. Otherwise we 
should probably keep the old `__shlf` call for compatibility, maybe by adding 
an "unknown" option to mask. To make it more explicit, downgrading 
`__shfl_async` to `__shlf` is straight-forward, but upgrading `__shfl` to 
`__shfl_async` is not.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-tvm] roastduck commented on a change in pull request #5498: [Optimization] Warp level reduction support for CUDA

Reply via email to