[GitHub] [incubator-tvm] wpan11nv commented on pull request #5382: [TE] Fix MakeLoopNest for warp memory
wpan11nv commented on pull request #5382: URL: https://github.com/apache/incubator-tvm/pull/5382#issuecomment-620140371 > Do you mean requiring the users to tag the iter_var or we do it in InferBound? If the former, maybe we can merge this PR first and then start an RFC for the API change. If the latter, can you point out where to tag the iter_var? I am not familiar with InferBound. OK. I do not have a good idea on this. Probably we could just use this convention. I think there is no issue for CUDA. For OpenCL, a warp or subgroup is not necessarily dispatched along threadidx.x FWIK. Probably we care less on OpenCL part? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-tvm] wpan11nv commented on pull request #5382: [TE] Fix MakeLoopNest for warp memory
wpan11nv commented on pull request #5382: URL: https://github.com/apache/incubator-tvm/pull/5382#issuecomment-619201349 Can we make it explicit, e.g by tagging the binding iter_var with "warp" scope. The existing "warp" scope attribute on stage looks odd to me. Should it be an attribute on iter_var's? For CUDA, when the block configuration is (1, 32), threadIdx.y may "define" the warp scope. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org