[GitHub] [incubator-tvm] t-vi commented on pull request #5600: [TOPI] Improve CUDA softmax scheduling

2020-06-04 Thread GitBox


t-vi commented on pull request #5600:
URL: https://github.com/apache/incubator-tvm/pull/5600#issuecomment-638823562


   @wpan11nv  Thanks for your offer to help. I submitted the clean-up #5726 and 
then in #5727 I add ROCm warp reductions. One of the things I did was to avoid 
assuming a fixed warp-size of 32 in the TIR transformations before codegen.
   Thank you for improving softmax btw - it was something that looked funny 
with the four kernels before.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-tvm] t-vi commented on pull request #5600: [TOPI] Improve CUDA softmax scheduling

2020-06-03 Thread GitBox


t-vi commented on pull request #5600:
URL: https://github.com/apache/incubator-tvm/pull/5600#issuecomment-638622419


   I'm adding shfl intrinsics to the rocm bits (using 
`tvm.intrin.rule.rocm.tvm_warp_shuffle /-up/-down` definitions).
   I'm currently seeing a funny effect where I get a `tvm_thread_allreduce` 
call with null arguments in `lower_thread_allreduce`'s `MakeAllreduce`. 
Eventually, I hope to get to the codegen - when I'll probably run into the 
nvptx bits in the llvm codegen. Is there a reason not to use the intrin.rule 
mechanism for nvptx?
   I'm not sure running `gpu_imagenet_bench.py` (which I'm using as the first 
stop of seeing if anything works) with the nvptx target works for me (though I 
get to the codegen for that), but I would not know if it worked before...
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-tvm] t-vi commented on pull request #5600: [TOPI] Improve CUDA softmax scheduling

2020-06-03 Thread GitBox


t-vi commented on pull request #5600:
URL: https://github.com/apache/incubator-tvm/pull/5600#issuecomment-638329923


   I'll just work on a fix.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-tvm] t-vi commented on pull request #5600: [TOPI] Improve CUDA softmax scheduling

2020-06-03 Thread GitBox


t-vi commented on pull request #5600:
URL: https://github.com/apache/incubator-tvm/pull/5600#issuecomment-638275567


   So ROCm uses the CUDA schedule, but warp reductions don't seem to currently 
work (so arguably, ROCm would want to be improved). But so before this PR, one 
could run resnet18 with rocm backend and now one cannot.
   This can also be seen earlier, when running the warp reduction tests on ROCm.
   I've looked a bit into fixing it, but I haven't fully understood from which 
of the three related patches this stems. 
   (Incidentally, it triggered also a corner case for me on cuda where nvrtc 
would accidentally use an cuda-8.0 instead of the 10.1 the libnvrtc belonged 
to.)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-tvm] t-vi commented on pull request #5600: [TOPI] Improve CUDA softmax scheduling

2020-06-03 Thread GitBox


t-vi commented on pull request #5600:
URL: https://github.com/apache/incubator-tvm/pull/5600#issuecomment-638068589


   This broke the ROCm backend.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org