[GitHub] [tvm] masahi commented on pull request #8636: [OpenCL] Add vectorization to cuda conv2d_nhwc schedule

GitBox Tue, 24 Aug 2021 04:00:23 -0700


masahi commented on pull request #8636:
URL: https://github.com/apache/tvm/pull/8636#issuecomment-904538888



   Thanks @echuraev, this is really interesting. With the new commit, I was 
able to generate a packed float16x2 instruction for AMD Vega or newer GPUs, 
which is supposed to be 2x faster than normal fp32 or fp16 instructions 
https://gist.github.com/masahi/2de1a7dc87e2068ffb50ba6135273f95#file-conv2d_nhwc_float16x2-s-L495-L496.
 I'm going to experiment further.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] masahi commented on pull request #8636: [OpenCL] Add vectorization to cuda conv2d_nhwc schedule

Reply via email to