[GitHub] [tvm] Hzfengsy opened a new pull request, #13534: [TIR] Introduce Pass UnifyKernelLaunch and speedup split op

GitBox Thu, 01 Dec 2022 17:55:45 -0800


Hzfengsy opened a new pull request, #13534:
URL: https://github.com/apache/tvm/pull/13534


   This PR introduces a new pass UnifyKernelLaunch to fuse the nearby kernels 
with same thread config, which will reduce the kernel launch overhead in 
`split` operator.
   
   Test on the following settings:
   
   ```python
   A = te.placeholder((75, 182), "int64")
   B = topi.split(A, 182, axis=1)
   
   func = te.create_prim_func([A, *B])
   ```
   
   The pass will reduce the latency from 347.03 us to 5.22 us on my machine.
   
   cc @MasterJH5574 @vinx13 @junrushao 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] Hzfengsy opened a new pull request, #13534: [TIR] Introduce Pass UnifyKernelLaunch and speedup split op

Reply via email to