masahi commented on pull request #9482:
URL: https://github.com/apache/tvm/pull/9482#issuecomment-964681835


   Hi @FranckQC, I wanted TIR-level CSE for a long time, so very excited to see 
this!
   
   What I wanted to do is to eliminate common expressions that span across the 
host and GPU - for example, in GPU `sort` kernel, I need to make `log2(N)` GPU 
kernel calls from the host to sort the input bottom up. In principle, `log2(N)` 
needs to computed once by the host and pass to the GPU kernel, but since we 
cannot CSE `log2(N)` expression that appears both in the host and GPU kernel, 
right now the GPU sort kernel is littered with `log2(N)` compute like this 
(note a log of calls to `call_spirv_pure_glsl450` which is totally unnecessary 
if we had TIR-level CSE) 
https://gist.github.com/masahi/7a755ef67009e1a836e3212c53cf496f
   
   Is this PR going to solve my problem?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to