masahi commented on PR #15878:
URL: https://github.com/apache/tvm/pull/15878#issuecomment-1748329331

   > it might be desirable to have a way to have packed calls inside dataflow 
blocks
   
   Very strong +1 for this need. This week I integrated a KV cache update 
kernel, and faced a problem of how to call this op when the module being 
defined is already under a dataflow scope. To workaround this, I ended up 
[returning in-place updated arguments from the kernel 
](https://github.com/masahi/tvm/blob/contrib-vllm/src/runtime/contrib/vllm/cache_kernels.cu#L88)
 to make it look like a pure operation from outside, and [call 
it](https://github.com/masahi/mlc-llm/blob/llama-batched-vllm/mlc_llm/relax_model/llama_batched.py#L163-L174)
 via `call_pure_packed`. Although it gets the job done, it feels very awkward 
to return in-place updated arguments. I just wished that I could simply call `_ 
= call_packed(...)` inside a dataflow block and have transform passes not 
remove this dummy binding. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to