masahi commented on PR #15878: URL: https://github.com/apache/tvm/pull/15878#issuecomment-1748329331
> it might be desirable to have a way to have packed calls inside dataflow blocks Very strong +1 for this need. This week I integrated a KV cache update kernel, and faced a problem of how to call this op when the module being defined is already under a dataflow scope. To workaround this, I ended up [returning in-place updated arguments from the kernel ](https://github.com/masahi/tvm/blob/contrib-vllm/src/runtime/contrib/vllm/cache_kernels.cu#L88) to make it look like a pure operation from outside, and [call it](https://github.com/masahi/mlc-llm/blob/llama-batched-vllm/mlc_llm/relax_model/llama_batched.py#L163-L174) via `call_pure_packed`. Although it gets the job done, it feels very awkward to return in-place updated arguments. I just wished that I could simply call `_ = call_packed(...)` inside a dataflow block and have transform passes not remove this dummy binding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
