tqchen commented on PR #14802:
URL: https://github.com/apache/tvm/pull/14802#issuecomment-1540498665
Thanks @masahi ! Some high level notes:
- We don't need to use zeros, can use relax's builtin alloc_tensor
Some ideas on making it even more composable:
- We can have cutlass BYOC to generate TIR functions that contains the
allocate and call_extern part into cutlass (the T.allocate here can depend on
dyn shape)
- We can then have a pass that lift the workspace allocation into relax
- Finally, the relax memory planning will tie up the reusable memory through
memory planning
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]