Lunderberg commented on PR #15839: URL: https://github.com/apache/tvm/pull/15839#issuecomment-1740113779
I'm afraid I don't quite understand. Either `R.memory.alloc_storage` or `R.builtin.alloc_tensor` could be independent allocations that could be returned as output. (e.g. A static memory plan allocations using `R.memory.alloc_storage`, reuses that buffer several times internally, before returning it with the final result.) Oh, I think I see. That would mean that its relying on the current behavior of `StaticBlockPlanMemory`, and assumes that the user-provided parameters are identical across each call. Effectively, the last kernel launch would be also serving as a memcpy from the memory used by the captured cudagraph to other memory space. Is that understanding correct? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
