wrongtest commented on PR #80:
URL: https://github.com/apache/tvm-rfcs/pull/80#issuecomment-1159769509
Many thanks~ The settings seems to also greatly benefit DMA synchronizations
handling in NPU workloads. For example, there could be "input DMA" -
"computation" - "output DMA" pipelines, where each pipeline stage may take it's
own IQ thus explicit synchronization instructions should be correctly inserted,
like "input DMA waits for the last (i-1 or i-2) output DMA".
Here are my two questions, just out of my curiosity :),
- What is the main purpose of `async_scope` annotation? It is for explicit
semantic representation, or useful hint to lowering analysis, or would affect
final codegen in CUDA?
If we only have data dependencies instead of the explicit control-flow
dependency annotations, could we still reach the same point of proper
synchronizations?
- For `async_commit_stage`/`async_wait_stage`, could I understand that they
are the standard tir intrinsic in stage pipeline settings, and the only things
for vendors to care about is how to lowering / codegen them?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]