wrongtest commented on PR #80:
URL: https://github.com/apache/tvm-rfcs/pull/80#issuecomment-1159769509

   Many thanks~ The settings seems to also greatly benefit DMA synchronizations 
handling in NPU workloads. For example, there could be "input DMA" - 
"computation" - "output DMA" pipelines, where each pipeline stage may take it's 
own IQ thus explicit synchronization instructions should be correctly inserted, 
like "input DMA waits for the last (i-1 or i-2) output DMA". 
   
   Here are my two questions, just out of my curiosity :), 
   - What is the main purpose of `async_scope` annotation? It is for explicit 
semantic representation,  or useful hint to lowering  analysis, or would affect 
final codegen in CUDA? 
     If we only have data dependencies instead of the explicit control-flow 
dependency annotations, could we still reach the same point of proper 
synchronizations?
   
   - For `async_commit_stage`/`async_wait_stage`, could I understand that they 
are the standard tir intrinsic in stage pipeline settings, and the only things 
for vendors to care about is how to lowering / codegen them? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to