csullivan opened a new pull request #7518:
URL: https://github.com/apache/tvm/pull/7518


   This is an in progress removal of the use of compile engine. The motivation 
is to bring TIR compilation into the main flow of the compiler rather than 
producing and compiling it via a callback into the compile engine. By replacing 
Relay primitive function calls with TIR prim function calls that contain the 
lowered TIR we enable, 
   - An intermediate stage in the lowering process where Relay and TIR coexist. 
   - The ability to add passes at this intermediate stage, 
       - For example memory planning which can infer user provided information 
from TE and the resulting TIR. 
   
   We are starting with a proof of concept by refactoring the 
GraphRuntimeCodegen to use an introduced TIR/TE compiler instead of the compile 
engine directly. In the new flow, 
   - The TE/TIR compiler lowers TE in the LowerTensorExpr pass 
   - Replaces relay.Function(attr:primitive) with a PrimFnCall that contains 
the lowered TIR
   - Runs GraphPlanMemory
   - Finally runs GraphRuntimeCodegen::VisitExpr to lower to graph JSON
   
   We plan to post an RFC in the discuss forum for more discussion soon, but we 
welcome discussion, comments, concerns here as we push on this refactor.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to