[GitHub] [tvm] manupa-arm commented on issue #9022: [Bug] BuiltinLower does not use alloca for storage on kDLCPU target devices

GitBox Fri, 17 Sep 2021 09:41:22 -0700


manupa-arm commented on issue #9022:
URL: https://github.com/apache/tvm/issues/9022#issuecomment-921934672



   Thanks @tqchen for the explanation of two viewpoints of how we could see the 
constraints.
   
   I do agree that we should put more constraints progressively to guide the 
compiler in the lowering. 
   
   It is just that the treatment of TVMBAW as a peer to heap and stack seemed 
not right. In fact one could use TVMBAW to perform heap allocation. My concern 
was that we are taking a decision way down in the compilation flow where it 
could have been taken transparently in the IR itself a bit higher.
   
   I think we are moving there with scopes but it would have been nicer to 
stick to TVMBAW for now as it is the more general lowering for 'global' and I 
would not see necessarily that as an orthogonal choice to the list you have 
mentioned. It just boils to the fact that we just want them to be on stack for 
mem2reg optimizations. While I dont disagree with the logic of the argument, 
but wish it was more explicit higher up in the compilation flow. If it was not 
for mem2reg optimizations, one could simply provide a Arena that could provide 
the workspace required as it would from the stack -- thus it seemed to me like 
a better interrim solution until we specialize the scope in the lowering.
   
   Yes, as you suggested the implementation of the target-dependent query for 
the max alloca size is not particularly challenging, it is just the API that we 
provide for the user is what we were worried about. This is important 
especially "micro" is not a target really in TVM -- so the short term solution 
seems like we would need one of the following TargetKind attributes for C and 
LLVM backends : 
   
   A1 :  --max-stack-alloca-size
   A2 :  --max-stack-size
   A3 :  --use-backend-alloc-workspace
   
   So here ideally from UX point of view, it would be better to give the user 
A2 argument rather than a constraining the size of single allocation seems like 
a proxy to control the behaviour of mem2reg transformations. 
   
   A3 on the other hand set kMaxStackAllocaSize to zero and forcing all the 
allocates to be serviced by TVMBAW which could be from heap or Arena placed 
anywhere controlled by the runtime/application.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] manupa-arm commented on issue #9022: [Bug] BuiltinLower does not use alloca for storage on kDLCPU target devices

Reply via email to