apeskov commented on PR #11228:
URL: https://github.com/apache/tvm/pull/11228#issuecomment-1124959258

   @manupa-arm 
   
   > Do you have a good reason why we need to compound this behaviour?
   As in short that's because of BYOC. @masahi answers that quite correctly in 
previous discussion
   
   Will try to explain a little bit more detailed. 
   
   In my particular case I have to know the tensor is constant or not before 
applying "partition_for_xxx" pass. Imagine that you have device which is able 
to process conv2 primitive only when weights are constants. Term "constant" in 
that case means that weight data are available on device initialisation step 
and device is able to apply some HW specific transformations and copy in proper 
HW specific memory. Moreover, we do not know type of weight transformation 
during tvm compilation because it depends on particular type of HW and device 
state. 
   
   So we have to partition graph with taking into account this requirements. 
Patterns may look like next:
   ```
   pat_1 = is_op("qnn.conv2d")(wildcard(), is_constant())  # Good. Strong 
requirements of constants
   pat_2 = is_op("qnn.conv2d")(wildcard(), wildcard())     # Bad. No 
restrictions. Will match anywhere, with and without const
   ```
   
   The pattern 'pat_2' is not suitable for our case because it will treat 
second argument as regular var regardless of whether it's constant or not. 
Weight tensor will be passed to BYOC function as regular argument of method 
Run(), but not for Init(). So we would like to use 'pat_1'. 
   
   To support 'pat_1'  we have to fold all constant subgraphs (like a 
'qnn.quantize(const_weight_fp32)') to real constants before applying 
partitioner pass, otherwise the pattern will be skipped.  Applying legalization 
pass before constant folding will decompose 'qnn.conv2d' as well and pattern 
'pat_1' will not be matched anyway. Totally, using legalization + 
constant_folding
   before partitioning doesn't help.
   
   The shortest way I found is to conditionally decompose qnn primitives only 
for constant subgraphs. That is equivalent of adding qnn primitives into 
constant folding pass. And I think it's right direction. 
   
   One of alternative way is to introduce one more pattern helper like 
`is_constant_subgraph()` and implement lazy initialisation on BYOC side. But it 
looks slightly unnatural.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to