ibsidorenko opened a new pull request, #14298:
URL: https://github.com/apache/tvm/pull/14298

   This is enhancement of [PR#13327](https://github.com/apache/tvm/pull/13327).
   
   **Motivation**:
   Playing with MetaScheduler for Hexagon target it was found that `avg_pool2d` 
has rather poor performance due to lack of vectorized code. 
`IndexDataTypeNormalizer `pass converts all indices to "int64" format and 
`NarrowDataTypeRewriter` should do the opposite (back to "int32"). In case of 
fail, we have a lot of int64 arithmetic for average pooling that can not be 
vectorized.
   
   **What was done:**
   Added support of binary ops ("div", "max", "min", "+" etc.) in 
`NarrowDataTypeRewriter`. In case of different bitwidth of operands in binary 
opeation it does downcasting instead of upcasting (as it was before).
   
   **Performance impact:**
   `avg_pool2d` from quantized InceptionV3 with the shape [1, 8, 35, 35, 32] 
(NCHW32c layout) tuned with MetaScheduler on Snapdragon 8gen1:
   
   shape             | Before fix, ms | After fix, ms |   speedup   |
   ------------------|----------------|---------------|-------------|
   avg_pool2d, int32 |      6.67      |      4.41     |    +34%     |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to