masahi opened a new pull request, #12865:
URL: https://github.com/apache/tvm/pull/12865

   The logic in 
https://github.com/apache/tvm/blob/52d6b59a39f503fe382b4d7cbac4b02f9e44aae0/python/tvm/topi/generic/conv2d.py#L480-L499
 is supposed to legalize the input dtype to be able to apply target-specific 
intrinsics that only support one of int8 or uint8. For example, the x86 VNNI 
instruction only supports (uint8, int8) activation and weight pair. 
   
   But the logic is incorrect (two cases are flipped) and leads to incorrect 
result in the following case:
   
   * The input activation is int8, and we want to use the x86 VNNI intrinsic 
which only supports uint8 activations.
   * The input activation is uint8, and we want to use the ARM `sdot` intrinsic 
which only supports int8 activations.
   
   The first case also applies to the Hexagon `vrmpy` intrinsic. I found this 
bug while testing `vrmpy` conv2d on int8 input.
   
   To test this on CI, we need to be running on a cascadelake or ARM v8.2 (with 
dot product support) instance. I cannot find a way to detect such cpu feature 
from a python script. `try / catch` doesn't work because the error is raised 
from LLVM (`LLVM ERROR: Do not know how to split the result of this operator`) 
that I don't know how to catch. So for now the test is skipped. Any suggestion 
on this? @areusch @driazati 
   
   cc @tkonolige @mbrookhart    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to