[GitHub] [tvm] apeskov commented on pull request #11642: Enable QNN primitives for DNNL runtime

GitBox Thu, 16 Jun 2022 02:50:13 -0700


apeskov commented on PR #11642:
URL: https://github.com/apache/tvm/pull/11642#issuecomment-1157461403

Hi @yangulei,

As I remember zero-copy input/output of tensors should already works in TVM
main. If you define relay layouts match with DNNL expectations it will use
external buffer as is without any copying or processing. It was one of the goal
of PR https://github.com/apache/tvm/pull/11345. Could you please be more
specific about scenarios you would like to optimise?

Regarding `post-op sum`. You are absolutely right, non in-place op before
`add` break correctness. Mem copy is inevitable. In case of post op sum input
data should be put into DST tensor of DNNL primitive and execution of primitive
will rewrite this data. In contrast of `post-op binary add` read data from
separate input tensor. Currently `binary add` has a limited support by
primitives which lead to `ref:any`. Also it has slightly worse performance
because it lead to one more memory access pass.

In case of missing layouts DNNL BOC runtime automatically inject required
reorder primitives. And it will looks like next:

```
bias --
\
in1 -> RORDER_1 -> tmp_1 -> CONV -> tmp3 -> RORDER_3 -> out
/
in2 -> RORDER_2 --
```
Problem in tensor tmp_3. There is 2 primate which produce data of tmp_3.
That's break concept of data flow graph. `REORDER_2` should be executed
strongly before `CONV` primitive. If you take a look in this patch in code
relate with in-place simulation ([link to
it](https://github.com/apeskov/tvm/blob/054901196b5c562f70208b0d9394d16e305e6269/src/runtime/contrib/dnnl/dnnl_json_runtime.cc#L771-L788))
you will see exactly I said. Essentially it's just copy input data to dst
tensor **exactly** before convolution primitive.

`Post op sum` is very tricky and has a lot of requirements. It works only if
proper layouts for `conv` and `add` was selected. It requires validation of
ability to rewrite input tensor memory (for Resnet50 this is correct but in
arbitrary case it should be checked).

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] apeskov commented on pull request #11642: Enable QNN primitives for DNNL runtime

Reply via email to