apeskov opened a new pull request #9618:
URL: https://github.com/apache/tvm/pull/9618
The main value of that change is enable qnn.conv2d and qnn.dense primitive
for DNNL base json runtime.
Some of these changes is useful for all type of workloads, not only int8
specific.
Together with that there was performed some refactoring of internal
infrastructure of DNNL plugin. The main int8 unrelated changes are:
* Improved thread safety. Now DNNL runtime can be used in multi instance
mode.
* Zero copy input/output handling
* Scratchpad specification
* Use DNNL query api to define proper layouts (additional data copy is
possible)
* Indirect addressing of memory objects. Internal tensor registry allow to
clone temp/const tensor depending particular thread id.
* Relative positioning of input arguments. Index of each optional arguments
specified via attributes
* Introduced ability of calculating constant subgraphs in DNNL code
generator stage
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]