apeskov opened a new pull request #9618:
URL: https://github.com/apache/tvm/pull/9618


   The main value of that change is enable qnn.conv2d and qnn.dense primitive 
for DNNL base json runtime.
   Some of these changes is useful for all type of workloads, not only int8 
specific. 
   
   Together with that there was performed some refactoring of internal 
infrastructure of DNNL plugin. The main int8 unrelated  changes are:
    * Improved thread safety. Now DNNL runtime can be used in multi instance 
mode.
    * Zero copy input/output handling
    * Scratchpad specification
    * Use DNNL query api to define proper layouts (additional data copy is 
possible)
    * Indirect addressing of memory objects. Internal tensor registry allow to 
clone temp/const tensor depending particular thread id.  
    * Relative positioning of input arguments. Index of each optional arguments 
specified via attributes
    * Introduced ability of calculating constant subgraphs in DNNL code 
generator stage
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to