apeskov commented on PR #11513:
URL: https://github.com/apache/tvm/pull/11513#issuecomment-1150836941

   One important comment about performance. Just to point out.
   
   In this patch you are using mechanic of auto detection proper layout inside 
of dnnl_json_runtime. It works correctly and dense primitive will use optimal 
layout. But it will execute weight reordering each inference call. This 
reordering significantly break performance (still better than previously, but 
less than possible).
   
   To avoid weight reordering it should be done once during `Init`. For that 
you need change dense weight pattern from `wildcard` to `is_constant`.
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to