apeskov commented on PR #11513:
URL: https://github.com/apache/tvm/pull/11513#issuecomment-1150836941
One important comment about performance. Just to point out.
In this patch you are using mechanic of auto detection proper layout inside
of dnnl_json_runtime. It works correctly and dense primitive will use optimal
layout. But it will execute weight reordering each inference call. This
reordering significantly break performance (still better than previously, but
less than possible).
To avoid weight reordering it should be done once during `Init`. For that
you need change dense weight pattern from `wildcard` to `is_constant`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]