[GitHub] [tvm] billishyahao opened a new pull request, #11513: [BYOC][DNNL] Enhance performance of DNNL BYOC dense operator

GitBox Tue, 31 May 2022 00:32:23 -0700


billishyahao opened a new pull request, #11513:
URL: https://github.com/apache/tvm/pull/11513


   This patch is to enhance the performance of DNNL BYOC dense operators by 1) 
introducing gelu fusion and 2) introducing alter dense weight layout.
   
   Why do we introduce gelu fusion:
   For the model family of BERT, GELU (Gaussian Error Linear Unit) activation 
is used heavily so if we perform gelu fusion in those models, then we gain a 
better performance boost. 
   
   Why do we introduce automatically packed dense and its altered weight layout:
   Format tag::ab (aka. tag::NC) is not the best format selected by DNNL 
inner_product primitive. It is a drawback in current DNNL BYOC module.
   
   For what model it fit in:
   Dense intensity type such as Bert family
   
   With this patch, I benchmarked the inference performance of a kind of 
vision-tranformer called PCPVT (https://arxiv.org/abs/2104.13840) on ICX-8352Y. 
Here is some boost data:
   
   | 32 cores |Latency (dev)  |
   |--|--|
   | stock byoc | 46.37ms (0.45ms) | 
   | byoc w/ patch| 38.68ms (0.35ms)  |
   
   
   
   Thanks for contributing to TVM!   Please refer to guideline 
https://tvm.apache.org/docs/contribute/ for useful information and tips. After 
the pull request is submitted, please request code reviews from 
[Reviewers](https://github.com/apache/incubator-tvm/blob/master/CONTRIBUTORS.md#reviewers)
 by @ them in the pull request thread.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] billishyahao opened a new pull request, #11513: [BYOC][DNNL] Enhance performance of DNNL BYOC dense operator

Reply via email to