anko-intel opened a new pull request #20821:
URL: https://github.com/apache/incubator-mxnet/pull/20821


   ## Description ##
   This change fuses FullyConnected operator with elemwise_add after it if 
possible. It is done for both float and quantized path.
   
   The change well optimize calculation on quantized graph with full 
quantization mode. Below are the measured results of the following benchmark :
   benchmark/python/dnnl/run.sh  benchmark/python/dnnl/fc_add.py
   run before and after this PR. Measurements are done on AWS EC2 instance 
c6i.16xlarge (Xeon(R) Platinum 8375C CPU).
   
   
   elemwise_add, float  
   |    Shape    | Hidden | Before [ms] | After [ms] | Improvment |
   |------------:|-------:|------------:|------------:|-----------:|
   | (   1, 224) |    512 |     1.010 |     1.016 | -1% |
   | (   1, 224) |   4096 |     1.053 |     1.011 |  4% |
   | (  16,1024) |   1024 |     1.191 |     1.155 |  3% |
   | (  32,4096) |   1024 |     1.814 |     1.793 |  1% |
   | (  32,4096) |   4096 |     3.764 |     3.650 |  3% |
   | ( 512, 512) |   4096 |     3.753 |     3.054 | 19% |
   
   Quantized:
   elemwise_add, mode = smart, granularity = tensor-wise  
   |    Shape    | Hidden | Before [ms] | After [ms] | Improvment |
   |------------:|-------:|------------:|------------:|-----------:|
   | (   1, 224) |    512 |     1.091 |     1.268 | -16% |
   | (   1, 224) |   4096 |     1.263 |     1.263 |   0% |
   | (  16,1024) |   1024 |     1.280 |     1.459 | -14% |
   | (  32,4096) |   1024 |     1.508 |     1.635 |  -8% |
   | (  32,4096) |   4096 |     1.857 |     1.998 |  -8% |
   | ( 512, 512) |   4096 |     3.003 |     2.563 |  15% |
   
   
   elemwise_add, mode = smart, granularity = channel-wise  
   
   |    Shape    | Hidden | Before [ms] | After [ms] | Improvment |
   |------------:|-------:|------------:|------------:|-----------:|
   | (   1, 224) |    512 |     1.154 |     1.192 |   -3% |
   | (   1, 224) |   4096 |     1.079 |     1.239 |  -15% |
   | (  16,1024) |   1024 |     1.173 |     1.424 |  -21% |
   | (  32,4096) |   1024 |     1.449 |     1.575 |   -9% |
   | (  32,4096) |   4096 |     1.806 |     1.921 |   -6% |
   | ( 512, 512) |   4096 |     2.969 |     2.507 |   16% |
   
   
   elemwise_add, mode = full, granularity = tensor-wise  
   
   |    Shape    | Hidden | Before* [ms] | After [ms] | Improvment |
   |------------:|-------:|------------:|------------:|-----------:|
   | (   1, 224) |    512 |     1.185 |     0.969 |  18% |
   | (   1, 224) |   4096 |     1.183 |     0.985 |  17% |
   | (  16,1024) |   1024 |     1.214 |     1.043 |  14% |
   | (  32,4096) |   1024 |     1.546 |     1.242 |  20% |
   | (  32,4096) |   4096 |     1.825 |     1.611 |  12% |
   | ( 512, 512) |   4096 |     2.311 |     1.745 |  24% |
   
   
   elemwise_add, mode = full, granularity = channel-wise  
   
   |    Shape    | Hidden | Before* [ms] | After [ms] | Improvment |
   |------------:|-------:|------------:|------------:|-----------:|
   | (   1, 224) |    512 |     1.090 |     0.915 |  16% |
   | (   1, 224) |   4096 |     1.082 |     0.878 |  19% |
   | (  16,1024) |   1024 |     1.150 |     0.934 |  19% |
   | (  32,4096) |   1024 |     1.490 |     1.149 |  23% |
   | (  32,4096) |   4096 |     1.784 |     1.553 |  13% |
   | ( 512, 512) |   4096 |     2.265 |     1.729 |  24% |
   
   \* - before this PR fuzing FC with add in full quantize mode is broken, so 
results are taken from the first commit of this PR which fix the issue
   
   ## Checklist ##
   ### Essentials ###
   - [ ] PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], 
[FEATURE], [DOC], etc)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage
   - [ ] Code is well-documented
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to