apeforest commented on issue #14496: performance degradation from 1.3.1 to 1.4.0
URL: 
https://github.com/apache/incubator-mxnet/issues/14496#issuecomment-477378018
 
 
   @sun-dev is correct. The computation in this operator is not on the elements 
of the tensor but the between shape index of the tensor. There are add, 
multiplication and division involved in the transpose operator 
[here](https://github.com/dmlc/mshadow/blob/757a91c3ca4f5ebf4879739c0871d2d5534465ac/mshadow/extension/transpose.h#L74)
   
   I did a performance comparison of different arithmetic operations between 
32-bit and 64-bit integers on CPU. There are noticable difference below. FYI, 
you can use 
[this](https://github.com/apeforest/doraemon/blob/master/perf32vs64.cc) code to 
reproduce. 
   
   ```
   result = 49995000
   Add 32 time in clocks 24869
   Add 32 time in ms 1359
   result = 49995000
   Add 64 time in clocks 6070
   Add 64 time in ms 1971
   result = 349965000
   Add Mul 32 time in clocks 3601
   Add Mul 32 time in ms 1196
   result = 349965000
   Add Mul 64 time in clocks 9967
   Add Mul 64 time in ms 3477
   result = 7137858
   Add Div 32 time in clocks 8273
   Add Div 32 time in ms 2878
   result = 7137858
   Add Div 64 time in clocks 24016
   Add Div 64 time in ms 8499
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to