[GitHub] [incubator-mxnet] roywei opened a new issue #15429: Operator Performance regression on CPU

GitBox Mon, 01 Jul 2019 23:37:06 -0700

roywei opened a new issue #15429: Operator Performance regression on CPU
URL: https://github.com/apache/incubator-mxnet/issues/15429
 
 
   Follow up on dev list discussion:
   
   
https://lists.apache.org/thread.html/154ef1e4010671e7375c7a7cbedb413d5a4a3677321488440fb32a3a@%3Cdev.mxnet.apache.org%3E
   
   We have found some operators to have performance regression using the 
operator benchmark module here:
   https://github.com/apache/incubator-mxnet/tree/master/benchmark/opperf
   
   @sandeep-krishnamurthy has helped to run the benchmark and this is the 
result:
   
https://gist.github.com/sandeep-krishnamurthy/e0a2be893c8c4d484390c9c8813bdf50
   
   The above result is using training mode (`autograd.record()`) and 
calculating both forward and backward time.
   
   To further investigate the impact on inference I have run the scripts 
   
   Please find the results here: 
   
https://docs.google.com/spreadsheets/d/1_eezNWbrBAm3s3i6G1m0Rd3YYdTEnmKlYtn4klqdyN0/edit?usp=sharing
   
   I have calculated the regression percentage and sorted them, thanks to 
@aaronmarkham for providing the first version.
   
   Although there are variances on perf numbers between runs, we observe the 
following commonly used operators be slower consistently.
   
   We need to fix them
   
   - [ ] BatchNorm
   - [ ] Dropout
   - [ ] relu
   - [ ] LeakyReLU
   - [ ] dot
   - [ ] element wise ops (mul, div, sub)
   - [ ] broadcast ops (mul, sub)
   
   Some ops regression seems only to happen on mxnet-mkl version (refer to 4th 
sheet of the google sheet)
   
   Environment:
   
   AWS C5.18xLarge
   Deep Learning Base AMI (Ubuntu) Version 18.1
   Python 3.6
   
   Scripts:
   https://github.com/apache/incubator-mxnet/tree/master/benchmark/opperf
   Notes: to run operators in inference mode, you need to set `False` at this 
line
   
https://github.com/apache/incubator-mxnet/blob/master/benchmark/opperf/utils/op_registry_utils.py#L73
   
   and change `run_backward` to `False` in all files under: 
   
https://github.com/apache/incubator-mxnet/tree/master/benchmark/opperf/nd_operations
   for example:
   
https://github.com/apache/incubator-mxnet/blob/master/benchmark/opperf/nd_operations/gemm_operators.py#L59


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] roywei opened a new issue #15429: Operator Performance regression on CPU

Reply via email to