sandeep-krishnamurthy commented on issue #14977: Add an utility for operator 
benchmarks
URL: https://github.com/apache/incubator-mxnet/pull/14977#issuecomment-493724973
 
 
   > Thanks for initiating this. Besides the questions below, I have some 
high-level questions and comments:
   > 
   > * why is it necessary to implement all the calls by hand? this approach 
seems rather inefficient. is there any way to implement this more concisely?
   Good point. I did think about it and was discussed by other community 
members on the proposal doc. Below is my thought process.
   
   Code per operator is something like below:
   ```
   add_res = run_performance_test(nd.add, run_backward=True, dtype=dtype, 
ctx=ctx,
                                      inputs=[{"lhs": (1024, 1024),
                                               "rhs": (1024, 1024)},
                                              {"lhs": (10000, 10),
                                               "rhs": (10000, 10)},
                                              {"lhs": (10000, 1),
                                               "rhs": (10000, 100)}],
                                      warmup=warmup, runs=runs)
   ``` 
   run_performance_test function will provide all the necessary tools to do a 
benchmark and get results. User is expected to specify 2 things - operator 
name, inputs for the operator. Providing inputs based on what needs to be 
tested is a crucial part and is made explicit. Each operator can have different 
criteria that needs to be in performance tests. Ex: broadcasting shapes are 
necessary for say arithmetic operators, small / large tensors etc.
   
   Hence, if we automate fully for example a solution like - Automatically 
fetch all operators registered, fetch inputs required, understand the input 
meanings (Ex: lhs shape is same or broadcastable to rhs shape). Such concise 
thing may help in having less code, but, may hide too many details and make it 
hard to generally use this tool, integrate it with systems like PR/Nightly 
benchmark dashboarding etc.
   
   Having said that, there are certainly few rooms of improvements - like all 
binary operators like add,sub,mul have similar expectations and can be made 
concise. But, I thought, it may lead to over engineering a simple utility 
required to easily run benchmark test for an operator. 
   
   > * what happens when people implement new operators? must they implement 
profiling logics here too?
   Yes. Not the profiling logic, but, something like below when they provide 
name of the operator, and specify inputs to use.
   
   ```python
   add_res = run_performance_test(nd.add, run_backward=True, dtype=dtype, 
ctx=ctx,
                                      inputs=[{"lhs": (1024, 1024),
                                               "rhs": (1024, 1024)}],
                                      warmup=warmup, runs=runs)
   ```
   > * I see the PR is marked as complete despite the many TODOs in the code. I 
don't think the code can be checked in in this state.
   Motivation of this PR as described in the description is to setup the base 
infrastructure for a tool to easily do operator benchmarks. Add 4 operators 
benchmarks in the PR setting up everything required for any community members 
to help it out to get more than 200 operators we have in MXNet covered. Also, 
reason for adding TODOs is to cover the overall skeleton and clearly put out 
the idea of what we want to cover next so that roadmap is clearly set.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to