Zha0q1 commented on pull request #18976:
URL: https://github.com/apache/incubator-mxnet/pull/18976#issuecomment-678471037


   Update:
   
   I used this script to time normal and uniform with the profiler:
   ```
   import mxnet
   from mxnet import profiler, np, npx, nd
   
   profiler.set_config(profile_all=True,
                       aggregate_stats=True,
                       continuous_dump=True,
                       filename='profile_output.json')
   
   profiler.set_state('run')
   
   for i in range(1000): # 5
       A = np.random.normal(0, 1, (2, 2, 2, 2, 2)) # (2**30)
       B = np.random.uniform(0, 1, (2, 2, 2, 2, 2))  # (2**30)
       npx.waitall()
   
   profiler.set_state('stop')
   print(profiler.dumps())
   ```
   
   With shape = (2, 2, 2, 2, 2), 1000 runs, results as follows:
   
   master:
   ```
   operator                                                                     
                                     
   =================                                                            
                                     
   Name                          Total Count        Time (ms)    Min Time (ms)  
  Max Time (ms)    Avg Time (ms)     
   ----                          -----------        ---------    -------------  
  -------------    -------------     
   _npi_normal                          1000         136.3660           0.0820  
         1.2510           0.1364     
   _npi_uniform                         1000         117.8620           0.0710  
         0.3080           0.1179     
   DeleteVariable                       2998          85.2090           0.0210  
         0.1160           0.0284     
   ResourceParallelRandomSetSeed               1          27.9670          
27.9670          27.9670          27.9670
   ```
   
   This PR:
   ``` 
   operator                                                                     
                                     
   =================                                                            
                                     
   Name                          Total Count        Time (ms)    Min Time (ms)  
  Max Time (ms)    Avg Time (ms)     
   ----                          -----------        ---------    -------------  
  -------------    -------------     
   _npi_normal                          1000         136.0820           0.0830  
         1.2160           0.1361     
   _npi_uniform                         1000         116.5880           0.0710  
         0.2300           0.1166     
   DeleteVariable                       2998          87.5840           0.0210  
         0.0890           0.0292     
   ResourceParallelRandomSetSeed               1          28.0330          
28.0330          28.0330          28.0330
   ```
   
   
   
   With shape = (2**30), 5 runs, results as follows:
   
   master:
   ```
   operator
   =================
   Name                          Total Count        Time (ms)    Min Time (ms)  
  Max Time (ms)    Avg Time (ms)
   ----                          -----------        ---------    -------------  
  -------------    -------------
   _npi_normal                             5      508480.5938      101081.6875  
    102168.1719      101696.1172
   _npi_uniform                            5      329225.9375       65534.1406  
     67001.8438       65845.1875
   ResourceParallelRandomSetSeed               1          28.0110          
28.0110          28.0110          28.0110
   DeleteVariable                         13           0.6330           0.0350  
         0.0900           0.0487
   ```
   
   This PR:
   ```
   operator                                                                     
                                     
   =================                                                            
                                     
   Name                          Total Count        Time (ms)    Min Time (ms)  
  Max Time (ms)    Avg Time (ms)     
   ----                          -----------        ---------    -------------  
  -------------    -------------     
   _npi_normal                             5      507331.0312      100847.3359  
    101944.7578      101466.2109     
   _npi_uniform                            5      322833.4062       64292.7656  
     65644.5703       64566.6797     
   ResourceParallelRandomSetSeed               1          28.0410          
28.0410          28.0410          28.0410 
   DeleteVariable                         13           0.5990           0.0370  
         0.0660           0.0461  
   ```
   
   
   I think the difference is quite insignificant. Theoretically speaking the 
indexing : actual computation ration should be really small. At least random 
ops should not be indexing-heavy


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to