Zha0q1 edited a comment on pull request #18976:
URL: https://github.com/apache/incubator-mxnet/pull/18976#issuecomment-678471037
Update:
I used this script to time normal and uniform with the profiler:
```
import mxnet
from mxnet import profiler, np, npx, nd
profiler.set_config(profile_all=True,
aggregate_stats=True,
continuous_dump=True,
filename='profile_output.json')
profiler.set_state('run')
for i in range(1000): # 5
A = np.random.normal(0, 1, (2, 2, 2, 2, 2)) # (2**30)
B = np.random.uniform(0, 1, (2, 2, 2, 2, 2)) # (2**30)
npx.waitall()
profiler.set_state('stop')
print(profiler.dumps())
```
With shape = (2, 2, 2, 2, 2), 1000 runs, results as follows:
master:
```
operator
=================
Name Total Count Time (ms) Min Time (ms)
Max Time (ms) Avg Time (ms)
---- ----------- --------- -------------
------------- -------------
_npi_normal 1000 136.3660 0.0820
1.2510 0.1364
_npi_uniform 1000 117.8620 0.0710
0.3080 0.1179
DeleteVariable 2998 85.2090 0.0210
0.1160 0.0284
ResourceParallelRandomSetSeed 1 27.9670
27.9670 27.9670 27.9670
```
This PR:
```
operator
=================
Name Total Count Time (ms) Min Time (ms)
Max Time (ms) Avg Time (ms)
---- ----------- --------- -------------
------------- -------------
_npi_normal 1000 136.0820 0.0830
1.2160 0.1361
_npi_uniform 1000 116.5880 0.0710
0.2300 0.1166
DeleteVariable 2998 87.5840 0.0210
0.0890 0.0292
ResourceParallelRandomSetSeed 1 28.0330
28.0330 28.0330 28.0330
```
With shape = (2**30), 5 runs, results as follows:
master:
```
operator
=================
Name Total Count Time (ms) Min Time (ms)
Max Time (ms) Avg Time (ms)
---- ----------- --------- -------------
------------- -------------
_npi_normal 5 508480.5938 101081.6875
102168.1719 101696.1172
_npi_uniform 5 329225.9375 65534.1406
67001.8438 65845.1875
ResourceParallelRandomSetSeed 1 28.0110
28.0110 28.0110 28.0110
DeleteVariable 13 0.6330 0.0350
0.0900 0.0487
```
This PR:
```
operator
=================
Name Total Count Time (ms) Min Time (ms)
Max Time (ms) Avg Time (ms)
---- ----------- --------- -------------
------------- -------------
_npi_normal 5 507331.0312 100847.3359
101944.7578 101466.2109
_npi_uniform 5 322833.4062 64292.7656
65644.5703 64566.6797
ResourceParallelRandomSetSeed 1 28.0410
28.0410 28.0410 28.0410
DeleteVariable 13 0.5990 0.0370
0.0660 0.0461
```
I think the difference is quite insignificant. Theoretically speaking the
(indexing : actual computation) ration should be really small. At least random
ops are indexing-heavy
@sandeep-krishnamurthy @access2rohit For your reference
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]