ChaiBapchya edited a comment on issue #17487: [OpPerf] Consolidate array 
manipulation related operators
URL: https://github.com/apache/incubator-mxnet/pull/17487#issuecomment-582123433
 
 
   All 5 categories 
   ```
   >>> from benchmark.opperf.nd_operations.array_manipulation_operators import 
run_rearrange_operators_benchmarks, run_shape_operators_benchmarks, 
run_expanding_operators_benchmarks, run_rounding_operators_benchmarks
   ```
   Results
   ```
   run_expanding_operators_benchmarks()
   INFO:root:Begin Benchmark - broadcast_axes
   INFO:root:Complete Benchmark - broadcast_axes
   INFO:root:Begin Benchmark - broadcast_axis
   INFO:root:Complete Benchmark - broadcast_axis
   INFO:root:Begin Benchmark - broadcast_like
   INFO:root:Complete Benchmark - broadcast_like
   INFO:root:Begin Benchmark - broadcast_to
   INFO:root:Complete Benchmark - broadcast_to
   INFO:root:Begin Benchmark - expand_dims
   INFO:root:Complete Benchmark - expand_dims
   INFO:root:Begin Benchmark - pad
   INFO:root:Complete Benchmark - pad
   INFO:root:Begin Benchmark - repeat
   INFO:root:Complete Benchmark - repeat
   INFO:root:Begin Benchmark - tile
   INFO:root:Complete Benchmark - tile
   {'broadcast_axis': [{'avg_time_forward_broadcast_axis': 0.0342, 
'max_storage_mem_alloc_cpu/0': 4.096, 'inputs': {'data': (1, 1024), 'axis': 0, 
'size': 2}}, {'avg_time_forward_broadcast_axis': 0.0302, 
'max_storage_mem_alloc_cpu/0': 0.008, 'inputs': {'data': (1, 1), 'axis': 0, 
'size': 2}}, {'avg_time_forward_broadcast_axis': 0.024, 
'max_storage_mem_alloc_cpu/0': 0.8, 'inputs': {'data': (1, 100), 'axis': 0, 
'size': 2}}], 'broadcast_like': [{'avg_time_forward_broadcast_like': 1.5138, 
'max_
   storage_mem_alloc_cpu/0': 4194.3042, 'inputs': {'lhs': (1024, 1024), 'rhs': 
(1024, 1024)}}, {'avg_time_forward_broadcast_like': 0.1705, 
'max_storage_mem_alloc_cpu/0': 400.0, 'inputs': {'lhs': (10000, 10), 'rhs': 
(10000, 10)}}, {'avg_time_forward_broadcast_like': 0.0446, 
'max_storage_mem_alloc_cpu/0': 20.0, 'inputs': {'lhs': (10000, 1), 'rhs': 
(10000, 1)}}], 'pad': [{'max_storage_mem_alloc_cpu/0': 0.192, 'inputs': 
{'data': (1, 4, 2, 4), 'mode': 'constant', 'pad_width': (0, 0, 0, 0, 1, 1
   , 1, 1)}}, {'max_storage_mem_alloc_cpu/0': 612.0, 'inputs': {'data': (10, 
25, 10, 100), 'mode': 'constant', 'pad_width': (0, 0, 0, 0, 1, 1, 1, 1)}}], 
'repeat': [{'avg_time_forward_repeat': 7.5347, 'avg_time_backward_repeat': 
10.3592, 'max_storage_mem_alloc_cpu/0': 4194.3042, 'inputs': {'data': (1024, 
1024), 'repeats': 2, 'axis': 0}}, {'avg_time_forward_repeat': 0.0664, 
'avg_time_backward_repeat': 0.0716, 'max_storage_mem_alloc_cpu/0': 40.0, 
'inputs': {'data': (10000, 1), 'repeats': 2,
   'axis': 0}}, {'avg_time_forward_repeat': 6.0047, 'avg_time_backward_repeat': 
8.3208, 'max_storage_mem_alloc_cpu/0': 4000.0, 'inputs': {'data': (10000, 100), 
'repeats': 2, 'axis': 0}}], 'tile': [{'avg_time_backward_tile': 7.2161, 
'max_storage_mem_alloc_cpu/0': 4194.3042, 'avg_time_forward_tile': 5.2652, 
'inputs': {'data': (1024, 1024), 'reps': 2}}, {'avg_time_backward_tile': 
0.0631, 'max_storage_mem_alloc_cpu/0': 40.0, 'avg_time_forward_tile': 0.1274, 
'inputs': {'data': (10000, 1), 'rep
   s': 2}}, {'avg_time_backward_tile': 6.7835, 'max_storage_mem_alloc_cpu/0': 
4000.0, 'avg_time_forward_tile': 4.8181, 'inputs': {'data': (10000, 100), 
'reps': 2}}], 'broadcast_to': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 
'avg_time_forward_broadcast_to': 1.4573, 'inputs': {'data': (1, 1024), 'shape': 
(1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 40.0, 
'avg_time_forward_broadcast_to': 0.0741, 'inputs': {'data': (1, 1), 'shape': 
(10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 'a
   vg_time_forward_broadcast_to': 1.5039, 'inputs': {'data': (1, 100), 'shape': 
(10000, 100)}}], 'expand_dims': [{'avg_time_forward_expand_dims': 0.15, 
'max_storage_mem_alloc_cpu/0': 2097.1521, 'inputs': {'data': (1024, 1024), 
'axis': 0}}, {'avg_time_forward_expand_dims': 0.029, 
'max_storage_mem_alloc_cpu/0': 20.0, 'inputs': {'data': (10000, 1), 'axis': 
0}}, {'avg_time_forward_expand_dims': 0.0524, 'max_storage_mem_alloc_cpu/0': 
2000.0, 'inputs': {'data': (10000, 100), 'axis': 0}}], 'broa
   dcast_axes': [{'avg_time_forward_broadcast_axes': 0.0416, 
'max_storage_mem_alloc_cpu/0': 4.096, 'inputs': {'data': (1, 1024), 'axis': 0, 
'size': 2}}, {'avg_time_forward_broadcast_axes': 0.0341, 
'max_storage_mem_alloc_cpu/0': 0.004, 'inputs': {'data': (1, 1), 'axis': 0, 
'size': 2}}, {'avg_time_forward_broadcast_axes': 0.0354, 
'max_storage_mem_alloc_cpu/0': 0.4, 'inputs': {'data': (1, 100), 'axis': 0, 
'size': 2}}]}
   ```
   ```
   run_rearrange_operators_benchmarks()
   INFO:root:Begin Benchmark - SwapAxis
   INFO:root:Complete Benchmark - SwapAxis
   INFO:root:Begin Benchmark - depth_to_space
   INFO:root:Complete Benchmark - depth_to_space
   INFO:root:Begin Benchmark - flip
   INFO:root:Complete Benchmark - flip
   INFO:root:Begin Benchmark - reverse
   INFO:root:Complete Benchmark - reverse
   INFO:root:Begin Benchmark - space_to_depth
   INFO:root:Complete Benchmark - space_to_depth
   INFO:root:Begin Benchmark - swapaxes
   INFO:root:Complete Benchmark - swapaxes
   INFO:root:Begin Benchmark - transpose
   INFO:root:Complete Benchmark - transpose
   {'transpose': [{'max_storage_mem_alloc_cpu/0': 4194.3042, 
'avg_time_forward_transpose': 0.2103, 'inputs': {'data': (1024, 1024)}}, 
{'max_storage_mem_alloc_cpu/0': 40.0, 'avg_time_forward_transpose': 0.0465, 
'inputs': {'data': (10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 4000.0, 
'avg_time_forward_transpose': 0.266, 'inputs': {'data': (10000, 100)}}], 
'depth_to_space': [{'max_storage_mem_alloc_cpu/0': 0.128, 
'avg_time_forward_depth_to_space': 0.2052, 'inputs': {'data': (1, 4, 2, 4), 
'block_size': 2}}, {'max_storage_mem_alloc_cpu/0': 1000.0, 
'avg_time_forward_depth_to_space': 1.2413, 'inputs': {'data': (10, 25, 10, 
100), 'block_size': 5}}], 'SwapAxis': [{'max_storage_mem_alloc_cpu/0': 
4194.3042, 'avg_time_forward_SwapAxis': 3.3261, 'avg_time_backward_SwapAxis': 
3.2804, 'inputs': {'data': (1024, 1024), 'dim1': 0, 'dim2': 1}}, 
{'max_storage_mem_alloc_cpu/0': 40.0, 'avg_time_forward_SwapAxis': 0.0658, 
'avg_time_backward_SwapAxis': 0.0532, 'inputs': {'data': (10000, 1), 'dim1': 0, 
'dim2': 1}}, {'max_storage_mem_alloc_cpu/0': 4000.0, 
'avg_time_forward_SwapAxis': 2.7984, 'avg_time_backward_SwapAxis': 3.0327, 
'inputs': {'data': (10000, 100), 'dim1': 0, 'dim2': 1}}], 'reverse': 
[{'max_storage_mem_alloc_cpu/0': 2097.1521, 'avg_time_forward_reverse': 0.8368, 
'avg_time_backward_reverse': 0.774, 'inputs': {'data': (1024, 1024), 'axis': 
0}}, {'max_storage_mem_alloc_cpu/0': 20.0, 'avg_time_forward_reverse': 0.0405, 
'avg_time_backward_reverse': 0.0369, 'inputs': {'data': (10000, 1), 'axis': 
0}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 'avg_time_forward_reverse': 
0.9559, 'avg_time_backward_reverse': 0.9159, 'inputs': {'data': (10000, 100), 
'axis': 0}}], 'flip': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 
'avg_time_forward_flip': 0.7658, 'inputs': {'data': (1024, 1024), 'axis': 0}}, 
{'max_storage_mem_alloc_cpu/0': 20.0, 'avg_time_forward_flip': 0.0343, 
'inputs': {'data': (10000, 1), 'axis': 0}}, {'max_storage_mem_alloc_cpu/0': 
2000.0, 'avg_time_forward_flip': 0.7287, 'inputs': {'data': (10000, 100), 
'axis': 0}}], 'space_to_depth': [{'max_storage_mem_alloc_cpu/0': 0.064, 
'avg_time_forward_space_to_depth': 0.0444, 'inputs': {'data': (1, 4, 2, 4), 
'block_size': 2}}, {'max_storage_mem_alloc_cpu/0': 500.0, 
'avg_time_forward_space_to_depth': 1.11, 'inputs': {'data': (10, 25, 10, 100), 
'block_size': 5}}], 'swapaxes': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 
'avg_time_forward_swapaxes': 2.4288, 'inputs': {'data': (1024, 1024), 'dim1': 
0, 'dim2': 1}}, {'max_storage_mem_alloc_cpu/0': 20.0, 
'avg_time_forward_swapaxes': 0.0526, 'inputs': {'data': (10000, 1), 'dim1': 0, 
'dim2': 1}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 
'avg_time_forward_swapaxes': 2.5499, 'inputs': {'data': (10000, 100), 'dim1': 
0, 'dim2': 1}}]}
   >>>
   ```
   ```
   >>> run_shape_operators_benchmarks()
   INFO:root:Begin Benchmark - diag
   INFO:root:Complete Benchmark - diag
   INFO:root:Begin Benchmark - reshape
   INFO:root:Complete Benchmark - reshape
   INFO:root:Begin Benchmark - reshape_like
   INFO:root:Complete Benchmark - reshape_like
   INFO:root:Begin Benchmark - shape_array
   INFO:root:Complete Benchmark - shape_array
   INFO:root:Begin Benchmark - size_array
   INFO:root:Complete Benchmark - size_array
   INFO:root:Begin Benchmark - split
   INFO:root:Complete Benchmark - split
   {'reshape_like': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 
'avg_time_forward_reshape_like': 0.4931, 'inputs': {'lhs': (1024, 1024), 'rhs': 
(1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 200.0, 
'avg_time_forward_reshape_like': 0.2905, 'inputs': {'lhs': (10000, 10), 'rhs': 
(10000, 10)}}, {'max_storage_mem_alloc_cpu/0': 40.0, 
'avg_time_forward_reshape_like': 0.0685, 'inputs': {'lhs': (10000, 1), 'rhs': 
(10000, 1)}}], 'shape_array': [{'max_storage_mem_alloc_cpu/0': 0.016, 
'avg_time_forw
   ard_shape_array': 0.014, 'inputs': {'data': (1024, 1024)}}, 
{'max_storage_mem_alloc_cpu/0': 0.016, 'avg_time_forward_shape_array': 0.0138, 
'inputs': {'data': (10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 0.016, 
'avg_time_forward_shape_array': 0.0133, 'inputs': {'data': (10000, 100)}}], 
'size_array': [{'avg_time_forward_size_array': 0.0138, 
'max_storage_mem_alloc_cpu/0': 0.008, 'inputs': {'data': (1024, 1024)}}, 
{'avg_time_forward_size_array': 0.014, 'max_storage_mem_alloc_cpu/0': 0.008
   , 'inputs': {'data': (10000, 1)}}, {'avg_time_forward_size_array': 0.0138, 
'max_storage_mem_alloc_cpu/0': 0.008, 'inputs': {'data': (10000, 100)}}], 
'reshape': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 
'avg_time_forward_reshape': 0.1507, 'inputs': {'data': (1024, 1024), 'shape': 
(1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 20.0, 
'avg_time_forward_reshape': 0.0371, 'inputs': {'data': (10000, 1), 'shape': 
(10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 'avg_time_forward_reshap
   e': 0.1779, 'inputs': {'data': (10000, 100), 'shape': (10000, 100)}}], 
'split': [{'max_storage_mem_alloc_cpu/0': 4194.3042, 'inputs': {'data': (1024, 
1024), 'num_outputs': 1, 'axis': 0}}, {'max_storage_mem_alloc_cpu/0': 40.0, 
'inputs': {'data': (10000, 1), 'num_outputs': 1, 'axis': 0}}, 
{'max_storage_mem_alloc_cpu/0': 4000.0, 'inputs': {'data': (10000, 100), 
'num_outputs': 1, 'axis': 0}}], 'diag': [{'avg_time_forward_diag': 0.0346, 
'max_storage_mem_alloc_cpu/0': 2.046, 'avg_time_backwa
   rd_diag': 0.4403, 'inputs': {'data': (1024, 1024), 'k': 1}}, 
{'avg_time_forward_diag': 0.0311, 'avg_time_backward_diag': 0.0445, 'inputs': 
{'data': (10000, 1), 'k': 1}}, {'avg_time_forward_diag': 0.0317, 
'max_storage_mem_alloc_cpu/0': 0.198, 'avg_time_backward_diag': 0.4408, 
'inputs': {'data': (10000, 100), 'k': 1}}]}
   ```
   ```
   >>> run_rounding_operators_benchmarks()
   INFO:root:Begin Benchmark - ceil
   INFO:root:Complete Benchmark - ceil
   INFO:root:Begin Benchmark - fix
   INFO:root:Complete Benchmark - fix
   INFO:root:Begin Benchmark - floor
   INFO:root:Complete Benchmark - floor
   INFO:root:Begin Benchmark - rint
   INFO:root:Complete Benchmark - rint
   INFO:root:Begin Benchmark - round
   INFO:root:Complete Benchmark - round
   INFO:root:Begin Benchmark - trunc
   INFO:root:Complete Benchmark - trunc
   {'floor': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 
'avg_time_forward_floor': 0.1889, 'inputs': {'data': (1024, 1024)}}, 
{'max_storage_mem_alloc_cpu/0': 20.0, 'avg_time_forward_floor': 0.0483, 
'inputs': {'data': (10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 
'avg_time_forward_floor': 0.1466, 'inputs': {'data': (10000, 100)}}], 'round': 
[{'avg_time_forward_round': 0.2401, 'max_storage_mem_alloc_cpu/0': 2097.1521, 
'inputs': {'data': (1024, 1024)}}, {'avg_time_forward_round': 0.0
   343, 'max_storage_mem_alloc_cpu/0': 20.0, 'inputs': {'data': (10000, 1)}}, 
{'avg_time_forward_round': 0.2264, 'max_storage_mem_alloc_cpu/0': 2000.0, 
'inputs': {'data': (10000, 100)}}], 'trunc': [{'avg_time_forward_trunc': 
0.2686, 'max_storage_mem_alloc_cpu/0': 4194.3042, 'inputs': {'data': (1024, 
1024)}}, {'avg_time_forward_trunc': 0.0877, 'max_storage_mem_alloc_cpu/0': 
20.0, 'inputs': {'data': (10000, 1)}}, {'avg_time_forward_trunc': 0.2895, 
'max_storage_mem_alloc_cpu/0': 2000.0, 'inp
   uts': {'data': (10000, 100)}}], 'fix': [{'avg_time_forward_fix': 0.4471, 
'max_storage_mem_alloc_cpu/0': 2097.1521, 'inputs': {'data': (1024, 1024)}}, 
{'avg_time_forward_fix': 0.0372, 'max_storage_mem_alloc_cpu/0': 20.0, 'inputs': 
{'data': (10000, 1)}}, {'avg_time_forward_fix': 0.3923, 
'max_storage_mem_alloc_cpu/0': 2000.0, 'inputs': {'data': (10000, 100)}}], 
'rint': [{'avg_time_forward_rint': 0.2299, 'max_storage_mem_alloc_cpu/0': 
2097.1521, 'inputs': {'data': (1024, 1024)}}, {'avg_tim
   e_forward_rint': 0.0354, 'max_storage_mem_alloc_cpu/0': 40.0, 'inputs': 
{'data': (10000, 1)}}, {'avg_time_forward_rint': 0.2015, 
'max_storage_mem_alloc_cpu/0': 2000.0, 'inputs': {'data': (10000, 100)}}], 
'ceil': [{'max_storage_mem_alloc_cpu/0': 4194.3042, 'avg_time_forward_ceil': 
0.3486, 'inputs': {'data': (1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 
20.0, 'avg_time_forward_ceil': 0.0395, 'inputs': {'data': (10000, 1)}}, 
{'max_storage_mem_alloc_cpu/0': 2000.0, 'avg_time_forward_ceil
   ': 0.4362, 'inputs': {'data': (10000, 100)}}]}
   ```
   
   ```
   >>> run_join_split_operators_benchmarks()
   INFO:root:Begin Benchmark - concat
   INFO:root:Complete Benchmark - concat
   INFO:root:Begin Benchmark - split
   INFO:root:Complete Benchmark - split
   INFO:root:Begin Benchmark - stack
   INFO:root:Complete Benchmark - stack
   {'concat': [{'inputs': {'args0': '<NDArray 100x100 @cpu(0)>', 'args1': 
'<NDArray 100x100 @cpu(0)>', 'args2': '<NDArray 100x100 @cpu(0)>'}, 
'max_storage_mem_alloc_cpu/0': 120.0}], 'split': [{'inputs': {'data': (1024, 
1024), 'num_outputs': 2}, 'max_storage_mem_alloc_cpu/0': 4194.3042}, {'inputs': 
{'data': (10000, 1), 'num_outputs': 1}, 'max_storage_mem_alloc_cpu/0': 20.0}, 
{'inputs': {'data': (10000, 100), 'num_outputs': 10}, 
'max_storage_mem_alloc_cpu/0': 3800.0}], 'stack': [{'inputs': {'args0': 
'<NDArray 100x100 @cpu(0)>', 'args1': '<NDArray 100x100 @cpu(0)>', 'args2': 
'<NDArray 100x100 @cpu(0)>'}, 'max_storage_mem_alloc_cpu/0': 60.0, 
'avg_time_forward_stack': 0.0653}]}
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to