ChaiBapchya edited a comment on issue #17487: [OpPerf] Consolidate array manipulation related operators URL: https://github.com/apache/incubator-mxnet/pull/17487#issuecomment-582123433 All 5 categories ``` >>> from benchmark.opperf.nd_operations.array_manipulation_operators import run_rearrange_operators_benchmarks, run_shape_operators_benchmarks, run_expanding_operators_benchmarks, run_rounding_operators_benchmarks ``` Results ``` run_expanding_operators_benchmarks() INFO:root:Begin Benchmark - broadcast_axes INFO:root:Complete Benchmark - broadcast_axes INFO:root:Begin Benchmark - broadcast_axis INFO:root:Complete Benchmark - broadcast_axis INFO:root:Begin Benchmark - broadcast_like INFO:root:Complete Benchmark - broadcast_like INFO:root:Begin Benchmark - broadcast_to INFO:root:Complete Benchmark - broadcast_to INFO:root:Begin Benchmark - expand_dims INFO:root:Complete Benchmark - expand_dims INFO:root:Begin Benchmark - pad INFO:root:Complete Benchmark - pad INFO:root:Begin Benchmark - repeat INFO:root:Complete Benchmark - repeat INFO:root:Begin Benchmark - tile INFO:root:Complete Benchmark - tile {'broadcast_axis': [{'avg_time_forward_broadcast_axis': 0.0342, 'max_storage_mem_alloc_cpu/0': 4.096, 'inputs': {'data': (1, 1024), 'axis': 0, 'size': 2}}, {'avg_time_forward_broadcast_axis': 0.0302, 'max_storage_mem_alloc_cpu/0': 0.008, 'inputs': {'data': (1, 1), 'axis': 0, 'size': 2}}, {'avg_time_forward_broadcast_axis': 0.024, 'max_storage_mem_alloc_cpu/0': 0.8, 'inputs': {'data': (1, 100), 'axis': 0, 'size': 2}}], 'broadcast_like': [{'avg_time_forward_broadcast_like': 1.5138, 'max_ storage_mem_alloc_cpu/0': 4194.3042, 'inputs': {'lhs': (1024, 1024), 'rhs': (1024, 1024)}}, {'avg_time_forward_broadcast_like': 0.1705, 'max_storage_mem_alloc_cpu/0': 400.0, 'inputs': {'lhs': (10000, 10), 'rhs': (10000, 10)}}, {'avg_time_forward_broadcast_like': 0.0446, 'max_storage_mem_alloc_cpu/0': 20.0, 'inputs': {'lhs': (10000, 1), 'rhs': (10000, 1)}}], 'pad': [{'max_storage_mem_alloc_cpu/0': 0.192, 'inputs': {'data': (1, 4, 2, 4), 'mode': 'constant', 'pad_width': (0, 0, 0, 0, 1, 1 , 1, 1)}}, {'max_storage_mem_alloc_cpu/0': 612.0, 'inputs': {'data': (10, 25, 10, 100), 'mode': 'constant', 'pad_width': (0, 0, 0, 0, 1, 1, 1, 1)}}], 'repeat': [{'avg_time_forward_repeat': 7.5347, 'avg_time_backward_repeat': 10.3592, 'max_storage_mem_alloc_cpu/0': 4194.3042, 'inputs': {'data': (1024, 1024), 'repeats': 2, 'axis': 0}}, {'avg_time_forward_repeat': 0.0664, 'avg_time_backward_repeat': 0.0716, 'max_storage_mem_alloc_cpu/0': 40.0, 'inputs': {'data': (10000, 1), 'repeats': 2, 'axis': 0}}, {'avg_time_forward_repeat': 6.0047, 'avg_time_backward_repeat': 8.3208, 'max_storage_mem_alloc_cpu/0': 4000.0, 'inputs': {'data': (10000, 100), 'repeats': 2, 'axis': 0}}], 'tile': [{'avg_time_backward_tile': 7.2161, 'max_storage_mem_alloc_cpu/0': 4194.3042, 'avg_time_forward_tile': 5.2652, 'inputs': {'data': (1024, 1024), 'reps': 2}}, {'avg_time_backward_tile': 0.0631, 'max_storage_mem_alloc_cpu/0': 40.0, 'avg_time_forward_tile': 0.1274, 'inputs': {'data': (10000, 1), 'rep s': 2}}, {'avg_time_backward_tile': 6.7835, 'max_storage_mem_alloc_cpu/0': 4000.0, 'avg_time_forward_tile': 4.8181, 'inputs': {'data': (10000, 100), 'reps': 2}}], 'broadcast_to': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 'avg_time_forward_broadcast_to': 1.4573, 'inputs': {'data': (1, 1024), 'shape': (1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 40.0, 'avg_time_forward_broadcast_to': 0.0741, 'inputs': {'data': (1, 1), 'shape': (10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 'a vg_time_forward_broadcast_to': 1.5039, 'inputs': {'data': (1, 100), 'shape': (10000, 100)}}], 'expand_dims': [{'avg_time_forward_expand_dims': 0.15, 'max_storage_mem_alloc_cpu/0': 2097.1521, 'inputs': {'data': (1024, 1024), 'axis': 0}}, {'avg_time_forward_expand_dims': 0.029, 'max_storage_mem_alloc_cpu/0': 20.0, 'inputs': {'data': (10000, 1), 'axis': 0}}, {'avg_time_forward_expand_dims': 0.0524, 'max_storage_mem_alloc_cpu/0': 2000.0, 'inputs': {'data': (10000, 100), 'axis': 0}}], 'broa dcast_axes': [{'avg_time_forward_broadcast_axes': 0.0416, 'max_storage_mem_alloc_cpu/0': 4.096, 'inputs': {'data': (1, 1024), 'axis': 0, 'size': 2}}, {'avg_time_forward_broadcast_axes': 0.0341, 'max_storage_mem_alloc_cpu/0': 0.004, 'inputs': {'data': (1, 1), 'axis': 0, 'size': 2}}, {'avg_time_forward_broadcast_axes': 0.0354, 'max_storage_mem_alloc_cpu/0': 0.4, 'inputs': {'data': (1, 100), 'axis': 0, 'size': 2}}]} ``` ``` run_rearrange_operators_benchmarks() INFO:root:Begin Benchmark - SwapAxis INFO:root:Complete Benchmark - SwapAxis INFO:root:Begin Benchmark - depth_to_space INFO:root:Complete Benchmark - depth_to_space INFO:root:Begin Benchmark - flip INFO:root:Complete Benchmark - flip INFO:root:Begin Benchmark - reverse INFO:root:Complete Benchmark - reverse INFO:root:Begin Benchmark - space_to_depth INFO:root:Complete Benchmark - space_to_depth INFO:root:Begin Benchmark - swapaxes INFO:root:Complete Benchmark - swapaxes INFO:root:Begin Benchmark - transpose INFO:root:Complete Benchmark - transpose {'transpose': [{'max_storage_mem_alloc_cpu/0': 4194.3042, 'avg_time_forward_transpose': 0.2103, 'inputs': {'data': (1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 40.0, 'avg_time_forward_transpose': 0.0465, 'inputs': {'data': (10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 4000.0, 'avg_time_forward_transpose': 0.266, 'inputs': {'data': (10000, 100)}}], 'depth_to_space': [{'max_storage_mem_alloc_cpu/0': 0.128, 'avg_time_forward_depth_to_space': 0.2052, 'inputs': {'data': (1, 4, 2, 4), 'block_size': 2}}, {'max_storage_mem_alloc_cpu/0': 1000.0, 'avg_time_forward_depth_to_space': 1.2413, 'inputs': {'data': (10, 25, 10, 100), 'block_size': 5}}], 'SwapAxis': [{'max_storage_mem_alloc_cpu/0': 4194.3042, 'avg_time_forward_SwapAxis': 3.3261, 'avg_time_backward_SwapAxis': 3.2804, 'inputs': {'data': (1024, 1024), 'dim1': 0, 'dim2': 1}}, {'max_storage_mem_alloc_cpu/0': 40.0, 'avg_time_forward_SwapAxis': 0.0658, 'avg_time_backward_SwapAxis': 0.0532, 'inputs': {'data': (10000, 1), 'dim1': 0, 'dim2': 1}}, {'max_storage_mem_alloc_cpu/0': 4000.0, 'avg_time_forward_SwapAxis': 2.7984, 'avg_time_backward_SwapAxis': 3.0327, 'inputs': {'data': (10000, 100), 'dim1': 0, 'dim2': 1}}], 'reverse': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 'avg_time_forward_reverse': 0.8368, 'avg_time_backward_reverse': 0.774, 'inputs': {'data': (1024, 1024), 'axis': 0}}, {'max_storage_mem_alloc_cpu/0': 20.0, 'avg_time_forward_reverse': 0.0405, 'avg_time_backward_reverse': 0.0369, 'inputs': {'data': (10000, 1), 'axis': 0}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 'avg_time_forward_reverse': 0.9559, 'avg_time_backward_reverse': 0.9159, 'inputs': {'data': (10000, 100), 'axis': 0}}], 'flip': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 'avg_time_forward_flip': 0.7658, 'inputs': {'data': (1024, 1024), 'axis': 0}}, {'max_storage_mem_alloc_cpu/0': 20.0, 'avg_time_forward_flip': 0.0343, 'inputs': {'data': (10000, 1), 'axis': 0}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 'avg_time_forward_flip': 0.7287, 'inputs': {'data': (10000, 100), 'axis': 0}}], 'space_to_depth': [{'max_storage_mem_alloc_cpu/0': 0.064, 'avg_time_forward_space_to_depth': 0.0444, 'inputs': {'data': (1, 4, 2, 4), 'block_size': 2}}, {'max_storage_mem_alloc_cpu/0': 500.0, 'avg_time_forward_space_to_depth': 1.11, 'inputs': {'data': (10, 25, 10, 100), 'block_size': 5}}], 'swapaxes': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 'avg_time_forward_swapaxes': 2.4288, 'inputs': {'data': (1024, 1024), 'dim1': 0, 'dim2': 1}}, {'max_storage_mem_alloc_cpu/0': 20.0, 'avg_time_forward_swapaxes': 0.0526, 'inputs': {'data': (10000, 1), 'dim1': 0, 'dim2': 1}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 'avg_time_forward_swapaxes': 2.5499, 'inputs': {'data': (10000, 100), 'dim1': 0, 'dim2': 1}}]} >>> ``` ``` >>> run_shape_operators_benchmarks() INFO:root:Begin Benchmark - diag INFO:root:Complete Benchmark - diag INFO:root:Begin Benchmark - reshape INFO:root:Complete Benchmark - reshape INFO:root:Begin Benchmark - reshape_like INFO:root:Complete Benchmark - reshape_like INFO:root:Begin Benchmark - shape_array INFO:root:Complete Benchmark - shape_array INFO:root:Begin Benchmark - size_array INFO:root:Complete Benchmark - size_array INFO:root:Begin Benchmark - split INFO:root:Complete Benchmark - split {'reshape_like': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 'avg_time_forward_reshape_like': 0.4931, 'inputs': {'lhs': (1024, 1024), 'rhs': (1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 200.0, 'avg_time_forward_reshape_like': 0.2905, 'inputs': {'lhs': (10000, 10), 'rhs': (10000, 10)}}, {'max_storage_mem_alloc_cpu/0': 40.0, 'avg_time_forward_reshape_like': 0.0685, 'inputs': {'lhs': (10000, 1), 'rhs': (10000, 1)}}], 'shape_array': [{'max_storage_mem_alloc_cpu/0': 0.016, 'avg_time_forw ard_shape_array': 0.014, 'inputs': {'data': (1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 0.016, 'avg_time_forward_shape_array': 0.0138, 'inputs': {'data': (10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 0.016, 'avg_time_forward_shape_array': 0.0133, 'inputs': {'data': (10000, 100)}}], 'size_array': [{'avg_time_forward_size_array': 0.0138, 'max_storage_mem_alloc_cpu/0': 0.008, 'inputs': {'data': (1024, 1024)}}, {'avg_time_forward_size_array': 0.014, 'max_storage_mem_alloc_cpu/0': 0.008 , 'inputs': {'data': (10000, 1)}}, {'avg_time_forward_size_array': 0.0138, 'max_storage_mem_alloc_cpu/0': 0.008, 'inputs': {'data': (10000, 100)}}], 'reshape': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 'avg_time_forward_reshape': 0.1507, 'inputs': {'data': (1024, 1024), 'shape': (1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 20.0, 'avg_time_forward_reshape': 0.0371, 'inputs': {'data': (10000, 1), 'shape': (10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 'avg_time_forward_reshap e': 0.1779, 'inputs': {'data': (10000, 100), 'shape': (10000, 100)}}], 'split': [{'max_storage_mem_alloc_cpu/0': 4194.3042, 'inputs': {'data': (1024, 1024), 'num_outputs': 1, 'axis': 0}}, {'max_storage_mem_alloc_cpu/0': 40.0, 'inputs': {'data': (10000, 1), 'num_outputs': 1, 'axis': 0}}, {'max_storage_mem_alloc_cpu/0': 4000.0, 'inputs': {'data': (10000, 100), 'num_outputs': 1, 'axis': 0}}], 'diag': [{'avg_time_forward_diag': 0.0346, 'max_storage_mem_alloc_cpu/0': 2.046, 'avg_time_backwa rd_diag': 0.4403, 'inputs': {'data': (1024, 1024), 'k': 1}}, {'avg_time_forward_diag': 0.0311, 'avg_time_backward_diag': 0.0445, 'inputs': {'data': (10000, 1), 'k': 1}}, {'avg_time_forward_diag': 0.0317, 'max_storage_mem_alloc_cpu/0': 0.198, 'avg_time_backward_diag': 0.4408, 'inputs': {'data': (10000, 100), 'k': 1}}]} ``` ``` >>> run_rounding_operators_benchmarks() INFO:root:Begin Benchmark - ceil INFO:root:Complete Benchmark - ceil INFO:root:Begin Benchmark - fix INFO:root:Complete Benchmark - fix INFO:root:Begin Benchmark - floor INFO:root:Complete Benchmark - floor INFO:root:Begin Benchmark - rint INFO:root:Complete Benchmark - rint INFO:root:Begin Benchmark - round INFO:root:Complete Benchmark - round INFO:root:Begin Benchmark - trunc INFO:root:Complete Benchmark - trunc {'floor': [{'max_storage_mem_alloc_cpu/0': 2097.1521, 'avg_time_forward_floor': 0.1889, 'inputs': {'data': (1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 20.0, 'avg_time_forward_floor': 0.0483, 'inputs': {'data': (10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 'avg_time_forward_floor': 0.1466, 'inputs': {'data': (10000, 100)}}], 'round': [{'avg_time_forward_round': 0.2401, 'max_storage_mem_alloc_cpu/0': 2097.1521, 'inputs': {'data': (1024, 1024)}}, {'avg_time_forward_round': 0.0 343, 'max_storage_mem_alloc_cpu/0': 20.0, 'inputs': {'data': (10000, 1)}}, {'avg_time_forward_round': 0.2264, 'max_storage_mem_alloc_cpu/0': 2000.0, 'inputs': {'data': (10000, 100)}}], 'trunc': [{'avg_time_forward_trunc': 0.2686, 'max_storage_mem_alloc_cpu/0': 4194.3042, 'inputs': {'data': (1024, 1024)}}, {'avg_time_forward_trunc': 0.0877, 'max_storage_mem_alloc_cpu/0': 20.0, 'inputs': {'data': (10000, 1)}}, {'avg_time_forward_trunc': 0.2895, 'max_storage_mem_alloc_cpu/0': 2000.0, 'inp uts': {'data': (10000, 100)}}], 'fix': [{'avg_time_forward_fix': 0.4471, 'max_storage_mem_alloc_cpu/0': 2097.1521, 'inputs': {'data': (1024, 1024)}}, {'avg_time_forward_fix': 0.0372, 'max_storage_mem_alloc_cpu/0': 20.0, 'inputs': {'data': (10000, 1)}}, {'avg_time_forward_fix': 0.3923, 'max_storage_mem_alloc_cpu/0': 2000.0, 'inputs': {'data': (10000, 100)}}], 'rint': [{'avg_time_forward_rint': 0.2299, 'max_storage_mem_alloc_cpu/0': 2097.1521, 'inputs': {'data': (1024, 1024)}}, {'avg_tim e_forward_rint': 0.0354, 'max_storage_mem_alloc_cpu/0': 40.0, 'inputs': {'data': (10000, 1)}}, {'avg_time_forward_rint': 0.2015, 'max_storage_mem_alloc_cpu/0': 2000.0, 'inputs': {'data': (10000, 100)}}], 'ceil': [{'max_storage_mem_alloc_cpu/0': 4194.3042, 'avg_time_forward_ceil': 0.3486, 'inputs': {'data': (1024, 1024)}}, {'max_storage_mem_alloc_cpu/0': 20.0, 'avg_time_forward_ceil': 0.0395, 'inputs': {'data': (10000, 1)}}, {'max_storage_mem_alloc_cpu/0': 2000.0, 'avg_time_forward_ceil ': 0.4362, 'inputs': {'data': (10000, 100)}}]} ``` ``` >>> run_join_split_operators_benchmarks() INFO:root:Begin Benchmark - concat INFO:root:Complete Benchmark - concat INFO:root:Begin Benchmark - split INFO:root:Complete Benchmark - split INFO:root:Begin Benchmark - stack INFO:root:Complete Benchmark - stack {'concat': [{'inputs': {'args0': '<NDArray 100x100 @cpu(0)>', 'args1': '<NDArray 100x100 @cpu(0)>', 'args2': '<NDArray 100x100 @cpu(0)>'}, 'max_storage_mem_alloc_cpu/0': 120.0}], 'split': [{'inputs': {'data': (1024, 1024), 'num_outputs': 2}, 'max_storage_mem_alloc_cpu/0': 4194.3042}, {'inputs': {'data': (10000, 1), 'num_outputs': 1}, 'max_storage_mem_alloc_cpu/0': 20.0}, {'inputs': {'data': (10000, 100), 'num_outputs': 10}, 'max_storage_mem_alloc_cpu/0': 3800.0}], 'stack': [{'inputs': {'args0': '<NDArray 100x100 @cpu(0)>', 'args1': '<NDArray 100x100 @cpu(0)>', 'args2': '<NDArray 100x100 @cpu(0)>'}, 'max_storage_mem_alloc_cpu/0': 60.0, 'avg_time_forward_stack': 0.0653}]} ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
