safrooze opened a new issue #11913: Unexpectedly poor copy implementation URL: https://github.com/apache/incubator-mxnet/issues/11913 ## Description Calling `copy()` on an ndarray is ~6x less efficient than simply adding `0.0` to it!! ## Environment info (Required) ``` ----------Python Info---------- Version : 3.4.5 Compiler : GCC 4.4.7 20120313 (Red Hat 4.4.7-1) Build : ('default', 'Jul 2 2016 17:47:47') Arch : ('64bit', 'ELF') ------------Pip Info----------- Version : 18.0 Directory : /home/ec2-user/anaconda3/envs/mxnet_p34/lib/python3.4/site-packages/pip ----------MXNet Info----------- Version : 1.3.0 Directory : /home/ec2-user/anaconda3/envs/mxnet_p34/lib/python3.4/site-packages/mxnet Commit Hash : f5b95b090815e879b57dca233604dcb3f1df967a ----------System Info---------- Platform : Linux-4.9.93-41.60.amzn1.x86_64-x86_64-with-glibc2.2.5 system : Linux node : ip-172-31-73-235 release : 4.9.93-41.60.amzn1.x86_64 version : #1 SMP Fri Apr 13 21:58:27 UTC 2018 ----------Hardware Info---------- machine : x86_64 processor : x86_64 Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz Stepping: 1 CPU MHz: 2700.366 BogoMIPS: 4600.11 Hypervisor vendor: Xen Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 46080K NUMA node0 CPU(s): 0-7 ----------Network Test---------- Setting timeout: 10 Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 2.7765 sec, LOAD: 0.6225 sec. Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0012 sec, LOAD: 0.4171 sec. Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.1178 sec, LOAD: 0.3382 sec. Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0672 sec, LOAD: 0.0239 sec. Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0113 sec, LOAD: 0.8263 sec. Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0019 sec, LOAD: 0.1053 sec. ``` I'm using Python. ## Minimum reproducible example ```python a = nd.empty((1, 512, 120*120)) start = time() for _ in range(2000): b = a.copy() nd.waitall() print('\tcopy: elapsed: {:.2f}'.format(time() - start)) start = time() for _ in range(2000): b = a + 0.0 nd.waitall() print('\tcopy via add: elapsed: {:.2f}'.format(time() - start)) ``` Output is: ``` copy: elapsed: 6.78 copy via add: elapsed: 1.05 ``` ## Steps to reproduce Run the above script
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
