safrooze opened a new issue #11913: Unexpectedly poor copy implementation
URL: https://github.com/apache/incubator-mxnet/issues/11913
 
 
   ## Description
   Calling `copy()` on an ndarray is ~6x less efficient than simply adding 
`0.0` to it!!
   
   ## Environment info (Required)
   
   ```
   ----------Python Info----------
   Version      : 3.4.5
   Compiler     : GCC 4.4.7 20120313 (Red Hat 4.4.7-1)
   Build        : ('default', 'Jul  2 2016 17:47:47')
   Arch         : ('64bit', 'ELF')
   ------------Pip Info-----------
   Version      : 18.0
   Directory    : 
/home/ec2-user/anaconda3/envs/mxnet_p34/lib/python3.4/site-packages/pip
   ----------MXNet Info-----------
   Version      : 1.3.0
   Directory    : 
/home/ec2-user/anaconda3/envs/mxnet_p34/lib/python3.4/site-packages/mxnet
   Commit Hash   : f5b95b090815e879b57dca233604dcb3f1df967a
   ----------System Info----------
   Platform     : Linux-4.9.93-41.60.amzn1.x86_64-x86_64-with-glibc2.2.5
   system       : Linux
   node         : ip-172-31-73-235
   release      : 4.9.93-41.60.amzn1.x86_64
   version      : #1 SMP Fri Apr 13 21:58:27 UTC 2018
   ----------Hardware Info----------
   machine      : x86_64
   processor    : x86_64
   Architecture:          x86_64
   CPU op-mode(s):        32-bit, 64-bit
   Byte Order:            Little Endian
   CPU(s):                8
   On-line CPU(s) list:   0-7
   Thread(s) per core:    2
   Core(s) per socket:    4
   Socket(s):             1
   NUMA node(s):          1
   Vendor ID:             GenuineIntel
   CPU family:            6
   Model:                 79
   Model name:            Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
   Stepping:              1
   CPU MHz:               2700.366
   BogoMIPS:              4600.11
   Hypervisor vendor:     Xen
   Virtualization type:   full
   L1d cache:             32K
   L1i cache:             32K
   L2 cache:              256K
   L3 cache:              46080K
   NUMA node0 CPU(s):     0-7
   ----------Network Test----------
   Setting timeout: 10
   Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 2.7765 sec, LOAD: 
0.6225 sec.
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0012 
sec, LOAD: 0.4171 sec.
   Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.1178 sec, LOAD: 
0.3382 sec.
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0672 sec, 
LOAD: 0.0239 sec.
   Timing for FashionMNIST: 
https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz,
 DNS: 0.0113 sec, LOAD: 0.8263 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0019 sec, LOAD: 
0.1053 sec.
   ```
   
   I'm using Python.
   
   ## Minimum reproducible example
   ```python
   a = nd.empty((1, 512, 120*120))
   
   start = time()
   for _ in range(2000):
       b = a.copy()
   nd.waitall()
   print('\tcopy: elapsed: {:.2f}'.format(time() - start))
   
   start = time()
   for _ in range(2000):
       b = a + 0.0
   nd.waitall()
   print('\tcopy via add: elapsed: {:.2f}'.format(time() - start))
   ```
   Output is:
   ```
        copy: elapsed: 6.78
        copy via add: elapsed: 1.05
   ```
   
   ## Steps to reproduce
   Run the above script
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to