apeforest commented on a change in pull request #12926: parallelize 
NDArray::Copy<cpu, cpu> when data size is large
URL: https://github.com/apache/incubator-mxnet/pull/12926#discussion_r236431370
 
 

 ##########
 File path: docs/faq/env_var.md
 ##########
 @@ -202,6 +202,12 @@ When USE_PROFILER is enabled in Makefile or CMake, the 
following environments ca
   If no such algorithm exists given other constraints, MXNet will error out. 
This variable affects the choice
   of CUDNN convolution algorithms. Please see [CUDNN developer 
guide](https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html)
 for more details.
 
+* MXNET_CPU_PARALLEL_COPY_SIZE
+  - Values: Int ```(default=200000)```
+  - The minimum size to call parallel copy by openMP in CPU2CPU mode.
+  - When the array size is bigger than this threshold, NDArray::Copy(from, to) 
is implemented by OpenMP with the Recommended OMP Thread Count.
+  - When the array size is less than this threshold, NDArray::Copy(from , to)) 
is implemented by mshadow::Copy(to, from) in single thread.
 
 Review comment:
   This is no longer accurate, right, since it is just implemented using 
`memcpy`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to