apeforest commented on a change in pull request #12926: parallelize NDArray::Copy<cpu, cpu> when data size is large URL: https://github.com/apache/incubator-mxnet/pull/12926#discussion_r236431370
########## File path: docs/faq/env_var.md ########## @@ -202,6 +202,12 @@ When USE_PROFILER is enabled in Makefile or CMake, the following environments ca If no such algorithm exists given other constraints, MXNet will error out. This variable affects the choice of CUDNN convolution algorithms. Please see [CUDNN developer guide](https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html) for more details. +* MXNET_CPU_PARALLEL_COPY_SIZE + - Values: Int ```(default=200000)``` + - The minimum size to call parallel copy by openMP in CPU2CPU mode. + - When the array size is bigger than this threshold, NDArray::Copy(from, to) is implemented by OpenMP with the Recommended OMP Thread Count. + - When the array size is less than this threshold, NDArray::Copy(from , to)) is implemented by mshadow::Copy(to, from) in single thread. Review comment: This is no longer accurate, right, since it is just implemented using `memcpy`? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
