samskalicky edited a comment on issue #12255: Pretty high cpu load when import mxnet URL: https://github.com/apache/incubator-mxnet/issues/12255#issuecomment-443279316 I took the script from above and added another loop to try from 1 to 36 processes testing a local build from source (not pip package): ``` import multiprocessing import time def mxnet_worker(): #t1 = time.time() import mxnet #t2 = time.time() #elapsed = t2-t1 #print(times) times = [] for i in range(37): t1 = time.time() read_process = [multiprocessing.Process(target=mxnet_worker) for i in range(i)] for p in read_process: p.daemon = True p.start() for p in read_process: p.join() t2 = time.time() times.append(t2-t1) for i in times: print(i) ``` Here are the results when compiling with the following cmake flags: ``` cmake -DUSE_CUDA=OFF -DUSE_CUDNN=OFF -DUSE_MKLDNN=OFF -DBLAS=Open -DCMAKE_BUILD_TYPE=Debug .. ``` 1: 5.77136611938 2: 7.65716195107 3: 13.9892320633 4: 16.6815569401 5: 22.9886288643 6: 27.6006569862 7: 30.7331540585 8: 33.8466141224 9: 34.18151021 10: 37.1062369347 11: 43.6272640228 12: 44.1143600941 13: 45.8406460285 14: 46.6692020893 15: 47.8332960606 16: 52.4621579647 17: 56.1070458889 18: 56.8046569824 19: 54.1124491692 20: 65.2930281162 21: 62.0744900703 22: 60.4670469761 23: 69.6229948997 24: 71.4172370434 25: 70.9572968483 26: 74.8509230614 27: 77.0419559479 28: 78.2489409447 29: 80.1934709549 30: 74.9342000484 31: 84.4639661312 32: 83.6565339565 33: 91.3137798309 34: 88.20520401 35: 96.2017951012 36: 96.4477438927 Then I tested against the pip wheel: 1: 1.86075401306 2: 2.52445602417 3: 27.84821105 4: 272.775645971 5: 532.317739964 6: 785.189717054 and i killed it after 6 processes. I think we get the picture. Heres another set of results when compiling without openmp: ``` cmake -DUSE_CUDA=OFF -DUSE_CUDNN=OFF -DUSE_MKLDNN=OFF -DBLAS=Open -DCMAKE_BUILD_TYPE=Debug -DUSE_OPENMP=OFF .. ``` 1: 0.827432 2: 0.859651 3: 0.858839 4: 0.833471 5: 0.884956 6: 0.883090 7: 0.862174 8: 0.888009 9: 0.891180 10: 0.917642 11: 0.894244 12: 0.947771 13: 0.944967 14: 0.956380 15: 0.932657 16: 0.991420 17: 0.956935 18: 0.924413 19: 0.935913 20: 0.944736 21: 0.996702 22: 0.934430 23: 0.966333 24: 1.022540 25: 1.038306 26: 1.175906 27: 1.056674 28: 1.022513 29: 1.083556 30: 1.151226 31: 1.078056 32: 1.046550 33: 1.220279 34: 1.256747 35: 1.334894 36: 1.377328 Clearly theres a problem with OpenMP seeing as the results are very reasonable to load when OpenMP is not used. @szha, can you take a look at this? Theres a huge discrepancy between building from source and the pip wheel. Is there something different that is done when building the pip wheel related to OpenMP?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services