samskalicky edited a comment on issue #12255: Pretty high cpu load when import 
mxnet
URL: 
https://github.com/apache/incubator-mxnet/issues/12255#issuecomment-443279316
 
 
   I took the script from above and added another loop to try from 1 to 36 
processes testing a local build from source (not pip package):
   
   ```
   import multiprocessing
   import time
   
   def mxnet_worker():
       #t1 = time.time()
       import mxnet
       #t2 = time.time()
       #elapsed = t2-t1
       #print(times)
   
   times = []
   for i in range(37):
       t1 = time.time()
       read_process = [multiprocessing.Process(target=mxnet_worker) for i in 
range(i)]
       for p in read_process:
           p.daemon = True
           p.start()
   
       for p in read_process:
           p.join()
       t2 = time.time()
       times.append(t2-t1)
   
   for i in times:
       print(i)
   ```
   
   Here are the results when compiling with the following cmake flags:
   
   ```
   cmake -DUSE_CUDA=OFF -DUSE_CUDNN=OFF -DUSE_MKLDNN=OFF -DBLAS=Open 
-DCMAKE_BUILD_TYPE=Debug  ..
   ```
   
   1: 5.77136611938
   2: 7.65716195107
   3: 13.9892320633
   4: 16.6815569401
   5: 22.9886288643
   6: 27.6006569862
   7: 30.7331540585
   8: 33.8466141224
   9: 34.18151021
   10: 37.1062369347
   11: 43.6272640228
   12: 44.1143600941
   13: 45.8406460285
   14: 46.6692020893
   15: 47.8332960606
   16: 52.4621579647
   17: 56.1070458889
   18: 56.8046569824
   19: 54.1124491692
   20: 65.2930281162
   21: 62.0744900703
   22: 60.4670469761
   23: 69.6229948997
   24: 71.4172370434
   25: 70.9572968483
   26: 74.8509230614
   27: 77.0419559479
   28: 78.2489409447
   29: 80.1934709549
   30: 74.9342000484
   31: 84.4639661312
   32: 83.6565339565
   33: 91.3137798309
   34: 88.20520401
   35: 96.2017951012
   36: 96.4477438927
   
   Then I tested against the pip wheel:
   
   1: 1.86075401306
   2: 2.52445602417
   3: 27.84821105
   4: 272.775645971
   5: 532.317739964
   6: 785.189717054
   
   and i killed it after 6 processes. I think we get the picture. 
   
   Heres another set of results when compiling without openmp:
   
   ```
   cmake -DUSE_CUDA=OFF -DUSE_CUDNN=OFF -DUSE_MKLDNN=OFF -DBLAS=Open 
-DCMAKE_BUILD_TYPE=Debug -DUSE_OPENMP=OFF ..
   ```
   
   
   1: 0.827432
   2: 0.859651
   3: 0.858839
   4: 0.833471
   5: 0.884956
   6: 0.883090
   7: 0.862174
   8: 0.888009
   9: 0.891180
   10: 0.917642
   11: 0.894244
   12: 0.947771
   13: 0.944967
   14: 0.956380
   15: 0.932657
   16: 0.991420
   17: 0.956935
   18: 0.924413
   19: 0.935913
   20: 0.944736
   21: 0.996702
   22: 0.934430
   23: 0.966333
   24: 1.022540
   25: 1.038306
   26: 1.175906
   27: 1.056674
   28: 1.022513
   29: 1.083556
   30: 1.151226
   31: 1.078056
   32: 1.046550
   33: 1.220279
   34: 1.256747
   35: 1.334894
   36: 1.377328
   
   Clearly theres a problem with OpenMP seeing as the results are very 
reasonable to load when OpenMP is not used. 
   
   @szha, can you take a look at this? Theres a huge discrepancy between 
building from source and the pip wheel. Is there something different that is 
done when building the pip wheel related to OpenMP?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to