YutingZhang edited a comment on issue #13593: Low CPU usage of MXNet in 
subprocesses
URL: 
https://github.com/apache/incubator-mxnet/issues/13593#issuecomment-450949556
 
 
   @pengzhao-intel @TaoLv @anirudh2290 @zhreshold Thank you for everyone's 
help, and happy new year! This problem seems more complicated (it might be 
multiple problems in the beginning). @zhreshold's fix solved the problem in 
most cases. 
   However, I found, if we call `asnumpy` in each worker, it interferes among 
the processes. And it does not seem to be a problem for GPU-version MxNet 
running on a GPU-machine. It seems only happening on **CPU-only machine (I 
tested on c5.18large with `mxnet-mkl`)**.
   
   Code (one-line difference):
   ```
   import argparse
   import sys
   from concurrent import futures
   import time
   import numpy as np
   mx=None
   
   
   def run(need_import):
       if need_import:
           import mxnet as mx
       else:
           global mx
       A = mx.nd.random.uniform(low=0, high=1, shape=(5000, 5000))
       while True:
           A = mx.nd.dot(A, A)
           A.asnumpy()    # ******** only difference ***********
   
   def parse_args():
       parser = argparse.ArgumentParser("benchmark mxnet cpu")
       parser.add_argument('--num-workers', '-j', dest='num_workers', type=int, 
default=0)
       parser.add_argument('--late-import', action='store_true')
       return parser.parse_args()
   
   def main(args):
   
       if args.num_workers == 0:
           print("Main process")
           try:
               run(need_import=args.late_import)
           except KeyboardInterrupt:
               pass
       else:
           print("Subprocesses")
           ex = futures.ProcessPoolExecutor(args.num_workers)
   
           for _ in range(args.num_workers):
               ex.submit(run, need_import=args.late_import)
           while True:
               try:
                   time.sleep(10000)
               except KeyboardInterrupt:
                   ex.shutdown(wait=False)
                   break
       print("Stopped")
   
   
   if __name__ == "__main__":
       args = parse_args()
       if not args.late_import:
          import mxnet as mx
       main(args)
   ```
   
   Launch 10 workers (`python3 mxnet_cpu_test.py --num-workers=10`). 
`MXNET_MP_WORKER_NTHREADS` does not affect the results.
   
![image](https://user-images.githubusercontent.com/7865903/50606321-3042e480-0e7a-11e9-892a-2066a6030caf.png)
   
   But running it only in the main process is fine:
   
![image](https://user-images.githubusercontent.com/7865903/50606810-e1964a00-0e7b-11e9-94cb-b0f61dbbea36.png)
   
   
   By the way, another issue I found with `mxnet` (cpu non-mkl version) is: 
when you run MxNet in a subprocess, it interferes with many other non-mxnet 
functions (e.g., `cv2.cvtColor`). The subprocess got stuck at those functions. 
This did not happen for `mxnet==1.3.1`, it started to happen in some nightly 
build version. Probably, we should create a new ticket for this.
   
    
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to