Hello, So I am trying to parallelize execution of the code across multiple cores of my cpu. When I set MXNET_CPU_WORKER_NTHREADS it actually decreases the effective number of cores I'm using even though I am setting that variable to less than the number of total cores that I have. It almost appears that setting this variable turns off the parallelization inside of Blas, OMP, and similar.
I was wondering if 1) I need to set another variable to let MXNet run its operators across cores. 2) If calling asnumpy(), or some other blocking call inside of the network automatically blocks the entire network (as opposed to just blocking the part of the network that call depends upon), and 3) if there are other obvious places to check when trying to understand why MXNet is using neither the memory I have available nor the CPU cores I have available. Thanks! Paul --- [Visit Topic](https://discuss.mxnet.io/t/parallelize-operators/6457/1) or reply to this email to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.mxnet.io/email/unsubscribe/f70ad21570ab65d1f355872e20808469e3019fb992913e91ae240c0ded3d7a3d).
