I finally got back to this after a while. I no longer observe the 
libomp-related error mentioned in the original issue, but I am observing 
process deadlocks with the following numpy/mxnet configuration in Sockeye:
```
conda list | grep numpy
numpy                     1.15.1           py36h6a91979_0
numpy-base                1.15.1           py36h8a80b8c_0
conda list | grep mkl
blas                      1.0                         mkl
mkl                       2019.0                      118
mkl_fft                   1.0.4            py36h5d10147_1
mkl_random                1.0.1            py36h5d10147_1
mxnet-mkl                 1.3.0.post0               <pip>
```
If mkl-optimized numpy is installed via anaconda (as shown above) and using 
mxnet-mkl==1.3.0.post0 on a Mac laptop, the Sockeye subprocess spawned at a 
checkpoint (to decode the validation data set), is unable to spawn and the main 
process deterministically hangs. When debugging, it seems that it fails to 
spawn the subprocess.
However, when using either mxnet==1.3.0.post0 (no mkl) or pip-installed numpy 
(no mkl), everything works just fine.

[ Full content available at: 
https://github.com/apache/incubator-mxnet/issues/8532 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to