I finally got back to this after a while. I no longer observe the libomp-related error mentioned in the original issue, but I am observing process deadlocks with the following numpy/mxnet configuration in Sockeye: ``` conda list | grep numpy numpy 1.15.1 py36h6a91979_0 numpy-base 1.15.1 py36h8a80b8c_0 conda list | grep mkl blas 1.0 mkl mkl 2019.0 118 mkl_fft 1.0.4 py36h5d10147_1 mkl_random 1.0.1 py36h5d10147_1 mxnet-mkl 1.3.0.post0 <pip> ``` If mkl-optimized numpy is installed via anaconda (as shown above) and using mxnet-mkl==1.3.0.post0 on a Mac laptop, the Sockeye subprocess spawned at a checkpoint (to decode the validation data set), is unable to spawn and the main process deterministically hangs. When debugging, it seems that it fails to spawn the subprocess. However, when using either mxnet==1.3.0.post0 (no mkl) or pip-installed numpy (no mkl), everything works just fine.
[ Full content available at: https://github.com/apache/incubator-mxnet/issues/8532 ] This message was relayed via gitbox.apache.org for [email protected]
