TaoLv commented on issue #12953: Update MKL-DNN dependency
URL: https://github.com/apache/incubator-mxnet/pull/12953#issuecomment-434207518
 
 
   I tried to reproduce it on g3.8xlarge instance and happened to get the same 
error when the test binary is linked to an old version of mkldnn. Here are the 
steps: 
   
   1. launch a g3.8xlarge instance and choose the first environment from AMI 
list:
   
   ``` 
   =============================================================================
          __|  __|_  )
          _|  (     /   Deep Learning AMI (Ubuntu) Version 16.0
         ___|\___|___|
    
=============================================================================
    
   Welcome to Ubuntu 16.04.5 LTS (GNU/Linux 4.4.0-1069-aws x86_64v)
    
   Please use one of the following commands to start the required environment 
with the framework of your choice:
   for MXNet(+Keras2) with Python3 (CUDA 9.0 and Intel MKL-DNN) 
_______________________________ source activate mxnet_p36
   for MXNet(+Keras2) with Python2 (CUDA 9.0 and Intel MKL-DNN) 
_______________________________ source activate mxnet_p27
   for TensorFlow(+Keras2) with Python3 (CUDA 9.0 and Intel MKL-DNN) 
_____________________ source activate tensorflow_p36
   for TensorFlow(+Keras2) with Python2 (CUDA 9.0 and Intel MKL-DNN) 
_____________________ source activate tensorflow_p27
   for Theano(+Keras2) with Python3 (CUDA 9.0) 
_______________________________________________ source activate theano_p36
   for Theano(+Keras2) with Python2 (CUDA 9.0) 
_______________________________________________ source activate theano_p27
    ```
   
   2. remove the pre-installed mxnet
   3. download and build the branch from my PR:
   `make -j8  USE_BLAS=openblas USE_MKLDNN=1 USE_PROFILER=1 test`
   4. copy cpp test binary to mxnet root folder and execute:
   `./mxnet_unit_tests --gtest_filter=MKLDNN*`
   and will get the “illegal instruction” error:
   ``` 
   (mxnet_p36) ubuntu@ip-172-31-63-232:~/incubator-mxnet$ ./mxnet_unit_tests 
--gtest_filter=MKLDNN*
    
   Note: Google Test filter = MKLDNN*
   [==========] Running 8 tests from 3 test cases.
   [----------] Global test environment set-up.
   [----------] 2 tests from MKLDNN_UTIL_FUNC
   [ RUN      ] MKLDNN_UTIL_FUNC.AlignMem
   [       OK ] MKLDNN_UTIL_FUNC.AlignMem (1 ms)
   [ RUN      ] MKLDNN_UTIL_FUNC.MemFormat
   [       OK ] MKLDNN_UTIL_FUNC.MemFormat (0 ms)
   [----------] 2 tests from MKLDNN_UTIL_FUNC (1 ms total)
    
   [----------] 4 tests from MKLDNN_NDArray
   [ RUN      ] MKLDNN_NDArray.GetDataReorder
   Illegal instruction (core dumped)
    ```
   
   5. ldd mxnet_unit_tests and find that it links to the 
/usr/local/lib/libmkldnn.so.0 which is with version 0.14 and distributed with 
this AMI environment.
   
   6. after update /usr/local/lib/libmkldnn.so.0 to latest, the cpp test 
failure would disappear.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to