ChaiBapchya edited a comment on issue #17980: URL: https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-623803913
> In case somebody finds this issue and wants their optimized build, here is a different workaround that removes the need for `LD_PRELOAD`. Just do this before running cmake the first time: > > ```shell > export CXXFLAGS="${CXXFLAGS} -DUSE_MKL -I/opt/intel/mkl/include" > ``` > > Then `cmake` can be run normally: > > ```shell > cmake -GNinja -DUSE_CUDA=OFF -DCMAKE_BUILD_TYPE=Release .. > ``` > > and the compiled MXNet can be run normally without any special environment variables. @kpuatamazon Hi I was trying to benchmark using opperf for mkl [default] vs workaround And despite ensuring mkl is installed & using export CXXFlags followed by usual cmake command, build failed with ``` gemm.cpp:(.text+0xb45): undefined reference to `cblas_gemm_s8u8s32' ``` I tried the undocumented abominable kludge option you mentioned and that worked smoothly. ``` export LD_PRELOAD=/opt/intel/mkl/lib/intel64/libmkl_rt.so rm -rf build/ mkdir -p build && cd build cmake -GNinja -DUSE_CUDA=OFF -DCMAKE_BUILD_TYPE=Release -D_DNNL_USE_MKL=FULL -DMKLINC=/opt/intel/mkl/include .. cmake --build . --parallel 1024 ``` Script for OpPerf : https://gist.github.com/ChaiBapchya/5f2342f75ddeb1e21f14acac665c76ad Results | Operator | LHS | RHS | MKL Default | MKL Workaround | |----------- |---------------- |---------------- |------------- |---------------- | | Dot | (4, 512, 512) | (4, 512, 512) | 15.1122 | 4.1254 | | | (5, 512, 512) | (5, 512, 512) | 38.1678 | 7.5323 | | | (5, 512, 1536 | (5, 512, 1536) | 21.6601 | 19.2503 | | | (5, 512, 2048) | (5, 512, 2048) | 29.0369 | 23.7432 | | | (5, 2048, 512) | (5, 2048, 512) | 167.5528 | 129.9957 | | | | | | | | Batch_dot | (4, 512, 512) | (4, 512, 512) | 1.7898 | 1.5445 | | | (5, 512, 512) | (5, 512, 512) | 2.2457 | 1.9361 | | | (5, 512, 1536) | (5, 512, 1536) | 6.1453 | 5.4034 | | | (5, 512, 2048) | (5, 512, 2048) | 8.246 | 8.0442 | | | (5, 2048, 512) | (5, 2048, 512) | 160.6243 | 29.0772 | | | | | | | | | **Data** | **Weight** | | | | FC | (4, 512) | (512, 512) | 0.0609 | 0.068 | | | (5, 512) | (512, 512) | 0.0633 | 0.0731 | | | (5, 512) | (1536, 512) | 0.0916 | 0.0996 | | | (5, 512) | (2048, 512) | 0.1081 | 0.1084 | | | | | | | However @kpuatamazon when I try to test out with default [i.e. default -> workaround -> default] by unsetting the environment variable LD_PRELOAD, it failed to build default with `gemm.cpp:(.text+0xe6b): undefined reference to `cblas_gemm_s8u8s32'` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org