someone is doing something unhealthy when they fork, which is causing an
assertion in the openmp library. the same assertion that would fire in mkl,
which is linked to libiomp5 (exact same omp library). this is new behavior
and most likely due to an error or suboptimal approach in the forking logic
in mxnet.

in order to circumvent the assert, the Ci team is proposing to remove the
library completely which is equivalent to cutting off your leg to make the
pain from stubbing your toe go away.

we get a lot of performance gain from OMP. is has about a 1/3 less overhead
for entering omp regions and also supports omp regions after a fork, which
libgomp does not.

in many months, no investigation has occurred as to WHY the assertion is
failing.

the pr is vetoed until such a time that the actual root cause of the
problem is known.


thanks,

-Chris.




On Thu, Nov 22, 2018 at 4:36 AM Anton Chernov <mecher...@gmail.com> wrote:

> Dear MXNet community,
>
> I would like to drive attention to an important issue that is present in
> the MXNet CMake build: usage of bundled llvm OpenMP library.
>
> I have opened a PR to remove it:
> https://github.com/apache/incubator-mxnet/pull/12160
>
> The issue was closed, but I am strong in my oppinion that it's the right
> thing to do.
>
> *Background*
> If you want to use OpenMP pragmas in your code for parallelization you
> would supply a special flag to the compiler:
>
> - Clang / -fopenmp
> https://openmp.llvm.org/
>
> - GCC / -fopenmp
> https://gcc.gnu.org/onlinedocs/libgomp/Enabling-OpenMP.html
>
> - Intel / [Q]openmp
>
> https://software.intel.com/en-us/node/522689#6E24682E-F411-4AE3-A04D-ECD81C7008D1
>
> - Visual Studio: /openmp (Enable OpenMP 2.0 Support)
> https://msdn.microsoft.com/en-us/library/tt15eb9t.aspx
>
> Each of the compilers would enable the '#pragma omp' directive during C/C++
> compilation and arrange for automatic linking of the OpenMP runtime library
> supplied by each complier separately.
>
> Thus, to use the advantages of an OpenMP implementation one has to compile
> the code with the corresponding compiler.
>
> Currently, in MXNet CMake build scripts a bundled version of llvm OpenMP is
> used ([1] and [2]) to replace the OpenMP library supplied by the compiler.
>
> I will quote here the README from the MKL-DNN (Intel(R) Math Kernel Library
> for Deep Neural Networks):
>
> "Intel MKL-DNN uses OpenMP* for parallelism and requires an OpenMP runtime
> library to work. As different OpenMP runtimes may not be binary compatible
> it's important to ensure that only one OpenMP runtime is used throughout
> the application. Having more than one OpenMP runtime initialized may lead
> to undefined behavior resulting in incorrect results or crashes." [3]
>
> And:
>
> "Using GNU compiler with -fopenmp and -liomp5 options will link the
> application with both Intel and GNU OpenMP runtime libraries. This will
> lead to undefined behavior of the application." [4]
>
> As can be seen from ldd for MXNet:
>
> $ ldd build/tests/mxnet_unit_tests | grep omp
>     libomp.so => /.../mxnet/build/3rdparty/openmp/runtime/src/libomp.so
> (0x00007f697bc55000)
>     libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1
> (0x00007f69660cd000)
>
> *Performance*
>
> The only performance data related to OpenMP in MXNet I was able to find is
> here:
>
> https://github.com/apache/incubator-mxnet/issues/9744#issuecomment-367711172
>
> Which in my understanding is testing imact of different environment
> variables for the same setup (using same bundled OpenMP library).
>
> The libraries may differ in implementation and the Thread Affinity
> Interface [5] may have significant impact on performance.
>
> All compliers support it:
>
> - Clang / KMP_AFFINITY
>
> https://github.com/clang-ykt/openmp/blob/master/runtime/src/kmp_affinity.cpp
>
> - GCC / GOMP_CPU_AFFINITY
>
> https://gcc.gnu.org/onlinedocs/gcc-4.7.1/libgomp/GOMP_005fCPU_005fAFFINITY.html
>
> - Intel / KMP_AFFINITY
>
> https://software.intel.com/en-us/node/522689#6E24682E-F411-4AE3-A04D-ECD81C7008D1
>
> - Visual Studio / SetThreadAffinityMask
>
> https://docs.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-setthreadaffinitymask
>
> *Issues*
>
> Failed OpenMP assertion when loading MXNet compiled with DEBUG=1
> https://github.com/apache/incubator-mxnet/issues/10856
>
> libomp.so dependency (need REAL fix)
> https://github.com/apache/incubator-mxnet/issues/11417
>
> mxnet-mkl (v0.12.0) crash when using (conda-installed) numpy with MKL
> https://github.com/apache/incubator-mxnet/issues/8532
>
> Performance regression when OMP_NUM_THREADS environment variable is not set
> https://github.com/apache/incubator-mxnet/issues/9744
>
> Poor concat CPU performance on CUDA builds
> https://github.com/apache/incubator-mxnet/issues/11905
>
> I would appreciate hearing your thoughts.
>
>
> Best
> Anton
>
> [1]
>
> https://github.com/apache/incubator-mxnet/blob/master/CMakeLists.txt#L400-L405
> [2] https://github.com/apache/incubator-mxnet/tree/master/3rdparty
> [3] https://github.com/intel/mkl-dnn/blame/master/README.md#L261-L265
> [4] https://github.com/intel/mkl-dnn/blame/master/README.md#L278-L280
> [5] https://software.intel.com/en-us/node/522691
>

Reply via email to