Re: [Discussion] Remove bundled llvm OpenMP

Anton Chernov Sun, 09 Dec 2018 07:29:01 -0800

Hi Chris,

Following up on the issue, are all things resolved in the discussion?


If yes, I kindly ask you to reopen this PR and remove ‘requesting changes’
status:
https://github.com/apache/incubator-mxnet/pull/12160

Thank you.


Best
Anton


вт, 27 нояб. 2018 г. в 17:15, Anton Chernov <[email protected]>:

> Another thing to take into consideration:
>
> All python artefacts that are created (PyPi) are built with make and are
> not using the bundled OpenMP library.
>
> One step for the switch to CMake to happen is the approval and merging of
> the mentioned PR:
>
> https://github.com/apache/incubator-mxnet/pull/12160
>
> If there are no other objections I kindly ask Chris Olivier to remove his
> 'requesting changes' veto on it to unblock the CMake overhaul work.
>
> Thank you.
>
> Best
> Anton
>
> чт, 22 нояб. 2018 г. в 17:11, Anton Chernov <[email protected]>:
>
>>
>> Thank you for you answer, Chris.
>>
>> > The whole “mixing omp libraries” is something that occurs in production
>> every day and certainly in everything that uses mkl.
>>
>> I'm afraid this statement is wrong. Intel MKL-DNN strictly ensures that
>> this mixture is not happening:
>>
>> "Intel MKL-DNN uses OpenMP* for parallelism and requires an OpenMP
>> runtime library to work. As different OpenMP runtimes may not be binary
>> compatible it's important to ensure that only one OpenMP runtime is used
>> throughout the application. Having more than one OpenMP runtime initialized
>> may lead to undefined behavior resulting in incorrect results or crashes."
>> [1]
>>
>> That is why 2 different MKLML libraries are provided:
>>
>> lib/libmklml_gnu.so  | Intel MKL small library for GNU* OpenMP runtime
>> lib/libmklml_intel.so | Intel MKL small library for Intel(R) OpenMP
>> runtime
>>
>> > is the suggestion that libiomp be removed from mkl?
>>
>> That is certainly not my suggestion.
>>
>> > have you spoken with intel? have you consulted Intel at all?
>>
>> Yes, I have asked for comments on the issue.
>>
>> > “hard to debug random crash”. you’re seeing an assertion which is
>> probably ...
>>
>> I'm seeing the result of undefined behaviour. And I want to put emphasis
>> on the following statement:
>>
>> I disregards of whether there is a particular reason for the assert - it
>> is a result of behaviour that should not happen. There are valid ways how
>> to use llvm OpenMP in MXNet and the current way is not one of them.
>>
>> > The lack of root-causing the problem and knee-jerk solution here makes
>> me
>> uncomfortable.
>>
>> I hope that my efforts highlighting the problems reach you to mitigate
>> your uncomfort.
>>
>> > if you want to see performance differences there’s an environment
>> variable
>> you can set in the mxnet omp tuning code that will print overhead and
>> execution times for the current omp library.
>>
>> I don't want to see performance differences in the current OpenMP
>> library. I want to remove the current OpenMP library and use the one
>> provided by the compiler.
>>
>>
>>
>> Best
>> Anton
>>
>> [1] https://github.com/intel/mkl-dnn/blame/master/README.md#L261-L265
>>
>> чт, 22 нояб. 2018 г. в 16:50, Chris Olivier <[email protected]>:
>>
>>> Do you not work on CI mostly? My apologies for thinking that was some
>>> sort
>>> of team effort between you and a few others that were passionate about CI
>>> keeping the CI system running smoothly.
>>>
>>> You have source code, you have the line the assertion is on. If you can’t
>>> describe what’s going wrong that causes the assertion, then I don’t
>>> really
>>> have anything more to add to this conversation beyond what’s below:
>>>
>>> The whole “mixing omp libraries” is something that occurs in production
>>> every day and certainly in everything that uses mkl.  It may occasionally
>>> cause problems for some edge cases when there is super-complex linking
>>> strategies and dynamic loading.  But this is not one of those edge cases.
>>> Mostly blaming this is a red herring for other thread-related problems
>>> and
>>> people switch omp library and the timing of their code changes and they
>>> stop seeing the problem. I’ve spent my entire career doing heavily
>>> multiphreaded c++ development and i’ve seen that a million times.  is the
>>> suggestion that libiomp be removed from mkl? have you spoken with intel?
>>> have you consulted Intel at all?
>>>
>>> and what you are seeing isn’t some “hard to debug random crash”. you’re
>>> seeing an assertion which is probably related to omp trying to create a
>>> thread pool after a fork and something was done in the mxnet code to make
>>> that sketchy to do. I’d suggest filing an issue with the llvm openmp just
>>> like you’d file with any other not-well-understood behavior in mxnet.
>>>
>>> The lack of root-causing the problem and knee-jerk solution here makes me
>>> uncomfortable.
>>>
>>> if you want to see performance differences there’s an environment
>>> variable
>>> you can set in the mxnet omp tuning code that will print overhead and
>>> execution times for the current omp library.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Nov 22, 2018 at 7:12 AM Anton Chernov <[email protected]>
>>> wrote:
>>>
>>> > Hi Chris,
>>> >
>>> > Thank you for your answer. If you have noticed the initial email comes
>>> from
>>> > me, Anton Chernov (@lebeg on Github) and thus the proposal is not from
>>> any
>>> > 'Ci' team that you've mentioned, but from me personally.
>>> >
>>> > You are writing:
>>> >
>>> > > someone is doing something unhealthy when they fork ...
>>> >
>>> > I'm missing any context to understand what you mean.
>>> >
>>> > > we get a lot of performance gain from OMP ...
>>> >
>>> > There is no data that would prove this statement and therefore it is a
>>> > random guess.
>>> >
>>> > > in many months, no investigation has occurred as to WHY the
>>> assertion is
>>> > failing.
>>> >
>>> > The investigation has concluded that this is happening due to undefined
>>> > behaviour which is, in my opinion, a suffient answer that does not
>>> require
>>> > to go any deeper.
>>> >
>>> > > the pr is vetoed until such a time that the actual root cause of the
>>> > problem is known.
>>> >
>>> > And considering the statements above there is no valid reason to veto
>>> the
>>> > PR.
>>> >
>>> >
>>> > Best
>>> > Anton
>>> >
>>> > чт, 22 нояб. 2018 г. в 15:38, Chris Olivier <[email protected]>:
>>> >
>>> > > 3x less overhead*
>>> > >
>>> > > On Thu, Nov 22, 2018 at 6:25 AM Chris Olivier <[email protected]
>>> >
>>> > > wrote:
>>> > >
>>> > > > someone is doing something unhealthy when they fork, which is
>>> causing
>>> > an
>>> > > > assertion in the openmp library. the same assertion that would
>>> fire in
>>> > > mkl,
>>> > > > which is linked to libiomp5 (exact same omp library). this is new
>>> > > behavior
>>> > > > and most likely due to an error or suboptimal approach in the
>>> forking
>>> > > logic
>>> > > > in mxnet.
>>> > > >
>>> > > > in order to circumvent the assert, the Ci team is proposing to
>>> remove
>>> > the
>>> > > > library completely which is equivalent to cutting off your leg to
>>> make
>>> > > the
>>> > > > pain from stubbing your toe go away.
>>> > > >
>>> > > > we get a lot of performance gain from OMP. is has about a 1/3 less
>>> > > > overhead for entering omp regions and also supports omp regions
>>> after a
>>> > > > fork, which libgomp does not.
>>> > > >
>>> > > > in many months, no investigation has occurred as to WHY the
>>> assertion
>>> > is
>>> > > > failing.
>>> > > >
>>> > > > the pr is vetoed until such a time that the actual root cause of
>>> the
>>> > > > problem is known.
>>> > > >
>>> > > >
>>> > > > thanks,
>>> > > >
>>> > > > -Chris.
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > > On Thu, Nov 22, 2018 at 4:36 AM Anton Chernov <[email protected]
>>> >
>>> > > wrote:
>>> > > >
>>> > > >> Dear MXNet community,
>>> > > >>
>>> > > >> I would like to drive attention to an important issue that is
>>> present
>>> > in
>>> > > >> the MXNet CMake build: usage of bundled llvm OpenMP library.
>>> > > >>
>>> > > >> I have opened a PR to remove it:
>>> > > >> https://github.com/apache/incubator-mxnet/pull/12160
>>> > > >>
>>> > > >> The issue was closed, but I am strong in my oppinion that it's the
>>> > right
>>> > > >> thing to do.
>>> > > >>
>>> > > >> *Background*
>>> > > >> If you want to use OpenMP pragmas in your code for
>>> parallelization you
>>> > > >> would supply a special flag to the compiler:
>>> > > >>
>>> > > >> - Clang / -fopenmp
>>> > > >> https://openmp.llvm.org/
>>> > > >>
>>> > > >> - GCC / -fopenmp
>>> > > >> https://gcc.gnu.org/onlinedocs/libgomp/Enabling-OpenMP.html
>>> > > >>
>>> > > >> - Intel / [Q]openmp
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>> https://software.intel.com/en-us/node/522689#6E24682E-F411-4AE3-A04D-ECD81C7008D1
>>> > > >>
>>> > > >> - Visual Studio: /openmp (Enable OpenMP 2.0 Support)
>>> > > >> https://msdn.microsoft.com/en-us/library/tt15eb9t.aspx
>>> > > >>
>>> > > >> Each of the compilers would enable the '#pragma omp' directive
>>> during
>>> > > >> C/C++
>>> > > >> compilation and arrange for automatic linking of the OpenMP
>>> runtime
>>> > > >> library
>>> > > >> supplied by each complier separately.
>>> > > >>
>>> > > >> Thus, to use the advantages of an OpenMP implementation one has to
>>> > > compile
>>> > > >> the code with the corresponding compiler.
>>> > > >>
>>> > > >> Currently, in MXNet CMake build scripts a bundled version of llvm
>>> > OpenMP
>>> > > >> is
>>> > > >> used ([1] and [2]) to replace the OpenMP library supplied by the
>>> > > compiler.
>>> > > >>
>>> > > >> I will quote here the README from the MKL-DNN (Intel(R) Math
>>> Kernel
>>> > > >> Library
>>> > > >> for Deep Neural Networks):
>>> > > >>
>>> > > >> "Intel MKL-DNN uses OpenMP* for parallelism and requires an OpenMP
>>> > > runtime
>>> > > >> library to work. As different OpenMP runtimes may not be binary
>>> > > compatible
>>> > > >> it's important to ensure that only one OpenMP runtime is used
>>> > throughout
>>> > > >> the application. Having more than one OpenMP runtime initialized
>>> may
>>> > > lead
>>> > > >> to undefined behavior resulting in incorrect results or crashes."
>>> [3]
>>> > > >>
>>> > > >> And:
>>> > > >>
>>> > > >> "Using GNU compiler with -fopenmp and -liomp5 options will link
>>> the
>>> > > >> application with both Intel and GNU OpenMP runtime libraries. This
>>> > will
>>> > > >> lead to undefined behavior of the application." [4]
>>> > > >>
>>> > > >> As can be seen from ldd for MXNet:
>>> > > >>
>>> > > >> $ ldd build/tests/mxnet_unit_tests | grep omp
>>> > > >>     libomp.so =>
>>> > /.../mxnet/build/3rdparty/openmp/runtime/src/libomp.so
>>> > > >> (0x00007f697bc55000)
>>> > > >>     libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1
>>> > > >> (0x00007f69660cd000)
>>> > > >>
>>> > > >> *Performance*
>>> > > >>
>>> > > >> The only performance data related to OpenMP in MXNet I was able to
>>> > find
>>> > > is
>>> > > >> here:
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>> https://github.com/apache/incubator-mxnet/issues/9744#issuecomment-367711172
>>> > > >>
>>> > > >> Which in my understanding is testing imact of different
>>> environment
>>> > > >> variables for the same setup (using same bundled OpenMP library).
>>> > > >>
>>> > > >> The libraries may differ in implementation and the Thread Affinity
>>> > > >> Interface [5] may have significant impact on performance.
>>> > > >>
>>> > > >> All compliers support it:
>>> > > >>
>>> > > >> - Clang / KMP_AFFINITY
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>> https://github.com/clang-ykt/openmp/blob/master/runtime/src/kmp_affinity.cpp
>>> > > >>
>>> > > >> - GCC / GOMP_CPU_AFFINITY
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>> https://gcc.gnu.org/onlinedocs/gcc-4.7.1/libgomp/GOMP_005fCPU_005fAFFINITY.html
>>> > > >>
>>> > > >> - Intel / KMP_AFFINITY
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>> https://software.intel.com/en-us/node/522689#6E24682E-F411-4AE3-A04D-ECD81C7008D1
>>> > > >>
>>> > > >> - Visual Studio / SetThreadAffinityMask
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>> https://docs.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-setthreadaffinitymask
>>> > > >>
>>> > > >> *Issues*
>>> > > >>
>>> > > >> Failed OpenMP assertion when loading MXNet compiled with DEBUG=1
>>> > > >> https://github.com/apache/incubator-mxnet/issues/10856
>>> > > >>
>>> > > >> libomp.so dependency (need REAL fix)
>>> > > >> https://github.com/apache/incubator-mxnet/issues/11417
>>> > > >>
>>> > > >> mxnet-mkl (v0.12.0) crash when using (conda-installed) numpy with
>>> MKL
>>> > > >> https://github.com/apache/incubator-mxnet/issues/8532
>>> > > >>
>>> > > >> Performance regression when OMP_NUM_THREADS environment variable
>>> is
>>> > not
>>> > > >> set
>>> > > >> https://github.com/apache/incubator-mxnet/issues/9744
>>> > > >>
>>> > > >> Poor concat CPU performance on CUDA builds
>>> > > >> https://github.com/apache/incubator-mxnet/issues/11905
>>> > > >>
>>> > > >> I would appreciate hearing your thoughts.
>>> > > >>
>>> > > >>
>>> > > >> Best
>>> > > >> Anton
>>> > > >>
>>> > > >> [1]
>>> > > >>
>>> > > >>
>>> > >
>>> >
>>> https://github.com/apache/incubator-mxnet/blob/master/CMakeLists.txt#L400-L405
>>> > > >> [2]
>>> https://github.com/apache/incubator-mxnet/tree/master/3rdparty
>>> > > >> [3]
>>> https://github.com/intel/mkl-dnn/blame/master/README.md#L261-L265
>>> > > >> [4]
>>> https://github.com/intel/mkl-dnn/blame/master/README.md#L278-L280
>>> > > >> [5] https://software.intel.com/en-us/node/522691
>>> > > >>
>>> > > >
>>> > >
>>> >
>>>
>>

Re: [Discussion] Remove bundled llvm OpenMP

Reply via email to