Re: Performance regression from removing libiomp5.so

2019-12-12 Thread Lausen, Leonard
That error should be fixed by Chris's work at 
https://github.com/apache/incubator-mxnet/pull/17039

It is currently expected that libmxnet.so transitively requires both libomp.so
and libgomp.so. If this is an issue, we need to build OpenBLAS from source as
part of our build scripts, because it introduces the libgomp.so requirement.

Are there any other known issues currently?

@Sam, do we have any performance comparison with the CMake build of 1.6 (+ #
17039 fix backported)?

Chris also mentioned he still has a follow-up PR 

> I actually also did change (removed from this PR in order to keep it
> simple) that stops forcing the creation of the engine during static init and
> the fork handlers, if it is desired. This has a couple of effects including
> reducing the overhead of resources and threads created simply by the library
> being loaded, which then, in turn, causes a lot of thread churn (Stop(),
> Start()) when forking. Even if hey never use mxnet, we spin up a lot of stuff
> unnecessarily. It actually gets lot faster on startp when this is done.

Best regards
Leonard

On Fri, 2019-12-13 at 13:34 +0800, Tao Lv wrote:
> Hi Chris,
> 
> From the licensing standpoint, llvm omp is indeed a choice. But previously
> we noticed that building mxnet with cmake and the llvm omp under 3rdparty
> folder will cause two runtimes linked to libmxnet.so [1]. Do you think
> that's still a problem?
> 
> Also with the current two build systems, linking llvm omp means we need
> move the binary release process from make to cmake, which i think need more
> discussion in the community. It's not likely we can finish that within the
> schedule of 1.6.0 release.
> 
> [1]
> https://github.com/apache/incubator-mxnet/issues/11417#issuecomment-555413002
> 
> On Thu, Dec 12, 2019 at 10:28 PM Chris Olivier 
> wrote:
> 
> > Hi Patric,
> > 
> > The llvm openmp we compile (originally from same Intel source as we all
> > know) seems to be Apache 2.0 licensed. Could we use that instead from a
> > licensing standpoint?
> > 
> > On Wed, Dec 11, 2019 at 10:36 PM Zhao, Patric 
> > wrote:
> > 
> > > Thanks, Sam.
> > > 
> > > The root cause is from different OpenMP library. Intel OpenMP will
> > provide
> > > better performance as your data shown.
> > > 
> > > Regarding release, since the license issue[1], we can't ship Intel OpenMP
> > > in the binary, but the most of performance boost from MKLDNN is still
> > > available.
> > > I think it should be acceptable to release 1.6 with MKLDNN  + GNU OpenMP
> > > for suboptimal performance.
> > > 
> > > To achieve the best performance, user should build from source to enable
> > > more advanced features like Intel MKL, Intel OpenMP, AVX512.
> > > 
> > > Thanks,
> > > 
> > > --Patric
> > > 
> > > [1] https://www.apache.org/legal/resolved.html#category-x
> > > 
> > > 
> > > 
> > > > -Original Message-
> > > > From: Skalicky, Sam 
> > > > Sent: Wednesday, December 11, 2019 1:36 PM
> > > > To: dev@mxnet.incubator.apache.org
> > > > Cc: Keshavan, Arjuna ; Harish, Nihal
> > > > 
> > > > Subject: Performance regression from removing libiomp5.so
> > > > 
> > > > Hi MXNet community,
> > > > 
> > > > I would like to bring your attention to the performance regression that
> > > was
> > > > found [1] between 1.5.1 and 1.6.0 due to removing the libiomp5.so
> > library
> > > > due to licensing issues. This change was made since this library has a
> > > category
> > > > x license [2] that is not compatible with the MXNet Apache
> > > > license/distribution.
> > > > 
> > > > We found that using OpenBLAS instead of MKL BLAS caused a regression
> > > > from 1500 samples/sec to 1300 samples/sec a 13.3% regression in
> > training
> > > > speed for a resnet18 training benchmark on a C5.18xlarge EC2 instance
> > > (with
> > > > 72 cores). Rebuilding with MKL BLAS showed an increase in performance
> > to
> > > > 1600 samples/sec in the 1.6.0 branch.
> > > > 
> > > > Please provide your feedback on the licensing issue (are there any
> > work-
> > > > arounds) and the tradeoff in performance (is the benefit worth trying
> > to
> > > > include back into MXNet builds).
> > > > 
> > > > Thanks to the efforts of the following folks for working on this issue
> > > (in no
> > > > particular order):
> > > > Patric Zhao
> > > > Amol Lele
> > > > Tao Lv A
> > > > Pedro Larroy
> > > > Nihal Harish
> > > > Chai Bapat
> > > > Arjuna Keshavan
> > > > Rong Zhang
> > > > 
> > > > Thanks!
> > > > Sam
> > > > 
> > > > [1] https://github.com/apache/incubator-mxnet/issues/16891
> > > > [2] https://www.apache.org/legal/resolved.html#category-x


Re: Performance regression from removing libiomp5.so

2019-12-12 Thread Tao Lv
Hi Chris,

>From the licensing standpoint, llvm omp is indeed a choice. But previously
we noticed that building mxnet with cmake and the llvm omp under 3rdparty
folder will cause two runtimes linked to libmxnet.so [1]. Do you think
that's still a problem?

Also with the current two build systems, linking llvm omp means we need
move the binary release process from make to cmake, which i think need more
discussion in the community. It's not likely we can finish that within the
schedule of 1.6.0 release.

[1]
https://github.com/apache/incubator-mxnet/issues/11417#issuecomment-555413002

On Thu, Dec 12, 2019 at 10:28 PM Chris Olivier 
wrote:

> Hi Patric,
>
> The llvm openmp we compile (originally from same Intel source as we all
> know) seems to be Apache 2.0 licensed. Could we use that instead from a
> licensing standpoint?
>
> On Wed, Dec 11, 2019 at 10:36 PM Zhao, Patric 
> wrote:
>
> > Thanks, Sam.
> >
> > The root cause is from different OpenMP library. Intel OpenMP will
> provide
> > better performance as your data shown.
> >
> > Regarding release, since the license issue[1], we can't ship Intel OpenMP
> > in the binary, but the most of performance boost from MKLDNN is still
> > available.
> > I think it should be acceptable to release 1.6 with MKLDNN  + GNU OpenMP
> > for suboptimal performance.
> >
> > To achieve the best performance, user should build from source to enable
> > more advanced features like Intel MKL, Intel OpenMP, AVX512.
> >
> > Thanks,
> >
> > --Patric
> >
> > [1] https://www.apache.org/legal/resolved.html#category-x
> >
> >
> >
> > > -Original Message-
> > > From: Skalicky, Sam 
> > > Sent: Wednesday, December 11, 2019 1:36 PM
> > > To: dev@mxnet.incubator.apache.org
> > > Cc: Keshavan, Arjuna ; Harish, Nihal
> > > 
> > > Subject: Performance regression from removing libiomp5.so
> > >
> > > Hi MXNet community,
> > >
> > > I would like to bring your attention to the performance regression that
> > was
> > > found [1] between 1.5.1 and 1.6.0 due to removing the libiomp5.so
> library
> > > due to licensing issues. This change was made since this library has a
> > category
> > > x license [2] that is not compatible with the MXNet Apache
> > > license/distribution.
> > >
> > > We found that using OpenBLAS instead of MKL BLAS caused a regression
> > > from 1500 samples/sec to 1300 samples/sec a 13.3% regression in
> training
> > > speed for a resnet18 training benchmark on a C5.18xlarge EC2 instance
> > (with
> > > 72 cores). Rebuilding with MKL BLAS showed an increase in performance
> to
> > > 1600 samples/sec in the 1.6.0 branch.
> > >
> > > Please provide your feedback on the licensing issue (are there any
> work-
> > > arounds) and the tradeoff in performance (is the benefit worth trying
> to
> > > include back into MXNet builds).
> > >
> > > Thanks to the efforts of the following folks for working on this issue
> > (in no
> > > particular order):
> > > Patric Zhao
> > > Amol Lele
> > > Tao Lv A
> > > Pedro Larroy
> > > Nihal Harish
> > > Chai Bapat
> > > Arjuna Keshavan
> > > Rong Zhang
> > >
> > > Thanks!
> > > Sam
> > >
> > > [1] https://github.com/apache/incubator-mxnet/issues/16891
> > > [2] https://www.apache.org/legal/resolved.html#category-x
> >
>


Re: Performance regression from removing libiomp5.so

2019-12-12 Thread Chris Olivier
Hi Patric,

The llvm openmp we compile (originally from same Intel source as we all
know) seems to be Apache 2.0 licensed. Could we use that instead from a
licensing standpoint?

On Wed, Dec 11, 2019 at 10:36 PM Zhao, Patric  wrote:

> Thanks, Sam.
>
> The root cause is from different OpenMP library. Intel OpenMP will provide
> better performance as your data shown.
>
> Regarding release, since the license issue[1], we can't ship Intel OpenMP
> in the binary, but the most of performance boost from MKLDNN is still
> available.
> I think it should be acceptable to release 1.6 with MKLDNN  + GNU OpenMP
> for suboptimal performance.
>
> To achieve the best performance, user should build from source to enable
> more advanced features like Intel MKL, Intel OpenMP, AVX512.
>
> Thanks,
>
> --Patric
>
> [1] https://www.apache.org/legal/resolved.html#category-x
>
>
>
> > -Original Message-
> > From: Skalicky, Sam 
> > Sent: Wednesday, December 11, 2019 1:36 PM
> > To: dev@mxnet.incubator.apache.org
> > Cc: Keshavan, Arjuna ; Harish, Nihal
> > 
> > Subject: Performance regression from removing libiomp5.so
> >
> > Hi MXNet community,
> >
> > I would like to bring your attention to the performance regression that
> was
> > found [1] between 1.5.1 and 1.6.0 due to removing the libiomp5.so library
> > due to licensing issues. This change was made since this library has a
> category
> > x license [2] that is not compatible with the MXNet Apache
> > license/distribution.
> >
> > We found that using OpenBLAS instead of MKL BLAS caused a regression
> > from 1500 samples/sec to 1300 samples/sec a 13.3% regression in training
> > speed for a resnet18 training benchmark on a C5.18xlarge EC2 instance
> (with
> > 72 cores). Rebuilding with MKL BLAS showed an increase in performance to
> > 1600 samples/sec in the 1.6.0 branch.
> >
> > Please provide your feedback on the licensing issue (are there any work-
> > arounds) and the tradeoff in performance (is the benefit worth trying to
> > include back into MXNet builds).
> >
> > Thanks to the efforts of the following folks for working on this issue
> (in no
> > particular order):
> > Patric Zhao
> > Amol Lele
> > Tao Lv A
> > Pedro Larroy
> > Nihal Harish
> > Chai Bapat
> > Arjuna Keshavan
> > Rong Zhang
> >
> > Thanks!
> > Sam
> >
> > [1] https://github.com/apache/incubator-mxnet/issues/16891
> > [2] https://www.apache.org/legal/resolved.html#category-x
>