Re: Dependency Update

2019-05-22 Thread Jake Lee
Thanks Aaron that's a great suggestion.
The reason why I put it under tools/dependencies is that the doc is
intended for developers who want to contribe to update the dependencies of
our PyPI package.
Regarding the CI, I'm also working on upgrading CUDA/cuDNN version that CI
use -PR[1][2].

Thanks,
Jake

[1] https://github.com/apache/incubator-mxnet/pull/14986
[2] https://github.com/apache/incubator-mxnet/pull/14950

On Wed, May 22, 2019 at 3:15 PM Aaron Markham 
wrote:

> Thanks for doing a thorough look at the version ranges. I have this PR
> [1] waiting for review that tries to pin graphviz and opencv, and it
> updates CI as well as the docs that go on the website.
> I think your updates would be beneficial in the docs that go on the
> website and should also update CI. Is there a benefit to having them
> as a readme in /tools? Doesn't this create extra maintenance with
> these version numbers being in three places (website install
> instructions, /tools folder, /ci folder)?
>
> [1] https://github.com/apache/incubator-mxnet/pull/14987
>
> On Wed, May 22, 2019 at 2:31 PM Qing Lan  wrote:
> >
> >
> > Great work Jake! The content on CPU/GPU build instruction is really
> helpful.
> >
> > Thanks,
> > Qing
> >
> > 
> > From: Jake Lee 
> > Sent: Wednesday, May 22, 2019 17:26
> > To: dev@mxnet.incubator.apache.org
> > Subject: Dependency Update
> >
> > Dear Community,
> >
> > I have been working on dependency udpate of MXNet. The goal is to upgrade
> > the dependencies that have security vulnerability issues and make MXNet
> > great again by being benefit from latest CUDA, cuDNN and NCCL software. I
> > documented the process on PR<
> > https://github.com/apache/incubator-mxnet/pull/15045>. Big thanks to
> Sheng
> > Zha(szha), Dick Carter(DickJC123), Anirudh Subramanian(anirudh2290), Qing
> > Lan(lanking520), Per Goncalves da Silva(perdasilva) for supporting me!
> >
> > Thanks,
> > Jake
>


Re: Dependency Update

2019-05-22 Thread Aaron Markham
Thanks for doing a thorough look at the version ranges. I have this PR
[1] waiting for review that tries to pin graphviz and opencv, and it
updates CI as well as the docs that go on the website.
I think your updates would be beneficial in the docs that go on the
website and should also update CI. Is there a benefit to having them
as a readme in /tools? Doesn't this create extra maintenance with
these version numbers being in three places (website install
instructions, /tools folder, /ci folder)?

[1] https://github.com/apache/incubator-mxnet/pull/14987

On Wed, May 22, 2019 at 2:31 PM Qing Lan  wrote:
>
>
> Great work Jake! The content on CPU/GPU build instruction is really helpful.
>
> Thanks,
> Qing
>
> 
> From: Jake Lee 
> Sent: Wednesday, May 22, 2019 17:26
> To: dev@mxnet.incubator.apache.org
> Subject: Dependency Update
>
> Dear Community,
>
> I have been working on dependency udpate of MXNet. The goal is to upgrade
> the dependencies that have security vulnerability issues and make MXNet
> great again by being benefit from latest CUDA, cuDNN and NCCL software. I
> documented the process on PR<
> https://github.com/apache/incubator-mxnet/pull/15045>. Big thanks to Sheng
> Zha(szha), Dick Carter(DickJC123), Anirudh Subramanian(anirudh2290), Qing
> Lan(lanking520), Per Goncalves da Silva(perdasilva) for supporting me!
>
> Thanks,
> Jake


Re: [DISCUSS] 1.5.0 Release Plan

2019-05-22 Thread Lai Wei
Hi @dev,

Thanks for working hard for the 1.5 release, since there has been several
release blockers (mostly fixed). We are extending the code freeze to Friday
05/22/2019. Right now we are tracking the following 5 open
PRs[1][2][3][4][5] and 1 issue[6]. Please let us know if you need more
time.

I would like to encourage all downstream projects to test with latest MXNet
to avoid any incompatibility in the coming 1.5.0 release. If you have any
issues that may block the release, please let us know.
Thank you very much.

[1] https://github.com/apache/incubator-mxnet/pull/14713
[2] https://github.com/apache/incubator-mxnet/pull/14893
[3] https://github.com/apache/incubator-mxnet/pull/15031
[4] https://github.com/apache/incubator-mxnet/pull/15039
[5] https://github.com/apache/incubator-mxnet/pull/15041
[6] https://github.com/apache/incubator-mxnet/issues/15034


Best Regards

Lai


On Wed, May 15, 2019 at 9:05 PM Junru Shao  wrote:

> Hi folks,
>
> Here I may have a release blocker for 1.5.0 about implementation of dynamic
> shape mechanism, which somehow conflicts with Gluon's deferred
> initialization [1].
>
> [1] https://github.com/dmlc/gluon-nlp/issues/706
>
> On Wed, May 15, 2019 at 12:09 PM Anirudh Subramanian <
> anirudh2...@gmail.com>
> wrote:
>
> > Hi Lai,
> >
> > From the discussion I had with Nvidia offline they are targeting on
> pushing
> > the required changes today.
> > Since this is important feature for the release, if this gets delayed and
> > cannot  be merged by 05/17/2019,
> > the code freeze date may need to be changed.
> >
> > Anirudh
> >
> > On Wed, May 15, 2019 at 1:23 AM Lv, Tao A  wrote:
> >
> > > Hi dev,
> > >
> > > We see there are several github issues [1][2][3][4] about mxnet windows
> > > build experience. The team is working intensively [5][6][7] on that to
> > fix
> > > some problems of MKL-DNN build on windows. We hope these fixes can
> catch
> > > the code freeze and finally enter the 1.5.0 release.
> > >
> > > The PR against mshadow (#374) was already merged and MXNet PR #14877 is
> > > under review - great thanks to CI team for helping on the MKL
> > installation
> > > request. PR #14952 is document change according to build logic changes
> in
> > > PR #14877. So I think these two PRs should be merged simultaneously.
> > > Currently #14877 is experiencing a CI response problem.
> > >
> > > Please take your time to have a look at these two PRs. Your comments
> and
> > > suggestions are highly appreciated.
> > >
> > > Thanks,
> > > -tao
> > >
> > > [1] https://github.com/apache/incubator-mxnet/issues/14670
> > > [2] https://github.com/apache/incubator-mxnet/issues/14335
> > > [3] https://github.com/apache/incubator-mxnet/issues/14203
> > > [4] https://github.com/apache/incubator-mxnet/issues/14085
> > > [5] https://github.com/apache/incubator-mxnet/pull/14877
> > > [6] https://github.com/dmlc/mshadow/pull/374
> > > [7] https://github.com/apache/incubator-mxnet/pull/14952
> > >
> > > -Original Message-
> > > From: Lai Wei [mailto:roywei...@gmail.com]
> > > Sent: Wednesday, May 15, 2019 2:57 PM
> > > To: dev@mxnet.incubator.apache.org
> > > Subject: Re: [DISCUSS] 1.5.0 Release Plan
> > >
> > > Hi Anirudh,
> > >
> > > I see there was an offline disucssion
> > > <
> > >
> >
> https://github.com/apache/incubator-mxnet/pull/14173#pullrequestreview-235846341
> > > >
> > > and I have updated the AMP feature and your project on the release
> > tracker
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/1.5.0+Release+Plan+and+Status
> > > >
> > > ,
> > > Please let me know if you have any updates.
> > >
> > > Hi @dev,
> > > This is a gentle reminder that  the code freeze for 1.5.0 release is on
> > > 05/17/2019, please let us know if you have any WIP pull requests aiming
> > for
> > > 1.5.0 that needs attention.
> > > Please understand we already have around 650 commits in master that
> need
> > > to be released in time. We understand TensorRT test in CI is failing
> and
> > > are trying to fix it. Meanwhile please update the tracker if there is
> any
> > > change:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/1.5.0+Release+Plan+and+Status
> > >
> > > Thanks!
> > >
> > > Lai
> > >
> > >
> > > On Wed, May 8, 2019 at 11:58 AM Anirudh Subramanian <
> > anirudh2...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi Sheng,
> > > >
> > > > I had a discussion with nvidia folks offline today (@ptrendx et.
> al.).
> > > > I strongly feel that the AMP feature should be included as part of
> the
> > > > release: https://github.com/apache/incubator-mxnet/pull/14173 .
> > > > The PR is aimed for completion for next week but reviews and RFC
> > > > discussions may take some time. I would request to extend the release
> > > > code freeze by 2 weeks.
> > > > Also, I would like to include
> > > >
> > > >
> https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32
> > > > +to+Mixed+Precision+Models
> > > > which
> > > > depends on the AMP PR.
> 

Re: Dependency Update

2019-05-22 Thread Qing Lan

Great work Jake! The content on CPU/GPU build instruction is really helpful.

Thanks,
Qing


From: Jake Lee 
Sent: Wednesday, May 22, 2019 17:26
To: dev@mxnet.incubator.apache.org
Subject: Dependency Update

Dear Community,

I have been working on dependency udpate of MXNet. The goal is to upgrade
the dependencies that have security vulnerability issues and make MXNet
great again by being benefit from latest CUDA, cuDNN and NCCL software. I
documented the process on PR<
https://github.com/apache/incubator-mxnet/pull/15045>. Big thanks to Sheng
Zha(szha), Dick Carter(DickJC123), Anirudh Subramanian(anirudh2290), Qing
Lan(lanking520), Per Goncalves da Silva(perdasilva) for supporting me!

Thanks,
Jake


Dependency Update

2019-05-22 Thread Jake Lee
Dear Community,

I have been working on dependency udpate of MXNet. The goal is to upgrade
the dependencies that have security vulnerability issues and make MXNet
great again by being benefit from latest CUDA, cuDNN and NCCL software. I
documented the process on PR<
https://github.com/apache/incubator-mxnet/pull/15045>. Big thanks to Sheng
Zha(szha), Dick Carter(DickJC123), Anirudh Subramanian(anirudh2290), Qing
Lan(lanking520), Per Goncalves da Silva(perdasilva) for supporting me!

Thanks,
Jake


Re: warnings as errors

2019-05-22 Thread Pedro Larroy
I was not able to fix the warnings on mshadow type switch with unused
local typedefs, that's one example of warning that I would disable. I
couldn't find a way to solve that one and I think the ramifications of
an unused typedef are not likely to cause bugs in the code and are
more of a pedantic nature.

https://github.com/apache/incubator-mxnet/pull/13424

I think turning on them one by one is going to pollute the compilation
output unecesarily and even run into commandline length problems?  I
think is best to enable all warnings and errors and cherry pick the
ones we can't fix or won't fix on purpose.

In this other case, I managed to tighten the warnings but ASAN is
giving some problems:

https://github.com/apache/incubator-mxnet/pull/14850

I think having warning fixes reviewed and merged faster without
triggering additional refactorings could make this process easier,
also having some help in this area and contributions would be greatly
appreciated.

Pedro.

On Tue, May 21, 2019 at 3:49 PM Sheng Zha  wrote:
>
> It would be great to enforce the check for warnings and treat as errors. Some 
> questions I have:
> - what are the warnings that you think should be ignored?
> - for the rest of the warning types, can we turn them on one by one?
>
> -sz
>
> On 2019/05/21 22:33:51, Pedro Larroy  wrote:
> > Hi dev@
> >
> > I try to fix any warning that I see during compilation of MXNet in my
> > platform and with the build toggles that I care about. These seemingly
> > trivial and ungrateful efforts, take nonetheless energy on the
> > contributor side.
> >
> > I think overall I submitted myself more than a dozen of PRs fixing
> > warnings and I would like to call for additional help and
> > contributions in this area.
> >
> > There was a question from Lin about discussing this on the mailing
> > list, I have the feeling that everybody agrees on moving towards zero
> > warnings and warnings as errors. I think there are unavoidable
> > warnings that can be disabled specifically such as the one triggered
> > by mshadow type switch.
> >
> > Some important missing warnings such as warning on missing return
> > values (ie. forgetting to return on a function returning non-void)
> > cause bugs, danger and additional time spent bugfixing, which can be
> > better spent somewhere else.
> >
> > Is there a process that we can figure out such as a more expedited
> > merges of PRs fixing warnings or a specific label?
> >
> > Some simple PRs that fixes a warning can take long to merge, and
> > sometimes trigger too much discussion and make the progress a bit
> > unfriendly to contributors.
> >
> > Any help or constructive ideas on this topic would be appreciated.
> >
> > Pedro.
> >


Re: [Discussion] Remove bundled llvm OpenMP

2019-05-22 Thread Anton Chernov
We are now waiting for a committer's review and merge.

ср, 22 мая 2019 г. в 22:14, Pedro Larroy :

> Thanks Aaron and Anton!   Can we rebase to update the PR?  Let me know
> how can I help further if you find some problems.
>
> On Wed, May 22, 2019 at 6:49 AM Aaron Markham 
> wrote:
> >
> > I reopened it for you.
> >
> > On Wed, May 22, 2019, 05:25 Anton Chernov  wrote:
> >
> > > I don't have necessary rights to reopen this PR.
> > >
> > > пн, 20 мая 2019 г. в 08:00, Pedro Larroy  >:
> > >
> > > > Hi Anton, Stas.
> > > >
> > > > Can we reopen this PR and get it merged as per the data collected by
> > > Stas?
> > > >
> > > > https://github.com/apache/incubator-mxnet/pull/12160
> > > >
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/MXNET/Benchmarking+MXNet+with+different+OpenMP+implementations
> > > >
> > > > There are multiple issues that will be fixed by solving this problem.
> > > >
> > > >
> > > > Pedro
> > > >
> > > > On Tue, Feb 12, 2019 at 4:54 AM Anton Chernov 
> > > wrote:
> > > > >
> > > > > I would like to propose a possible alternative solution for
> > > > consideration.
> > > > >
> > > > > If keeping llvm OpenMP as a submodule is inevitable one could make
> > > > > following adjustments:
> > > > >
> > > > > Since compilers try to find their own OpenMP library implicitly,
> MXNet
> > > > > needs to ensure that only the bundled version is found. Therefore
> > > during
> > > > > the build and also during deployment this library has to provide
> > > symlinks
> > > > > for each possible compiler that would link to the built artifact
> ie.
> > > > >
> > > > > libiomp.so -> libgomp.so -> libomp.so
> > > > >
> > > > > The MKLML iomp would need to be hidden and removed as well.
> > > > >
> > > > > On Windows it would be a different story, but as can be seen [1]
> > > bundled
> > > > > OpenMP was not included in the Windows build anyway.
> > > > >
> > > > > Alternatively: always use iomp (with same symlinking trick though)
> > > > provided
> > > > > by MKLML distribution [2]. This potentially could work on Windows
> as
> > > > well.
> > > > >
> > > > > Best
> > > > > Anton
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> https://github.com/apache/incubator-mxnet/blob/8a63bdecf2d9f12d34fe5874957ae4c867eb5f5b/CMakeLists.txt#L408-L410
> > > > > [2] https://github.com/intel/mkl-dnn/releases
> > > > >
> > > > > вт, 12 февр. 2019 г. в 11:22, Anton Chernov :
> > > > >
> > > > > > Recent benchmarking results have been published here [1].
> Experiments
> > > > > > compare different OpenMP implementations as well as binaries
> compiled
> > > > with
> > > > > > different compilers including GCC, Clang and ICC.
> > > > > >
> > > > > > During experimentation another issues with mixing up libraries
> was
> > > > > > identified and described here [2].
> > > > > >
> > > > > > Best
> > > > > > Anton
> > > > > >
> > > > > > [1] https://cwiki.apache.org/confluence/x/2wclBg
> > > > > > [2]
> > > > > >
> > > >
> > >
> https://github.com/apache/incubator-mxnet/issues/14087#issuecomment-461734041
> > > > > >
> > > > > >
> > > > > > вс, 9 дек. 2018 г. в 16:28, Anton Chernov :
> > > > > >
> > > > > >> Hi Chris,
> > > > > >>
> > > > > >> Following up on the issue, are all things resolved in the
> > > discussion?
> > > > > >>
> > > > > >> If yes, I kindly ask you to reopen this PR and remove
> ‘requesting
> > > > > >> changes’ status:
> > > > > >> https://github.com/apache/incubator-mxnet/pull/12160
> > > > > >>
> > > > > >> Thank you.
> > > > > >>
> > > > > >>
> > > > > >> Best
> > > > > >> Anton
> > > > > >>
> > > > > >>
> > > > > >> вт, 27 нояб. 2018 г. в 17:15, Anton Chernov <
> mecher...@gmail.com>:
> > > > > >>
> > > > > >>> Another thing to take into consideration:
> > > > > >>>
> > > > > >>> All python artefacts that are created (PyPi) are built with
> make
> > > and
> > > > are
> > > > > >>> not using the bundled OpenMP library.
> > > > > >>>
> > > > > >>> One step for the switch to CMake to happen is the approval and
> > > > merging
> > > > > >>> of the mentioned PR:
> > > > > >>>
> > > > > >>> https://github.com/apache/incubator-mxnet/pull/12160
> > > > > >>>
> > > > > >>> If there are no other objections I kindly ask Chris Olivier to
> > > remove
> > > > > >>> his 'requesting changes' veto on it to unblock the CMake
> overhaul
> > > > work.
> > > > > >>>
> > > > > >>> Thank you.
> > > > > >>>
> > > > > >>> Best
> > > > > >>> Anton
> > > > > >>>
> > > > > >>> чт, 22 нояб. 2018 г. в 17:11, Anton Chernov <
> mecher...@gmail.com>:
> > > > > >>>
> > > > > 
> > > > >  Thank you for you answer, Chris.
> > > > > 
> > > > >  > The whole “mixing omp libraries” is something that occurs in
> > > > >  production
> > > > >  every day and certainly in everything that uses mkl.
> > > > > 
> > > > >  I'm afraid this statement is wrong. Intel MKL-DNN strictly
> ensures
> > > > that
> > > > >  this mixture is not happening:
> > > > > 
> > > > >  "Intel MKL-DNN uses 

Re: [Discussion] Remove bundled llvm OpenMP

2019-05-22 Thread Anton Chernov
Great! Thank you, Aaron. I have rebased it.

ср, 22 мая 2019 г. в 15:49, Aaron Markham :

> I reopened it for you.
>
> On Wed, May 22, 2019, 05:25 Anton Chernov  wrote:
>
> > I don't have necessary rights to reopen this PR.
> >
> > пн, 20 мая 2019 г. в 08:00, Pedro Larroy :
> >
> > > Hi Anton, Stas.
> > >
> > > Can we reopen this PR and get it merged as per the data collected by
> > Stas?
> > >
> > > https://github.com/apache/incubator-mxnet/pull/12160
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Benchmarking+MXNet+with+different+OpenMP+implementations
> > >
> > > There are multiple issues that will be fixed by solving this problem.
> > >
> > >
> > > Pedro
> > >
> > > On Tue, Feb 12, 2019 at 4:54 AM Anton Chernov 
> > wrote:
> > > >
> > > > I would like to propose a possible alternative solution for
> > > consideration.
> > > >
> > > > If keeping llvm OpenMP as a submodule is inevitable one could make
> > > > following adjustments:
> > > >
> > > > Since compilers try to find their own OpenMP library implicitly,
> MXNet
> > > > needs to ensure that only the bundled version is found. Therefore
> > during
> > > > the build and also during deployment this library has to provide
> > symlinks
> > > > for each possible compiler that would link to the built artifact ie.
> > > >
> > > > libiomp.so -> libgomp.so -> libomp.so
> > > >
> > > > The MKLML iomp would need to be hidden and removed as well.
> > > >
> > > > On Windows it would be a different story, but as can be seen [1]
> > bundled
> > > > OpenMP was not included in the Windows build anyway.
> > > >
> > > > Alternatively: always use iomp (with same symlinking trick though)
> > > provided
> > > > by MKLML distribution [2]. This potentially could work on Windows as
> > > well.
> > > >
> > > > Best
> > > > Anton
> > > >
> > > > [1]
> > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/blob/8a63bdecf2d9f12d34fe5874957ae4c867eb5f5b/CMakeLists.txt#L408-L410
> > > > [2] https://github.com/intel/mkl-dnn/releases
> > > >
> > > > вт, 12 февр. 2019 г. в 11:22, Anton Chernov :
> > > >
> > > > > Recent benchmarking results have been published here [1].
> Experiments
> > > > > compare different OpenMP implementations as well as binaries
> compiled
> > > with
> > > > > different compilers including GCC, Clang and ICC.
> > > > >
> > > > > During experimentation another issues with mixing up libraries was
> > > > > identified and described here [2].
> > > > >
> > > > > Best
> > > > > Anton
> > > > >
> > > > > [1] https://cwiki.apache.org/confluence/x/2wclBg
> > > > > [2]
> > > > >
> > >
> >
> https://github.com/apache/incubator-mxnet/issues/14087#issuecomment-461734041
> > > > >
> > > > >
> > > > > вс, 9 дек. 2018 г. в 16:28, Anton Chernov :
> > > > >
> > > > >> Hi Chris,
> > > > >>
> > > > >> Following up on the issue, are all things resolved in the
> > discussion?
> > > > >>
> > > > >> If yes, I kindly ask you to reopen this PR and remove ‘requesting
> > > > >> changes’ status:
> > > > >> https://github.com/apache/incubator-mxnet/pull/12160
> > > > >>
> > > > >> Thank you.
> > > > >>
> > > > >>
> > > > >> Best
> > > > >> Anton
> > > > >>
> > > > >>
> > > > >> вт, 27 нояб. 2018 г. в 17:15, Anton Chernov  >:
> > > > >>
> > > > >>> Another thing to take into consideration:
> > > > >>>
> > > > >>> All python artefacts that are created (PyPi) are built with make
> > and
> > > are
> > > > >>> not using the bundled OpenMP library.
> > > > >>>
> > > > >>> One step for the switch to CMake to happen is the approval and
> > > merging
> > > > >>> of the mentioned PR:
> > > > >>>
> > > > >>> https://github.com/apache/incubator-mxnet/pull/12160
> > > > >>>
> > > > >>> If there are no other objections I kindly ask Chris Olivier to
> > remove
> > > > >>> his 'requesting changes' veto on it to unblock the CMake overhaul
> > > work.
> > > > >>>
> > > > >>> Thank you.
> > > > >>>
> > > > >>> Best
> > > > >>> Anton
> > > > >>>
> > > > >>> чт, 22 нояб. 2018 г. в 17:11, Anton Chernov  >:
> > > > >>>
> > > > 
> > > >  Thank you for you answer, Chris.
> > > > 
> > > >  > The whole “mixing omp libraries” is something that occurs in
> > > >  production
> > > >  every day and certainly in everything that uses mkl.
> > > > 
> > > >  I'm afraid this statement is wrong. Intel MKL-DNN strictly
> ensures
> > > that
> > > >  this mixture is not happening:
> > > > 
> > > >  "Intel MKL-DNN uses OpenMP* for parallelism and requires an
> OpenMP
> > > >  runtime library to work. As different OpenMP runtimes may not be
> > > binary
> > > >  compatible it's important to ensure that only one OpenMP runtime
> > is
> > > used
> > > >  throughout the application. Having more than one OpenMP runtime
> > > initialized
> > > >  may lead to undefined behavior resulting in incorrect results or
> > > crashes."
> > > >  [1]
> > > > 
> > > >  That is why 2 different MKLML libraries are provided:
> > > > 
> > > 

Re: [Discussion] Remove bundled llvm OpenMP

2019-05-22 Thread Pedro Larroy
Thanks Aaron and Anton!   Can we rebase to update the PR?  Let me know
how can I help further if you find some problems.

On Wed, May 22, 2019 at 6:49 AM Aaron Markham  wrote:
>
> I reopened it for you.
>
> On Wed, May 22, 2019, 05:25 Anton Chernov  wrote:
>
> > I don't have necessary rights to reopen this PR.
> >
> > пн, 20 мая 2019 г. в 08:00, Pedro Larroy :
> >
> > > Hi Anton, Stas.
> > >
> > > Can we reopen this PR and get it merged as per the data collected by
> > Stas?
> > >
> > > https://github.com/apache/incubator-mxnet/pull/12160
> > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/MXNET/Benchmarking+MXNet+with+different+OpenMP+implementations
> > >
> > > There are multiple issues that will be fixed by solving this problem.
> > >
> > >
> > > Pedro
> > >
> > > On Tue, Feb 12, 2019 at 4:54 AM Anton Chernov 
> > wrote:
> > > >
> > > > I would like to propose a possible alternative solution for
> > > consideration.
> > > >
> > > > If keeping llvm OpenMP as a submodule is inevitable one could make
> > > > following adjustments:
> > > >
> > > > Since compilers try to find their own OpenMP library implicitly, MXNet
> > > > needs to ensure that only the bundled version is found. Therefore
> > during
> > > > the build and also during deployment this library has to provide
> > symlinks
> > > > for each possible compiler that would link to the built artifact ie.
> > > >
> > > > libiomp.so -> libgomp.so -> libomp.so
> > > >
> > > > The MKLML iomp would need to be hidden and removed as well.
> > > >
> > > > On Windows it would be a different story, but as can be seen [1]
> > bundled
> > > > OpenMP was not included in the Windows build anyway.
> > > >
> > > > Alternatively: always use iomp (with same symlinking trick though)
> > > provided
> > > > by MKLML distribution [2]. This potentially could work on Windows as
> > > well.
> > > >
> > > > Best
> > > > Anton
> > > >
> > > > [1]
> > > >
> > >
> > https://github.com/apache/incubator-mxnet/blob/8a63bdecf2d9f12d34fe5874957ae4c867eb5f5b/CMakeLists.txt#L408-L410
> > > > [2] https://github.com/intel/mkl-dnn/releases
> > > >
> > > > вт, 12 февр. 2019 г. в 11:22, Anton Chernov :
> > > >
> > > > > Recent benchmarking results have been published here [1]. Experiments
> > > > > compare different OpenMP implementations as well as binaries compiled
> > > with
> > > > > different compilers including GCC, Clang and ICC.
> > > > >
> > > > > During experimentation another issues with mixing up libraries was
> > > > > identified and described here [2].
> > > > >
> > > > > Best
> > > > > Anton
> > > > >
> > > > > [1] https://cwiki.apache.org/confluence/x/2wclBg
> > > > > [2]
> > > > >
> > >
> > https://github.com/apache/incubator-mxnet/issues/14087#issuecomment-461734041
> > > > >
> > > > >
> > > > > вс, 9 дек. 2018 г. в 16:28, Anton Chernov :
> > > > >
> > > > >> Hi Chris,
> > > > >>
> > > > >> Following up on the issue, are all things resolved in the
> > discussion?
> > > > >>
> > > > >> If yes, I kindly ask you to reopen this PR and remove ‘requesting
> > > > >> changes’ status:
> > > > >> https://github.com/apache/incubator-mxnet/pull/12160
> > > > >>
> > > > >> Thank you.
> > > > >>
> > > > >>
> > > > >> Best
> > > > >> Anton
> > > > >>
> > > > >>
> > > > >> вт, 27 нояб. 2018 г. в 17:15, Anton Chernov :
> > > > >>
> > > > >>> Another thing to take into consideration:
> > > > >>>
> > > > >>> All python artefacts that are created (PyPi) are built with make
> > and
> > > are
> > > > >>> not using the bundled OpenMP library.
> > > > >>>
> > > > >>> One step for the switch to CMake to happen is the approval and
> > > merging
> > > > >>> of the mentioned PR:
> > > > >>>
> > > > >>> https://github.com/apache/incubator-mxnet/pull/12160
> > > > >>>
> > > > >>> If there are no other objections I kindly ask Chris Olivier to
> > remove
> > > > >>> his 'requesting changes' veto on it to unblock the CMake overhaul
> > > work.
> > > > >>>
> > > > >>> Thank you.
> > > > >>>
> > > > >>> Best
> > > > >>> Anton
> > > > >>>
> > > > >>> чт, 22 нояб. 2018 г. в 17:11, Anton Chernov :
> > > > >>>
> > > > 
> > > >  Thank you for you answer, Chris.
> > > > 
> > > >  > The whole “mixing omp libraries” is something that occurs in
> > > >  production
> > > >  every day and certainly in everything that uses mkl.
> > > > 
> > > >  I'm afraid this statement is wrong. Intel MKL-DNN strictly ensures
> > > that
> > > >  this mixture is not happening:
> > > > 
> > > >  "Intel MKL-DNN uses OpenMP* for parallelism and requires an OpenMP
> > > >  runtime library to work. As different OpenMP runtimes may not be
> > > binary
> > > >  compatible it's important to ensure that only one OpenMP runtime
> > is
> > > used
> > > >  throughout the application. Having more than one OpenMP runtime
> > > initialized
> > > >  may lead to undefined behavior resulting in incorrect results or
> > > crashes."
> > > >  [1]
> > > > 
> > > >  That 

Re: Report of MXNet NumPy Project Status

2019-05-22 Thread Junru Shao
 Nice progress Jun!

On Wed, May 22, 2019 at 12:12 AM Jun Wu  wrote:

> Dear Community,
>
> A few months ago, we submitted this RFC
>  proposing
> introducing NumPy-compatible coding experience into MXNet. As it has been
> some time since the proposal, we would like to share the progress with the
> community and listen to feedbacks and suggestions to enhance technical
> implementation as well as the way the project is operated.
>
> We set our first milestone by tackling the problem of MXNet not supporting
> scalar and zero-size tensors. Last month, we submitted the PR
>  providing the
> infrastructure to support those two types of tensors in MXNet. This work
> has affected almost every file and all language bindings in MXNet codebase.
> It would be impossible to provide a complete solution hadn't there any
> contributions from many MXNet developers across different organizations.
>
> With the infrastructure of supporting scalar and zero-size tensors, we are
> currently working on implementing NumPy operators in MXNet. We created a
> list of operators 
> to be implemented from the D2L book , and hope that we
> will be able to provide full NumPy operator coverage for the book by the
> end of next month.
>
> In the future, we plan to provide NumPy operator support for GluonCV
>  and GluonNLP
> . We also intend to explore the
> opportunities of extending our work to support the libraries that heavily
> depend on NumPy, not only from the deep learning world, but also a broader
> data science community, where the techniques employed by deep learning,
> such as auto differentiation, symbolic programming, GPU computing, and so
> forth can be beneficial.
>
> Thank you very much for your time to read this email and care about our
> efforts on making MXNet a super user-friendly deep learning framework. We
> look forward to your comments, suggestions and contributions for this
> project.
>
> Best,
> Developers of MXNet NumPy Project
>
> References
> [1] Development branch:
> https://github.com/apache/incubator-mxnet/tree/numpy
> [2] PR for supporting scalar and zero-size tensors:
> https://github.com/apache/incubator-mxnet/pull/14661
> [3] First batch of NumPy operators to be implemented:
> https://github.com/apache/incubator-mxnet/issues/14327
> [4] The D2L book: https://github.com/d2l-ai/d2l-en
> [5] GluonCV: https://github.com/dmlc/gluon-cv
> [6] GluonNLP: https://github.com/dmlc/gluon-nlp
>


Re: [Discussion] Remove bundled llvm OpenMP

2019-05-22 Thread Anton Chernov
I don't have necessary rights to reopen this PR.

пн, 20 мая 2019 г. в 08:00, Pedro Larroy :

> Hi Anton, Stas.
>
> Can we reopen this PR and get it merged as per the data collected by Stas?
>
> https://github.com/apache/incubator-mxnet/pull/12160
>
>
> https://cwiki.apache.org/confluence/display/MXNET/Benchmarking+MXNet+with+different+OpenMP+implementations
>
> There are multiple issues that will be fixed by solving this problem.
>
>
> Pedro
>
> On Tue, Feb 12, 2019 at 4:54 AM Anton Chernov  wrote:
> >
> > I would like to propose a possible alternative solution for
> consideration.
> >
> > If keeping llvm OpenMP as a submodule is inevitable one could make
> > following adjustments:
> >
> > Since compilers try to find their own OpenMP library implicitly, MXNet
> > needs to ensure that only the bundled version is found. Therefore during
> > the build and also during deployment this library has to provide symlinks
> > for each possible compiler that would link to the built artifact ie.
> >
> > libiomp.so -> libgomp.so -> libomp.so
> >
> > The MKLML iomp would need to be hidden and removed as well.
> >
> > On Windows it would be a different story, but as can be seen [1] bundled
> > OpenMP was not included in the Windows build anyway.
> >
> > Alternatively: always use iomp (with same symlinking trick though)
> provided
> > by MKLML distribution [2]. This potentially could work on Windows as
> well.
> >
> > Best
> > Anton
> >
> > [1]
> >
> https://github.com/apache/incubator-mxnet/blob/8a63bdecf2d9f12d34fe5874957ae4c867eb5f5b/CMakeLists.txt#L408-L410
> > [2] https://github.com/intel/mkl-dnn/releases
> >
> > вт, 12 февр. 2019 г. в 11:22, Anton Chernov :
> >
> > > Recent benchmarking results have been published here [1]. Experiments
> > > compare different OpenMP implementations as well as binaries compiled
> with
> > > different compilers including GCC, Clang and ICC.
> > >
> > > During experimentation another issues with mixing up libraries was
> > > identified and described here [2].
> > >
> > > Best
> > > Anton
> > >
> > > [1] https://cwiki.apache.org/confluence/x/2wclBg
> > > [2]
> > >
> https://github.com/apache/incubator-mxnet/issues/14087#issuecomment-461734041
> > >
> > >
> > > вс, 9 дек. 2018 г. в 16:28, Anton Chernov :
> > >
> > >> Hi Chris,
> > >>
> > >> Following up on the issue, are all things resolved in the discussion?
> > >>
> > >> If yes, I kindly ask you to reopen this PR and remove ‘requesting
> > >> changes’ status:
> > >> https://github.com/apache/incubator-mxnet/pull/12160
> > >>
> > >> Thank you.
> > >>
> > >>
> > >> Best
> > >> Anton
> > >>
> > >>
> > >> вт, 27 нояб. 2018 г. в 17:15, Anton Chernov :
> > >>
> > >>> Another thing to take into consideration:
> > >>>
> > >>> All python artefacts that are created (PyPi) are built with make and
> are
> > >>> not using the bundled OpenMP library.
> > >>>
> > >>> One step for the switch to CMake to happen is the approval and
> merging
> > >>> of the mentioned PR:
> > >>>
> > >>> https://github.com/apache/incubator-mxnet/pull/12160
> > >>>
> > >>> If there are no other objections I kindly ask Chris Olivier to remove
> > >>> his 'requesting changes' veto on it to unblock the CMake overhaul
> work.
> > >>>
> > >>> Thank you.
> > >>>
> > >>> Best
> > >>> Anton
> > >>>
> > >>> чт, 22 нояб. 2018 г. в 17:11, Anton Chernov :
> > >>>
> > 
> >  Thank you for you answer, Chris.
> > 
> >  > The whole “mixing omp libraries” is something that occurs in
> >  production
> >  every day and certainly in everything that uses mkl.
> > 
> >  I'm afraid this statement is wrong. Intel MKL-DNN strictly ensures
> that
> >  this mixture is not happening:
> > 
> >  "Intel MKL-DNN uses OpenMP* for parallelism and requires an OpenMP
> >  runtime library to work. As different OpenMP runtimes may not be
> binary
> >  compatible it's important to ensure that only one OpenMP runtime is
> used
> >  throughout the application. Having more than one OpenMP runtime
> initialized
> >  may lead to undefined behavior resulting in incorrect results or
> crashes."
> >  [1]
> > 
> >  That is why 2 different MKLML libraries are provided:
> > 
> >  lib/libmklml_gnu.so  | Intel MKL small library for GNU* OpenMP
> runtime
> >  lib/libmklml_intel.so | Intel MKL small library for Intel(R) OpenMP
> >  runtime
> > 
> >  > is the suggestion that libiomp be removed from mkl?
> > 
> >  That is certainly not my suggestion.
> > 
> >  > have you spoken with intel? have you consulted Intel at all?
> > 
> >  Yes, I have asked for comments on the issue.
> > 
> >  > “hard to debug random crash”. you’re seeing an assertion which is
> >  probably ...
> > 
> >  I'm seeing the result of undefined behaviour. And I want to put
> >  emphasis on the following statement:
> > 
> >  I disregards of whether there is a particular reason for the assert
> -
> >  it is 

Re: Report of MXNet NumPy Project Status

2019-05-22 Thread Pedro Larroy
Thanks, that's a nice summary. Great job and good to know the
progress. I think we can do some exciting stuff in terms of parsing
the Python AST and converting to a computational graph. Maybe we could
brainstorm on that further on the linked ticket.

On Wed, May 22, 2019 at 12:12 AM Jun Wu  wrote:
>
> Dear Community,
>
> A few months ago, we submitted this RFC
>  proposing
> introducing NumPy-compatible coding experience into MXNet. As it has been
> some time since the proposal, we would like to share the progress with the
> community and listen to feedbacks and suggestions to enhance technical
> implementation as well as the way the project is operated.
>
> We set our first milestone by tackling the problem of MXNet not supporting
> scalar and zero-size tensors. Last month, we submitted the PR
>  providing the
> infrastructure to support those two types of tensors in MXNet. This work
> has affected almost every file and all language bindings in MXNet codebase.
> It would be impossible to provide a complete solution hadn't there any
> contributions from many MXNet developers across different organizations.
>
> With the infrastructure of supporting scalar and zero-size tensors, we are
> currently working on implementing NumPy operators in MXNet. We created a
> list of operators 
> to be implemented from the D2L book , and hope that we
> will be able to provide full NumPy operator coverage for the book by the
> end of next month.
>
> In the future, we plan to provide NumPy operator support for GluonCV
>  and GluonNLP
> . We also intend to explore the
> opportunities of extending our work to support the libraries that heavily
> depend on NumPy, not only from the deep learning world, but also a broader
> data science community, where the techniques employed by deep learning,
> such as auto differentiation, symbolic programming, GPU computing, and so
> forth can be beneficial.
>
> Thank you very much for your time to read this email and care about our
> efforts on making MXNet a super user-friendly deep learning framework. We
> look forward to your comments, suggestions and contributions for this
> project.
>
> Best,
> Developers of MXNet NumPy Project
>
> References
> [1] Development branch: https://github.com/apache/incubator-mxnet/tree/numpy
> [2] PR for supporting scalar and zero-size tensors:
> https://github.com/apache/incubator-mxnet/pull/14661
> [3] First batch of NumPy operators to be implemented:
> https://github.com/apache/incubator-mxnet/issues/14327
> [4] The D2L book: https://github.com/d2l-ai/d2l-en
> [5] GluonCV: https://github.com/dmlc/gluon-cv
> [6] GluonNLP: https://github.com/dmlc/gluon-nlp


Report of MXNet NumPy Project Status

2019-05-22 Thread Jun Wu
Dear Community,

A few months ago, we submitted this RFC
 proposing
introducing NumPy-compatible coding experience into MXNet. As it has been
some time since the proposal, we would like to share the progress with the
community and listen to feedbacks and suggestions to enhance technical
implementation as well as the way the project is operated.

We set our first milestone by tackling the problem of MXNet not supporting
scalar and zero-size tensors. Last month, we submitted the PR
 providing the
infrastructure to support those two types of tensors in MXNet. This work
has affected almost every file and all language bindings in MXNet codebase.
It would be impossible to provide a complete solution hadn't there any
contributions from many MXNet developers across different organizations.

With the infrastructure of supporting scalar and zero-size tensors, we are
currently working on implementing NumPy operators in MXNet. We created a
list of operators 
to be implemented from the D2L book , and hope that we
will be able to provide full NumPy operator coverage for the book by the
end of next month.

In the future, we plan to provide NumPy operator support for GluonCV
 and GluonNLP
. We also intend to explore the
opportunities of extending our work to support the libraries that heavily
depend on NumPy, not only from the deep learning world, but also a broader
data science community, where the techniques employed by deep learning,
such as auto differentiation, symbolic programming, GPU computing, and so
forth can be beneficial.

Thank you very much for your time to read this email and care about our
efforts on making MXNet a super user-friendly deep learning framework. We
look forward to your comments, suggestions and contributions for this
project.

Best,
Developers of MXNet NumPy Project

References
[1] Development branch: https://github.com/apache/incubator-mxnet/tree/numpy
[2] PR for supporting scalar and zero-size tensors:
https://github.com/apache/incubator-mxnet/pull/14661
[3] First batch of NumPy operators to be implemented:
https://github.com/apache/incubator-mxnet/issues/14327
[4] The D2L book: https://github.com/d2l-ai/d2l-en
[5] GluonCV: https://github.com/dmlc/gluon-cv
[6] GluonNLP: https://github.com/dmlc/gluon-nlp