+1 (binding)

I ran unittests successfully on 3 different GPU architectures on CUDA 10.2 :

Tesla P100-SXM2 (Pascal, arch = 60).
Tesla V100-SXM2 (Volta, arch = 70)
Tesla T4 (Turing, arch = 75)

Compilation was via ./tests/jenkins/run_test_ubuntu.sh modified with the 
additional lines:

echo "DEV=0" >> config.mk
echo "USE_MKLDNN=0" >> config.mk
echo "USE_LAPACK_PATH=/usr/lib/x86_64-linux-gnu" >> config.mk
echo "USE_BLAS=openblas" >> config.mk

Tests run were:

nosetests --verbose tests/python/unittest || exit 1
nosetests --verbose tests/python/gpu/ || exit 1
nosetests --verbose tests/python/train || exit 1

All tests on all 3 configurations passed.

    -Dick

On 2020/02/05 19:24:35, "Lausen, Leonard" <lau...@amazon.com.INVALID> wrote: 
> Hi Markus,
> 
> you point out a critical flaw of the current MXNet website. We don't have any
> versioning and the website is always built from master branch.
> 
> Thus while recent improvements to the build system are backwards compatible 
> (ie.
> old instructions continue to work), there is no way to find the old 
> instructions
> to build "old" releases.
> 
> https://github.com/apache/incubator-mxnet/issues/17497 tracks the issue.
> 
> Including the package build instructions with the source release makes sense.
> To make sure they don't get out of date, including the html pages built from 
> the
> source release is another option.
> 
> Best regards
> Leonard
> 
> On Wed, 2020-02-05 at 11:06 -0800, Markus Weimer wrote:
> > Hi,
> > 
> > I was trying to follow the build instructions[0] on Ubuntu 18.04.
> > However, I a stumped at step 2:
> > 
> > `cp config/config.cmake config.cmake`
> > 
> > The file `cmake.conf` does not seem to exist in the tarball on the
> > dist sit. `find . -name "cmake.conf" -print` finds nothing. In fact,
> > the `config` folder doesn't seem to exist in the tarball either.
> > However, the file and folder do exist on GitHub[1]. Are the build
> > instructions for a release different from the build from the
> > repository?
> > 
> > On a related note: It might make sense to package build instructions
> > with the source release. Websites get updated to reflect current use,
> > and it might be difficult for future users of this version of mxnet to
> > piece together the build instructions.
> > 
> > Thanks,
> > 
> > Markus
> > 
> > 
> > [0]: https://mxnet.apache.org/get_started/ubuntu_setup
> > [1]: https://github.com/apache/incubator-mxnet/tree/master/config
> > 
> > On Tue, Feb 4, 2020 at 3:05 PM Lausen, Leonard
> > <lau...@amazon.com.invalid> wrote:
> > > Using latest upstream jemalloc
> > > https://github.com/leezu/mxnet/commit/fd4c78a635087f6164344da53a55ba2b67da2fd2
> > > fixes the issue.
> > > 
> > > However, there were concerns that this commit relies on unreleased
> > > development
> > > features of jemalloc (jemalloc cmake build system support) and we'll not
> > > merge
> > > this commit until upstream releases cmake build system support in a 
> > > release.
> > > 
> > > In the meantime anyone is welcome to work on an equivalent patch based on
> > > the
> > > custom build system in latest stable jemalloc.
> > > 
> > > On Tue, 2020-02-04 at 22:46 +0000, Lausen, Leonard wrote:
> > > > Bisect identifies
> > > > https://github.com/apache/incubator-mxnet/commit/425319cb59904573bd3fe1b6fe0a7381eceb9bbd
> > > > 
> > > > Thus this is an issue with jemalloc + llvm libopemnp.
> > > > 
> > > > The correct reproducer for latest master branch is
> > > > 
> > > > 
> > > >   git clone --recursive https://github.com/apache/incubator-mxnet/ mxnet
> > > >   cd mxnet
> > > >   git checkout a726c406964b9cd17efa826738a662e09d973972 # workaround
> > > > https://github.com/apache/incubator-mxnet/issues/17514
> > > >   mkdir build; cd build;
> > > >   cmake -DUSE_CPP_PACKAGE=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo -GNinja
> > > > -DUSE_CUDA=OFF -DUSE_JEMALLOC=ON ..
> > > >   ninja
> > > >   ./cpp-package/example/test_regress_label  # run a 2-3 times to 
> > > > reproduce
> > > > 
> > > > Let's move the discussion to about fixing the jemalloc, openmp
> > > > incompatibility
> > > > to https://github.com/apache/incubator-mxnet/issues/17043
> > > > 
> > > > 
> > > > 
> > > > @Chris, could you look into this issue as it only happens with LLVM
> > > > OpenMP?
> > > > 
> > > > 
> > > > 
> > > > @Przemek: For 1.6.0 releas notes I suggest include recommendation to set
> > > > USE_JEMALLOC=OFF when compiling from source.
> > > > 
> > > > This note should probably be added in any case, as building with
> > > > USE_JEMALLOC=ON
> > > > is broken on Ubuntu Ubuntu 18.10 and higher, as well as Debian Stable.
> > > > 
> > > > Given these release notes, +1 for the release.
> > > > 
> > > > 
> > > > Best regards
> > > > Leonard
> > > > 
> > > > On Tue, 2020-02-04 at 22:26 +0000, Lausen, Leonard wrote:
> > > > > Actually below reproducer is wrong. The issue was apparently fixed on
> > > > > master
> > > > > recently. I'm running an automated bisect and will report the result
> > > > > later.
> > > > > 
> > > > > On Tue, 2020-02-04 at 21:44 +0000, Lausen, Leonard wrote:
> > > > > > Hi Chris,
> > > > > > 
> > > > > > you previously found and fixed a OMP race condition during fork at
> > > > > > https://github.com/apache/incubator-mxnet/pull/17039
> > > > > > 
> > > > > > This time no forks are involved. Could you run the following
> > > > > > reproducer on
> > > > > > master branch:
> > > > > > 
> > > > > >   git clone --recursive https://github.com/apache/incubator-mxnet/
> > > > > > mxnet
> > > > > >   cd mxnet
> > > > > >   git checkout a726c406964b9cd17efa826738a662e09d973972 # workaround
> > > > > > https://github.com/apache/incubator-mxnet/issues/17514
> > > > > >   mkdir build; cd build;
> > > > > >   cmake -DUSE_CPP_PACKAGE=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo 
> > > > > > -GNinja
> > > > > > -DUSE_CUDA=OFF ..
> > > > > >   ninja
> > > > > >   ./cpp-package/example/test_regress_label  # run a 2-3 times to
> > > > > > reproduce
> > > > > > 
> > > > > > 
> > > > > > As you are OpenMP expert, you may be able to identify the root cause
> > > > > > withe
> > > > > > relative ease.
> > > > > > 
> > > > > > Thank you,
> > > > > > 
> > > > > > Leonard
> > > > > > 
> > > > > > On Tue, 2020-02-04 at 11:06 -0800, Chris Olivier wrote:
> > > > > > > When "fixing", please "fix" through actual root-cause analysis 
> > > > > > > (use
> > > > > > > gdb,
> > > > > > > for instance) and not simply by guesswork and cutting out things
> > > > > > > which
> > > > > > > probably aren't actually at fault (blaming an OMP library that's 
> > > > > > > in
> > > > > > > worldwide distribution int he billions should be treated with 
> > > > > > > great
> > > > > > > skepticism).
> > > > > > > 
> > > > > > > On Tue, Feb 4, 2020 at 10:44 AM Lin Yuan <apefor...@gmail.com>
> > > > > > > wrote:
> > > > > > > 
> > > > > > > > Pedro,
> > > > > > > > 
> > > > > > > > While I agree with you we need to fix this usability issue, I
> > > > > > > > don't
> > > > > > > > think
> > > > > > > > this is a release blocker as Przemek mentioned above. Could we 
> > > > > > > > fix
> > > > > > > > this
> > > > > > > > in
> > > > > > > > the next minor release?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > 
> > > > > > > > Lin
> > > > > > > > 
> > > > > > > > On Tue, Feb 4, 2020 at 10:38 AM Pedro Larroy <
> > > > > > > > pedro.larroy.li...@gmail.com
> > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > > Right. Would it be possible to have the CMake build also use
> > > > > > > > > libgomp
> > > > > > > > > for
> > > > > > > > > consistency with the releases until these issues are resolved?
> > > > > > > > > This can affect anyone compiling the distribution with CMake 
> > > > > > > > > and
> > > > > > > > > also
> > > > > > > > > happens randomly in CI, worsening the contributor experience 
> > > > > > > > > due
> > > > > > > > > to
> > > > > > > > > CI
> > > > > > > > > failures.
> > > > > > > > > 
> > > > > > > > > On Tue, Feb 4, 2020 at 9:33 AM Przemysław Trędak <
> > > > > > > > > ptre...@apache.org
> > > > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > > > Hi Pedro,
> > > > > > > > > > 
> > > > > > > > > > From the issue that you linked it seems that you are using 
> > > > > > > > > > the
> > > > > > > > > > LLVM
> > > > > > > > > > OpenMP, whereas I believe the actual release uses libgomp 
> > > > > > > > > > (at
> > > > > > > > > > least
> > > > > > > > > that's
> > > > > > > > > > what seems to be the conclusion from this issue:
> > > > > > > > > > https://github.com/apache/incubator-mxnet/issues/16891)?
> > > > > > > > > > 
> > > > > > > > > > Przemek
> > > > > > > > > > 
> > > > > > > > > > On 2020/02/04 03:42:30, Pedro Larroy <
> > > > > > > > > > pedro.larroy.li...@gmail.com
> > > > > > > > > > wrote:
> > > > > > > > > > > -1
> > > > > > > > > > > 
> > > > > > > > > > > Unit tests passed in CPU build.
> > > > > > > > > > > 
> > > > > > > > > > > I observe crashes related to openmp using cpp unit tests:
> > > > > > > > > > > 
> > > > > > > > > > > https://github.com/apache/incubator-mxnet/issues/17043
> > > > > > > > > > > 
> > > > > > > > > > > Pedro.
> > > > > > > > > > > 
> > > > > > > > > > > On Mon, Feb 3, 2020 at 6:44 PM Chaitanya Bapat <
> > > > > > > > > > > chai.ba...@gmail.com
> > > > > > > > > > wrote:
> > > > > > > > > > > > +1
> > > > > > > > > > > > Successfully built MXNet 1.6.0rc2 on Linux
> > > > > > > > > > > > Tested for OpPerf utility
> > > > > > > > > > > > For CPU -
> > > > > > > > > > > > 
> > > > > > > > https://gist.github.com/ChaiBapchya/d5ecc3e971c5a3c558d672477b4b6b9c
> > > > > > > > > > > > Works well!
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > On Mon, 3 Feb 2020 at 15:43, Lin Yuan 
> > > > > > > > > > > > <apefor...@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > > +1
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Tested Horovod with mnist example. My compiler flags 
> > > > > > > > > > > > > are
> > > > > > > > > > > > > below:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔
> > > > > > > > > > > > > CPU_SSE,
> > > > > > > > > > > > > ✔
> > > > > > > > > > CPU_SSE2,
> > > > > > > > > > > > ✔
> > > > > > > > > > > > > CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔
> > > > > > > > > > > > > CPU_AVX,
> > > > > > > > > > > > > ✖
> > > > > > > > > > > > CPU_AVX2, ✔
> > > > > > > > > > > > > OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖
> > > > > > > > > > > > > BLAS_ATLAS,
> > > > > > > > > > > > > ✖
> > > > > > > > > > > > BLAS_MKL, ✖
> > > > > > > > > > > > > BLAS_APPLE, ✔ LAPACK, ✖ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖
> > > > > > > > > > > > > PROFILER,
> > > > > > > > > > > > > ✔
> > > > > > > > > > > > > DIST_KVSTORE, ✖ CXX14, ✖ INT64_TENSOR_SIZE, ✖
> > > > > > > > > > > > > SIGNAL_HANDLER,
> > > > > > > > > > > > > ✖
> > > > > > > > > > DEBUG, ✖
> > > > > > > > > > > > > TVM_OP]
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Lin
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Sat, Feb 1, 2020 at 9:55 PM Tao Lv 
> > > > > > > > > > > > > <ta...@apache.org>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I tested below items:
> > > > > > > > > > > > > > 1. download artifacts from Apache dist repo;
> > > > > > > > > > > > > > 2. the signature looks good;
> > > > > > > > > > > > > > 3. build from source code with MKL-DNN and MKL on
> > > > > > > > > > > > > > centos;
> > > > > > > > > > > > > > 4. run fp32 and int8 inference of ResNet50 under
> > > > > > > > > > > > /example/quantization/.
> > > > > > > > > > > > > > thanks,
> > > > > > > > > > > > > > -tao
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > On Sun, Feb 2, 2020 at 11:00 AM Tao Lv <
> > > > > > > > > > > > > > ta...@apache.org>
> > > > > > > > wrote:
> > > > > > > > > > > > > > > I see. I was looking at this page:
> > > > > > > > > > > > > > > 
> > > > > > > > > https://github.com/apache/incubator-mxnet/releases/tag/1.6.0.rc2
> > > > > > > > > > > > > > > On Sun, Feb 2, 2020 at 4:54 AM Przemysław Trędak <
> > > > > > > > > > ptre...@apache.org
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Hi Tao,
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Could you tell me where did you look for it and
> > > > > > > > > > > > > > > > did
> > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > find
> > > > > > > > > > it? I
> > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > checked and both
> > > > > > > > > > > > > > > > 
> > > > > > > > > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.6.0.rc2/
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > draft of the release on GitHub have them.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Thank you
> > > > > > > > > > > > > > > > Przemek
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > On 2020/02/01 14:23:11, Tao Lv 
> > > > > > > > > > > > > > > > <ta...@apache.org>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > It seems the src tar and signature are missing
> > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > the
> > > > > > > > tag.
> > > > > > > > > > > > > > > > > On Fri, Jan 31, 2020 at 11:09 AM Przemysław
> > > > > > > > > > > > > > > > > Trędak <
> > > > > > > > > > > > > > ptre...@apache.org>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Dear MXNet community,
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > This is the vote to release Apache MXNet
> > > > > > > > > > > > > > > > > > (incubating)
> > > > > > > > > > version
> > > > > > > > > > > > > 1.6.0.
> > > > > > > > > > > > > > > > > > Voting starts today and will close on Monday
> > > > > > > > > > > > > > > > > > 2/3/2020
> > > > > > > > > 23:59
> > > > > > > > > > PST.
> > > > > > > > > > > > > > > > > > Link to release notes:
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > https://cwiki.apache.org/confluence/display/MXNET/1.6.0+Release+notes
> > > > > > > > > > > > > > > > > > Link to release candidate:
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > https://github.com/apache/incubator-mxnet/releases/tag/1.6.0.rc2
> > > > > > > > > > > > > > > > > > Link to source and signatures on apache dist
> > > > > > > > > > > > > > > > > > server:
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.6.0.rc2/
> > > > > > > > > > > > > > > > > > The differences comparing to previous 
> > > > > > > > > > > > > > > > > > release
> > > > > > > > > > > > > > > > > > candidate
> > > > > > > > > > > > 1.6.0.rc1:
> > > > > > > > > > > > > > > > > >  * Fixes for license issues (#17361, #17375,
> > > > > > > > > > > > > > > > > > #17370,
> > > > > > > > > #17460)
> > > > > > > > > > > > > > > > > >  * Bugfix for saving LSTM layer parameter
> > > > > > > > > > > > > > > > > > (#17288)
> > > > > > > > > > > > > > > > > >  * Bugfix for downloading the model from 
> > > > > > > > > > > > > > > > > > model
> > > > > > > > > > > > > > > > > > zoo
> > > > > > > > > > > > > > > > > > from
> > > > > > > > > > multiple
> > > > > > > > > > > > > > > > processes
> > > > > > > > > > > > > > > > > > (#17372)
> > > > > > > > > > > > > > > > > >  * Fixed a symbol.py in AMP for GluonNLP
> > > > > > > > > > > > > > > > > > (#17408)
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Please remember to TEST first before voting
> > > > > > > > > > > > > > > > > > accordingly:
> > > > > > > > > > > > > > > > > > +1 = approve
> > > > > > > > > > > > > > > > > > +0 = no opinion
> > > > > > > > > > > > > > > > > > -1 = disapprove (provide reason)
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > > > > > > > Przemyslaw Tredak
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > --
> > > > > > > > > > > > *Chaitanya Prakash Bapat*
> > > > > > > > > > > > *+1 (973) 953-6299*
> > > > > > > > > > > > 
> > > > > > > > > > > > [image: https://www.linkedin.com//in/chaibapat25]
> > > > > > > > > > > > <https://github.com/ChaiBapchya>[image:
> > > > > > > > > > https://www.facebook.com/chaibapat
> > > > > > > > > > > > ]
> > > > > > > > > > > > <https://www.facebook.com/chaibapchya>[image:
> > > > > > > > > > > > https://twitter.com/ChaiBapchya] <
> > > > > > > > > > > > https://twitter.com/ChaiBapchya
> > > > > > > > > > > [image:
> > > > > > > > > > > > https://www.linkedin.com//in/chaibapat25]
> > > > > > > > > > > > <https://www.linkedin.com//in/chaibapchya/>
> > > > > > > > > > > > 
> 

Reply via email to