Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-06 Thread Dick Carter
+1 (binding) I ran unittests successfully on 3 different GPU architectures on CUDA 10.2 : Tesla P100-SXM2 (Pascal, arch = 60). Tesla V100-SXM2 (Volta, arch = 70) Tesla T4 (Turing, arch = 75) Compilation was via ./tests/jenkins/run_test_ubuntu.sh modified with the additional lines: echo

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-05 Thread Sheng Zha
+1 as the disclaimer-WIP is now used which will allow us more time to fix license issues. On 2020/02/04 02:43:42, Chaitanya Bapat wrote: > +1 > Successfully built MXNet 1.6.0rc2 on Linux > Tested for OpPerf utility > For CPU - >

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-05 Thread Lausen, Leonard
Hi Markus, you point out a critical flaw of the current MXNet website. We don't have any versioning and the website is always built from master branch. Thus while recent improvements to the build system are backwards compatible (ie. old instructions continue to work), there is no way to find the

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-05 Thread Markus Weimer
Hi, I was trying to follow the build instructions[0] on Ubuntu 18.04. However, I a stumped at step 2: `cp config/config.cmake config.cmake` The file `cmake.conf` does not seem to exist in the tarball on the dist sit. `find . -name "cmake.conf" -print` finds nothing. In fact, the `config` folder

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-04 Thread Lausen, Leonard
Using latest upstream jemalloc https://github.com/leezu/mxnet/commit/fd4c78a635087f6164344da53a55ba2b67da2fd2 fixes the issue. However, there were concerns that this commit relies on unreleased development features of jemalloc (jemalloc cmake build system support) and we'll not merge this

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-04 Thread Lausen, Leonard
Bisect identifies https://github.com/apache/incubator-mxnet/commit/425319cb59904573bd3fe1b6fe0a7381eceb9bbd Thus this is an issue with jemalloc + llvm libopemnp. The correct reproducer for latest master branch is git clone --recursive https://github.com/apache/incubator-mxnet/ mxnet cd

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-04 Thread Lausen, Leonard
Actually below reproducer is wrong. The issue was apparently fixed on master recently. I'm running an automated bisect and will report the result later. On Tue, 2020-02-04 at 21:44 +, Lausen, Leonard wrote: > Hi Chris, > > you previously found and fixed a OMP race condition during fork at >

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-04 Thread Lausen, Leonard
Hi Chris, you previously found and fixed a OMP race condition during fork at https://github.com/apache/incubator-mxnet/pull/17039 This time no forks are involved. Could you run the following reproducer on master branch: git clone --recursive https://github.com/apache/incubator-mxnet/ mxnet

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-04 Thread Pedro Larroy
Hi Przemek I'm fine if we add it to the release notes and try to fix it for the next release. Changing my vote to +1 Pedro. On Mon, Feb 3, 2020 at 7:42 PM Pedro Larroy wrote: > > -1 > > Unit tests passed in CPU build. > > I observe crashes related to openmp using cpp unit tests: > >

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-04 Thread Pedro Larroy
@Chris: If you actually go and read the issue that I linked above, you can see that I was using gdb. Maybe you can have a look into the issue if you have an idea to fix. The backtrace points to a segfault in the omp library. While the cause could be somewhere else which is causing undefined

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-04 Thread Chris Olivier
When "fixing", please "fix" through actual root-cause analysis (use gdb, for instance) and not simply by guesswork and cutting out things which probably aren't actually at fault (blaming an OMP library that's in worldwide distribution int he billions should be treated with great skepticism). On

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-04 Thread Lin Yuan
Pedro, While I agree with you we need to fix this usability issue, I don't think this is a release blocker as Przemek mentioned above. Could we fix this in the next minor release? Thanks, Lin On Tue, Feb 4, 2020 at 10:38 AM Pedro Larroy wrote: > Right. Would it be possible to have the CMake

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-04 Thread Pedro Larroy
Right. Would it be possible to have the CMake build also use libgomp for consistency with the releases until these issues are resolved? This can affect anyone compiling the distribution with CMake and also happens randomly in CI, worsening the contributor experience due to CI failures. On Tue,

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-04 Thread Przemysław Trędak
Hi Pedro, >From the issue that you linked it seems that you are using the LLVM OpenMP, >whereas I believe the actual release uses libgomp (at least that's what seems >to be the conclusion from this issue: >https://github.com/apache/incubator-mxnet/issues/16891)? Przemek On 2020/02/04

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-03 Thread Pedro Larroy
-1 Unit tests passed in CPU build. I observe crashes related to openmp using cpp unit tests: https://github.com/apache/incubator-mxnet/issues/17043 Pedro. On Mon, Feb 3, 2020 at 6:44 PM Chaitanya Bapat wrote: > +1 > Successfully built MXNet 1.6.0rc2 on Linux > Tested for OpPerf utility >

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-03 Thread Chaitanya Bapat
+1 Successfully built MXNet 1.6.0rc2 on Linux Tested for OpPerf utility For CPU - https://gist.github.com/ChaiBapchya/d5ecc3e971c5a3c558d672477b4b6b9c Works well! On Mon, 3 Feb 2020 at 15:43, Lin Yuan wrote: > +1 > > Tested Horovod with mnist example. My compiler flags are below: > > [✔

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-03 Thread Lin Yuan
+1 Tested Horovod with mnist example. My compiler flags are below: [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ CPU_SSE2, ✔ CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ CPU_AVX2, ✔ OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ BLAS_MKL, ✖

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-01 Thread Tao Lv
+1 I tested below items: 1. download artifacts from Apache dist repo; 2. the signature looks good; 3. build from source code with MKL-DNN and MKL on centos; 4. run fp32 and int8 inference of ResNet50 under /example/quantization/. thanks, -tao On Sun, Feb 2, 2020 at 11:00 AM Tao Lv wrote: > I

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-01 Thread Tao Lv
I see. I was looking at this page: https://github.com/apache/incubator-mxnet/releases/tag/1.6.0.rc2 On Sun, Feb 2, 2020 at 4:54 AM Przemysław Trędak wrote: > Hi Tao, > > Could you tell me where did you look for it and did not find it? I just > checked and both >

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-01 Thread Przemysław Trędak
Hi Tao, Could you tell me where did you look for it and did not find it? I just checked and both https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.6.0.rc2/ and draft of the release on GitHub have them. Thank you Przemek On 2020/02/01 14:23:11, Tao Lv wrote: > It seems the src tar and

Re: [VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-02-01 Thread Tao Lv
It seems the src tar and signature are missing from the tag. On Fri, Jan 31, 2020 at 11:09 AM Przemysław Trędak wrote: > Dear MXNet community, > > This is the vote to release Apache MXNet (incubating) version 1.6.0. > Voting starts today and will close on Monday 2/3/2020 23:59 PST. > > Link to

[VOTE] Release Apache MXNet (incubating) version 1.6.0.rc2

2020-01-30 Thread Przemysław Trędak
Dear MXNet community, This is the vote to release Apache MXNet (incubating) version 1.6.0. Voting starts today and will close on Monday 2/3/2020 23:59 PST. Link to release notes: https://cwiki.apache.org/confluence/display/MXNET/1.6.0+Release+notes Link to release candidate: