Actually below reproducer is wrong. The issue was apparently fixed on master recently. I'm running an automated bisect and will report the result later.
On Tue, 2020-02-04 at 21:44 +0000, Lausen, Leonard wrote: > Hi Chris, > > you previously found and fixed a OMP race condition during fork at > https://github.com/apache/incubator-mxnet/pull/17039 > > This time no forks are involved. Could you run the following reproducer on > master branch: > > git clone --recursive https://github.com/apache/incubator-mxnet/ mxnet > cd mxnet > git checkout a726c406964b9cd17efa826738a662e09d973972 # workaround > https://github.com/apache/incubator-mxnet/issues/17514 > mkdir build; cd build; > cmake -DUSE_CPP_PACKAGE=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo -GNinja > -DUSE_CUDA=OFF .. > ninja > ./cpp-package/example/test_regress_label # run a 2-3 times to reproduce > > > As you are OpenMP expert, you may be able to identify the root cause withe > relative ease. > > Thank you, > > Leonard > > On Tue, 2020-02-04 at 11:06 -0800, Chris Olivier wrote: > > When "fixing", please "fix" through actual root-cause analysis (use gdb, > > for instance) and not simply by guesswork and cutting out things which > > probably aren't actually at fault (blaming an OMP library that's in > > worldwide distribution int he billions should be treated with great > > skepticism). > > > > On Tue, Feb 4, 2020 at 10:44 AM Lin Yuan <apefor...@gmail.com> wrote: > > > > > Pedro, > > > > > > While I agree with you we need to fix this usability issue, I don't think > > > this is a release blocker as Przemek mentioned above. Could we fix this in > > > the next minor release? > > > > > > Thanks, > > > > > > Lin > > > > > > On Tue, Feb 4, 2020 at 10:38 AM Pedro Larroy <pedro.larroy.li...@gmail.com > > > wrote: > > > > > > > Right. Would it be possible to have the CMake build also use libgomp for > > > > consistency with the releases until these issues are resolved? > > > > This can affect anyone compiling the distribution with CMake and also > > > > happens randomly in CI, worsening the contributor experience due to CI > > > > failures. > > > > > > > > On Tue, Feb 4, 2020 at 9:33 AM Przemysław Trędak <ptre...@apache.org> > > > > wrote: > > > > > > > > > Hi Pedro, > > > > > > > > > > From the issue that you linked it seems that you are using the LLVM > > > > > OpenMP, whereas I believe the actual release uses libgomp (at least > > > > that's > > > > > what seems to be the conclusion from this issue: > > > > > https://github.com/apache/incubator-mxnet/issues/16891)? > > > > > > > > > > Przemek > > > > > > > > > > On 2020/02/04 03:42:30, Pedro Larroy <pedro.larroy.li...@gmail.com> > > > > > wrote: > > > > > > -1 > > > > > > > > > > > > Unit tests passed in CPU build. > > > > > > > > > > > > I observe crashes related to openmp using cpp unit tests: > > > > > > > > > > > > https://github.com/apache/incubator-mxnet/issues/17043 > > > > > > > > > > > > Pedro. > > > > > > > > > > > > On Mon, Feb 3, 2020 at 6:44 PM Chaitanya Bapat <chai.ba...@gmail.com > > > > > wrote: > > > > > > > +1 > > > > > > > Successfully built MXNet 1.6.0rc2 on Linux > > > > > > > Tested for OpPerf utility > > > > > > > For CPU - > > > > > > > > > > https://gist.github.com/ChaiBapchya/d5ecc3e971c5a3c558d672477b4b6b9c > > > > > > > Works well! > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 3 Feb 2020 at 15:43, Lin Yuan <apefor...@gmail.com> wrote: > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > Tested Horovod with mnist example. My compiler flags are below: > > > > > > > > > > > > > > > > [✔ CUDA, ✔ CUDNN, ✔ NCCL, ✔ CUDA_RTC, ✖ TENSORRT, ✔ CPU_SSE, ✔ > > > > > CPU_SSE2, > > > > > > > ✔ > > > > > > > > CPU_SSE3, ✔ CPU_SSE4_1, ✔ CPU_SSE4_2, ✖ CPU_SSE4A, ✔ CPU_AVX, ✖ > > > > > > > CPU_AVX2, ✔ > > > > > > > > OPENMP, ✖ SSE, ✔ F16C, ✖ JEMALLOC, ✔ BLAS_OPEN, ✖ BLAS_ATLAS, ✖ > > > > > > > BLAS_MKL, ✖ > > > > > > > > BLAS_APPLE, ✔ LAPACK, ✖ MKLDNN, ✔ OPENCV, ✖ CAFFE, ✖ PROFILER, ✔ > > > > > > > > DIST_KVSTORE, ✖ CXX14, ✖ INT64_TENSOR_SIZE, ✖ SIGNAL_HANDLER, ✖ > > > > > DEBUG, ✖ > > > > > > > > TVM_OP] > > > > > > > > > > > > > > > > Lin > > > > > > > > > > > > > > > > On Sat, Feb 1, 2020 at 9:55 PM Tao Lv <ta...@apache.org> wrote: > > > > > > > > > > > > > > > > > +1 > > > > > > > > > > > > > > > > > > I tested below items: > > > > > > > > > 1. download artifacts from Apache dist repo; > > > > > > > > > 2. the signature looks good; > > > > > > > > > 3. build from source code with MKL-DNN and MKL on centos; > > > > > > > > > 4. run fp32 and int8 inference of ResNet50 under > > > > > > > /example/quantization/. > > > > > > > > > thanks, > > > > > > > > > -tao > > > > > > > > > > > > > > > > > > On Sun, Feb 2, 2020 at 11:00 AM Tao Lv <ta...@apache.org> > > > wrote: > > > > > > > > > > I see. I was looking at this page: > > > > > > > > > > > > > > https://github.com/apache/incubator-mxnet/releases/tag/1.6.0.rc2 > > > > > > > > > > On Sun, Feb 2, 2020 at 4:54 AM Przemysław Trędak < > > > > > ptre...@apache.org > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Tao, > > > > > > > > > > > > > > > > > > > > > > Could you tell me where did you look for it and did not > > > > > > > > > > > find > > > > > it? I > > > > > > > > just > > > > > > > > > > > checked and both > > > > > > > > > > > > > > > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.6.0.rc2/ > > > > > > > and > > > > > > > > > > > draft of the release on GitHub have them. > > > > > > > > > > > > > > > > > > > > > > Thank you > > > > > > > > > > > Przemek > > > > > > > > > > > > > > > > > > > > > > On 2020/02/01 14:23:11, Tao Lv <ta...@apache.org> wrote: > > > > > > > > > > > > It seems the src tar and signature are missing from the > > > tag. > > > > > > > > > > > > On Fri, Jan 31, 2020 at 11:09 AM Przemysław Trędak < > > > > > > > > > ptre...@apache.org> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Dear MXNet community, > > > > > > > > > > > > > > > > > > > > > > > > > > This is the vote to release Apache MXNet (incubating) > > > > > version > > > > > > > > 1.6.0. > > > > > > > > > > > > > Voting starts today and will close on Monday 2/3/2020 > > > > 23:59 > > > > > PST. > > > > > > > > > > > > > Link to release notes: > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/MXNET/1.6.0+Release+notes > > > > > > > > > > > > > Link to release candidate: > > > > > > > > > > > > > > > > > > > > https://github.com/apache/incubator-mxnet/releases/tag/1.6.0.rc2 > > > > > > > > > > > > > Link to source and signatures on apache dist server: > > > > > > > > > > > > > > > > > > > > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.6.0.rc2/ > > > > > > > > > > > > > The differences comparing to previous release > > > > > > > > > > > > > candidate > > > > > > > 1.6.0.rc1: > > > > > > > > > > > > > * Fixes for license issues (#17361, #17375, #17370, > > > > #17460) > > > > > > > > > > > > > * Bugfix for saving LSTM layer parameter (#17288) > > > > > > > > > > > > > * Bugfix for downloading the model from model zoo > > > > > > > > > > > > > from > > > > > multiple > > > > > > > > > > > processes > > > > > > > > > > > > > (#17372) > > > > > > > > > > > > > * Fixed a symbol.py in AMP for GluonNLP (#17408) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please remember to TEST first before voting > > > > > > > > > > > > > accordingly: > > > > > > > > > > > > > +1 = approve > > > > > > > > > > > > > +0 = no opinion > > > > > > > > > > > > > -1 = disapprove (provide reason) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > Przemyslaw Tredak > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > *Chaitanya Prakash Bapat* > > > > > > > *+1 (973) 953-6299* > > > > > > > > > > > > > > [image: https://www.linkedin.com//in/chaibapat25] > > > > > > > <https://github.com/ChaiBapchya>[image: > > > > > https://www.facebook.com/chaibapat > > > > > > > ] > > > > > > > <https://www.facebook.com/chaibapchya>[image: > > > > > > > https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya > > > > > > [image: > > > > > > > https://www.linkedin.com//in/chaibapat25] > > > > > > > <https://www.linkedin.com//in/chaibapchya/> > > > > > > >