Thanks Pedro. Good to know you think it is important as well. I hope the community can review a proposal on the CWiki soon? that would be great...
On Mon, Jul 30, 2018 at 4:26 AM Pedro Larroy <pedro.larroy.li...@gmail.com> wrote: > Hi Hagay > > We are aware of this and we are working in this direction which as you > point out, is more desirable. > There's a huge amount of non-trivial work that has gone into building these > distribution packages from Sheng which needs to be adapted for our CI > system, and taken into consideration. > > Pedro. > > > On Mon, Jul 30, 2018 at 9:07 AM Hagay Lupesko <lupe...@gmail.com> wrote: > > > Thanks Tong for root-causing the issue! > > Thanks Sheng for following up with an updated PyPi package. > > > > What worries me is that we seem to build MXNet PyPi distribution packages > > with a build config different than the CI where all of the tests are > > running. > > Looking here [1 > > < > > > https://github.com/apache/incubator-mxnet/blob/master/ci/docker/install/ubuntu_core.sh > > >] > > it seems that MXNet CI Ubuntu build uses libopenblas-dev v0.2.18, while > > PyPi build for MXNet 1.2.1 used v0.3.2 (I would imaging PyPi > distribution?) > > > > Needless to say that if we don't make sure PyPi distribution is aligned > > with the CI build, similar issues can happen again with other > dependencies. > > I'd think we want the build configs to be the same, or better yet have > the > > PyPi package be built from the output produced by the CI. > > Thoughts? > > > > [1] > > > > > https://github.com/apache/incubator-mxnet/blob/master/ci/docker/install/ubuntu_core.sh > > > > > > On Fri, Jul 27, 2018 at 11:31 AM Sheng Zha <szha....@gmail.com> wrote: > > > > > Tong, > > > > > > That's great news. I'm glad that OpenBLAS people are responding so > > quickly. > > > In that case it's probably a better idea to use that version instead. > The > > > latest OpenBLAS version brings many optimization for all kinds of > > hardware. > > > > > > -sz > > > > > > On Fri, Jul 27, 2018 at 11:10 AM, Tong He <hetong...@gmail.com> wrote: > > > > > > > Hi Sheng, > > > > > > > > I also opened an issue on OpenBLAS repo: > > > > https://github.com/xianyi/OpenBLAS/issues/1700 . > > > > > > > > As informed that "0.3.2 should be released this weekend", I tested > > their > > > > develope branch as well, and seems the new version has fixed the bug. > > > > > > > > Since OpenBLAS 0.3.2 could also have performance improvement, > > therefore I > > > > propose to wait for OpenBLAS 0.3.2 for our pip post release. > > > > > > > > > > > > Best regards, > > > > > > > > Tong He > > > > > > > > 2018-07-27 10:54 GMT-07:00 Sheng Zha <szha....@gmail.com>: > > > > > > > > > Forgot to mention, the post release version is a pip package > version. > > > > > > > > > > -sz > > > > > > > > > > > On Jul 27, 2018, at 10:42 AM, Sheng Zha <szha....@gmail.com> > > wrote: > > > > > > > > > > > > In this case we can regard it as a release problem, which is > > usually > > > > > what post release versions are for. It’s still the same release > with > > > > > different dependency, so there is no code change needed. > > > > > > > > > > > > -sz > > > > > > > > > > > > > > > > > >> On Jul 27, 2018, at 8:31 AM, Steffen Rochel < > > > steffenroc...@gmail.com> > > > > > wrote: > > > > > >> > > > > > >> Hi Tong - thanks for root causing the problem. > > > > > >> Sheng - what is 1.2.1.post0? Shouldn't a patch with fix be > > released > > > as > > > > > >> 1.2.2? > > > > > >> Steffen > > > > > >> > > > > > >>> On Thu, Jul 26, 2018 at 5:33 PM Sheng Zha <szha....@gmail.com> > > > > wrote: > > > > > >>> > > > > > >>> Dear users and developers of Apache MXNet (Incubating), > > > > > >>> > > > > > >>> Thanks to Tong's dedication, the root cause for this issue was > > > > > identified > > > > > >>> to be instability in OpenBLAS's latest stable version 0.3.1. > For > > > > > details, > > > > > >>> see Tong's comment > > > > > >>> < > > > > > >>> https://github.com/apache/incubator-mxnet/issues/11853# > > > > > issuecomment-408272772 > > > > > >>>> > > > > > >>> . > > > > > >>> > > > > > >>> Since both the nightly build and the 1.2.1 wheels are affected, > > we > > > > > >>> recommend that we stay on OpenBLAS last known stable version > > 0.2.20 > > > > > that > > > > > >>> we've been using. I will assume lazy consensus and prepare the > > fix > > > > > >>> (1.2.1.post0). > > > > > >>> > > > > > >>> -sz > > > > > >>> > > > > > >>>> On Tue, Jul 24, 2018 at 3:35 PM, Tong He <t...@apache.org> > > wrote: > > > > > >>>> > > > > > >>>> Recently there's an issue regarding the inconsistent result > from > > > > gluon > > > > > >>>> forward: > > > > > >>>> > > > > > >>>> https://github.com/apache/incubator-mxnet/issues/11853 > > > > > >>>> > > > > > >>>> Given a constant input image and loaded pretrained parameters, > > we > > > > > expect > > > > > >>> a > > > > > >>>> deterministic output from arbitrary repeats of forwards. > However > > > > from > > > > > the > > > > > >>>> issue I see that the forwarded result is non-determinstic. It > is > > > > > harmful > > > > > >>> as > > > > > >>>> it makes the results from experments/benchmarks/inference > > > > > meaningless. > > > > > >>>> > > > > > >>>> Therefore I propose to block the 1.3 release before it gets > > > > resolved. > > > > > >>>> > > > > > >>> > > > > > > > > > > > > > > >