I'd agree that we should have a repeatable process for generating artifacts. It would be useful for Apache release reviewers to be able to double check the results we get in CI, and it would help give a consistent experience for users.
I'm a little uncomfortable with the idea of generating the actual artifacts from the CI account. The CI account is designed to run arbitrary code from the internet. Generating a native binary that gets distributed to a bunch of computers from this account seems like an unnecessary security risk. We could very simply run artifact builds in a different environment for which the entire internet does not have execute permissions. I'm not strongly against the idea of releasing from the current CI account, but I think we should be careful of how tightly we want to couple these processes. On Tue, Jul 31, 2018 at 3:35 AM Hagay Lupesko <lupe...@gmail.com> wrote: > Thanks Pedro. > Good to know you think it is important as well. I hope the community can > review a proposal on the CWiki soon? that would be great... > > On Mon, Jul 30, 2018 at 4:26 AM Pedro Larroy <pedro.larroy.li...@gmail.com > > > wrote: > > > Hi Hagay > > > > We are aware of this and we are working in this direction which as you > > point out, is more desirable. > > There's a huge amount of non-trivial work that has gone into building > these > > distribution packages from Sheng which needs to be adapted for our CI > > system, and taken into consideration. > > > > Pedro. > > > > > > On Mon, Jul 30, 2018 at 9:07 AM Hagay Lupesko <lupe...@gmail.com> wrote: > > > > > Thanks Tong for root-causing the issue! > > > Thanks Sheng for following up with an updated PyPi package. > > > > > > What worries me is that we seem to build MXNet PyPi distribution > packages > > > with a build config different than the CI where all of the tests are > > > running. > > > Looking here [1 > > > < > > > > > > https://github.com/apache/incubator-mxnet/blob/master/ci/docker/install/ubuntu_core.sh > > > >] > > > it seems that MXNet CI Ubuntu build uses libopenblas-dev v0.2.18, while > > > PyPi build for MXNet 1.2.1 used v0.3.2 (I would imaging PyPi > > distribution?) > > > > > > Needless to say that if we don't make sure PyPi distribution is aligned > > > with the CI build, similar issues can happen again with other > > dependencies. > > > I'd think we want the build configs to be the same, or better yet have > > the > > > PyPi package be built from the output produced by the CI. > > > Thoughts? > > > > > > [1] > > > > > > > > > https://github.com/apache/incubator-mxnet/blob/master/ci/docker/install/ubuntu_core.sh > > > > > > > > > On Fri, Jul 27, 2018 at 11:31 AM Sheng Zha <szha....@gmail.com> wrote: > > > > > > > Tong, > > > > > > > > That's great news. I'm glad that OpenBLAS people are responding so > > > quickly. > > > > In that case it's probably a better idea to use that version instead. > > The > > > > latest OpenBLAS version brings many optimization for all kinds of > > > hardware. > > > > > > > > -sz > > > > > > > > On Fri, Jul 27, 2018 at 11:10 AM, Tong He <hetong...@gmail.com> > wrote: > > > > > > > > > Hi Sheng, > > > > > > > > > > I also opened an issue on OpenBLAS repo: > > > > > https://github.com/xianyi/OpenBLAS/issues/1700 . > > > > > > > > > > As informed that "0.3.2 should be released this weekend", I tested > > > their > > > > > develope branch as well, and seems the new version has fixed the > bug. > > > > > > > > > > Since OpenBLAS 0.3.2 could also have performance improvement, > > > therefore I > > > > > propose to wait for OpenBLAS 0.3.2 for our pip post release. > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > Tong He > > > > > > > > > > 2018-07-27 10:54 GMT-07:00 Sheng Zha <szha....@gmail.com>: > > > > > > > > > > > Forgot to mention, the post release version is a pip package > > version. > > > > > > > > > > > > -sz > > > > > > > > > > > > > On Jul 27, 2018, at 10:42 AM, Sheng Zha <szha....@gmail.com> > > > wrote: > > > > > > > > > > > > > > In this case we can regard it as a release problem, which is > > > usually > > > > > > what post release versions are for. It’s still the same release > > with > > > > > > different dependency, so there is no code change needed. > > > > > > > > > > > > > > -sz > > > > > > > > > > > > > > > > > > > > >> On Jul 27, 2018, at 8:31 AM, Steffen Rochel < > > > > steffenroc...@gmail.com> > > > > > > wrote: > > > > > > >> > > > > > > >> Hi Tong - thanks for root causing the problem. > > > > > > >> Sheng - what is 1.2.1.post0? Shouldn't a patch with fix be > > > released > > > > as > > > > > > >> 1.2.2? > > > > > > >> Steffen > > > > > > >> > > > > > > >>> On Thu, Jul 26, 2018 at 5:33 PM Sheng Zha < > szha....@gmail.com> > > > > > wrote: > > > > > > >>> > > > > > > >>> Dear users and developers of Apache MXNet (Incubating), > > > > > > >>> > > > > > > >>> Thanks to Tong's dedication, the root cause for this issue > was > > > > > > identified > > > > > > >>> to be instability in OpenBLAS's latest stable version 0.3.1. > > For > > > > > > details, > > > > > > >>> see Tong's comment > > > > > > >>> < > > > > > > >>> https://github.com/apache/incubator-mxnet/issues/11853# > > > > > > issuecomment-408272772 > > > > > > >>>> > > > > > > >>> . > > > > > > >>> > > > > > > >>> Since both the nightly build and the 1.2.1 wheels are > affected, > > > we > > > > > > >>> recommend that we stay on OpenBLAS last known stable version > > > 0.2.20 > > > > > > that > > > > > > >>> we've been using. I will assume lazy consensus and prepare > the > > > fix > > > > > > >>> (1.2.1.post0). > > > > > > >>> > > > > > > >>> -sz > > > > > > >>> > > > > > > >>>> On Tue, Jul 24, 2018 at 3:35 PM, Tong He <t...@apache.org> > > > wrote: > > > > > > >>>> > > > > > > >>>> Recently there's an issue regarding the inconsistent result > > from > > > > > gluon > > > > > > >>>> forward: > > > > > > >>>> > > > > > > >>>> https://github.com/apache/incubator-mxnet/issues/11853 > > > > > > >>>> > > > > > > >>>> Given a constant input image and loaded pretrained > parameters, > > > we > > > > > > expect > > > > > > >>> a > > > > > > >>>> deterministic output from arbitrary repeats of forwards. > > However > > > > > from > > > > > > the > > > > > > >>>> issue I see that the forwarded result is non-determinstic. > It > > is > > > > > > harmful > > > > > > >>> as > > > > > > >>>> it makes the results from experments/benchmarks/inference > > > > > > meaningless. > > > > > > >>>> > > > > > > >>>> Therefore I propose to block the 1.3 release before it gets > > > > > resolved. > > > > > > >>>> > > > > > > >>> > > > > > > > > > > > > > > > > > > > > >