Thanks Leonard for picking up this work. Are you planning to open another PR 
that commits these PRs into v1.x too so this doesn’t happen again (if we ever 
release a 1.9 version)? 

Other than these 2 PRs are there any others that are required for the v1.8.0 
release?

https://github.com/apache/incubator-mxnet/pull/19251
https://github.com/apache/incubator-mxnet/pull/19262

Sam

On 9/30/20, 9:03 PM, "Leonard Lausen" <lau...@apache.org> wrote:

    CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



    Thank you Sam for driving the release!

    I took a quick look at the missing commits from v1.7.x in v1.8.x via the git
    cherry mode and applied them to v1.8.x. Please see
    https://github.com/apache/incubator-mxnet/pull/19262

    The missing code in v1.8.x is substantial (+675 −518) and I thus change my 
vote
    for the rc0 release to -1.

    I hope we can include checking for missing commits via git cherry mode in 
the
    release manager process going forward. It just takes a few minutes. If we 
want
    to streamline the process, we can do so by avoiding to squash commits during
    porting from one branch to another which reduces false positives in git 
cherry
    mode (commits detected as missing that were actually ported).

    Best regards
    Leonard

    On Wed, 2020-09-30 at 21:52 +0000, Skalicky, Sam wrote:
    > Hi MXNet Community,
    >
    > Quick summary on the status of the vote:
    >
    > 2  +1
    > 1 -0.9
    >
    > I spoke with Leonard offline, and the problem only impacts the specific
    > instance when running MKLDNN/oneDNN immediately after intgemm. We don’t 
expect
    > users to fall into this specific edge case, and so far the problem hasn’t 
been
    > reproduced on 1.8.x (even through it contains the same oneDNN and intgemm
    > components that are in the master branch). He proposed to not postpone the
    > release for this issue, but if other issues arise we should fix this one 
at
    > the same time.
    >
    > There are also still missing PRs that were in v1.7.x that were never 
committed
    > to v1.x branch. And so when branching from v1.x to create the v1.8.x 
branch
    > these PRs do not exist. Unfortunately no one has volunteered to port 
these to
    > v1.x and v1.8.x branches.
    >
    > I propose extending the vote until Friday October 2, 23:59:59 PDT to 
conclude
    > the discussion and get the remaining votes necessary.
    >
    > Thanks!
    > Sam
    >
    > On 9/29/20, 12:41 PM, "Skalicky, Sam" <sska...@amazon.com.INVALID> wrote:
    >
    >     There was no response from the community on the discussion thread 
[1]. So
    > the current state is the same.
    >
    >     [1]
    > 
https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
    >
    >     On 9/29/20, 11:36 AM, "Xingjian SHI" <xsh...@connect.ust.hk> wrote:
    >
    >         CAUTION: This email originated from outside of the organization. 
Do
    > not click links or open attachments unless you can confirm the sender and 
know
    > the content is safe.
    >
    >
    >
    >         Just one question regarding the 1.8.0.rc0. Are all PRs that are in
    > 1.7.0 included in 1.8.0? For example,
    > https://github.com/apache/incubator-mxnet/pull/18653
    >
    >         Thanks,
    >         Xingjian
    >
    >         On 9/29/20, 10:20 AM, "Leonard Lausen" <lau...@apache.org> wrote:
    >
    >             Thank you Aaron for trying the build and pointing out the 
issues.
    >
    >             On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
    >             > 2) Tried just doing a make. This fails because none of the
    > submodules are
    >             > there. [...]
    >
    >             I downloaded the rc from the link shared by Sam [1] and it 
does
    > include the
    >             submodules. Could you provide more details on your issue?
    >
    >             > Downloaded the tar.gz for the release and looked at the 
build
    > from
    >             source directions on the website, but these have you use 
cmake and
    > don't
    >             really tell you what to do...
    >
    >             The docs refer users to version-controlled files, as the 
build-
    > from-source guide
    >             on the website is shared among all versions, however the 
actual
    > build steps
    >             differes on different versions. I think the best way to 
improve it
    > is to provide
    >             version-specific build from source instructions via the 
"version
    > selector"
    >             feature on the get started page. Contributions towards this 
goal
    > or other
    >             improvements would be great [2].
    >
    >             Thanks
    >             Leonard
    >
    >             [1]:
    > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
    >             [2]: https://github.com/apache/incubator-mxnet/issues/18666
    >
    >
    > On 9/29/20, 10:09 AM, "Leonard Lausen" <lau...@apache.org> wrote:
    >
    >     CAUTION: This email originated from outside of the organization. Do 
not
    > click links or open attachments unless you can confirm the sender and 
know the
    > content is safe.
    >
    >
    >
    >     Vote -0.9.
    >
    >     Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) 
wrongly
    >     handles zmm registers. Together with MXNet intgemm feature (also 
included
    > in 1.8
    >     rc0) this can yield NaN results if onednn gemm is executed some time 
after
    >     intgemm. [1]
    >
    >     Thanks
    >     Leonard
    >
    >     [1]:
    > 
https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
    >             >
    >             >
    >             > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <
    > sska...@amazon.com.invalid>
    >             > wrote:
    >             >
    >             > > Thanks for pointing this out Leonard. Has anyone been 
able to
    > reproduce
    >             > > the problem on 1.8.0.rc0?
    >             > >
    >             > > Either way, I would proposed that we continue validating 
the
    > release as-is
    >             > > and see if we can find any other issues.
    >             > >
    >             > > Sam
    >             > >
    >             > > On 9/28/20, 10:22 AM, "Leonard Lausen" <lau...@apache.org>
    > wrote:
    >             > >
    >             > >     CAUTION: This email originated from outside of the
    > organization. Do
    >             > > not click links or open attachments unless you can 
confirm the
    > sender and
    >             > > know the content is safe.
    >             > >
    >             > >
    >             > >
    >             > >     Thank you Sam for driving the 1.8 release!
    >             > >
    >             > >     As the included oneDNN package is known to produce nan
    > results on the
    >             > > master
    >             > >     branch [1] and is pending an upstream fix by Intel, 
I'd
    > suggest to
    >             > > extend the
    >             > >     vote until we have clarity if the bug also affects 
the 1.8
    > release,
    >             > > given that
    >             > >     oneDNN is enabled in the default configuration [2].
    >             > >
    >             > >     [1]:
    >             > >
    > 
https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
    >             > >     [2]:
    >             > >
    >             > >
    > 
https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
    >             > >
    >             > >     On Sun, 2020-09-27 at 18:28 -0700, sandeep 
krishnamurthy
    > wrote:
    >             > >     > Sam,
    >             > >     >
    >             > >     > Thank you for driving the v1.8.0 release of MXNet. 
This
    > is exciting
    >             > > given
    >             > >     > it is coming with CUDA11 and cuDNN8!!
    >             > >     >
    >             > >     > Fixing the release candidate link:
    >             > >     >
    > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
    >             > >     >
    >             > >     > Best,
    >             > >     > Sandeep
    >             > >     >
    >             > >     >
    >             > >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
    >             > > <sska...@amazon.com.invalid>
    >             > >     > wrote:
    >             > >     >
    >             > >     > > Dear MXNet community,
    >             > >     > >
    >             > >     > > This is the vote to release Apache MXNet 
(incubating)
    > version
    >             > > 1.8.0.
    >             > >     > > Voting will start September 26, 23:59:59 PDT and 
close
    > on
    >             > > September 29,
    >             > >     > > 23:59:59 PDT.
    >             > >     > >
    >             > >     > > Link to release notes:
    >             > >     > >
    >             > >
    > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
    >             > >     > >
    >             > >     > > Link to release candidate:
    >             > >     > >
    > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
    >             > >     > >
    >             > >     > > Link to source and signatures on apache dist 
server:
    >             > >     > >
    > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
    >             > >     > >
    >             > >     > > Please remember to TEST first before voting
    > accordingly:
    >             > >     > > +1 = approve
    >             > >     > > +0 = no opinion
    >             > >     > > -1 = disapprove (provide reason)
    >             > >     > >
    >             > >     > > Best regards,
    >             > >     > > Sam Skalicky
    >             > >     > >
    >             > >     > >
    >             > >     >
    >             > >     > --
    >             > >     > Sandeep Krishnamurthy
    >             > >
    >             > >
    >             > >
    >
    >
    >
    >


Reply via email to