Hi MXNet Community,

Quick update on the progress of the fixes for the release:

[1] - Thanks to Leonard for finding the missing PRs that were in v1.7.x but 
missing in v1.x, and backporting them to v1.8.x
[2] - Ive backported the combined commit with the missing PRs to v1.x to make 
sure this doesn’t happen in a future release
[3] - Fix for the oneDNN/intgemm problem
[4] - Fix for split_and_save 
[5] - Fix for setting attributes in reviewSubgraph

Tentatively we'll shoot for restarting the vote early next week once the 
remaining PRs with fixes are merged (we're making sure these fixes are 
backports from v1.x :-D).

Please reply if there are other PRs with fixes that need to be included in the 
v1.8.0 release.

Thanks and have a great weekend!
Sam

[1] https://github.com/apache/incubator-mxnet/pull/19262
[2] https://github.com/apache/incubator-mxnet/pull/19281 
[3] https://github.com/apache/incubator-mxnet/pull/19251
[4] https://github.com/apache/incubator-mxnet/pull/19267 
[5] https://github.com/apache/incubator-mxnet/pull/19278 

On 9/30/20, 9:53 PM, "Skalicky, Sam" <sska...@amazon.com> wrote:

    Thanks Leonard for picking up this work. Are you planning to open another 
PR that commits these PRs into v1.x too so this doesn’t happen again (if we 
ever release a 1.9 version)? 

    Other than these 2 PRs are there any others that are required for the 
v1.8.0 release?

    https://github.com/apache/incubator-mxnet/pull/19251
    https://github.com/apache/incubator-mxnet/pull/19262

    Sam

    On 9/30/20, 9:03 PM, "Leonard Lausen" <lau...@apache.org> wrote:

        CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



        Thank you Sam for driving the release!

        I took a quick look at the missing commits from v1.7.x in v1.8.x via 
the git
        cherry mode and applied them to v1.8.x. Please see
        https://github.com/apache/incubator-mxnet/pull/19262

        The missing code in v1.8.x is substantial (+675 −518) and I thus change 
my vote
        for the rc0 release to -1.

        I hope we can include checking for missing commits via git cherry mode 
in the
        release manager process going forward. It just takes a few minutes. If 
we want
        to streamline the process, we can do so by avoiding to squash commits 
during
        porting from one branch to another which reduces false positives in git 
cherry
        mode (commits detected as missing that were actually ported).

        Best regards
        Leonard

        On Wed, 2020-09-30 at 21:52 +0000, Skalicky, Sam wrote:
        > Hi MXNet Community,
        >
        > Quick summary on the status of the vote:
        >
        > 2  +1
        > 1 -0.9
        >
        > I spoke with Leonard offline, and the problem only impacts the 
specific
        > instance when running MKLDNN/oneDNN immediately after intgemm. We 
don’t expect
        > users to fall into this specific edge case, and so far the problem 
hasn’t been
        > reproduced on 1.8.x (even through it contains the same oneDNN and 
intgemm
        > components that are in the master branch). He proposed to not 
postpone the
        > release for this issue, but if other issues arise we should fix this 
one at
        > the same time.
        >
        > There are also still missing PRs that were in v1.7.x that were never 
committed
        > to v1.x branch. And so when branching from v1.x to create the v1.8.x 
branch
        > these PRs do not exist. Unfortunately no one has volunteered to port 
these to
        > v1.x and v1.8.x branches.
        >
        > I propose extending the vote until Friday October 2, 23:59:59 PDT to 
conclude
        > the discussion and get the remaining votes necessary.
        >
        > Thanks!
        > Sam
        >
        > On 9/29/20, 12:41 PM, "Skalicky, Sam" <sska...@amazon.com.INVALID> 
wrote:
        >
        >     There was no response from the community on the discussion thread 
[1]. So
        > the current state is the same.
        >
        >     [1]
        > 
https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
        >
        >     On 9/29/20, 11:36 AM, "Xingjian SHI" <xsh...@connect.ust.hk> 
wrote:
        >
        >         CAUTION: This email originated from outside of the 
organization. Do
        > not click links or open attachments unless you can confirm the sender 
and know
        > the content is safe.
        >
        >
        >
        >         Just one question regarding the 1.8.0.rc0. Are all PRs that 
are in
        > 1.7.0 included in 1.8.0? For example,
        > https://github.com/apache/incubator-mxnet/pull/18653
        >
        >         Thanks,
        >         Xingjian
        >
        >         On 9/29/20, 10:20 AM, "Leonard Lausen" <lau...@apache.org> 
wrote:
        >
        >             Thank you Aaron for trying the build and pointing out the 
issues.
        >
        >             On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
        >             > 2) Tried just doing a make. This fails because none of 
the
        > submodules are
        >             > there. [...]
        >
        >             I downloaded the rc from the link shared by Sam [1] and 
it does
        > include the
        >             submodules. Could you provide more details on your issue?
        >
        >             > Downloaded the tar.gz for the release and looked at the 
build
        > from
        >             source directions on the website, but these have you use 
cmake and
        > don't
        >             really tell you what to do...
        >
        >             The docs refer users to version-controlled files, as the 
build-
        > from-source guide
        >             on the website is shared among all versions, however the 
actual
        > build steps
        >             differes on different versions. I think the best way to 
improve it
        > is to provide
        >             version-specific build from source instructions via the 
"version
        > selector"
        >             feature on the get started page. Contributions towards 
this goal
        > or other
        >             improvements would be great [2].
        >
        >             Thanks
        >             Leonard
        >
        >             [1]:
        > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
        >             [2]: 
https://github.com/apache/incubator-mxnet/issues/18666
        >
        >
        > On 9/29/20, 10:09 AM, "Leonard Lausen" <lau...@apache.org> wrote:
        >
        >     CAUTION: This email originated from outside of the organization. 
Do not
        > click links or open attachments unless you can confirm the sender and 
know the
        > content is safe.
        >
        >
        >
        >     Vote -0.9.
        >
        >     Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) 
wrongly
        >     handles zmm registers. Together with MXNet intgemm feature (also 
included
        > in 1.8
        >     rc0) this can yield NaN results if onednn gemm is executed some 
time after
        >     intgemm. [1]
        >
        >     Thanks
        >     Leonard
        >
        >     [1]:
        > 
https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
        >             >
        >             >
        >             > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <
        > sska...@amazon.com.invalid>
        >             > wrote:
        >             >
        >             > > Thanks for pointing this out Leonard. Has anyone been 
able to
        > reproduce
        >             > > the problem on 1.8.0.rc0?
        >             > >
        >             > > Either way, I would proposed that we continue 
validating the
        > release as-is
        >             > > and see if we can find any other issues.
        >             > >
        >             > > Sam
        >             > >
        >             > > On 9/28/20, 10:22 AM, "Leonard Lausen" 
<lau...@apache.org>
        > wrote:
        >             > >
        >             > >     CAUTION: This email originated from outside of the
        > organization. Do
        >             > > not click links or open attachments unless you can 
confirm the
        > sender and
        >             > > know the content is safe.
        >             > >
        >             > >
        >             > >
        >             > >     Thank you Sam for driving the 1.8 release!
        >             > >
        >             > >     As the included oneDNN package is known to 
produce nan
        > results on the
        >             > > master
        >             > >     branch [1] and is pending an upstream fix by 
Intel, I'd
        > suggest to
        >             > > extend the
        >             > >     vote until we have clarity if the bug also 
affects the 1.8
        > release,
        >             > > given that
        >             > >     oneDNN is enabled in the default configuration 
[2].
        >             > >
        >             > >     [1]:
        >             > >
        > 
https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
        >             > >     [2]:
        >             > >
        >             > >
        > 
https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
        >             > >
        >             > >     On Sun, 2020-09-27 at 18:28 -0700, sandeep 
krishnamurthy
        > wrote:
        >             > >     > Sam,
        >             > >     >
        >             > >     > Thank you for driving the v1.8.0 release of 
MXNet. This
        > is exciting
        >             > > given
        >             > >     > it is coming with CUDA11 and cuDNN8!!
        >             > >     >
        >             > >     > Fixing the release candidate link:
        >             > >     >
        > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
        >             > >     >
        >             > >     > Best,
        >             > >     > Sandeep
        >             > >     >
        >             > >     >
        >             > >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
        >             > > <sska...@amazon.com.invalid>
        >             > >     > wrote:
        >             > >     >
        >             > >     > > Dear MXNet community,
        >             > >     > >
        >             > >     > > This is the vote to release Apache MXNet 
(incubating)
        > version
        >             > > 1.8.0.
        >             > >     > > Voting will start September 26, 23:59:59 PDT 
and close
        > on
        >             > > September 29,
        >             > >     > > 23:59:59 PDT.
        >             > >     > >
        >             > >     > > Link to release notes:
        >             > >     > >
        >             > >
        > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
        >             > >     > >
        >             > >     > > Link to release candidate:
        >             > >     > >
        > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
        >             > >     > >
        >             > >     > > Link to source and signatures on apache dist 
server:
        >             > >     > >
        > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
        >             > >     > >
        >             > >     > > Please remember to TEST first before voting
        > accordingly:
        >             > >     > > +1 = approve
        >             > >     > > +0 = no opinion
        >             > >     > > -1 = disapprove (provide reason)
        >             > >     > >
        >             > >     > > Best regards,
        >             > >     > > Sam Skalicky
        >             > >     > >
        >             > >     > >
        >             > >     >
        >             > >     > --
        >             > >     > Sandeep Krishnamurthy
        >             > >
        >             > >
        >             > >
        >
        >
        >
        >



Reply via email to