Re: MXNet Podling Report - October

2018-10-02 Thread Michael Wall
Hi Haibin,

A couple of things I thought of when reviewing the report
1 - No answer for the question "Any issues that the Incubator PMC (IPMC) or
ASF Board wish/need to be aware of?"
2 - What are the links to the medium blog, the youtube channel and the
twitter account?
3 - Where did the 62 tutorials come from?  Mostly my own interest to see
them.
4 - How were the github stats calculated?  For Sep 2018 issues created, the
report read 87 but I come up with 112 (the sum of open and closed  from
https://github.com/apache/incubator-mxnet/issues?utf8=%E2%9C%93&q=is%3Aissue+created%3A2018-09-01..2018-09-30+).
For Sep 2018 issues closed the report read 124 where I get 110 (
https://github.com/apache/incubator-mxnet/issues?utf8=%E2%9C%93&q=is%3Aissue+closed%3A2018-09-01..2018-09-30+).
But that is only for the incubator-mxnet repo, it doesn't include
incubator-mxnet-site or incubator-mxnet-test.  The links could be included
in the report as well which might help to generate the numbers next time.
5 - You mention new mentors were added, but not why.  If the board asks, it
will be a follow up task.

Mike

On Mon, Oct 1, 2018 at 8:33 PM Haibin Lin  wrote:

> Hi MXNet community,
>
> The podling report for MXNet is due on October 3rd. The report covers
> MXNet's progress on community development and project development (the
> previous one can be found here  >).
> You can search "MXNet" at https://wiki.apache.org/incubator/October2018
> for
> MXNet's draft report for October. Please help review and contribute to the
> report before it's due.
>
> If you have any suggestions on improving the report, please let me know and
> I'm happy to update the report based on the feedback. Thanks!
>
> Best regards,
> Haibin
>


Re: Which merge option to use on the Import Julia binding PR?

2018-10-02 Thread Carin Meier
Marco - Thanks for the "dry run" idea. It will give everyone a clear idea
of the process and what the expected results will look like.

- I took my fork of the repo and synced my master branch.
- @iblis17 made a copy of the branch of the Julia import PR and submitted
it to my repo
- I merged it with the "Merge" option through the web interface.

Here is a gif of the process of merging: http://g.recordit.co/DzBcFtnjmV.gif
Here is the result of the repo: https://github.com/gigasquid/incubator-mxnet

Please everyone take a look and validate that this looks ok.

If there are no objections, Marco - could you please take the lead in
requesting the actions from INFRA?

It will be great to *finally* get this PR in  :)

Thanks,
Carin





On Sat, Sep 29, 2018 at 10:02 PM Chiyuan Zhang  wrote:

> Sorry, here is the image: https://imgur.com/V5wd2XB
>
> And here is the github document on the 3 different merge options for the
> web UI button: https://help.github.com/articles/about-pull-request-merges/
>
> On Sat, Sep 29, 2018 at 6:48 PM Marco de Abreu
>  wrote:
>
> > Could you upload the picture somewhere please? HTML is being stripped out
> > on email lists.
> >
> > Chiyuan Zhang  schrieb am So., 30. Sep. 2018, 03:44:
> >
> > > There is an option in the repo settings menu to disable or enable
> > > merge-commit for PR, see a screenshot below (from a different github
> > > project):
> > >
> > > [image: image.png]
> > >
> > > My guess is that this is disabled for the reason to avoid creating
> > > non-linear history for standard PRs (as oppose to technical problem).
> But
> > > this is only my guess, it would be great if someone could confirm.
> > >
> > > Best,
> > > Chiyuan
> > >
> > > On Sat, Sep 29, 2018 at 3:50 AM Carin Meier 
> > wrote:
> > >
> > >> I believe so, but if someone wants to confirm it would be great.
> > >> Unfortunately, I just came down with a cold/flu so I will be out of
> > >> communication for a bit
> > >>
> > >> On Fri, Sep 28, 2018 at 9:51 PM Marco de Abreu
> > >>  wrote:
> > >>
> > >> > Are we sure that this is due to lacking permissions and not because
> of
> > >> some
> > >> > technical limitation? If we are certain, we can ask out mentors to
> > >> create a
> > >> > ticket with Apache Infra to make that switch.
> > >> >
> > >> > -Marco
> > >> >
> > >> > Carin Meier  schrieb am Sa., 29. Sep. 2018,
> > >> 01:17:
> > >> >
> > >> > > I made a test regular merge commit into a copy of master. It
> seemed
> > >> to go
> > >> > > fine. Here is a listing of what it will look like for everyone.
> > >> > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/incubator-mxnet/commits/test-merge-julia-import
> > >> > >
> > >> > > Although, I would be happy to push the merge button. I think the
> > most
> > >> > > important thing is to get the PR merged, so whatever way is the
> best
> > >> to
> > >> > > make that happen, let's do it.
> > >> > >
> > >> > > So - Does the regular merge seem like a good option?
> > >> > > If so, what is the best way to make that happen?
> > >> > >
> > >> > >
> > >> > > On Fri, Sep 28, 2018 at 6:05 PM Chiyuan Zhang 
> > >> wrote:
> > >> > >
> > >> > > > Agreed with Pedro. Maybe the merge-commit option from the github
> > >> > > interface
> > >> > > > was disabled for a reason. But as Pedro said, maybe it is good
> to
> > >> > > > temporarily enable it for this PR and merge using that.
> > >> > > >
> > >> > > >
> > >> > > >- It should be technically easier than rebasing due to the
> > >> > > >git-subtree-import issue we are currently having
> > >> > > >- It also avoid stacking a huge commit history on *top* of
> > >> current
> > >> > > >history
> > >> > > >- The downside is probably the history of the project is not
> > >> linear
> > >> > > >anymore, but I think this is actually what we would like to
> > have
> > >> for
> > >> > > > this
> > >> > > >particular case, because the contents of the main repo and
> the
> > >> julia
> > >> > > > branch
> > >> > > >actually does not overlap. So it makes sense to have two
> tails
> > >> with
> > >> > > > their
> > >> > > >own history.
> > >> > > >
> > >> > > > Carin: I guess if someone with admin permission on the github
> > could
> > >> > > > temporarily enable the merge-commit option, then pushing the
> > button
> > >> on
> > >> > > the
> > >> > > > web might simply work.
> > >> > > >
> > >> > > > Best,
> > >> > > > Chiyuan
> > >> > > >
> > >> > > > On Fri, Sep 28, 2018 at 2:53 PM Carin Meier <
> carinme...@gmail.com
> > >
> > >> > > wrote:
> > >> > > >
> > >> > > > > Pedro - Maybe a merge commit is a better answer in this case.
> I
> > >> > > > originally
> > >> > > > > ruled it out since it wasn't an option in the github web
> > >> interface,
> > >> > but
> > >> > > > > since this looks like it is going to have to be done outside
> it
> > >> > because
> > >> > > > of
> > >> > > > > the subtrees anyway, it might be a better 

Re: CUDNN algorithm selection failure

2018-10-02 Thread Marco de Abreu
I have created an issue at
https://github.com/apache/incubator-mxnet/issues/12715 and a PR to disable
the test at https://github.com/apache/incubator-mxnet/pull/12716.

This test is pretty new and was submitted with a number of other
problematic (and disabled) tests:
https://github.com/apache/incubator-mxnet/issues/11164 It could be possible
that the test is simply not stable enough. The PR that introduced that test
is https://github.com/apache/incubator-mxnet/pull/10921 - it was merged two
days ago.

Best regards,
Marco

On Tue, Oct 2, 2018 at 8:43 AM Pedro Larroy 
wrote:

> Thanks for checking Lin. If it happens again we will have to dig deeper. We
> have just one executor in GPU so I wonder what could be the root cause of
> this.
>
> On Mon, Oct 1, 2018 at 10:57 PM Lin Yuan  wrote:
>
> > I could not reproduce the error on an EC2 g3x8 instance making it hard to
> > debug. I also suspect it was due to resource usage limit on ci
>  Instance.
> >
> > On Mon, Oct 1, 2018 at 10:40 PM Pedro Larroy <
> pedro.larroy.li...@gmail.com
> > >
> > wrote:
> >
> > > It doesn't look like flakiness to me at first sight. I think it might
> be
> > > related to resource usage / allocation / leak in the worst case.
> > >
> > > Could be that there was not enough memory GPU memory at the time of
> test
> > > execution. But I'm just speculating, hence my original question.
> > >
> > > Pedro.
> > >
> > > On Mon, Oct 1, 2018 at 8:16 PM Lin Yuan  wrote:
> > >
> > > > Hi Pedro,
> > > >
> > > > I also got this failure in my PR
> > > >
> > > >
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-11742/27/pipeline
> > > >
> > > > I was not able to identify the root cause of it from changelist. Are
> > you
> > > > suggesting there is some flakiness in the master branch too?
> > > >
> > > > Thanks,
> > > >
> > > > Lin
> > > >
> > > > On Mon, Oct 1, 2018 at 4:55 PM Pedro Larroy <
> > > pedro.larroy.li...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > I saw this failure on CI:
> > > > >
> > > > >
> > > >
> > >
> >
> http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/1697/pipeline
> > > > >
> > > > > Have you seen other cases where we fail to select the best CUDNN
> > > > algorithm?
> > > > > In which circumstances this could happen, and do you think is a
> good
> > > idea
> > > > > to have one selected by default as a last resort?
> > > > >
> > > > >
> > > > > Pedro.
> > > > >
> > > >
> > >
> >
>