Re: [apache/incubator-mxnet] [RFC] MXNet 2.0 JVM Language development (#17783)

2020-03-16 Thread Sheng Zha
+1 for option 1 and 2

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-mxnet/issues/17783#issuecomment-599800518

Re: MXNet Bot Demo

2020-03-16 Thread Marco de Abreu
Well that's generally a problem with a deferred CI approach (CI is run at
commit and not at merge time). This can either be solved through the other
proposal that's currently on dev@, by having a bot which does merges by
having a global lock and a merge queue or by accepting the issue. Reality
right now is that we're running that model where two PRs which are merged
in parallel might break one another. One thing to consider though is that
this breakage would have to be introduced in two separate parts since
otherwise there'd be merge conflicts. I think we had that situation twice
so far and the result was a quick revert, so I'd say that it's a problem
that can happily be accepted. All other solutions basically require some
form of single-threaded and globally locked solution which limits us in
scalability. I'd recommend to just accept that risk and revert a PR in case
it actually had a conflict.

-Marco

On Mon, Mar 16, 2020 at 6:29 PM Skalicky, Sam 
wrote:

> We probably need some way to track which CI runs ran for which commit too,
> that way we can ensure that all CI runs ran on the commit that will be
> merged.  Maybe the bot can comment with the commit hash when users command
> it to do something. Although since users can trigger individual CI runs its
> possible to have some commits run some CI runs but not others. We need some
> way to easily verify that the CI has executed all runs on the commit that
> will be merged.
>
> Sam
>
> > On Mar 13, 2020, at 8:28 PM, Przemysław Trędak 
> wrote:
> >
> > CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
> >
> >
> >
> > I personally like the idea of opt-in more than opt-out:
> > - ultimately PR author wants the PR to be merged so they (or committer
> reviewing the PR) will trigger the CI
> > - if it is easy to trigger the PR via the bot command then the amount of
> work per PR should be less than with opt-out (since most of the commits
> should then be marked as [skip ci] or something similar
> >
> > +1 to the bot making a comment on each new PR with its commands (and
> also explaining, or at least giving links to the general PR process so new
> PR authors are not lost). Maybe we could make the bot recognize if the PR
> author is new or existing contributor and offer advice based on that?
> >
> > Thanks
> > Przemek
> >
> > On 2020/03/13 22:06:58, Marco de Abreu  wrote:
> >> Hi,
> >>
> >> since it's no longer necessary to push a new commit to trigger CI, it
> will
> >> already reduce the costs. But to me, requiring an action to enable CI
> after
> >> a PR has been created initially, is a no go. User can opt out of CI, but
> >> the default has to be CI being triggered automatically for every commit
> >> unless specifically disabled by a participant. I'm also fine with
> >> triggering certain additional jobs (think about running a nightly job
> upon
> >> request for a PR) to require a manual step, but the PR validation
> pipelines
> >> have to run automatically. Every check that is marked as "Required" in
> >> GitHub has to be automatically kicked off.
> >>
> >> -Marco
> >>
> >> On Fri, Mar 13, 2020 at 9:50 PM Chaitanya Bapat 
> >> wrote:
> >>
> >>> Firstly,
> >>> Sorry I missed out on attaching the mail thread that was sent on 12th
> >>> February for notifying the community of the upcoming changes to the
> MXNet
> >>> CI
> >>> For reference :
> >>>
> >>>
> https://lists.apache.org/thread.html/r09a6ab2803a996fc80e00fe39ed312fa4865e8805e08df847f1addad%40%3Cdev.mxnet.apache.org%3E
> >>>
> >>> Now to the questions,
>  Is it possible for re-triggering a single job to be abused?
> >>> @Tao In the case when a user re-triggers a single job multiple times,
> that
> >>> will be visible in the PR conversation thread. A committer, even after
> he
> >>> has approved the PR before, generally takes a look at the final state
> of
> >>> the PR before merging. Would it be fair to assume the committer could
> take
> >>> the multiple re-trigger of a single job into account before merging?
> The
> >>> committer then has the option to invoke `@mxnet-bot run ci [all] ` to
> >>> trigger the entire build pipeline one last to counter the abuse. This
> is
> >>> aligned with what @Leonard said.
> >>>
> >>> @Sandeep Thanks a lot for collecting and sharing valuable data. I'd
> concur
> >>> with the opinion that given the existing things committers & PR Authors
> >>> take care of, invoking CI shouldn't be that big of an additional
> burden.
> >>>
> >>> @Marco With the opt-out, the onus remains on the PR Author. It doesn't
> help
> >>> reduce the resource usage. Hence, it was suggested to switch to
> >>> opt-in. @Leo's suggestion for proactive commenting on the part of bot
> makes
> >>> sense and is doable.
> >>>
> >>> Default : opt-out and User initiated opt-in (with addressing Leo's fix
> for
> >>> the usability issue you correctly pointed out )
> >>> @Marco How doe

Re: Workflow proposal

2020-03-16 Thread Marco de Abreu
Considering how unstable our PR as well as our nightly jobs have been so
far, is that an assumption we can rightfully make? Also, who'd be
responsible for fixing that branch in case a PR actually breaks a nightly
test?

-Marco

On Mon, Mar 16, 2020 at 7:41 PM Pedro Larroy 
wrote:

> The original idea is that the promotion to the other branch is automated by
> nightly CI, so it shouldn't have those problems that are mentioned, so
> there shouldn't be any manual merging on that branch.
>
> On Wed, Mar 11, 2020 at 7:43 PM Chris Olivier 
> wrote:
>
> > My $0.02
> >
> > We had this model dual-branch when I was at GE and it was problematic.
> > Among other things, the two branches would tend to diverge to a large
> > degree and you ended up just cherry-picking in stuff here and there,
> which
> > caused even more problems, as well as the model allows the secondary
> branch
> > to get pretty buggy -- human nature being what it is -- to the point
> where
> > it's difficult to merge it into master without freezing them both and
> > stabilizing, merging into master, then stabilizing again (small things
> > almost certainly went into master in the meantime -- hotfixes, critical
> > features, etc, while everything was on hold stabilizing the secondary
> > branch).  It just double the work in the end, is what I experienced.
> >
> > On Wed, Mar 11, 2020 at 5:47 PM Yuan Tang 
> wrote:
> >
> > > Second to not introduce a dev branch. We should try to improve our
> > release
> > > process instead and avoid another branch that may introduce confusion
> > > around the source of truth.
> > >
> > > On Wed, Mar 11, 2020 at 8:39 PM Tianqi Chen 
> > > wrote:
> > >
> > > > While the idea of staging seems to be reasonable.
> > > > Most OSS projects choose not to do so because having a complicated
> > > staging
> > > > will likely confuse the contributors, and increase the change of
> > > > divergence(between dev and master).
> > > >
> > > > Given that we have a release model, so in some sense the release
> itself
> > > > serves as a staging pt.
> > > > A good approach would simply setup the nightly if necessary strive to
> > fix
> > > > regressions and make sure the formal release addresses the issues.
> > > >
> > > > TQ
> > > >
> > > > On Wed, Mar 11, 2020 at 5:32 PM Pedro Larroy <
> > > pedro.larroy.li...@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > I talk to some people about this and they thought it would be a
> good
> > > > idea,
> > > > > so sharing it here:
> > > > >
> > > > > I would propose to use a staging or "dev" branch into which
> nightly &
> > > > > performance tests are done periodically and then this branch is
> > merged
> > > to
> > > > > master. The goal of this workflow would be to avoid having
> > regressions
> > > > and
> > > > > nightly failures creeping into master. PRs would get merged into
> dev
> > > and
> > > > > dev promoted periodically / nightly into master.
> > > > >
> > > > > The names can be swapped as well, between dev and master, so PRS
> get
> > > > merged
> > > > > into master and it doesn't change the workflow, and staging is the
> > > branch
> > > > > where nightly changes are merged to.
> > > > >
> > > > > Have this been considered?
> > > > >
> > > > > Pedro.
> > > > >
> > > >
> > >
> > >
> > > --
> > > Yuan Tang
> > > https://terrytangyuan.github.io/about/ <
> http://twitter.com/TerryTangYuan
> > >
> > > 
> > >
> >
>


Re: Workflow proposal

2020-03-16 Thread Pedro Larroy
The original idea is that the promotion to the other branch is automated by
nightly CI, so it shouldn't have those problems that are mentioned, so
there shouldn't be any manual merging on that branch.

On Wed, Mar 11, 2020 at 7:43 PM Chris Olivier  wrote:

> My $0.02
>
> We had this model dual-branch when I was at GE and it was problematic.
> Among other things, the two branches would tend to diverge to a large
> degree and you ended up just cherry-picking in stuff here and there, which
> caused even more problems, as well as the model allows the secondary branch
> to get pretty buggy -- human nature being what it is -- to the point where
> it's difficult to merge it into master without freezing them both and
> stabilizing, merging into master, then stabilizing again (small things
> almost certainly went into master in the meantime -- hotfixes, critical
> features, etc, while everything was on hold stabilizing the secondary
> branch).  It just double the work in the end, is what I experienced.
>
> On Wed, Mar 11, 2020 at 5:47 PM Yuan Tang  wrote:
>
> > Second to not introduce a dev branch. We should try to improve our
> release
> > process instead and avoid another branch that may introduce confusion
> > around the source of truth.
> >
> > On Wed, Mar 11, 2020 at 8:39 PM Tianqi Chen 
> > wrote:
> >
> > > While the idea of staging seems to be reasonable.
> > > Most OSS projects choose not to do so because having a complicated
> > staging
> > > will likely confuse the contributors, and increase the change of
> > > divergence(between dev and master).
> > >
> > > Given that we have a release model, so in some sense the release itself
> > > serves as a staging pt.
> > > A good approach would simply setup the nightly if necessary strive to
> fix
> > > regressions and make sure the formal release addresses the issues.
> > >
> > > TQ
> > >
> > > On Wed, Mar 11, 2020 at 5:32 PM Pedro Larroy <
> > pedro.larroy.li...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > I talk to some people about this and they thought it would be a good
> > > idea,
> > > > so sharing it here:
> > > >
> > > > I would propose to use a staging or "dev" branch into which nightly &
> > > > performance tests are done periodically and then this branch is
> merged
> > to
> > > > master. The goal of this workflow would be to avoid having
> regressions
> > > and
> > > > nightly failures creeping into master. PRs would get merged into dev
> > and
> > > > dev promoted periodically / nightly into master.
> > > >
> > > > The names can be swapped as well, between dev and master, so PRS get
> > > merged
> > > > into master and it doesn't change the workflow, and staging is the
> > branch
> > > > where nightly changes are merged to.
> > > >
> > > > Have this been considered?
> > > >
> > > > Pedro.
> > > >
> > >
> >
> >
> > --
> > Yuan Tang
> > https://terrytangyuan.github.io/about/  >
> > 
> >
>


Re: MXNet Bot Demo

2020-03-16 Thread Skalicky, Sam
We probably need some way to track which CI runs ran for which commit too, that 
way we can ensure that all CI runs ran on the commit that will be merged.  
Maybe the bot can comment with the commit hash when users command it to do 
something. Although since users can trigger individual CI runs its possible to 
have some commits run some CI runs but not others. We need some way to easily 
verify that the CI has executed all runs on the commit that will be merged.

Sam

> On Mar 13, 2020, at 8:28 PM, Przemysław Trędak  wrote:
> 
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you can confirm the sender and know the 
> content is safe.
> 
> 
> 
> I personally like the idea of opt-in more than opt-out:
> - ultimately PR author wants the PR to be merged so they (or committer 
> reviewing the PR) will trigger the CI
> - if it is easy to trigger the PR via the bot command then the amount of work 
> per PR should be less than with opt-out (since most of the commits should 
> then be marked as [skip ci] or something similar
> 
> +1 to the bot making a comment on each new PR with its commands (and also 
> explaining, or at least giving links to the general PR process so new PR 
> authors are not lost). Maybe we could make the bot recognize if the PR author 
> is new or existing contributor and offer advice based on that?
> 
> Thanks
> Przemek
> 
> On 2020/03/13 22:06:58, Marco de Abreu  wrote:
>> Hi,
>> 
>> since it's no longer necessary to push a new commit to trigger CI, it will
>> already reduce the costs. But to me, requiring an action to enable CI after
>> a PR has been created initially, is a no go. User can opt out of CI, but
>> the default has to be CI being triggered automatically for every commit
>> unless specifically disabled by a participant. I'm also fine with
>> triggering certain additional jobs (think about running a nightly job upon
>> request for a PR) to require a manual step, but the PR validation pipelines
>> have to run automatically. Every check that is marked as "Required" in
>> GitHub has to be automatically kicked off.
>> 
>> -Marco
>> 
>> On Fri, Mar 13, 2020 at 9:50 PM Chaitanya Bapat 
>> wrote:
>> 
>>> Firstly,
>>> Sorry I missed out on attaching the mail thread that was sent on 12th
>>> February for notifying the community of the upcoming changes to the MXNet
>>> CI
>>> For reference :
>>> 
>>> https://lists.apache.org/thread.html/r09a6ab2803a996fc80e00fe39ed312fa4865e8805e08df847f1addad%40%3Cdev.mxnet.apache.org%3E
>>> 
>>> Now to the questions,
 Is it possible for re-triggering a single job to be abused?
>>> @Tao In the case when a user re-triggers a single job multiple times, that
>>> will be visible in the PR conversation thread. A committer, even after he
>>> has approved the PR before, generally takes a look at the final state of
>>> the PR before merging. Would it be fair to assume the committer could take
>>> the multiple re-trigger of a single job into account before merging? The
>>> committer then has the option to invoke `@mxnet-bot run ci [all] ` to
>>> trigger the entire build pipeline one last to counter the abuse. This is
>>> aligned with what @Leonard said.
>>> 
>>> @Sandeep Thanks a lot for collecting and sharing valuable data. I'd concur
>>> with the opinion that given the existing things committers & PR Authors
>>> take care of, invoking CI shouldn't be that big of an additional burden.
>>> 
>>> @Marco With the opt-out, the onus remains on the PR Author. It doesn't help
>>> reduce the resource usage. Hence, it was suggested to switch to
>>> opt-in. @Leo's suggestion for proactive commenting on the part of bot makes
>>> sense and is doable.
>>> 
>>> Default : opt-out and User initiated opt-in (with addressing Leo's fix for
>>> the usability issue you correctly pointed out )
>>> @Marco How does this sound to you?
>>> 
>>> Again, thank you all for chiming in and voicing your opinion. Appreciate
>>> it.
>>> We can take ahead these discussions in today's demo meeting. [Design Doc
>>> ] [Demo
>>> Video ]
>>> 
>>> Thanks,
>>> Chai
>>> 
>>> On Fri, 13 Mar 2020 at 12:34, Marco de Abreu 
>>> wrote:
>>> 
 I'd recommend that the bot makes an initial comment when a PR gets opened
 and informs the users of its commands. It then tells the user the commend
 to opt out of CI.
 
 -Marco
 
 Lausen, Leonard  schrieb am Fr., 13. März
>>> 2020,
 20:27:
 
> On opt-out: People may be unaware of opt-out would not use it. There is
 no
> incentive to use opt-out, as the PR author doesn't pay any money for CI
> run.
> 
> I agree with Marco though that opt-in alone may cause usability issues,
 as
> contributors may not be aware of how to trigger the CI.
> One solution is that the bot proactively comments on the PR and reminds
 the
> autho