Re: Protected master needs to be turned off

2017-12-04 Thread Pedro Larroy
Agree on both Mu and Tianqi’s points.


I think this would align well with the “nightly tests” narrative. We
can disable CI for trivial changes with a comment like (“skip ci”)
comment on the ghprb Jenkins plugin, and trust the nightly to catch
any problems introduced by trivial patches in an aggregated manner.

Pedro.

On Fri, Dec 1, 2017 at 6:49 PM, Tianqi Chen  wrote:
> I think we are still using CI to catch bugs. And we should take caution
> when merging something that CI did not catch up due to the response time.
> This do not contradict what comitter can do with their best judgement. For
> example, I would normally switch CI off and merge simple typo fixes. If the
> fix merged causes problem, you still get CI to report it maybe a few hours
> later.
>
> The bottom line is that comitter get these information as feedbacks and
> they use their best judgement when do so. Most of the time when unsure, I
> simply wait for CI to finish
>
> Tianqi
>
>
> On Fri, Dec 1, 2017 at 9:41 AM Mu Li  wrote:
>
>> Totally agree that CI is useful. Actually, I wrote the jenkinsfile and
>> setup the jenkins server before moving to apache server. I just mention
>> that we cannot rely on the CI test. It currently covers operator unittests
>> and regression test on several cnns. But the code coverage isn't great. If
>> a PR touches the core system, the best practice today is still code
>> reviewing. Otherwise, such as a PR is mainly about examples, the CI often
>> doesn't help so we just waste machine times.
>>
>> I think checking the exact code coverage is on the roadmap, but I don't
>> know if we have any progress on it.
>>
>> On Fri, Dec 1, 2017 at 6:19 AM, Pedro Larroy > >
>> wrote:
>>
>> > CI catches problems all the time. I don't think many of us can afford
>> > to build all the flavors and architectures in their laptops or
>> > workstations, so we have to rely on CI to catch all kinds of errors
>> > from compilation errors to bugs plus regressions, specially in a
>> > project which has so many build flavors.
>> >
>> > I have had this experience in big projects several times and I can
>> > tell you it's always the same.
>> >
>> > So from extensive software development experience I write that we will
>> > be able to develop and merge much faster once we have a reliable CI
>> > running in short cycles, any other approach or shortcuts is just
>> > accumulating technical debt for the future that somebody will have to
>> > cleanup and will slow down development. Is better to have a CI with a
>> > reduced scope working reliably than bypassing CI.
>> >
>> > This is irrespective of using dev to merge or unprotected master.
>> >
>> > We can't afford to have increased warnings, bugs creeping into the
>> > codebase going unnoticed, build system problems, performance
>> > regressions, etc. And we have to rely on a solid CI for this. If we
>> > are not ready for this, we should halt feature development or at least
>> > merging new features until we have a stable codebase and build system.
>> >
>>


Re: Protected master needs to be turned off

2017-12-01 Thread Marco de Abreu
We plan to add the possibility to trigger or stop a build by adding a
comment in the pull request - this will be added after we switched to the
new CI.

Am 01.12.2017 7:25 nachm. schrieb "Mu Li" :

> That's how it works before, committers can stop a CI test if it's a simple
> fix, such as typo. But with apache's jenkins server,  it's nontrivial to
> request the permission to stop/start a CI test.
>
> I had a thought about to let the CI be smart enough to only run the tests
> related to the code changes. But I felt it's easier to have committers to
> do it manually. Given the current situation, however, it's probably worth
> spending times to improve the CI instead of requesting changes to the
> repo/CI server.
>
> On Fri, Dec 1, 2017 at 9:49 AM, Tianqi Chen 
> wrote:
>
> > I think we are still using CI to catch bugs. And we should take caution
> > when merging something that CI did not catch up due to the response time.
> > This do not contradict what comitter can do with their best judgement.
> For
> > example, I would normally switch CI off and merge simple typo fixes. If
> the
> > fix merged causes problem, you still get CI to report it maybe a few
> hours
> > later.
> >
> > The bottom line is that comitter get these information as feedbacks and
> > they use their best judgement when do so. Most of the time when unsure, I
> > simply wait for CI to finish
> >
> > Tianqi
> >
> >
> > On Fri, Dec 1, 2017 at 9:41 AM Mu Li  wrote:
> >
> > > Totally agree that CI is useful. Actually, I wrote the jenkinsfile and
> > > setup the jenkins server before moving to apache server. I just mention
> > > that we cannot rely on the CI test. It currently covers operator
> > unittests
> > > and regression test on several cnns. But the code coverage isn't great.
> > If
> > > a PR touches the core system, the best practice today is still code
> > > reviewing. Otherwise, such as a PR is mainly about examples, the CI
> often
> > > doesn't help so we just waste machine times.
> > >
> > > I think checking the exact code coverage is on the roadmap, but I don't
> > > know if we have any progress on it.
> > >
> > > On Fri, Dec 1, 2017 at 6:19 AM, Pedro Larroy <
> > pedro.larroy.li...@gmail.com
> > > >
> > > wrote:
> > >
> > > > CI catches problems all the time. I don't think many of us can afford
> > > > to build all the flavors and architectures in their laptops or
> > > > workstations, so we have to rely on CI to catch all kinds of errors
> > > > from compilation errors to bugs plus regressions, specially in a
> > > > project which has so many build flavors.
> > > >
> > > > I have had this experience in big projects several times and I can
> > > > tell you it's always the same.
> > > >
> > > > So from extensive software development experience I write that we
> will
> > > > be able to develop and merge much faster once we have a reliable CI
> > > > running in short cycles, any other approach or shortcuts is just
> > > > accumulating technical debt for the future that somebody will have to
> > > > cleanup and will slow down development. Is better to have a CI with a
> > > > reduced scope working reliably than bypassing CI.
> > > >
> > > > This is irrespective of using dev to merge or unprotected master.
> > > >
> > > > We can't afford to have increased warnings, bugs creeping into the
> > > > codebase going unnoticed, build system problems, performance
> > > > regressions, etc. And we have to rely on a solid CI for this. If we
> > > > are not ready for this, we should halt feature development or at
> least
> > > > merging new features until we have a stable codebase and build
> system.
> > > >
> > >
> >
>


Re: Protected master needs to be turned off

2017-12-01 Thread Mu Li
That's how it works before, committers can stop a CI test if it's a simple
fix, such as typo. But with apache's jenkins server,  it's nontrivial to
request the permission to stop/start a CI test.

I had a thought about to let the CI be smart enough to only run the tests
related to the code changes. But I felt it's easier to have committers to
do it manually. Given the current situation, however, it's probably worth
spending times to improve the CI instead of requesting changes to the
repo/CI server.

On Fri, Dec 1, 2017 at 9:49 AM, Tianqi Chen 
wrote:

> I think we are still using CI to catch bugs. And we should take caution
> when merging something that CI did not catch up due to the response time.
> This do not contradict what comitter can do with their best judgement. For
> example, I would normally switch CI off and merge simple typo fixes. If the
> fix merged causes problem, you still get CI to report it maybe a few hours
> later.
>
> The bottom line is that comitter get these information as feedbacks and
> they use their best judgement when do so. Most of the time when unsure, I
> simply wait for CI to finish
>
> Tianqi
>
>
> On Fri, Dec 1, 2017 at 9:41 AM Mu Li  wrote:
>
> > Totally agree that CI is useful. Actually, I wrote the jenkinsfile and
> > setup the jenkins server before moving to apache server. I just mention
> > that we cannot rely on the CI test. It currently covers operator
> unittests
> > and regression test on several cnns. But the code coverage isn't great.
> If
> > a PR touches the core system, the best practice today is still code
> > reviewing. Otherwise, such as a PR is mainly about examples, the CI often
> > doesn't help so we just waste machine times.
> >
> > I think checking the exact code coverage is on the roadmap, but I don't
> > know if we have any progress on it.
> >
> > On Fri, Dec 1, 2017 at 6:19 AM, Pedro Larroy <
> pedro.larroy.li...@gmail.com
> > >
> > wrote:
> >
> > > CI catches problems all the time. I don't think many of us can afford
> > > to build all the flavors and architectures in their laptops or
> > > workstations, so we have to rely on CI to catch all kinds of errors
> > > from compilation errors to bugs plus regressions, specially in a
> > > project which has so many build flavors.
> > >
> > > I have had this experience in big projects several times and I can
> > > tell you it's always the same.
> > >
> > > So from extensive software development experience I write that we will
> > > be able to develop and merge much faster once we have a reliable CI
> > > running in short cycles, any other approach or shortcuts is just
> > > accumulating technical debt for the future that somebody will have to
> > > cleanup and will slow down development. Is better to have a CI with a
> > > reduced scope working reliably than bypassing CI.
> > >
> > > This is irrespective of using dev to merge or unprotected master.
> > >
> > > We can't afford to have increased warnings, bugs creeping into the
> > > codebase going unnoticed, build system problems, performance
> > > regressions, etc. And we have to rely on a solid CI for this. If we
> > > are not ready for this, we should halt feature development or at least
> > > merging new features until we have a stable codebase and build system.
> > >
> >
>


Re: Protected master needs to be turned off

2017-12-01 Thread Tianqi Chen
I think we are still using CI to catch bugs. And we should take caution
when merging something that CI did not catch up due to the response time.
This do not contradict what comitter can do with their best judgement. For
example, I would normally switch CI off and merge simple typo fixes. If the
fix merged causes problem, you still get CI to report it maybe a few hours
later.

The bottom line is that comitter get these information as feedbacks and
they use their best judgement when do so. Most of the time when unsure, I
simply wait for CI to finish

Tianqi


On Fri, Dec 1, 2017 at 9:41 AM Mu Li  wrote:

> Totally agree that CI is useful. Actually, I wrote the jenkinsfile and
> setup the jenkins server before moving to apache server. I just mention
> that we cannot rely on the CI test. It currently covers operator unittests
> and regression test on several cnns. But the code coverage isn't great. If
> a PR touches the core system, the best practice today is still code
> reviewing. Otherwise, such as a PR is mainly about examples, the CI often
> doesn't help so we just waste machine times.
>
> I think checking the exact code coverage is on the roadmap, but I don't
> know if we have any progress on it.
>
> On Fri, Dec 1, 2017 at 6:19 AM, Pedro Larroy  >
> wrote:
>
> > CI catches problems all the time. I don't think many of us can afford
> > to build all the flavors and architectures in their laptops or
> > workstations, so we have to rely on CI to catch all kinds of errors
> > from compilation errors to bugs plus regressions, specially in a
> > project which has so many build flavors.
> >
> > I have had this experience in big projects several times and I can
> > tell you it's always the same.
> >
> > So from extensive software development experience I write that we will
> > be able to develop and merge much faster once we have a reliable CI
> > running in short cycles, any other approach or shortcuts is just
> > accumulating technical debt for the future that somebody will have to
> > cleanup and will slow down development. Is better to have a CI with a
> > reduced scope working reliably than bypassing CI.
> >
> > This is irrespective of using dev to merge or unprotected master.
> >
> > We can't afford to have increased warnings, bugs creeping into the
> > codebase going unnoticed, build system problems, performance
> > regressions, etc. And we have to rely on a solid CI for this. If we
> > are not ready for this, we should halt feature development or at least
> > merging new features until we have a stable codebase and build system.
> >
>


Re: Protected master needs to be turned off

2017-12-01 Thread Mu Li
Totally agree that CI is useful. Actually, I wrote the jenkinsfile and
setup the jenkins server before moving to apache server. I just mention
that we cannot rely on the CI test. It currently covers operator unittests
and regression test on several cnns. But the code coverage isn't great. If
a PR touches the core system, the best practice today is still code
reviewing. Otherwise, such as a PR is mainly about examples, the CI often
doesn't help so we just waste machine times.

I think checking the exact code coverage is on the roadmap, but I don't
know if we have any progress on it.

On Fri, Dec 1, 2017 at 6:19 AM, Pedro Larroy 
wrote:

> CI catches problems all the time. I don't think many of us can afford
> to build all the flavors and architectures in their laptops or
> workstations, so we have to rely on CI to catch all kinds of errors
> from compilation errors to bugs plus regressions, specially in a
> project which has so many build flavors.
>
> I have had this experience in big projects several times and I can
> tell you it's always the same.
>
> So from extensive software development experience I write that we will
> be able to develop and merge much faster once we have a reliable CI
> running in short cycles, any other approach or shortcuts is just
> accumulating technical debt for the future that somebody will have to
> cleanup and will slow down development. Is better to have a CI with a
> reduced scope working reliably than bypassing CI.
>
> This is irrespective of using dev to merge or unprotected master.
>
> We can't afford to have increased warnings, bugs creeping into the
> codebase going unnoticed, build system problems, performance
> regressions, etc. And we have to rely on a solid CI for this. If we
> are not ready for this, we should halt feature development or at least
> merging new features until we have a stable codebase and build system.
>


Re: Protected master needs to be turned off

2017-12-01 Thread Pedro Larroy
CI catches problems all the time. I don't think many of us can afford
to build all the flavors and architectures in their laptops or
workstations, so we have to rely on CI to catch all kinds of errors
from compilation errors to bugs plus regressions, specially in a
project which has so many build flavors.

I have had this experience in big projects several times and I can
tell you it's always the same.

So from extensive software development experience I write that we will
be able to develop and merge much faster once we have a reliable CI
running in short cycles, any other approach or shortcuts is just
accumulating technical debt for the future that somebody will have to
cleanup and will slow down development. Is better to have a CI with a
reduced scope working reliably than bypassing CI.

This is irrespective of using dev to merge or unprotected master.

We can't afford to have increased warnings, bugs creeping into the
codebase going unnoticed, build system problems, performance
regressions, etc. And we have to rely on a solid CI for this. If we
are not ready for this, we should halt feature development or at least
merging new features until we have a stable codebase and build system.


Re: Protected master needs to be turned off

2017-11-30 Thread Marco de Abreu
Sounds like a good way to me.

By the way, the setup for the new CI is ready and we’re already verifying
the master branch. We still need to do some clean up and await the
Amazon-AppSec review.

On Thu, Nov 30, 2017 at 8:55 PM, Mu Li  wrote:

> I barely remember CI caught bugs except for lint, but it definitely slows
> down the code merge.
>
> I understand the general concerns for removing master protection. So I
> propose to use the dev branch to merge changes until the CI is stable. And
> make the nightly build build the dev branch instead of master.
>
> Best
> Mu
>
> > On Nov 30, 2017, at 11:34 AM, Gautam  wrote:
> >
> > I believe few of the committers voted -1 and those who favored they have
> > put  pre-condition.
> > As mentioned before and mentioning again without protected master it will
> > be hard to debug the build failure.
> > And I am sure everyone here is aware of the challenges which CI faces
> every
> > day, not having protected master makes it more difficult.
> >
> > -Gautam
> >
> >
> >
> >
> >
> >
> >
> >
> >> On Thu, Nov 30, 2017 at 11:24 AM, Eric Xie  wrote:
> >>
> >> Since committers voted for +1. We consider this vote passed.
> >>
> >> Thanks,
> >> Eric
> >>
> >>
> >>
> >>> On 2017-11-19 12:51, "Eric Xie" wrote:
> >>> Hi all,
> >>> I'm starting this thread to vote on turning off protected master. The
> >> reasons are:
> >>>
> >>> 1. Since we turned on protected master pending PRs has grown from 40 to
> >> 80. It is severely slowing down development.
> >>>
> >>> 2. Committers, not CI, are ultimately responsible for the code they
> >> merge. You should only override the CI when you are very confident that
> CI
> >> is the problem, not your code. If it turns out you are wrong, you should
> >> fix it ASAP. This is the bare minimum requirement for all committers: BE
> >> RESPONSIBLE.
> >>>
> >>> I'm aware of the argument for using protected master: It make sure that
> >> master is stable.
> >>>
> >>> Well, master will be most stable if we stop adding any commits to it.
> >> But that's not what we want is it?
> >>>
> >>> Protected master hardly adds any stability. The faulty tests that
> breaks
> >> master at random got merged into master because they happened to succeed
> >> once.
> >>>
> >>> Thanks,
> >>> Junyuan Xie
> >>>
> >>
> >
> >
> >
> > --
> > Best Regards,
> > Gautam Kumar
>


Re: Protected master needs to be turned off

2017-11-30 Thread Mu Li
I barely remember CI caught bugs except for lint, but it definitely slows down 
the code merge. 

I understand the general concerns for removing master protection. So I propose 
to use the dev branch to merge changes until the CI is stable. And make the 
nightly build build the dev branch instead of master. 

Best
Mu

> On Nov 30, 2017, at 11:34 AM, Gautam  wrote:
> 
> I believe few of the committers voted -1 and those who favored they have
> put  pre-condition.
> As mentioned before and mentioning again without protected master it will
> be hard to debug the build failure.
> And I am sure everyone here is aware of the challenges which CI faces every
> day, not having protected master makes it more difficult.
> 
> -Gautam
> 
> 
> 
> 
> 
> 
> 
> 
>> On Thu, Nov 30, 2017 at 11:24 AM, Eric Xie  wrote:
>> 
>> Since committers voted for +1. We consider this vote passed.
>> 
>> Thanks,
>> Eric
>> 
>> 
>> 
>>> On 2017-11-19 12:51, "Eric Xie" wrote:
>>> Hi all,
>>> I'm starting this thread to vote on turning off protected master. The
>> reasons are:
>>> 
>>> 1. Since we turned on protected master pending PRs has grown from 40 to
>> 80. It is severely slowing down development.
>>> 
>>> 2. Committers, not CI, are ultimately responsible for the code they
>> merge. You should only override the CI when you are very confident that CI
>> is the problem, not your code. If it turns out you are wrong, you should
>> fix it ASAP. This is the bare minimum requirement for all committers: BE
>> RESPONSIBLE.
>>> 
>>> I'm aware of the argument for using protected master: It make sure that
>> master is stable.
>>> 
>>> Well, master will be most stable if we stop adding any commits to it.
>> But that's not what we want is it?
>>> 
>>> Protected master hardly adds any stability. The faulty tests that breaks
>> master at random got merged into master because they happened to succeed
>> once.
>>> 
>>> Thanks,
>>> Junyuan Xie
>>> 
>> 
> 
> 
> 
> -- 
> Best Regards,
> Gautam Kumar


Re: Protected master needs to be turned off

2017-11-30 Thread Eric Xie
Since committers voted for +1. We consider this vote passed.

Thanks,
Eric



On 2017-11-19 12:51, "Eric Xie" wrote: 
> Hi all,
> I'm starting this thread to vote on turning off protected master. The reasons 
> are:
> 
> 1. Since we turned on protected master pending PRs has grown from 40 to 80. 
> It is severely slowing down development.
> 
> 2. Committers, not CI, are ultimately responsible for the code they merge. 
> You should only override the CI when you are very confident that CI is the 
> problem, not your code. If it turns out you are wrong, you should fix it 
> ASAP. This is the bare minimum requirement for all committers: BE RESPONSIBLE.
> 
> I'm aware of the argument for using protected master: It make sure that 
> master is stable.
> 
> Well, master will be most stable if we stop adding any commits to it. But 
> that's not what we want is it?
> 
> Protected master hardly adds any stability. The faulty tests that breaks 
> master at random got merged into master because they happened to succeed once.
> 
> Thanks,
> Junyuan Xie
> 


Re: Protected master needs to be turned off

2017-11-20 Thread kellen sunderland
-0.5 (non-binding).   I would propose that if there are pieces of CI that
are slowing down development (and I think we can all agree that there are)
that we should strip out these problematic CI pieces and open issues for
them.  We can then assign issues to people and evaluate what should be done
to fix or mitigate the root cause of the problems (be it slow runtime,
flakiness, etc.).  This way we will get to a state where the CI is stable,
and we'll have a backlog of issues to fix.

-Kellen

On Mon, Nov 20, 2017 at 2:36 PM, Marco de Abreu <
marco.g.ab...@googlemail.com> wrote:

> Small update regarding the status of the new CI: We've been able to get
> ubuntu builds and tests up and running.
>
> I’ve received requests to provide exact times for all important stages of
> the CI. As the first runs have been executed successfully, I’ll provide you
> with some data. This will give you a small guidance where our bottlenecks
> are located and in which field we can have improvements. All jobs of the
> same stage are being executed in parallel.
>
>
>
> In the following you’ll see the durations of all Ubuntu-Builds and -Tests.
> They are all (CPU- and GPU-tests) executed on a G3.8xlarge with 32 vCPUs.
> These times only cover the actual core execution time of the task
> *without* stashing,
> cleaning etc.
>
>
>
> Build:
>
>- Amalgamation: 3m40s
>- Amalgamation MIN: 3m49s
>- *CPU: Openblas: 8m38s*
>- *GPU CUDA7.5+cuDNN5: 11m55s*
>- *GPU: MKLML: 9m42s*
>
>
>
>
>
> Unit Test:
>
>- Perl CPU: 17m36s
>- Perl GPU: 8m31s
>- *Python2: CPU: 1h10m10s*
>- Python2: GPU: 14m10s
>- Python2: MKLML-CPU: 10m25s + 1m43s = 12m08s
>- Python2: MKLML-GPU: 14m19s
>- *Python3: CPU: 1h8m26s*
>- Python3: GPU: 14m58s
>- Python3: MKLML-CPU: 10m31s (seems like this job is not running the
>Python2-Train-Tests)
>- Python3: MKLML-GPU: 13m37s
>- *R: CPU: 20m6s + 9s + 9m58s = 30m13s*
>- *R: GPU: 21m50s + 10s + 1m17s = 23m17s*
>- Scala: CPU: 2m18s + 3m57s = 6m15s
>
>
>
> Integration Test:
>
>- Caffe GPU: 7m42s
>- Python GPU: 1m47s
>- Cpp-package GPU: 59s
>
>
> The next step is to get Windows up and running and improve execution time
> on CPU-tasks.
>
> -Marco
>
> On Mon, Nov 20, 2017 at 9:18 PM, YiZhi Liu  wrote:
>
> > +1
> >
> > 2017-11-20 11:47 GMT-08:00 Tianqi Chen :
> > > +1 until new CI is implemented.
> > >
> > > Tianqi
> > >
> > > On Mon, Nov 20, 2017 at 11:11 AM, Eric Xie  wrote:
> > >
> > >> A lot of people seems to be confused, so let's clarify the separation
> of
> > >> roles/responsibilities:
> > >>
> > >> 1. The committers that merge code are responsible for code quality and
> > >> tests passing on master.
> > >>
> > >> 2. The CI / infra maintainers are responsible for keeping CI running
> > >> properly and honestly reporting bugs.
> > >> If test fails because Jenkins is faulty or slow, you need to fix
> > Jenkins.
> > >> If test fails because there is a bug in the code, then you did a good
> > job
> > >> catching bugs. It is not your responsibility to fix the bug. You
> merely
> > >> should report it to committers.
> > >>
> > >> Thanks,
> > >> Junyuan Xie
> > >>
> > >>
> > >> On 2017-11-19 16:07, Gautam  wrote:
> > >> > -1
> > >> >
> > >> > Please see inline.
> > >> >
> > >> > On Nov 19, 2017 12:51 PM, "Eric Xie"  wrote:
> > >> >
> > >> > Hi all,
> > >> > I'm starting this thread to vote on turning off protected master.
> The
> > >> > reasons are:
> > >> >
> > >> > 1. Since we turned on protected master pending PRs has grown from 40
> > to
> > >> 80.
> > >> > It is severely slowing down development.
> > >> >
> > >> >
> > >> >  Turning protection off, will give you ability to merge the code
> > >> which
> > >> > has build failure. How and more importantly who will be best judge
> to
> > >> > figure out what's is wrong ? If CI is a problem then we have people
> > from
> > >> > infra team, who are sort of 'on call" day and night trying to fix.
> > >> > Including me trying to fix the slave at 2 am in night. I have seen
> > >> > commiters sending PRs and if it fails without even looking at the
> > >> > reason,which could be just a temporary error, they reach out to
> infra
> > >> team
> > >> > immediately. So I don't agree that we should turn off the protected
> > >> branch
> > >> > just for the sake of speed. We should care more on quality than
> speed.
> > >> >
> > >> >
> > >> > 2. Committers, not CI, are ultimately responsible for the code they
> > >> merge.
> > >> > You should only override the CI when you are very confident that CI
> is
> > >> the
> > >> > problem, not your code. If it turns out you are wrong, you should
> fix
> > it
> > >> > ASAP. This is the bare minimum requirement for all committers: BE
> > >> > RESPONSIBLE.
> > >> >
> > >> > How do we make sure that this in deed happens?
> > >> >
> > >> > I'm aware of the argument for 

Re: Protected master needs to be turned off

2017-11-20 Thread Marco de Abreu
Small update regarding the status of the new CI: We've been able to get
ubuntu builds and tests up and running.

I’ve received requests to provide exact times for all important stages of
the CI. As the first runs have been executed successfully, I’ll provide you
with some data. This will give you a small guidance where our bottlenecks
are located and in which field we can have improvements. All jobs of the
same stage are being executed in parallel.



In the following you’ll see the durations of all Ubuntu-Builds and -Tests.
They are all (CPU- and GPU-tests) executed on a G3.8xlarge with 32 vCPUs.
These times only cover the actual core execution time of the task
*without* stashing,
cleaning etc.



Build:

   - Amalgamation: 3m40s
   - Amalgamation MIN: 3m49s
   - *CPU: Openblas: 8m38s*
   - *GPU CUDA7.5+cuDNN5: 11m55s*
   - *GPU: MKLML: 9m42s*





Unit Test:

   - Perl CPU: 17m36s
   - Perl GPU: 8m31s
   - *Python2: CPU: 1h10m10s*
   - Python2: GPU: 14m10s
   - Python2: MKLML-CPU: 10m25s + 1m43s = 12m08s
   - Python2: MKLML-GPU: 14m19s
   - *Python3: CPU: 1h8m26s*
   - Python3: GPU: 14m58s
   - Python3: MKLML-CPU: 10m31s (seems like this job is not running the
   Python2-Train-Tests)
   - Python3: MKLML-GPU: 13m37s
   - *R: CPU: 20m6s + 9s + 9m58s = 30m13s*
   - *R: GPU: 21m50s + 10s + 1m17s = 23m17s*
   - Scala: CPU: 2m18s + 3m57s = 6m15s



Integration Test:

   - Caffe GPU: 7m42s
   - Python GPU: 1m47s
   - Cpp-package GPU: 59s


The next step is to get Windows up and running and improve execution time
on CPU-tasks.

-Marco

On Mon, Nov 20, 2017 at 9:18 PM, YiZhi Liu  wrote:

> +1
>
> 2017-11-20 11:47 GMT-08:00 Tianqi Chen :
> > +1 until new CI is implemented.
> >
> > Tianqi
> >
> > On Mon, Nov 20, 2017 at 11:11 AM, Eric Xie  wrote:
> >
> >> A lot of people seems to be confused, so let's clarify the separation of
> >> roles/responsibilities:
> >>
> >> 1. The committers that merge code are responsible for code quality and
> >> tests passing on master.
> >>
> >> 2. The CI / infra maintainers are responsible for keeping CI running
> >> properly and honestly reporting bugs.
> >> If test fails because Jenkins is faulty or slow, you need to fix
> Jenkins.
> >> If test fails because there is a bug in the code, then you did a good
> job
> >> catching bugs. It is not your responsibility to fix the bug. You merely
> >> should report it to committers.
> >>
> >> Thanks,
> >> Junyuan Xie
> >>
> >>
> >> On 2017-11-19 16:07, Gautam  wrote:
> >> > -1
> >> >
> >> > Please see inline.
> >> >
> >> > On Nov 19, 2017 12:51 PM, "Eric Xie"  wrote:
> >> >
> >> > Hi all,
> >> > I'm starting this thread to vote on turning off protected master. The
> >> > reasons are:
> >> >
> >> > 1. Since we turned on protected master pending PRs has grown from 40
> to
> >> 80.
> >> > It is severely slowing down development.
> >> >
> >> >
> >> >  Turning protection off, will give you ability to merge the code
> >> which
> >> > has build failure. How and more importantly who will be best judge to
> >> > figure out what's is wrong ? If CI is a problem then we have people
> from
> >> > infra team, who are sort of 'on call" day and night trying to fix.
> >> > Including me trying to fix the slave at 2 am in night. I have seen
> >> > commiters sending PRs and if it fails without even looking at the
> >> > reason,which could be just a temporary error, they reach out to infra
> >> team
> >> > immediately. So I don't agree that we should turn off the protected
> >> branch
> >> > just for the sake of speed. We should care more on quality than speed.
> >> >
> >> >
> >> > 2. Committers, not CI, are ultimately responsible for the code they
> >> merge.
> >> > You should only override the CI when you are very confident that CI is
> >> the
> >> > problem, not your code. If it turns out you are wrong, you should fix
> it
> >> > ASAP. This is the bare minimum requirement for all committers: BE
> >> > RESPONSIBLE.
> >> >
> >> > How do we make sure that this in deed happens?
> >> >
> >> > I'm aware of the argument for using protected master: It make sure
> that
> >>
> >>
> >>
> >> > master is stable.
> >> >
> >> > Well, master will be most stable if we stop adding any commits to it.
> But
> >> > that's not what we want is it?
> >> >
> >> > No we are not saying don't add anything in master, we are just saying
> >> > please don't add bad code to the master. And yes I have seen bad code
> has
> >> > been merged to the master when protected branch was not enabled.
> >> >
> >> > Protected master hardly adds any stability. The faulty tests that
> breaks
> >> > master at random got merged into master because they happened to
> succeed
> >> > once.
> >> >
> >> > That's not true, it filter out one of the important aspect that at
> least
> >> > when code was merged it completed the whole cycle of build and test.
> Sure
> >> > flaky test we can track down.
> >> >
> 

Re: Protected master needs to be turned off

2017-11-20 Thread Tianqi Chen
+1 until new CI is implemented.

Tianqi

On Mon, Nov 20, 2017 at 11:11 AM, Eric Xie  wrote:

> A lot of people seems to be confused, so let's clarify the separation of
> roles/responsibilities:
>
> 1. The committers that merge code are responsible for code quality and
> tests passing on master.
>
> 2. The CI / infra maintainers are responsible for keeping CI running
> properly and honestly reporting bugs.
> If test fails because Jenkins is faulty or slow, you need to fix Jenkins.
> If test fails because there is a bug in the code, then you did a good job
> catching bugs. It is not your responsibility to fix the bug. You merely
> should report it to committers.
>
> Thanks,
> Junyuan Xie
>
>
> On 2017-11-19 16:07, Gautam  wrote:
> > -1
> >
> > Please see inline.
> >
> > On Nov 19, 2017 12:51 PM, "Eric Xie"  wrote:
> >
> > Hi all,
> > I'm starting this thread to vote on turning off protected master. The
> > reasons are:
> >
> > 1. Since we turned on protected master pending PRs has grown from 40 to
> 80.
> > It is severely slowing down development.
> >
> >
> >  Turning protection off, will give you ability to merge the code
> which
> > has build failure. How and more importantly who will be best judge to
> > figure out what's is wrong ? If CI is a problem then we have people from
> > infra team, who are sort of 'on call" day and night trying to fix.
> > Including me trying to fix the slave at 2 am in night. I have seen
> > commiters sending PRs and if it fails without even looking at the
> > reason,which could be just a temporary error, they reach out to infra
> team
> > immediately. So I don't agree that we should turn off the protected
> branch
> > just for the sake of speed. We should care more on quality than speed.
> >
> >
> > 2. Committers, not CI, are ultimately responsible for the code they
> merge.
> > You should only override the CI when you are very confident that CI is
> the
> > problem, not your code. If it turns out you are wrong, you should fix it
> > ASAP. This is the bare minimum requirement for all committers: BE
> > RESPONSIBLE.
> >
> > How do we make sure that this in deed happens?
> >
> > I'm aware of the argument for using protected master: It make sure that
>
>
>
> > master is stable.
> >
> > Well, master will be most stable if we stop adding any commits to it. But
> > that's not what we want is it?
> >
> > No we are not saying don't add anything in master, we are just saying
> > please don't add bad code to the master. And yes I have seen bad code has
> > been merged to the master when protected branch was not enabled.
> >
> > Protected master hardly adds any stability. The faulty tests that breaks
> > master at random got merged into master because they happened to succeed
> > once.
> >
> > That's not true, it filter out one of the important aspect that at least
> > when code was merged it completed the whole cycle of build and test. Sure
> > flaky test we can track down.
> >
> > Thanks,
> > Junyuan Xie
> >
>


Re: Protected master needs to be turned off

2017-11-20 Thread Meghna Baijal
-1 (non binding)
While I agree that the protected master is slowing development but it is
also helping identify problematic tests. If we don't do this now, the same
issues will exist on the new CI.

On Sun, Nov 19, 2017 at 1:51 PM, Marco de Abreu <
marco.g.ab...@googlemail.com> wrote:

> Hello,
>
> -1 (non binding)
>
> Who is going to be responsible for changes breaking tests and having other
> side effects after they have been merged? I'm afraid that this will harm
> further development. At the moment I'm the responsible person for setting
> up the new CI and so far have my results shown that not the CI itself is
> the problem but also the stability of our code as well as the tests
> themselves. At the moment we are having big issues to get a stable CI
> because MXNet seems to be relying on so specific architectures,
> dependencies and other factors which I'm not even able to track down that
> this causes everything to be unstable.
>
> Just to point it out: If we encounter so many problems while setting up a
> CI system, doesn't that mean that our users and customers are also going to
> face those issues as soon as things are getting more complicated? This is a
> red flag in my opinion and I'm really looking forward to the usability
> Sprint, but at the moment I'm afraid that an unprotected master will make
> the situation even worse. It's already enough work to isolate and fix the
> current issues, but if new untested changes get merged, this is going to be
> like fighting a wildfire with a bottle of water.
>
> So please revise your thoughts. If anybody is blocked by the protected
> master, I would really appreciate it if they could approach me personally
> in order to help stabilising the current situation. Just feeding in more
> and more changes on one end while we're fixing issues on the other end
> won't get us anywhere.
>
> Best regards,
> Marco
>
> Am 19.11.2017 10:08 nachm. schrieb "Chris Olivier"  >:
>
> > Revised:
> >
> >
> > +1 at least until new CI is implemented. Then reevaluate.
> >
> > On Sun, Nov 19, 2017 at 1:07 PM Chris Olivier 
> > wrote:
> >
> > > +1
> > >
> > >
> > > On Sun, Nov 19, 2017 at 12:52 PM Zha, Sheng 
> wrote:
> > >
> > >> +1
> > >>
> > >> Best regards,
> > >> -sz
> > >>
> > >> On 11/19/17, 12:51 PM, "Eric Xie"  wrote:
> > >>
> > >> Hi all,
> > >> I'm starting this thread to vote on turning off protected master.
> > The
> > >> reasons are:
> > >>
> > >> 1. Since we turned on protected master pending PRs has grown from
> 40
> > >> to 80. It is severely slowing down development.
> > >>
> > >> 2. Committers, not CI, are ultimately responsible for the code
> they
> > >> merge. You should only override the CI when you are very confident
> that
> > CI
> > >> is the problem, not your code. If it turns out you are wrong, you
> should
> > >> fix it ASAP. This is the bare minimum requirement for all committers:
> BE
> > >> RESPONSIBLE.
> > >>
> > >> I'm aware of the argument for using protected master: It make sure
> > >> that master is stable.
> > >>
> > >> Well, master will be most stable if we stop adding any commits to
> > it.
> > >> But that's not what we want is it?
> > >>
> > >> Protected master hardly adds any stability. The faulty tests that
> > >> breaks master at random got merged into master because they happened
> to
> > >> succeed once.
> > >>
> > >> Thanks,
> > >> Junyuan Xie
> > >>
> > >>
> > >>
> > >>
> >
>


Re: Protected master needs to be turned off

2017-11-20 Thread Naveen Swamy
-1
for the reasons, Gautam and Marco have pointed out.

On Sun, Nov 19, 2017 at 4:07 PM, Gautam  wrote:

> -1
>
> Please see inline.
>
> On Nov 19, 2017 12:51 PM, "Eric Xie"  wrote:
>
> Hi all,
> I'm starting this thread to vote on turning off protected master. The
> reasons are:
>
> 1. Since we turned on protected master pending PRs has grown from 40 to 80.
> It is severely slowing down development.
>
>
>  Turning protection off, will give you ability to merge the code which
> has build failure. How and more importantly who will be best judge to
> figure out what's is wrong ? If CI is a problem then we have people from
> infra team, who are sort of 'on call" day and night trying to fix.
> Including me trying to fix the slave at 2 am in night. I have seen
> commiters sending PRs and if it fails without even looking at the
> reason,which could be just a temporary error, they reach out to infra team
> immediately. So I don't agree that we should turn off the protected branch
> just for the sake of speed. We should care more on quality than speed.
>
>
> 2. Committers, not CI, are ultimately responsible for the code they merge.
> You should only override the CI when you are very confident that CI is the
> problem, not your code. If it turns out you are wrong, you should fix it
> ASAP. This is the bare minimum requirement for all committers: BE
> RESPONSIBLE.
>
> How do we make sure that this in deed happens?
>
> I'm aware of the argument for using protected master: It make sure that
> master is stable.
>
> Well, master will be most stable if we stop adding any commits to it. But
> that's not what we want is it?
>
> No we are not saying don't add anything in master, we are just saying
> please don't add bad code to the master. And yes I have seen bad code has
> been merged to the master when protected branch was not enabled.
>
> Protected master hardly adds any stability. The faulty tests that breaks
> master at random got merged into master because they happened to succeed
> once.
>
> That's not true, it filter out one of the important aspect that at least
> when code was merged it completed the whole cycle of build and test. Sure
> flaky test we can track down.
>
> Thanks,
> Junyuan Xie
>


Re: Protected master needs to be turned off

2017-11-19 Thread Gautam
-1

Please see inline.

On Nov 19, 2017 12:51 PM, "Eric Xie"  wrote:

Hi all,
I'm starting this thread to vote on turning off protected master. The
reasons are:

1. Since we turned on protected master pending PRs has grown from 40 to 80.
It is severely slowing down development.


 Turning protection off, will give you ability to merge the code which
has build failure. How and more importantly who will be best judge to
figure out what's is wrong ? If CI is a problem then we have people from
infra team, who are sort of 'on call" day and night trying to fix.
Including me trying to fix the slave at 2 am in night. I have seen
commiters sending PRs and if it fails without even looking at the
reason,which could be just a temporary error, they reach out to infra team
immediately. So I don't agree that we should turn off the protected branch
just for the sake of speed. We should care more on quality than speed.


2. Committers, not CI, are ultimately responsible for the code they merge.
You should only override the CI when you are very confident that CI is the
problem, not your code. If it turns out you are wrong, you should fix it
ASAP. This is the bare minimum requirement for all committers: BE
RESPONSIBLE.

How do we make sure that this in deed happens?

I'm aware of the argument for using protected master: It make sure that
master is stable.

Well, master will be most stable if we stop adding any commits to it. But
that's not what we want is it?

No we are not saying don't add anything in master, we are just saying
please don't add bad code to the master. And yes I have seen bad code has
been merged to the master when protected branch was not enabled.

Protected master hardly adds any stability. The faulty tests that breaks
master at random got merged into master because they happened to succeed
once.

That's not true, it filter out one of the important aspect that at least
when code was merged it completed the whole cycle of build and test. Sure
flaky test we can track down.

Thanks,
Junyuan Xie


Re: Protected master needs to be turned off

2017-11-19 Thread Eric Xie
Committer right is a privilege. If someone keeps merging untested code and 
refuses to fix the problems they have caused then he or she shouldn't be a 
committer.

Using protected master doesn't guarantee that code is well tested. You can 
disable the test and merge code instead of actually fixing the bug. Using 
protected master actually encourages this behavior, since people are focused on 
getting tests to pass. It doesn't do much good towards improving code quality.

As far as I know, no other team managing large code bases uses protected master 
without the ability to override.

Thanks,
Junyuan Xie


On 2017-11-19 13:51, Marco de Abreu  wrote: 
> Hello,
> 
> -1 (non binding)
> 
> Who is going to be responsible for changes breaking tests and having other
> side effects after they have been merged? I'm afraid that this will harm
> further development. At the moment I'm the responsible person for setting
> up the new CI and so far have my results shown that not the CI itself is
> the problem but also the stability of our code as well as the tests
> themselves. At the moment we are having big issues to get a stable CI
> because MXNet seems to be relying on so specific architectures,
> dependencies and other factors which I'm not even able to track down that
> this causes everything to be unstable.
> 
> Just to point it out: If we encounter so many problems while setting up a
> CI system, doesn't that mean that our users and customers are also going to
> face those issues as soon as things are getting more complicated? This is a
> red flag in my opinion and I'm really looking forward to the usability
> Sprint, but at the moment I'm afraid that an unprotected master will make
> the situation even worse. It's already enough work to isolate and fix the
> current issues, but if new untested changes get merged, this is going to be
> like fighting a wildfire with a bottle of water.
> 
> So please revise your thoughts. If anybody is blocked by the protected
> master, I would really appreciate it if they could approach me personally
> in order to help stabilising the current situation. Just feeding in more
> and more changes on one end while we're fixing issues on the other end
> won't get us anywhere.
> 
> Best regards,
> Marco
> 
> Am 19.11.2017 10:08 nachm. schrieb "Chris Olivier" :
> 
> > Revised:
> >
> >
> > +1 at least until new CI is implemented. Then reevaluate.
> >
> > On Sun, Nov 19, 2017 at 1:07 PM Chris Olivier 
> > wrote:
> >
> > > +1
> > >
> > >
> > > On Sun, Nov 19, 2017 at 12:52 PM Zha, Sheng  wrote:
> > >
> > >> +1
> > >>
> > >> Best regards,
> > >> -sz
> > >>
> > >> On 11/19/17, 12:51 PM, "Eric Xie"  wrote:
> > >>
> > >> Hi all,
> > >> I'm starting this thread to vote on turning off protected master.
> > The
> > >> reasons are:
> > >>
> > >> 1. Since we turned on protected master pending PRs has grown from 40
> > >> to 80. It is severely slowing down development.
> > >>
> > >> 2. Committers, not CI, are ultimately responsible for the code they
> > >> merge. You should only override the CI when you are very confident that
> > CI
> > >> is the problem, not your code. If it turns out you are wrong, you should
> > >> fix it ASAP. This is the bare minimum requirement for all committers: BE
> > >> RESPONSIBLE.
> > >>
> > >> I'm aware of the argument for using protected master: It make sure
> > >> that master is stable.
> > >>
> > >> Well, master will be most stable if we stop adding any commits to
> > it.
> > >> But that's not what we want is it?
> > >>
> > >> Protected master hardly adds any stability. The faulty tests that
> > >> breaks master at random got merged into master because they happened to
> > >> succeed once.
> > >>
> > >> Thanks,
> > >> Junyuan Xie
> > >>
> > >>
> > >>
> > >>
> >
> 


Re: Protected master needs to be turned off

2017-11-19 Thread Chris Olivier
A possible compromise could be that we turn off protection for master
branch for the Apache infra only with plans to turn protection back on when
new CI is up.

Furthermore, if something is merged in prematurely that breaks master for
some number of time (ie 3), then we turn protection back on immediately and
without a vote (in order to enforce great care by those who would vote for
protection to stay off).  These failures could be broadcast on dev and why
they shouldn’t count as one of the three can be disputed there.  To turn
protection off again would require another vote.


On Sun, Nov 19, 2017 at 2:37 PM Zha, Sheng  wrote:

> My +1 vote stands. The vote is about what we should do right now, not
> where we should be ideally in 3 months. I don’t think we can move forward
> without disabling the branch protection, because current CI is not in any
> state to base the merge-decisions on. For example, here’s why:
> 1. Master branch protection is currently on. A change that breaks build
> was still merged in on the 16th despite the protection. I wasn’t able to
> merge the fix in yesterday for 8 hours because the CI tests fail.
> 2. False negative rate is currently too high (see the red crosses in
> https://github.com/apache/incubator-mxnet/pulls and
> https://github.com/apache/incubator-mxnet/commits/master)
>
> People who are working on test infrastructure might say that it’s “enough
> work to isolate and fix the current issues”, and I can certainly relate to
> that. On the other hand, you too can probably empathize with the developers
> who has “enough work to develop new features and write tests” without
> having to deal with the broken CI. (note that my argument is on the CI
> system, and flaky test cases are a separate issue).
>
> Regarding “doesn't that mean that our users and customers are also going
> to face those issues”, I honestly don’t think the argument stands. Release
> cycles and distribution channels, as well as the safety measures are there
> exactly to isolate the problems to the development branch and protect the
> users. If anything, turning on branch protection on release branches should
> suffice.
>
> Finally, master branch protection being off doesn’t mean PRs can be merged
> without being tested. Contributors own the code quality and are responsible
> for the changes. Committers and reviewers are there to ensure that merged
> changes are OK.
>
> Best regards,
> -sz
>
> On 11/19/17, 1:51 PM, "Marco de Abreu" 
> wrote:
>
> Hello,
>
> -1 (non binding)
>
> Who is going to be responsible for changes breaking tests and having
> other
> side effects after they have been merged? I'm afraid that this will
> harm
> further development. At the moment I'm the responsible person for
> setting
> up the new CI and so far have my results shown that not the CI itself
> is
> the problem but also the stability of our code as well as the tests
> themselves. At the moment we are having big issues to get a stable CI
> because MXNet seems to be relying on so specific architectures,
> dependencies and other factors which I'm not even able to track down
> that
> this causes everything to be unstable.
>
> Just to point it out: If we encounter so many problems while setting
> up a
> CI system, doesn't that mean that our users and customers are also
> going to
> face those issues as soon as things are getting more complicated? This
> is a
> red flag in my opinion and I'm really looking forward to the usability
> Sprint, but at the moment I'm afraid that an unprotected master will
> make
> the situation even worse. It's already enough work to isolate and fix
> the
> current issues, but if new untested changes get merged, this is going
> to be
> like fighting a wildfire with a bottle of water.
>
> So please revise your thoughts. If anybody is blocked by the protected
> master, I would really appreciate it if they could approach me
> personally
> in order to help stabilising the current situation. Just feeding in
> more
> and more changes on one end while we're fixing issues on the other end
> won't get us anywhere.
>
> Best regards,
> Marco
>
> Am 19.11.2017 10:08 nachm. schrieb "Chris Olivier" <
> cjolivie...@gmail.com>:
>
> > Revised:
> >
> >
> > +1 at least until new CI is implemented. Then reevaluate.
> >
> > On Sun, Nov 19, 2017 at 1:07 PM Chris Olivier  >
> > wrote:
> >
> > > +1
> > >
> > >
> > > On Sun, Nov 19, 2017 at 12:52 PM Zha, Sheng 
> wrote:
> > >
> > >> +1
> > >>
> > >> Best regards,
> > >> -sz
> > >>
> > >> On 11/19/17, 12:51 PM, "Eric Xie"  wrote:
> > >>
> > >> Hi all,
> > >> I'm starting this thread to vote on turning off protected
> master.
> > The
> > >> reasons are:
> 

Re: Protected master needs to be turned off

2017-11-19 Thread Zha, Sheng
My +1 vote stands. The vote is about what we should do right now, not where we 
should be ideally in 3 months. I don’t think we can move forward without 
disabling the branch protection, because current CI is not in any state to base 
the merge-decisions on. For example, here’s why:
1. Master branch protection is currently on. A change that breaks build was 
still merged in on the 16th despite the protection. I wasn’t able to merge the 
fix in yesterday for 8 hours because the CI tests fail.
2. False negative rate is currently too high (see the red crosses in 
https://github.com/apache/incubator-mxnet/pulls and 
https://github.com/apache/incubator-mxnet/commits/master)

People who are working on test infrastructure might say that it’s “enough work 
to isolate and fix the current issues”, and I can certainly relate to that. On 
the other hand, you too can probably empathize with the developers who has 
“enough work to develop new features and write tests” without having to deal 
with the broken CI. (note that my argument is on the CI system, and flaky test 
cases are a separate issue).

Regarding “doesn't that mean that our users and customers are also going to 
face those issues”, I honestly don’t think the argument stands. Release cycles 
and distribution channels, as well as the safety measures are there exactly to 
isolate the problems to the development branch and protect the users. If 
anything, turning on branch protection on release branches should suffice.

Finally, master branch protection being off doesn’t mean PRs can be merged 
without being tested. Contributors own the code quality and are responsible for 
the changes. Committers and reviewers are there to ensure that merged changes 
are OK.

Best regards,
-sz

On 11/19/17, 1:51 PM, "Marco de Abreu"  wrote:

Hello,

-1 (non binding)

Who is going to be responsible for changes breaking tests and having other
side effects after they have been merged? I'm afraid that this will harm
further development. At the moment I'm the responsible person for setting
up the new CI and so far have my results shown that not the CI itself is
the problem but also the stability of our code as well as the tests
themselves. At the moment we are having big issues to get a stable CI
because MXNet seems to be relying on so specific architectures,
dependencies and other factors which I'm not even able to track down that
this causes everything to be unstable.

Just to point it out: If we encounter so many problems while setting up a
CI system, doesn't that mean that our users and customers are also going to
face those issues as soon as things are getting more complicated? This is a
red flag in my opinion and I'm really looking forward to the usability
Sprint, but at the moment I'm afraid that an unprotected master will make
the situation even worse. It's already enough work to isolate and fix the
current issues, but if new untested changes get merged, this is going to be
like fighting a wildfire with a bottle of water.

So please revise your thoughts. If anybody is blocked by the protected
master, I would really appreciate it if they could approach me personally
in order to help stabilising the current situation. Just feeding in more
and more changes on one end while we're fixing issues on the other end
won't get us anywhere.

Best regards,
Marco

Am 19.11.2017 10:08 nachm. schrieb "Chris Olivier" :

> Revised:
>
>
> +1 at least until new CI is implemented. Then reevaluate.
>
> On Sun, Nov 19, 2017 at 1:07 PM Chris Olivier 
> wrote:
>
> > +1
> >
> >
> > On Sun, Nov 19, 2017 at 12:52 PM Zha, Sheng  wrote:
> >
> >> +1
> >>
> >> Best regards,
> >> -sz
> >>
> >> On 11/19/17, 12:51 PM, "Eric Xie"  wrote:
> >>
> >> Hi all,
> >> I'm starting this thread to vote on turning off protected master.
> The
> >> reasons are:
> >>
> >> 1. Since we turned on protected master pending PRs has grown from 
40
> >> to 80. It is severely slowing down development.
> >>
> >> 2. Committers, not CI, are ultimately responsible for the code they
> >> merge. You should only override the CI when you are very confident that
> CI
> >> is the problem, not your code. If it turns out you are wrong, you 
should
> >> fix it ASAP. This is the bare minimum requirement for all committers: 
BE
> >> RESPONSIBLE.
> >>
> >> I'm aware of the argument for using protected master: It make sure
> >> that master is stable.
> >>
> >> Well, master will be most stable if we stop adding any commits to
> it.
> >> But that's not what we want is it?
> >>
> >> Protected master hardly 

Re: Protected master needs to be turned off

2017-11-19 Thread Marco de Abreu
Hello,

-1 (non binding)

Who is going to be responsible for changes breaking tests and having other
side effects after they have been merged? I'm afraid that this will harm
further development. At the moment I'm the responsible person for setting
up the new CI and so far have my results shown that not the CI itself is
the problem but also the stability of our code as well as the tests
themselves. At the moment we are having big issues to get a stable CI
because MXNet seems to be relying on so specific architectures,
dependencies and other factors which I'm not even able to track down that
this causes everything to be unstable.

Just to point it out: If we encounter so many problems while setting up a
CI system, doesn't that mean that our users and customers are also going to
face those issues as soon as things are getting more complicated? This is a
red flag in my opinion and I'm really looking forward to the usability
Sprint, but at the moment I'm afraid that an unprotected master will make
the situation even worse. It's already enough work to isolate and fix the
current issues, but if new untested changes get merged, this is going to be
like fighting a wildfire with a bottle of water.

So please revise your thoughts. If anybody is blocked by the protected
master, I would really appreciate it if they could approach me personally
in order to help stabilising the current situation. Just feeding in more
and more changes on one end while we're fixing issues on the other end
won't get us anywhere.

Best regards,
Marco

Am 19.11.2017 10:08 nachm. schrieb "Chris Olivier" :

> Revised:
>
>
> +1 at least until new CI is implemented. Then reevaluate.
>
> On Sun, Nov 19, 2017 at 1:07 PM Chris Olivier 
> wrote:
>
> > +1
> >
> >
> > On Sun, Nov 19, 2017 at 12:52 PM Zha, Sheng  wrote:
> >
> >> +1
> >>
> >> Best regards,
> >> -sz
> >>
> >> On 11/19/17, 12:51 PM, "Eric Xie"  wrote:
> >>
> >> Hi all,
> >> I'm starting this thread to vote on turning off protected master.
> The
> >> reasons are:
> >>
> >> 1. Since we turned on protected master pending PRs has grown from 40
> >> to 80. It is severely slowing down development.
> >>
> >> 2. Committers, not CI, are ultimately responsible for the code they
> >> merge. You should only override the CI when you are very confident that
> CI
> >> is the problem, not your code. If it turns out you are wrong, you should
> >> fix it ASAP. This is the bare minimum requirement for all committers: BE
> >> RESPONSIBLE.
> >>
> >> I'm aware of the argument for using protected master: It make sure
> >> that master is stable.
> >>
> >> Well, master will be most stable if we stop adding any commits to
> it.
> >> But that's not what we want is it?
> >>
> >> Protected master hardly adds any stability. The faulty tests that
> >> breaks master at random got merged into master because they happened to
> >> succeed once.
> >>
> >> Thanks,
> >> Junyuan Xie
> >>
> >>
> >>
> >>
>


Re: Protected master needs to be turned off

2017-11-19 Thread Chris Olivier
+1


On Sun, Nov 19, 2017 at 12:52 PM Zha, Sheng  wrote:

> +1
>
> Best regards,
> -sz
>
> On 11/19/17, 12:51 PM, "Eric Xie"  wrote:
>
> Hi all,
> I'm starting this thread to vote on turning off protected master. The
> reasons are:
>
> 1. Since we turned on protected master pending PRs has grown from 40
> to 80. It is severely slowing down development.
>
> 2. Committers, not CI, are ultimately responsible for the code they
> merge. You should only override the CI when you are very confident that CI
> is the problem, not your code. If it turns out you are wrong, you should
> fix it ASAP. This is the bare minimum requirement for all committers: BE
> RESPONSIBLE.
>
> I'm aware of the argument for using protected master: It make sure
> that master is stable.
>
> Well, master will be most stable if we stop adding any commits to it.
> But that's not what we want is it?
>
> Protected master hardly adds any stability. The faulty tests that
> breaks master at random got merged into master because they happened to
> succeed once.
>
> Thanks,
> Junyuan Xie
>
>
>
>