Re: [DISCUSS] CI improvements

2017-11-02 Thread Sean Goller
precheckin is literally './gradlew build :geode-assembly:acceptanceTest
integrationTest distributedTest flakyTest' :)

-S.

On Thu, Nov 2, 2017 at 1:10 PM, Dan Smith  wrote:

> On Thu, Nov 2, 2017 at 11:58 AM, Sean Goller  wrote:
>
> > Given the length of time precheckin seems to run, would it make sense to
> > break it up?
> >
> > -Sean.
> >
>
> Sure, as long as we don't miss anything :)
>
> -Dan
>


Re: [DISCUSS] CI improvements

2017-11-02 Thread Dan Smith
On Thu, Nov 2, 2017 at 11:58 AM, Sean Goller  wrote:

> Given the length of time precheckin seems to run, would it make sense to
> break it up?
>
> -Sean.
>

Sure, as long as we don't miss anything :)

-Dan


Re: [DISCUSS] CI improvements

2017-11-02 Thread Sean Goller
Given the length of time precheckin seems to run, would it make sense to
break it up?

-Sean.

On Thu, Nov 2, 2017 at 11:49 AM, Dan Smith  wrote:

> Looks good. Should we go ahead and change this to run precheckin instead of
> build?
>
> -Dan
>
> On Thu, Nov 2, 2017 at 9:53 AM, Anthony Baker  wrote:
>
> > If you’d like to check this out, here’s the PR containing the pipeline
> and
> > job scripts:
> > https://github.com/apache/geode/pull/1006
> >
> > And the pipeline itself:
> > https://concourse.apachegeode-ci.info
> >
> > There are three pipelines defined:
> >
> > - develop:  runs `gradle build`.  Can be extended to include other
> > precheckin tests based on feedback.
> > - docker-images: builds the container used for the develop pipeline.
> > - meta: watches for changes to the pipeline files and automatically
> > updates the runtime pipelines.
> >
> > Authentication is integrated with GitHub.  If you want the ability to
> > manually stop/start jobs please request on the dev@g.a.o mailing list
> > (same as for Jenkins) and include your GitHub id.
> >
> > What do you think?
> >
> > Anthony
> >
> > > On Oct 6, 2017, at 7:08 AM, Anthony Baker  wrote:
> > >
> > > Hi all,
> > >
> > > I’d like to propose the following that we switch our continuous
> > > integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
> > > this because we continue to experience a significant number of
> > > environmental-related test failures.
> > >
> > > These issues include CPU interference from other Jenkins jobs on the
> > > same host, running out of disk space, port conflicts, and other
> > > gremlins.  The net effect is that we are only getting 1-2 successful
> > > builds per month.  Certainly not all test failures can be traced back
> > > to environmental issues.  However, internal testing on isolated VM’s
> > > shows a combined success rate of about 3X higher compared to ASF
> > > Jenkins for the same tests.  This is still definitely NotAwesome, but
> > > removing environmental factors will let us focus on stabilizing flaky
> > > tests.
> > >
> > > Concourse is an Apache-licensed open source CI system based on
> > > pipelines.  The pipelines are defined in a YML file containing job
> > > definitions—inputs, outputs, resources, and tasks.  A task is simply a
> > > bash script that returns 0/1 for success/failure.  A web UI displays
> > > build status.  Importantly, each job runs inside an isolated
> > > container.  The containers are load-balanced across a pool of workers.
> > > For an example of a build pipeline, see [3] for the pipeline used to
> > > build concourse itself.
> > >
> > > A Concourse environment is deployed and managed in cloud environments
> > > through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
> > > and storage resources as well as manage the infrastructure.  These
> > > project resources would be available for use by all committers and
> > > community members regardless of corporate affiliations.  Note that
> > > AFAIK there is no explicit requirement to host CI on ASF
> > > infrastructure—unlike for critical project resources such as source
> > > code, mailing lists, and issue tracking.
> > >
> > > The source for the pipeline and job scripts would reside within the
> > > geode-* repos.  Geode committers would be able to modify those, same
> > > as with our .travis.yml scripts.  All test results and build artifacts
> > > would be publicly viewable just like with our Jenkins build output
> > > today.  Requests for admin assistance would go through the dev@geode
> > > mailing list.
> > >
> > > Thoughts?  As a first step we could run both CI systems side-by-side
> > > and see how the Concourse approach works for our project.
> > >
> > > Thanks,
> > > Anthony
> > >
> > >
> > > [1] https://builds.apache.org/job/Geode-nightly/
> > > [2] https://concourse.ci
> > > [3] https://ci.concourse.ci
> > > [4] https://bosh.io
> >
> >
>


Re: [DISCUSS] CI improvements

2017-11-02 Thread Dan Smith
Looks good. Should we go ahead and change this to run precheckin instead of
build?

-Dan

On Thu, Nov 2, 2017 at 9:53 AM, Anthony Baker  wrote:

> If you’d like to check this out, here’s the PR containing the pipeline and
> job scripts:
> https://github.com/apache/geode/pull/1006
>
> And the pipeline itself:
> https://concourse.apachegeode-ci.info
>
> There are three pipelines defined:
>
> - develop:  runs `gradle build`.  Can be extended to include other
> precheckin tests based on feedback.
> - docker-images: builds the container used for the develop pipeline.
> - meta: watches for changes to the pipeline files and automatically
> updates the runtime pipelines.
>
> Authentication is integrated with GitHub.  If you want the ability to
> manually stop/start jobs please request on the dev@g.a.o mailing list
> (same as for Jenkins) and include your GitHub id.
>
> What do you think?
>
> Anthony
>
> > On Oct 6, 2017, at 7:08 AM, Anthony Baker  wrote:
> >
> > Hi all,
> >
> > I’d like to propose the following that we switch our continuous
> > integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
> > this because we continue to experience a significant number of
> > environmental-related test failures.
> >
> > These issues include CPU interference from other Jenkins jobs on the
> > same host, running out of disk space, port conflicts, and other
> > gremlins.  The net effect is that we are only getting 1-2 successful
> > builds per month.  Certainly not all test failures can be traced back
> > to environmental issues.  However, internal testing on isolated VM’s
> > shows a combined success rate of about 3X higher compared to ASF
> > Jenkins for the same tests.  This is still definitely NotAwesome, but
> > removing environmental factors will let us focus on stabilizing flaky
> > tests.
> >
> > Concourse is an Apache-licensed open source CI system based on
> > pipelines.  The pipelines are defined in a YML file containing job
> > definitions—inputs, outputs, resources, and tasks.  A task is simply a
> > bash script that returns 0/1 for success/failure.  A web UI displays
> > build status.  Importantly, each job runs inside an isolated
> > container.  The containers are load-balanced across a pool of workers.
> > For an example of a build pipeline, see [3] for the pipeline used to
> > build concourse itself.
> >
> > A Concourse environment is deployed and managed in cloud environments
> > through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
> > and storage resources as well as manage the infrastructure.  These
> > project resources would be available for use by all committers and
> > community members regardless of corporate affiliations.  Note that
> > AFAIK there is no explicit requirement to host CI on ASF
> > infrastructure—unlike for critical project resources such as source
> > code, mailing lists, and issue tracking.
> >
> > The source for the pipeline and job scripts would reside within the
> > geode-* repos.  Geode committers would be able to modify those, same
> > as with our .travis.yml scripts.  All test results and build artifacts
> > would be publicly viewable just like with our Jenkins build output
> > today.  Requests for admin assistance would go through the dev@geode
> > mailing list.
> >
> > Thoughts?  As a first step we could run both CI systems side-by-side
> > and see how the Concourse approach works for our project.
> >
> > Thanks,
> > Anthony
> >
> >
> > [1] https://builds.apache.org/job/Geode-nightly/
> > [2] https://concourse.ci
> > [3] https://ci.concourse.ci
> > [4] https://bosh.io
>
>


Re: [DISCUSS] CI improvements

2017-11-02 Thread Anthony Baker
If you’d like to check this out, here’s the PR containing the pipeline and job 
scripts:
https://github.com/apache/geode/pull/1006

And the pipeline itself:
https://concourse.apachegeode-ci.info

There are three pipelines defined:

- develop:  runs `gradle build`.  Can be extended to include other precheckin 
tests based on feedback.
- docker-images: builds the container used for the develop pipeline.
- meta: watches for changes to the pipeline files and automatically updates the 
runtime pipelines.

Authentication is integrated with GitHub.  If you want the ability to manually 
stop/start jobs please request on the dev@g.a.o mailing list (same as for 
Jenkins) and include your GitHub id.

What do you think?

Anthony

> On Oct 6, 2017, at 7:08 AM, Anthony Baker  wrote:
> 
> Hi all,
> 
> I’d like to propose the following that we switch our continuous
> integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
> this because we continue to experience a significant number of
> environmental-related test failures.
> 
> These issues include CPU interference from other Jenkins jobs on the
> same host, running out of disk space, port conflicts, and other
> gremlins.  The net effect is that we are only getting 1-2 successful
> builds per month.  Certainly not all test failures can be traced back
> to environmental issues.  However, internal testing on isolated VM’s
> shows a combined success rate of about 3X higher compared to ASF
> Jenkins for the same tests.  This is still definitely NotAwesome, but
> removing environmental factors will let us focus on stabilizing flaky
> tests.
> 
> Concourse is an Apache-licensed open source CI system based on
> pipelines.  The pipelines are defined in a YML file containing job
> definitions—inputs, outputs, resources, and tasks.  A task is simply a
> bash script that returns 0/1 for success/failure.  A web UI displays
> build status.  Importantly, each job runs inside an isolated
> container.  The containers are load-balanced across a pool of workers.
> For an example of a build pipeline, see [3] for the pipeline used to
> build concourse itself.
> 
> A Concourse environment is deployed and managed in cloud environments
> through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
> and storage resources as well as manage the infrastructure.  These
> project resources would be available for use by all committers and
> community members regardless of corporate affiliations.  Note that
> AFAIK there is no explicit requirement to host CI on ASF
> infrastructure—unlike for critical project resources such as source
> code, mailing lists, and issue tracking.
> 
> The source for the pipeline and job scripts would reside within the
> geode-* repos.  Geode committers would be able to modify those, same
> as with our .travis.yml scripts.  All test results and build artifacts
> would be publicly viewable just like with our Jenkins build output
> today.  Requests for admin assistance would go through the dev@geode
> mailing list.
> 
> Thoughts?  As a first step we could run both CI systems side-by-side
> and see how the Concourse approach works for our project.
> 
> Thanks,
> Anthony
> 
> 
> [1] https://builds.apache.org/job/Geode-nightly/
> [2] https://concourse.ci
> [3] https://ci.concourse.ci
> [4] https://bosh.io



Re: [DISCUSS] CI improvements

2017-10-06 Thread Anthony Baker
Comments inline

> On Oct 6, 2017, at 11:51 AM, Mark Bretl  wrote:
> 
> Correct, there is no requirement from ASF about where to run CI or even to
> run CI. I am all for the best tools and stable (and repeatable)
> environments. I would be open to seeing how Concourse could work for Geode.
> 
> Once the Concourse environment is setup, probably would be best to update
> the Jenkins job to a pipeline job type as well.
> 
> There a few questions I have, which can be discussed as we go:
> - What does 'donate' mean for Pivotal? Is there a, for lack of a better
> term, contract or is this a 'good faith' effort? I am hesitant to go to any
> corporately controlled infrastructure, especially since the entity could
> decide to stop funding.

Pivotal is offering to fund a cloud account with resources for the Geode 
project.  I can’t predict the future but I wouldn’t be surprised to see more 
ASF projects receiving this kind of support in the future given the increasing 
popularity of cloud computing (not to mention the complexity of donating 
capital assets).  If at some point Pivotal does stop funding this effort and we 
can’t find another donor, we can always fall back to ASF Jenkins :-)

> - What types of activities will require admin assistance?

Admin support should be fairly minimal and would include activities like 
initial deployment and configuration, upgrades to bosh / concourse, adding more 
compute resources, or troubleshooting occasional cloud hiccups.

> - Who will be watching the dev list for requests?

Assuming we reach consensus to move forward with this proposal, expect to see 
some introductions on the dev list.  Does that help?

> 
> --Mark



Re: [DISCUSS] CI improvements

2017-10-06 Thread Mark Bretl
Correct, there is no requirement from ASF about where to run CI or even to
run CI. I am all for the best tools and stable (and repeatable)
environments. I would be open to seeing how Concourse could work for Geode.

Once the Concourse environment is setup, probably would be best to update
the Jenkins job to a pipeline job type as well.

There a few questions I have, which can be discussed as we go:
- What does 'donate' mean for Pivotal? Is there a, for lack of a better
term, contract or is this a 'good faith' effort? I am hesitant to go to any
corporately controlled infrastructure, especially since the entity could
decide to stop funding.
- What types of activities will require admin assistance?
- Who will be watching the dev list for requests?

--Mark

On Fri, Oct 6, 2017 at 10:05 AM, Kenneth Howe  wrote:

> +1 - Reduce the noise level for analyzing CI results
>
> > On Oct 6, 2017, at 9:49 AM, Udo Kohlmeyer  wrote:
> >
> > +1 Switch... parallel runs would be safest
> >
> > On Fri, Oct 6, 2017 at 9:42 AM, Jianxia Chen  wrote:
> >
> >> +1 to switch to Concourse
> >> +1 As a first step we could run both CI systems side-by-side and see how
> >> the Concourse approach works for our project.
> >>
> >> On Fri, Oct 6, 2017 at 7:08 AM, Anthony Baker 
> wrote:
> >>
> >>> Hi all,
> >>>
> >>> I’d like to propose the following that we switch our continuous
> >>> integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
> >>> this because we continue to experience a significant number of
> >>> environmental-related test failures.
> >>>
> >>> These issues include CPU interference from other Jenkins jobs on the
> >>> same host, running out of disk space, port conflicts, and other
> >>> gremlins.  The net effect is that we are only getting 1-2 successful
> >>> builds per month.  Certainly not all test failures can be traced back
> >>> to environmental issues.  However, internal testing on isolated VM’s
> >>> shows a combined success rate of about 3X higher compared to ASF
> >>> Jenkins for the same tests.  This is still definitely NotAwesome, but
> >>> removing environmental factors will let us focus on stabilizing flaky
> >>> tests.
> >>>
> >>> Concourse is an Apache-licensed open source CI system based on
> >>> pipelines.  The pipelines are defined in a YML file containing job
> >>> definitions—inputs, outputs, resources, and tasks.  A task is simply a
> >>> bash script that returns 0/1 for success/failure.  A web UI displays
> >>> build status.  Importantly, each job runs inside an isolated
> >>> container.  The containers are load-balanced across a pool of workers.
> >>> For an example of a build pipeline, see [3] for the pipeline used to
> >>> build concourse itself.
> >>>
> >>> A Concourse environment is deployed and managed in cloud environments
> >>> through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
> >>> and storage resources as well as manage the infrastructure.  These
> >>> project resources would be available for use by all committers and
> >>> community members regardless of corporate affiliations.  Note that
> >>> AFAIK there is no explicit requirement to host CI on ASF
> >>> infrastructure—unlike for critical project resources such as source
> >>> code, mailing lists, and issue tracking.
> >>>
> >>> The source for the pipeline and job scripts would reside within the
> >>> geode-* repos.  Geode committers would be able to modify those, same
> >>> as with our .travis.yml scripts.  All test results and build artifacts
> >>> would be publicly viewable just like with our Jenkins build output
> >>> today.  Requests for admin assistance would go through the dev@geode
> >>> mailing list.
> >>>
> >>> Thoughts?  As a first step we could run both CI systems side-by-side
> >>> and see how the Concourse approach works for our project.
> >>>
> >>> Thanks,
> >>> Anthony
> >>>
> >>>
> >>> [1] https://builds.apache.org/job/Geode-nightly/
> >>> [2] https://concourse.ci
> >>> [3] https://ci.concourse.ci
> >>> [4] https://bosh.io
> >>>
> >>
> >
> >
> >
> > --
> > Kindest Regards
> > -
> > *Udo Kohlmeyer* | *Snr Solutions Architect* |*Pivotal*
> > *Mobile:* +61 409-279-160 | ukohlme...@pivotal.io
> > 
> > www.pivotal.io
>
>


Re: [DISCUSS] CI improvements

2017-10-06 Thread Kenneth Howe
+1 - Reduce the noise level for analyzing CI results

> On Oct 6, 2017, at 9:49 AM, Udo Kohlmeyer  wrote:
> 
> +1 Switch... parallel runs would be safest
> 
> On Fri, Oct 6, 2017 at 9:42 AM, Jianxia Chen  wrote:
> 
>> +1 to switch to Concourse
>> +1 As a first step we could run both CI systems side-by-side and see how
>> the Concourse approach works for our project.
>> 
>> On Fri, Oct 6, 2017 at 7:08 AM, Anthony Baker  wrote:
>> 
>>> Hi all,
>>> 
>>> I’d like to propose the following that we switch our continuous
>>> integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
>>> this because we continue to experience a significant number of
>>> environmental-related test failures.
>>> 
>>> These issues include CPU interference from other Jenkins jobs on the
>>> same host, running out of disk space, port conflicts, and other
>>> gremlins.  The net effect is that we are only getting 1-2 successful
>>> builds per month.  Certainly not all test failures can be traced back
>>> to environmental issues.  However, internal testing on isolated VM’s
>>> shows a combined success rate of about 3X higher compared to ASF
>>> Jenkins for the same tests.  This is still definitely NotAwesome, but
>>> removing environmental factors will let us focus on stabilizing flaky
>>> tests.
>>> 
>>> Concourse is an Apache-licensed open source CI system based on
>>> pipelines.  The pipelines are defined in a YML file containing job
>>> definitions—inputs, outputs, resources, and tasks.  A task is simply a
>>> bash script that returns 0/1 for success/failure.  A web UI displays
>>> build status.  Importantly, each job runs inside an isolated
>>> container.  The containers are load-balanced across a pool of workers.
>>> For an example of a build pipeline, see [3] for the pipeline used to
>>> build concourse itself.
>>> 
>>> A Concourse environment is deployed and managed in cloud environments
>>> through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
>>> and storage resources as well as manage the infrastructure.  These
>>> project resources would be available for use by all committers and
>>> community members regardless of corporate affiliations.  Note that
>>> AFAIK there is no explicit requirement to host CI on ASF
>>> infrastructure—unlike for critical project resources such as source
>>> code, mailing lists, and issue tracking.
>>> 
>>> The source for the pipeline and job scripts would reside within the
>>> geode-* repos.  Geode committers would be able to modify those, same
>>> as with our .travis.yml scripts.  All test results and build artifacts
>>> would be publicly viewable just like with our Jenkins build output
>>> today.  Requests for admin assistance would go through the dev@geode
>>> mailing list.
>>> 
>>> Thoughts?  As a first step we could run both CI systems side-by-side
>>> and see how the Concourse approach works for our project.
>>> 
>>> Thanks,
>>> Anthony
>>> 
>>> 
>>> [1] https://builds.apache.org/job/Geode-nightly/
>>> [2] https://concourse.ci
>>> [3] https://ci.concourse.ci
>>> [4] https://bosh.io
>>> 
>> 
> 
> 
> 
> -- 
> Kindest Regards
> -
> *Udo Kohlmeyer* | *Snr Solutions Architect* |*Pivotal*
> *Mobile:* +61 409-279-160 | ukohlme...@pivotal.io
> 
> www.pivotal.io



Re: [DISCUSS] CI improvements

2017-10-06 Thread Udo Kohlmeyer
+1 Switch... parallel runs would be safest

On Fri, Oct 6, 2017 at 9:42 AM, Jianxia Chen  wrote:

> +1 to switch to Concourse
> +1 As a first step we could run both CI systems side-by-side and see how
> the Concourse approach works for our project.
>
> On Fri, Oct 6, 2017 at 7:08 AM, Anthony Baker  wrote:
>
> > Hi all,
> >
> > I’d like to propose the following that we switch our continuous
> > integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
> > this because we continue to experience a significant number of
> > environmental-related test failures.
> >
> > These issues include CPU interference from other Jenkins jobs on the
> > same host, running out of disk space, port conflicts, and other
> > gremlins.  The net effect is that we are only getting 1-2 successful
> > builds per month.  Certainly not all test failures can be traced back
> > to environmental issues.  However, internal testing on isolated VM’s
> > shows a combined success rate of about 3X higher compared to ASF
> > Jenkins for the same tests.  This is still definitely NotAwesome, but
> > removing environmental factors will let us focus on stabilizing flaky
> > tests.
> >
> > Concourse is an Apache-licensed open source CI system based on
> > pipelines.  The pipelines are defined in a YML file containing job
> > definitions—inputs, outputs, resources, and tasks.  A task is simply a
> > bash script that returns 0/1 for success/failure.  A web UI displays
> > build status.  Importantly, each job runs inside an isolated
> > container.  The containers are load-balanced across a pool of workers.
> > For an example of a build pipeline, see [3] for the pipeline used to
> > build concourse itself.
> >
> > A Concourse environment is deployed and managed in cloud environments
> > through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
> > and storage resources as well as manage the infrastructure.  These
> > project resources would be available for use by all committers and
> > community members regardless of corporate affiliations.  Note that
> > AFAIK there is no explicit requirement to host CI on ASF
> > infrastructure—unlike for critical project resources such as source
> > code, mailing lists, and issue tracking.
> >
> > The source for the pipeline and job scripts would reside within the
> > geode-* repos.  Geode committers would be able to modify those, same
> > as with our .travis.yml scripts.  All test results and build artifacts
> > would be publicly viewable just like with our Jenkins build output
> > today.  Requests for admin assistance would go through the dev@geode
> > mailing list.
> >
> > Thoughts?  As a first step we could run both CI systems side-by-side
> > and see how the Concourse approach works for our project.
> >
> > Thanks,
> > Anthony
> >
> >
> > [1] https://builds.apache.org/job/Geode-nightly/
> > [2] https://concourse.ci
> > [3] https://ci.concourse.ci
> > [4] https://bosh.io
> >
>



-- 
Kindest Regards
-
*Udo Kohlmeyer* | *Snr Solutions Architect* |*Pivotal*
*Mobile:* +61 409-279-160 | ukohlme...@pivotal.io

www.pivotal.io


Re: [DISCUSS] CI improvements

2017-10-06 Thread Jianxia Chen
+1 to switch to Concourse
+1 As a first step we could run both CI systems side-by-side and see how
the Concourse approach works for our project.

On Fri, Oct 6, 2017 at 7:08 AM, Anthony Baker  wrote:

> Hi all,
>
> I’d like to propose the following that we switch our continuous
> integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
> this because we continue to experience a significant number of
> environmental-related test failures.
>
> These issues include CPU interference from other Jenkins jobs on the
> same host, running out of disk space, port conflicts, and other
> gremlins.  The net effect is that we are only getting 1-2 successful
> builds per month.  Certainly not all test failures can be traced back
> to environmental issues.  However, internal testing on isolated VM’s
> shows a combined success rate of about 3X higher compared to ASF
> Jenkins for the same tests.  This is still definitely NotAwesome, but
> removing environmental factors will let us focus on stabilizing flaky
> tests.
>
> Concourse is an Apache-licensed open source CI system based on
> pipelines.  The pipelines are defined in a YML file containing job
> definitions—inputs, outputs, resources, and tasks.  A task is simply a
> bash script that returns 0/1 for success/failure.  A web UI displays
> build status.  Importantly, each job runs inside an isolated
> container.  The containers are load-balanced across a pool of workers.
> For an example of a build pipeline, see [3] for the pipeline used to
> build concourse itself.
>
> A Concourse environment is deployed and managed in cloud environments
> through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
> and storage resources as well as manage the infrastructure.  These
> project resources would be available for use by all committers and
> community members regardless of corporate affiliations.  Note that
> AFAIK there is no explicit requirement to host CI on ASF
> infrastructure—unlike for critical project resources such as source
> code, mailing lists, and issue tracking.
>
> The source for the pipeline and job scripts would reside within the
> geode-* repos.  Geode committers would be able to modify those, same
> as with our .travis.yml scripts.  All test results and build artifacts
> would be publicly viewable just like with our Jenkins build output
> today.  Requests for admin assistance would go through the dev@geode
> mailing list.
>
> Thoughts?  As a first step we could run both CI systems side-by-side
> and see how the Concourse approach works for our project.
>
> Thanks,
> Anthony
>
>
> [1] https://builds.apache.org/job/Geode-nightly/
> [2] https://concourse.ci
> [3] https://ci.concourse.ci
> [4] https://bosh.io
>


Re: [DISCUSS] CI improvements

2017-10-06 Thread Jared Stewart
+1 I think this will be a huge improvement to the reliability of our test 
infrastructure.

- Jared

> On Oct 6, 2017, at 9:26 AM, Kirk Lund  wrote:
> 
> +1 no thoughts other than make it so!
> 
> On Fri, Oct 6, 2017 at 7:08 AM, Anthony Baker  wrote:
> 
>> Hi all,
>> 
>> I’d like to propose the following that we switch our continuous
>> integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
>> this because we continue to experience a significant number of
>> environmental-related test failures.
>> 
>> These issues include CPU interference from other Jenkins jobs on the
>> same host, running out of disk space, port conflicts, and other
>> gremlins.  The net effect is that we are only getting 1-2 successful
>> builds per month.  Certainly not all test failures can be traced back
>> to environmental issues.  However, internal testing on isolated VM’s
>> shows a combined success rate of about 3X higher compared to ASF
>> Jenkins for the same tests.  This is still definitely NotAwesome, but
>> removing environmental factors will let us focus on stabilizing flaky
>> tests.
>> 
>> Concourse is an Apache-licensed open source CI system based on
>> pipelines.  The pipelines are defined in a YML file containing job
>> definitions—inputs, outputs, resources, and tasks.  A task is simply a
>> bash script that returns 0/1 for success/failure.  A web UI displays
>> build status.  Importantly, each job runs inside an isolated
>> container.  The containers are load-balanced across a pool of workers.
>> For an example of a build pipeline, see [3] for the pipeline used to
>> build concourse itself.
>> 
>> A Concourse environment is deployed and managed in cloud environments
>> through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
>> and storage resources as well as manage the infrastructure.  These
>> project resources would be available for use by all committers and
>> community members regardless of corporate affiliations.  Note that
>> AFAIK there is no explicit requirement to host CI on ASF
>> infrastructure—unlike for critical project resources such as source
>> code, mailing lists, and issue tracking.
>> 
>> The source for the pipeline and job scripts would reside within the
>> geode-* repos.  Geode committers would be able to modify those, same
>> as with our .travis.yml scripts.  All test results and build artifacts
>> would be publicly viewable just like with our Jenkins build output
>> today.  Requests for admin assistance would go through the dev@geode
>> mailing list.
>> 
>> Thoughts?  As a first step we could run both CI systems side-by-side
>> and see how the Concourse approach works for our project.
>> 
>> Thanks,
>> Anthony
>> 
>> 
>> [1] https://builds.apache.org/job/Geode-nightly/
>> [2] https://concourse.ci
>> [3] https://ci.concourse.ci
>> [4] https://bosh.io
>> 



Re: [DISCUSS] CI improvements

2017-10-06 Thread Kirk Lund
+1 no thoughts other than make it so!

On Fri, Oct 6, 2017 at 7:08 AM, Anthony Baker  wrote:

> Hi all,
>
> I’d like to propose the following that we switch our continuous
> integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
> this because we continue to experience a significant number of
> environmental-related test failures.
>
> These issues include CPU interference from other Jenkins jobs on the
> same host, running out of disk space, port conflicts, and other
> gremlins.  The net effect is that we are only getting 1-2 successful
> builds per month.  Certainly not all test failures can be traced back
> to environmental issues.  However, internal testing on isolated VM’s
> shows a combined success rate of about 3X higher compared to ASF
> Jenkins for the same tests.  This is still definitely NotAwesome, but
> removing environmental factors will let us focus on stabilizing flaky
> tests.
>
> Concourse is an Apache-licensed open source CI system based on
> pipelines.  The pipelines are defined in a YML file containing job
> definitions—inputs, outputs, resources, and tasks.  A task is simply a
> bash script that returns 0/1 for success/failure.  A web UI displays
> build status.  Importantly, each job runs inside an isolated
> container.  The containers are load-balanced across a pool of workers.
> For an example of a build pipeline, see [3] for the pipeline used to
> build concourse itself.
>
> A Concourse environment is deployed and managed in cloud environments
> through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
> and storage resources as well as manage the infrastructure.  These
> project resources would be available for use by all committers and
> community members regardless of corporate affiliations.  Note that
> AFAIK there is no explicit requirement to host CI on ASF
> infrastructure—unlike for critical project resources such as source
> code, mailing lists, and issue tracking.
>
> The source for the pipeline and job scripts would reside within the
> geode-* repos.  Geode committers would be able to modify those, same
> as with our .travis.yml scripts.  All test results and build artifacts
> would be publicly viewable just like with our Jenkins build output
> today.  Requests for admin assistance would go through the dev@geode
> mailing list.
>
> Thoughts?  As a first step we could run both CI systems side-by-side
> and see how the Concourse approach works for our project.
>
> Thanks,
> Anthony
>
>
> [1] https://builds.apache.org/job/Geode-nightly/
> [2] https://concourse.ci
> [3] https://ci.concourse.ci
> [4] https://bosh.io
>


Re: [DISCUSS] CI improvements

2017-10-06 Thread Jinmei Liao
+1 for switching concourse and running it side by side first to see how it
works first.

On Fri, Oct 6, 2017 at 7:57 AM, Gregory Chase  wrote:

> I recall hearing that Apache HAWQ already runs Concourse in a similar
> approach.
>
> I also know of at least three other open source projects are running
> Concourse:
>
> 1. Concourse.ci  itself
> 2. Greenplum Database 
> 3. RabbitMQ 
>
> On Fri, Oct 6, 2017 at 2:08 PM, Anthony Baker  wrote:
>
> > Hi all,
> >
> > I’d like to propose the following that we switch our continuous
> > integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
> > this because we continue to experience a significant number of
> > environmental-related test failures.
> >
> > These issues include CPU interference from other Jenkins jobs on the
> > same host, running out of disk space, port conflicts, and other
> > gremlins.  The net effect is that we are only getting 1-2 successful
> > builds per month.  Certainly not all test failures can be traced back
> > to environmental issues.  However, internal testing on isolated VM’s
> > shows a combined success rate of about 3X higher compared to ASF
> > Jenkins for the same tests.  This is still definitely NotAwesome, but
> > removing environmental factors will let us focus on stabilizing flaky
> > tests.
> >
> > Concourse is an Apache-licensed open source CI system based on
> > pipelines.  The pipelines are defined in a YML file containing job
> > definitions—inputs, outputs, resources, and tasks.  A task is simply a
> > bash script that returns 0/1 for success/failure.  A web UI displays
> > build status.  Importantly, each job runs inside an isolated
> > container.  The containers are load-balanced across a pool of workers.
> > For an example of a build pipeline, see [3] for the pipeline used to
> > build concourse itself.
> >
> > A Concourse environment is deployed and managed in cloud environments
> > through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
> > and storage resources as well as manage the infrastructure.  These
> > project resources would be available for use by all committers and
> > community members regardless of corporate affiliations.  Note that
> > AFAIK there is no explicit requirement to host CI on ASF
> > infrastructure—unlike for critical project resources such as source
> > code, mailing lists, and issue tracking.
> >
> > The source for the pipeline and job scripts would reside within the
> > geode-* repos.  Geode committers would be able to modify those, same
> > as with our .travis.yml scripts.  All test results and build artifacts
> > would be publicly viewable just like with our Jenkins build output
> > today.  Requests for admin assistance would go through the dev@geode
> > mailing list.
> >
> > Thoughts?  As a first step we could run both CI systems side-by-side
> > and see how the Concourse approach works for our project.
> >
> > Thanks,
> > Anthony
> >
> >
> > [1] https://builds.apache.org/job/Geode-nightly/
> > [2] https://concourse.ci
> > [3] https://ci.concourse.ci
> > [4] https://bosh.io
> >
>
>
>
> --
> Greg Chase
>
> Product team, Pivotal Cloud Foundry Services
> https://pivotal.io/platform/services
>
> Pivotal Software
> http://www.pivotal.io/
>
> 650-215-0477
> @GregChase
>



-- 
Cheers

Jinmei


Re: [DISCUSS] CI improvements

2017-10-06 Thread Gregory Chase
I recall hearing that Apache HAWQ already runs Concourse in a similar
approach.

I also know of at least three other open source projects are running
Concourse:

1. Concourse.ci  itself
2. Greenplum Database 
3. RabbitMQ 

On Fri, Oct 6, 2017 at 2:08 PM, Anthony Baker  wrote:

> Hi all,
>
> I’d like to propose the following that we switch our continuous
> integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
> this because we continue to experience a significant number of
> environmental-related test failures.
>
> These issues include CPU interference from other Jenkins jobs on the
> same host, running out of disk space, port conflicts, and other
> gremlins.  The net effect is that we are only getting 1-2 successful
> builds per month.  Certainly not all test failures can be traced back
> to environmental issues.  However, internal testing on isolated VM’s
> shows a combined success rate of about 3X higher compared to ASF
> Jenkins for the same tests.  This is still definitely NotAwesome, but
> removing environmental factors will let us focus on stabilizing flaky
> tests.
>
> Concourse is an Apache-licensed open source CI system based on
> pipelines.  The pipelines are defined in a YML file containing job
> definitions—inputs, outputs, resources, and tasks.  A task is simply a
> bash script that returns 0/1 for success/failure.  A web UI displays
> build status.  Importantly, each job runs inside an isolated
> container.  The containers are load-balanced across a pool of workers.
> For an example of a build pipeline, see [3] for the pipeline used to
> build concourse itself.
>
> A Concourse environment is deployed and managed in cloud environments
> through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
> and storage resources as well as manage the infrastructure.  These
> project resources would be available for use by all committers and
> community members regardless of corporate affiliations.  Note that
> AFAIK there is no explicit requirement to host CI on ASF
> infrastructure—unlike for critical project resources such as source
> code, mailing lists, and issue tracking.
>
> The source for the pipeline and job scripts would reside within the
> geode-* repos.  Geode committers would be able to modify those, same
> as with our .travis.yml scripts.  All test results and build artifacts
> would be publicly viewable just like with our Jenkins build output
> today.  Requests for admin assistance would go through the dev@geode
> mailing list.
>
> Thoughts?  As a first step we could run both CI systems side-by-side
> and see how the Concourse approach works for our project.
>
> Thanks,
> Anthony
>
>
> [1] https://builds.apache.org/job/Geode-nightly/
> [2] https://concourse.ci
> [3] https://ci.concourse.ci
> [4] https://bosh.io
>



-- 
Greg Chase

Product team, Pivotal Cloud Foundry Services
https://pivotal.io/platform/services

Pivotal Software
http://www.pivotal.io/

650-215-0477
@GregChase


[DISCUSS] CI improvements

2017-10-06 Thread Anthony Baker
Hi all,

I’d like to propose the following that we switch our continuous
integration (CI) system from Jenkins [1] to Concourse [2].  I suggest
this because we continue to experience a significant number of
environmental-related test failures.

These issues include CPU interference from other Jenkins jobs on the
same host, running out of disk space, port conflicts, and other
gremlins.  The net effect is that we are only getting 1-2 successful
builds per month.  Certainly not all test failures can be traced back
to environmental issues.  However, internal testing on isolated VM’s
shows a combined success rate of about 3X higher compared to ASF
Jenkins for the same tests.  This is still definitely NotAwesome, but
removing environmental factors will let us focus on stabilizing flaky
tests.

Concourse is an Apache-licensed open source CI system based on
pipelines.  The pipelines are defined in a YML file containing job
definitions—inputs, outputs, resources, and tasks.  A task is simply a
bash script that returns 0/1 for success/failure.  A web UI displays
build status.  Importantly, each job runs inside an isolated
container.  The containers are load-balanced across a pool of workers.
For an example of a build pipeline, see [3] for the pipeline used to
build concourse itself.

A Concourse environment is deployed and managed in cloud environments
through bosh [4].  Pivotal has agreed to donate AWS and/or GCP compute
and storage resources as well as manage the infrastructure.  These
project resources would be available for use by all committers and
community members regardless of corporate affiliations.  Note that
AFAIK there is no explicit requirement to host CI on ASF
infrastructure—unlike for critical project resources such as source
code, mailing lists, and issue tracking.

The source for the pipeline and job scripts would reside within the
geode-* repos.  Geode committers would be able to modify those, same
as with our .travis.yml scripts.  All test results and build artifacts
would be publicly viewable just like with our Jenkins build output
today.  Requests for admin assistance would go through the dev@geode
mailing list.

Thoughts?  As a first step we could run both CI systems side-by-side
and see how the Concourse approach works for our project.

Thanks,
Anthony


[1] https://builds.apache.org/job/Geode-nightly/
[2] https://concourse.ci
[3] https://ci.concourse.ci
[4] https://bosh.io