+1

Till Rohrmann <trohrm...@apache.org> 于2019年12月4日周三 下午10:43写道:

> +1 for moving to Azure pipelines as it promises better scalability and
> tooling. Looking forward to having faster builds and hence shorter feedback
> cycles :-)
>
> Cheers,
> Till
>
> On Wed, Dec 4, 2019 at 1:24 PM Chesnay Schepler <ches...@apache.org>
> wrote:
>
> > @robert Can you expand how the azure setup interacts with CiBot? Do we
> > have to continue mirroring builds into flink-ci? How will the cronjob
> > configuration work? We should have a general idea on how to implement
> > this before proceeding.
> > Additionally, moving /all /jobs into flink-ci requires setting up the
> > environment variables we have; can we set these up via files or will we
> > have to give all committers permissions for flink-ci/flink?
> >
> > On 04/12/2019 12:55, Chesnay Schepler wrote:
> > > From what I've seen so far Azure will provide us a better experience,
> > > so I'd say +1 for the transition as a whole.
> > >
> > > I'd delay merge at least until the feature branch is cut.
> > > Given the parental leave it may even make sense to only start merging
> > > in January afterwards, to reduce the total time taken for the
> transition.
> > >
> > > Reviews could maybe be made earlier, but I'm wondering whether anyone
> > > would even have the time at the moment to do so.
> > >
> > > On 04/12/2019 12:35, Kurt Young wrote:
> > >> Thanks Robert for driving this. There is another big pain point of
> > >> current
> > >> travis,
> > >> which is its cache mechanism will fail from time to time. Almost
> > >> around 50%
> > >> of
> > >> the build fails are caused by cache problem. I opened this issue to
> > >> travis
> > >> but
> > >> got no response yet. So big +1 from my side.
> > >>
> > >> Just one comment, it's close to 1.10 feature freeze and we will spend
> > >> some
> > >> time
> > >> to make tests stable before release. I wish this replacement can
> happen
> > >> after
> > >> 1.10 release, otherwise it will be a unstable factor during release
> > >> testing.
> > >>
> > >> Best,
> > >> Kurt
> > >>
> > >>
> > >> On Wed, Dec 4, 2019 at 7:16 PM Zhu Zhu <reed...@gmail.com> wrote:
> > >>
> > >>> Thanks Robert for the updates! And thanks a lot for all the efforts
> to
> > >>> investigate, experiment and tune Azure Pipelines for Flink building.
> > >>> Big +1 for it.
> > >>>
> > >>> It would be great that the community building can be extended with
> > >>> custom
> > >>> machines so that the tests would not be queued for long with daily
> > >>> growing
> > >>> PRs.
> > >>>
> > >>> The increased timeout would be also very helpful.
> > >>> The 50min timeout for free travis accounts is a pain currently,
> > >>> especially
> > >>> when we'd like to run e2e tests in our own travis. And I had to
> > >>> manually
> > >>> split the jobs to make it possible to pass.
> > >>>
> > >>> Thanks,
> > >>> Zhu Zhu
> > >>>
> > >>> Robert Metzger <rmetz...@apache.org> 于2019年12月4日周三 下午6:36写道:
> > >>>
> > >>>> Hi all,
> > >>>>
> > >>>> as a follow up from our discussion on reducing the build time [1], I
> > >>> would
> > >>>> like to propose migrating our build infrastructure to Azure
> Pipelines
> > >>> (away
> > >>>> from Travis).
> > >>>>
> > >>>> I believe that we have reached the limits of what Travis can
> > >>>> provide the
> > >>>> Flink community, and I don't want the build system to limit or
> > >>>> influence
> > >>>> the project's growth.
> > >>>>
> > >>>> *Benefits:*
> > >>>> 1. The free Travis account are limited to 5 parallel builds, with a
> > >>> timeout
> > >>>> of 50 minutes. Azure offers *10 parallel builds with 300 minute
> > >>>> timeouts
> > >>>> *for
> > >>>> free for open source projects.
> > >>>> 2. Azure Pipelines allows us to *add custom build machines* to the
> > >>>> pool
> > >>> of
> > >>>> 10 free parallel builders.
> > >>>> This will allow the Flink community to scale the available build
> > >>>> capacity
> > >>>> as the project grows. We are dependent on donations from supporting
> > >>>> companies, but I believe that it is easier for companies to donate
> > >>> machines
> > >>>> than money.
> > >>>> Alibaba is willing to provide 10 machines, with 32 cores each to the
> > >>> Flink
> > >>>> project for this purpose.
> > >>>> In addition, Xiyuan, who's working on adding ARM support for Flink
> > >>> provided
> > >>>> me with 2 ARM machines (16 cores each).
> > >>>> I want to use the custom, more efficient build machines for building
> > >>>> Flink's pull requests and master-pushes.
> > >>>> 3. *Azure Pipelines is a more feature-rich tool*, allowing for
> > >>>> example to
> > >>>> transfer intermediate build artifacts between pipeline stages. This
> > >>>> will
> > >>>> allow us to make the build more reliable (we are currently abusing
> the
> > >>>> caching mechanism in Travis for this).
> > >>>> It also has some basic analytics on test results / flaky tests etc.
> > >>>>
> > >>>> *Known problems:*
> > >>>> - Initially, we might see different build instabilities than before
> > >>>> - There's a higher maintenance overhead for the custom build
> machines
> > >>>> (keeping them up to date etc.)
> > >>>> - We can not use the build status integration of AZP, because they
> > >>> require
> > >>>> write access to the repository's source. The foundation does not
> allow
> > >>> that
> > >>>> [2].
> > >>>> I propose to extend flinkbot / the flink-ci repository.
> > >>>>
> > >>>> *Current Status:*
> > >>>> - I'm able [3] to execute [4] the current custom build scripts on
> > >>>> Azure
> > >>>> Pipelines: This means that we will have one compile stage, and N
> > >>>> testing
> > >>>> jobs in the 2nd stage. Currently, we have N=10 testing jobs.
> > >>>> The time from the start of a build till all tests have completed is
> > >>>> 1h22
> > >>>> minutes.
> > >>>> - I'm working on getting the nightly end to end tests to run on the
> > >>>> new
> > >>>> infrastructure.
> > >>>> - I'm working on getting the build to work on our pool of custom
> > >>>> machines
> > >>>> as well
> > >>>> - I'm working on setting up the full matrix of builds (different
> > >>>> scala,
> > >>>> hadoop etc. versions) for the nightlies
> > >>>>
> > >>>> *Next Steps:*
> > >>>> - I propose to document the entire build system in the Flink Wiki
> > >>>> - Once Azure can cover the same pull request tests as Travis, I
> > >>>> would set
> > >>>> it up to run in parallel (including Flinkbot posting links to
> > >>>> Azure). I
> > >>>> hope that this phase lasts for 1-2 weeks only, so that we do not
> > >>>> have to
> > >>>> maintain things concurrently. I will monitor the build stability
> > >>>> closely,
> > >>>> but would expect some support with debugging potential issues from
> the
> > >>>> contributors.
> > >>>> - Once there are no problems with the new setup, we remove the
> Travis
> > >>>> setup.
> > >>>> - Independently, I will work on triggering builds from master /
> > >>>> release -
> > >>>> branch pushes, as well as cron builds from the master branch ...
> > >>>> all this
> > >>>> will be described in the Wiki.
> > >>>>
> > >>>>
> > >>>> *Timeline:*- Once I have the feeling that people are supportive of
> the
> > >>>> idea, I will start documenting in the Wiki. The first pull requests
> > >>> should
> > >>>> show up after a few more days.
> > >>>> I will do a one month parental leave starting some time later in
> > >>> December,
> > >>>> which will probably delay things a bit. I hope to have everything
> > >>> finished
> > >>>> by end of January.
> > >>>>
> > >>>> I'm happy to hear your thoughts on this work.
> > >>>> If nobody objects, I will start documenting the system and prepare
> > >>>> everything for the migration.
> > >>>>
> > >>>> Best,
> > >>>> Robert
> > >>>>
> > >>>>
> > >>>>
> > >>>> [1]
> > >>>>
> > >>>>
> > >>>
> >
> https://lists.apache.org/thread.html/b90aa518fcabce94f8e1de4132f46120fae613db6e95a2705f1bd1ea@%3Cdev.flink.apache.org%3E
> > >>>
> > >>>> [2] https://issues.apache.org/jira/browse/INFRA-17030
> > >>>> [3] https://github.com/rmetzger/flink/tree/azure_playground
> > >>>> [4]
> > >>>
> https://dev.azure.com/rmetzger/Flink/_build?definitionId=4&_a=summary
> > >
> > >
> > >
> >
> >
>


-- 
Best Regards

Jeff Zhang

Reply via email to