Hi Nico, I am hopeful this will improve the developer experience quite a bit, in particular for first time contributors. +1
Cheers, Konstantin On Thu, Dec 16, 2021 at 5:04 PM Till Rohrmann <trohrm...@apache.org> wrote: > Thanks for drafting this proposal Nico. > > I hope that we can improve our development processes and build system > stability in the long run with the move to GHA. Hence +1 for this proposal > and the timeline. The plan looks thoroughly planned. > > Cheers, > Till > > On Thu, Dec 16, 2021 at 4:29 PM Chesnay Schepler <ches...@apache.org> > wrote: > > > We will not use Apache resources, but install self-hosted runners on our > > current CI machines, similar to what we have done with Azure. > > > > On 16/12/2021 16:07, Fabian Paul wrote: > > > Hi Nico, > > > > > > Thanks a lot for drafting the proposal. I really like the > > > fully-fledged phasing model. All in all, I am +1 to move away from > > > azure and can only second all the points you have mentioned. > > > > > > I only want to clarify one point. So far my understanding was that the > > > GHA resources are managed on a GitHub organizational level in contrast > > > to Azure pipelines where projects have certain resources. What happens > > > if more and more projects inside the Apache Github organization > > > migrate to GHA? Will this affect the build queue time? > > > > > > Best, > > > Fabian > > > > > > On Thu, Dec 16, 2021 at 3:59 PM Nicolaus Weidner > > > <nicolaus.weid...@ververica.com> wrote: > > >> Hi all, > > >> > > >> as several people know by now, we are planning to move from Azure CI > to > > >> Github Actions. This is motivated by (not an exhaustive list): > > >> - Not needing to mirror the repo anymore for CI > > >> - Improving the contributor experience, especially for new > contributors > > >> - GHA development being more active than Azure CI development > > >> > > >> In case someone wants to check out the current version of the planned > > GHA > > >> workflow, you can find it here: > > >> > > > https://github.com/ververica/flink/blob/master/.github/workflows/hadoop-2.8.3-scala-2.12-workflow.yml > > >> Past runs can be seen here: > https://github.com/ververica/flink/actions > > (lots > > >> of red, but this is almost always not due to the workflow) > > >> > > >> I want to put a draft for the migration roadmap up for discussion. > It's > > >> divided into several phases: > > >> > > >> *Phase 1: *GHA activated on master (but not required) > > >> - A single CI machine is converted to run GHA runners (instead of > Azure > > >> runners) and runs the workflow on pushes to master > > >> - Azure CI remains unchanged and is still the source of truth > > >> - We can compare runtimes and behavior/failures > > >> - Timeframe: 2 weeks > > >> > > >> *Phase 2: *Additional features > > >> - Any additional functionality that we want to add to GHA is added > (e.g. > > >> not running the workflow if workflow files were modified) > > >> - Functionality from FlinkCIBot that we want to keep is ported over > > >> (syncing with the mirror repo can be dropped, but there are some > > automated > > >> checks that we want to keep) > > >> - We can monitor whether performance is impacted by any change > > >> - Timeframe: 2 weeks > > >> > > >> *Phase 3: *Cron jobs and (some) PR triggers run on GHA > > >> - GHA cron builds activated (for master and release branches) > > >> - Note: Includes some backports to all affected branches, else > the > > >> workflows won’t run: > > >> > > > https://stackoverflow.com/questions/61989951/github-action-workflow-not-running/61992817#61992817 > > >> - GHA builds run for PRs of select committers (the idea is to try out > > >> builds for all the intended trigger conditions) > > >> - Timeframe: 1 week > > >> > > >> *Up to this point, the existing CI pipeline is mostly unaffected - we > > only > > >> took away one CI machine.* > > >> > > >> *Phase 4: *Full switch to GHA > > >> - Set up GHA runners on all machines > > >> - GHA builds are activated for all PRs > > >> - Either Azure or GHA build is required > > >> - GHA runners are activated, Azure runners are deactivated (but not > yet > > >> removed) apart from 1 machine (for stragglers) > > >> - Azure cron jobs are disabled, but kept around in case we need to > > revert > > >> - Timeframe: 1-2 weeks > > >> > > >> *Phase 5: *Removal of Azure CI leftovers > > >> - Only after we are satisfied that GHA is stable (at least 1 month > after > > >> the switch, can be longer) > > >> - Green GHA build is required from now on > > >> - Stale PRs that don't have a GHA run will have to trigger a new one > > (but > > >> they would most likely have to rebase anyway...) > > >> - (old) FlinkCIBot is disabled > > >> - Azure yamls are deleted > > >> - Azure runners are removed from machines > > >> > > >> > > >> Timing-wise, the full switch to GHA should happen during a quiet time, > > far > > >> away from a release. The remaining phases shouldn't have much impact, > > but > > >> right before a release is not a good moment, of course. > > >> Please give us your thoughts and point out anything we missed or that > > >> doesn't seem to make sense! > > >> > > >> Best, > > >> Nico > > > > > > > -- Konstantin Knauf https://twitter.com/snntrable https://github.com/knaufk