This is great news. Thanks Hari , Mateo and pulsar community

On Fri, Mar 12, 2021, 2:04 AM Lari Hotari <lari.hot...@sagire.fi> wrote:

> Dear Pulsar community members,
>
> The work on "Changes to GitHub Actions based Pulsar CI" has gone forward
> based on your feedback. Here are some updates about the work.
>
> The draft PIP proposal document is here:
>
> https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit#heading=h.f53rkcu20sry
> There's a *detailed status update in the document about a prototype for the
> refactored Pulsar CI GitHub Actions based workflow*.
>
> Thanks for all the suggestions and feedback by now. A lot of improvements
> have been made by the Pulsar contributors to overcome the technical
> obstacles.
> Special thanks go to Matteo for reducing the sizes of docker images. A lot
> of small improvements have been made to the Pulsar maven build to enable
> the new refactored GitHub Actions workflow. Thank you for all PR reviews
> and feedback.
>
> The main goal of the "Changes to GitHub Actions based Pulsar CI" work has
> been to *reduce the resource consumption of the Pulsar CI build and to
> speed up Pulsar development by improving the developer productivity* when
> less time is wasted in waiting for Pulsar CI build feedback. The prototype
> demonstrates these improvements.
>
> As you can see from the email from Jan 28 below, *the resource consumption
> was 19 hrs 36 minutes* for a single pull request that was observed when the
> work began.
> Now, with the prototype of the refactored Pulsar CI build, the resource
> consumption is *7 hrs 9 minutes.*
> *This is about 60% reduction in resource consumption.* The whole pipeline
> completes in 75-100 minutes.
>
> Here's a breakdown of the duration (resource consumption) of each build job
> in the refactored workflow:
> Workflow Job seconds h:mm:ss
> Pulsar CI Changed files check 4 0:00:04
> Pulsar CI Go 1.11 Functions 155 0:02:35
> Pulsar CI Go 1.12 Functions 166 0:02:46
> Pulsar CI Go 1.13 Functions 113 0:01:53
> Pulsar CI Go 1.14 Functions 96 0:01:36
> Pulsar CI Build on MacOS 1017 0:16:57
> Pulsar CI Build and License check 346 0:05:46
> Pulsar CI Build Pulsar CPP and Python clients 683 0:11:23
> Pulsar CI Build Pulsar java-test-image docker image 405 0:06:45
> Pulsar CI CI - Unit - Other 1580 0:26:20
> Pulsar CI CI - Unit - Brokers - Broker Group 1 968 0:16:08
> Pulsar CI CI - Unit - Brokers - Broker Group 2 2223 0:37:03
> Pulsar CI CI - Unit - Brokers - Client Api 1652 0:27:32
> Pulsar CI CI - Unit - Brokers - Client Impl 916 0:15:16
> Pulsar CI CI - Unit - Brokers - Other 522 0:08:42
> Pulsar CI CI - Unit - Proxy 331 0:05:31
> Pulsar CI Build Pulsar docker image 2343 0:39:03
> Pulsar CI CI - Integration - Shade 414 0:06:54
> Pulsar CI CI - Integration - Backwards Compatibility 849 0:14:09
> Pulsar CI CI - Integration - Cli 1490 0:24:50
> Pulsar CI CI - Integration - Messaging 857 0:14:17
> Pulsar CI CI - Integration - Schema 468 0:07:48
> Pulsar CI CI - Integration - Standalone 286 0:04:46
> Pulsar CI CI - Integration - Transaction 362 0:06:02
> Pulsar CI CI - System - Function State 699 0:11:39
> Pulsar CI CI - System - Tiered FileSystem 779 0:12:59
> Pulsar CI CI - System - Tiered JCloud 529 0:08:49
> Pulsar CI CI - System - Pulsar Connectors - Thread 1795 0:29:55
> Pulsar CI CI - System - Pulsar Connectors - Process 2312 0:38:32
> Pulsar CI CI - System - Sql 1377 0:22:57
> *Total resource consumption*
> 7:08:57
>
>
> GitHub Actions doesn't support restarting a single job (
>
> https://github.community/t/ability-to-rerun-just-a-single-job-in-a-workflow/17234
> ).
> However, this is not a showstopper since there are ways to address the
> issues that cause flakiness.
> There is a separate PIP for changing the way to handle flaky tests. You can
> find the link to that in the "Changes to GitHub Actions based Pulsar CI"
> document's header.
>
> *Some requests for the Pulsar community:*
>
> 1) *Please take a look at the updated PIP document*:
>
> https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit#heading=h.f53rkcu20sry
> . *It also contains more details of the prototype that has been
> successfully completed.*
>
> 2) *Please share your feedback and suggest a way forward.*
>
> *Thank you for your help!*
>
> BR, Lari
>
> On Thu, Jan 28, 2021 at 7:13 PM Lari Hotari <lari.hot...@sagire.fi> wrote:
>
> > Dear Pulsar community members,
> >
> > Currently, the Pulsar GitHub Actions workflows are consuming the majority
> > of the shared pool of resources allocated for github.com/apache
> projects.
> > Other Apache projects have been impacted and there is a demand to improve
> > the Pulsar CI
> > <https://github.com/apache/pulsar/pull/9159#issuecomment-766915396>
> asap.
> >
> > In GitHub Actions Runners, the unit of resources is the time that a
> Runner
> > is occupied. I observed the workflow runs for handling a single Pull
> > Request (in my personal fork) and these were the running durations:
> > Workflow name Duration
> > CI - Build - MacOS 0:17:23
> > CI - Go Functions style check 0:02:38
> > CI - Unit - Brokers - Other 0:15:40
> > CI - Unit - Brokers - Client Impl 0:16:28
> > CI - Misc 0:16:51
> > CI - Unit - Proxy 0:14:23
> > CI - Go Functions Tests 0:22:08
> > CI - CPP, Python Tests 0:23:30
> > CI - Unit 0:42:11
> > CI - Integration - Sql 1:00:13
> > CI - Integration - Tiered JCloud 1:00:18
> > CI - Integration - Tiered FileSystem 1:00:13
> > CI - Integration - Function State 1:00:12
> > CI - Integration - Cli 1:10:22
> > CI - Integration - Transaction 1:16:34
> > CI - Integration - Process 1:11:23
> > CI - Shade - Test 1:15:45
> > CI - Unit - Brokers - Client Api 0:26:13
> > CI - Unit - Brokers - Broker Group 2 0:35:05
> > CI - Integration - Standalone 0:45:29
> > CI - Integration - Messaging 1:00:23
> > CI - Integration - Thread 1:00:19
> > CI - Integration - Backwards Compatibility 1:00:19
> > CI - Integration - Schema 1:00:19
> > CI - Unit - Brokers - Broker Group 1 2:02:31
> > TOTAL 19:36:50
> >
> > *In this case, the total resource consumption of GitHub Actions Runners
> is
> > 19 hours 36 minutes 50 seconds for a single pull request to
> apache/pulsar.*
> >
> > Since GitHub Actions Runner resource pool utilization is very high, this
> > leads to the build queue to grow and take a long time to process.
> >
> > I have been looking for ways to improve the Pulsar CI for the last 3
> > months. During this period I worked on a few experiments. The learnings
> > from the past experiments are documented at a high level in the following
> > draft PIP document.
> >
> > *The draft PIP "Changes to GitHub Actions based Pulsar CI" document is a
> > Google doc:*
> >
> >
> https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit?usp=sharing
> >
> > *Please participate* so that we get the plan adjusted based on the
> > feedback asap. If there's already a similar effort ongoing, I hope we can
> > join efforts.
> >
> > *Let's fix Pulsar CI!*
> >
> > BR, Lari
> >
>

Reply via email to