I am not sure that merge all the workflows into one workflow is a good idea. As I know, Github Actions doesn't allow to rerun a single job in a workflow. That means if there has any failure in the workflow, we need to rerun all steps/stage. There has a worst-case is we failed in the different tests when rerunning it and this would take more time to pass the CI.
--- Yong On Fri, 29 Jan 2021 at 01:14, Lari Hotari <lari.hot...@sagire.fi> wrote: > Dear Pulsar community members, > > Currently, the Pulsar GitHub Actions workflows are consuming the majority > of the shared pool of resources allocated for github.com/apache projects. > Other Apache projects have been impacted and there is a demand to improve > the Pulsar CI > <https://github.com/apache/pulsar/pull/9159#issuecomment-766915396> asap. > > In GitHub Actions Runners, the unit of resources is the time that a Runner > is occupied. I observed the workflow runs for handling a single Pull > Request (in my personal fork) and these were the running durations: > Workflow name Duration > CI - Build - MacOS 0:17:23 > CI - Go Functions style check 0:02:38 > CI - Unit - Brokers - Other 0:15:40 > CI - Unit - Brokers - Client Impl 0:16:28 > CI - Misc 0:16:51 > CI - Unit - Proxy 0:14:23 > CI - Go Functions Tests 0:22:08 > CI - CPP, Python Tests 0:23:30 > CI - Unit 0:42:11 > CI - Integration - Sql 1:00:13 > CI - Integration - Tiered JCloud 1:00:18 > CI - Integration - Tiered FileSystem 1:00:13 > CI - Integration - Function State 1:00:12 > CI - Integration - Cli 1:10:22 > CI - Integration - Transaction 1:16:34 > CI - Integration - Process 1:11:23 > CI - Shade - Test 1:15:45 > CI - Unit - Brokers - Client Api 0:26:13 > CI - Unit - Brokers - Broker Group 2 0:35:05 > CI - Integration - Standalone 0:45:29 > CI - Integration - Messaging 1:00:23 > CI - Integration - Thread 1:00:19 > CI - Integration - Backwards Compatibility 1:00:19 > CI - Integration - Schema 1:00:19 > CI - Unit - Brokers - Broker Group 1 2:02:31 > TOTAL 19:36:50 > > *In this case, the total resource consumption of GitHub Actions Runners is > 19 hours 36 minutes 50 seconds for a single pull request to apache/pulsar.* > > Since GitHub Actions Runner resource pool utilization is very high, this > leads to the build queue to grow and take a long time to process. > > I have been looking for ways to improve the Pulsar CI for the last 3 > months. During this period I worked on a few experiments. The learnings > from the past experiments are documented at a high level in the following > draft PIP document. > > *The draft PIP "Changes to GitHub Actions based Pulsar CI" document is a > Google doc:* > > https://docs.google.com/document/d/1FNEWD3COdnNGMiryO9qBUW_83qtzAhqjDI5wwmPD-YE/edit?usp=sharing > > *Please participate* so that we get the plan adjusted based on the feedback > asap. If there's already a similar effort ongoing, I hope we can join > efforts. > > *Let's fix Pulsar CI!* > > BR, Lari >