Re: [DISCUSS] solve unstable build capacity problem on TravisCI

Chesnay Schepler Tue, 25 Jun 2019 00:50:57 -0700


On 24/06/2019 23:48, Bowen Li wrote:

- Has anyone else experienced the same problem or have similar observation
on TravisCI? (I suspect it has things to do with time zone)

In Europe we have the same problem.


- What pricing plan of TravisCI is Flink currently using? Is it the free
plan for open source projects? What are the guaranteed build capacity of
the current plan?

Flink is using the Travis resources provided by the ASF, which afaik theASF is paying for.

Note that in the past the Flink project was already approached by INFRAsince we were using too many Travis resources,

so this is _not_ as simple as asking for more.


- If the current pricing plan (either free or paid) can't provide stable
build capacity, can we upgrade to a higher priced plan with larger and more
stable build capacity?

We are currently investigating whether companies could donate/sponsorTravis CI resources to the ASF for increasing the build capacity;currently waiting for an answer from INFRA.


BTW, another factor that contribute to the productivity problem is that our
build is slow - we run full build for every PR and a successful full build
takes ~5h. We definitely have more options to solve it, for instance,
modularize the build graphs and reuse artifacts from the previous build.
But I think that can be a big effort which is much harder to accomplish in
a short period of time and may deserve its own separate discussion.

We already are doing that. The vast majority of the build times issimply due to tests taking way too long, not compilation.The tests for the kafka connector alone exceed a single profile, as doesthe table API.Unless people start caring about test times before adding them, thisissue cannot be solved.

Of course, this discussion isn't new, I already raised it the last 2times we approach the Travis limits, with little to no effect to be seen.

At this point I'm sure someone out there is thinking "well, just don'trun kafka tests for every PR. Like, check the diff or something",and yes, sure, that _might_ work. But to this day, despite numerouspeople suggesting it, I still haven't seen a single person actually tryimplementing it.

The problem with these kind of approaches is that they tend to bebrittle as hell, result in subtle behaviors if they have bugs, andoverall make the CI significantly more complicated by introducingvarious edge cases.

Our current CI is, relatively speaking, straightforward and consistent.And as it stands we can't afford elaborate schemes because I just don'thave the time capacity for maintaining that.

Re: [DISCUSS] solve unstable build capacity problem on TravisCI

Reply via email to