Re: MIgrating source code and CI

Jarek Potiuk Tue, 25 Aug 2020 06:01:01 -0700

In case you would like to utilise the new "workflow_run' feature and
implement duplicate cancelling for builds from forks - our changes in
Airflow have been merged, I also released a fully-featured "Cancel Workflow
Runs" action that utilizes the "workflow_run" feature: The v2 available in
the marketplace
<https://github.com/marketplace/actions/cancel-workflow-runs> with full
support  for "workflow_run" events. Plus the documentation now is available
and merged in https://github.com/apache/airflow/blob/master/CI.rst
(includes sequence diagrams and explanation of motivation and approach).


I hope it will be useful and you can get some experiences from our work
in Apache Airflow. Happy to help in case you have any questions/problems.

BTW. This new workflow allowed us to do some significant optimisations:

Some stats for average runs (we have way bigger gains in situations where
python released new patch-level version):

* Prepare image job: 5 minutes 30 seconds -> 1 minute 7 seconds (~80%
improvement)
* Longest job time: 34 minutes => 29 minutes 30 seconds (~15% improvement
in longest job)
* On average our builds are ~ 30% faster (that's rough approximation,
depending on a number of circumstances).
* Build time saved per build (!)  = 27 jobs * 4.5 minutes ~ 2h machine
build time saved for each build (!)
* In longest build scenarios we save > 40% of build time and elapsed time
 10x less chance of build being broken due to external dependency problems

Thanks to this optimization, we plan to decrease the waiting time of an
average contributor for CI feedback from ~40 minutes to less than 10
minutes (and I think this is achievable).

J.

On Tue, Aug 18, 2020 at 2:54 PM Jarek Potiuk <[email protected]> wrote:

> Hey Chris,
>
> As mentioned before - i'd love to help bringing my Apache Airflow
> experiences with GitHub. As you commented - I would also love to have a PR
> out of the code (even draft)? I have some comments that might be best if
> they are referring to the actual code and happy to add more comments :).
>
> Thanks for the notes Chandan, it makes it really easy to follow. There are
>> 3 things that I can have a prod at:
>>
>> - "GitHub Actions do not support YAML variables and/or YAML anchors, which
>>    leads to some repetition." Yeh I think that this is a problem. So the
>> way buildbarn approach this is to use jsonnet to generate the github
>> actions, which I think is an effective route
>> https://github.com/buildbarn/bb-storage/commit/41a7e29dfebb625729a0c4f0e8285a254781e698#diff-02d9c370a663741451423342d5869b21
>> I'm happy to experiement with such an approach and report back.
>>
>
> As of last week GitHub actions support Composite Run steps:
> https://github.blog/changelog/2020-08-07-github-actions-composite-run-steps/ 
> that
> might do what you want to do regarding code reuse.
>
> Comment: I also missed anchors initially, but over time I think the
> Composite Run steps are a much better solution if you have quite a number
> of jobs to write. The problem with anchors and Yaml variables is that they
> are designed to make the yaml more DRY. But I think it is much more
> important that the CI code is easy to read and that you know what's going
> on when you look at the step in isolation rather than having to refer to
> something defined in another part of Yaml (readability trumps
> orthogonality). The way how anchors are merged and their syntax is rather
> weird IMHO and I think CI yamls should be (similarly as tests) much more
> DAMP than DRY (
> https://stackoverflow.com/questions/6453235/what-does-damp-not-dry-mean-when-talking-about-unit-tests
> )
>
>
>> - "persistent artifact cache" So I know that there have been efforts in
>> this area via
>> https://gitlab.com/BuildStream/buildstream/-/merge_requests/1997 (now
>> merged) and more recently with the asset cache at
>> https://gitlab.com/BuildStream/infrastructure/infrastructure/-/merge_requests/2.
>> It's probably worth coordinating with the authors of both MRs to smooth the
>> transition.
>
>
> Not sure what is the "cache" here. But GA has excellent support for
> caching artifacts including API to query and retrieve them. We are using it
> extensively in Airflow - we publish generated packages, logs,
> documentation, and much more this way:
> https://github.com/apache/airflow/actions/runs/213556233
>
>
>> On a broader point, is there anywhere we can track and review each
>> other's work? I'm happy to work off
>> https://github.com/cs-shadow/buildstream but other locations work for me.
>>
>>
>
> I also looked up some earlier discussions in the thread and here are my
> comments:
>
> Since we will have a public repository, we qualify for "free" and
>> "unlimited" actions. I am sure both those terms come with associated
>> fineprint. So far, it seems like Actions could cope up with ~4-5
>> pipelines in parallel (each with 7-8 jobs). I am not sure whether the
>> free tier will be able to keep up with our load once we start getting
>> real patches on GitHub. If not, we may need to add some custom runners
>> (similar to what we have on GitLab).
>
>
> Apache has Enterprise level of support. Once I saw a message about it, but
> I cannot find the place any more. But when you raise a ticket and choose
> "Apache" as organisation you can see it is "Enterprise level". BTW. They
> are usually very helpful and respond in under 2 hours usually.
>
> Out of the trenches:
>
> We run a lot of huge builds in Apache Airflow, I think when we run on
> Travis we were the 3rd or 4th highest-use Apache project in Travis CI -
> right after Apache Kafka I believe and it only grew 2-3 times since then.
> We have ~10000 builds now and counting over the last ~4 months. And at
> least 1/3 of those builds are of this kind (
> https://github.com/apache/airflow/actions/runs/213556233) - 30 jobs, each
> using 10-30 minutes of machine time. Which at the very least is
> 10x30x10000/3/4/60 ~ 5000 hours of build hours per month. That's a lot. We
> are still optimising it because we want to be good citizens though and I am
> finishing a change to decrease it by 20/30%.
>
> And indeed - you can set up private workers on Azure/Amazon/GCP if needed.
> It's super easy and you can even setup auto-scaling self-hosted runners if
> you need AWS example here:
> https://040code.github.io/2020/05/25/scaling-selfhosted-action-runners
> and different options to deploy on GCP :
> https://github.blog/2020-08-04-github-actions-self-hosted-runners-on-google-cloud/
>  (auto-scaling
> with AppEngine, Kubernetes, persistent, even multi-cloud with Anthos if you
> want).
>
> Many thanks,
>>
>> Chris
>>
>
>
> --
> +48 660 796 129
>


-- 
+48 660 796 129

Re: MIgrating source code and CI

Reply via email to