Hi,

I agree.

I'll submit two requirements though:
> - the configuration for CI builds must be kept in the Arrow repository
>    (as they are currently in .github, etc.)
> - CI builds must be runnable from PRs
>

I'll submit three more:
- The result of the build (pass / did not pass) must be shown on github's
PRs
- The logs must be public and "clickable" from github
- We must not allow privileged arbitrary code execution from arbitrary users

I POCed Buildkite in January for Rust builds. See ARROW-11140
<https://issues.apache.org/jira/browse/ARROW-11140?jql=project%20%3D%20ARROW%20AND%20text%20~%20%22buildkite%22%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC>
and corresponding PR https://github.com/apache/arrow/pull/9111. It
fulfilled the above requirements for docker runs.

The runner was running a rootless docker, for all PRs and branches, and
allowed people to register runners on their own repos if they wish so.

Limitations:
1. no macos and windows (no easy way to secure the runner against arbitrary
execution)
2. jobs cannot use sudo and privileged stuff (we would need a separate
queue for these, or e.g. use a user whitelist like Krisztián mentioned)

Best,
Jorge


On Thu, Apr 15, 2021 at 12:28 AM Antoine Pitrou <anto...@python.org> wrote:

>
> Hi Krisztian,
>
> Thanks for bringing this up.  This is definitely becoming a
> high-priority topic for Arrow development.
>
> I don't believe there is much opportunity for reducing the number of
> builds or their runtime.  We simply have a lot of development going on,
> and the number of different CI jobs we have is simply because we need to
> support many different configurations (and past experience has shown
> that they quickly stop working if we don't monitor them on a regular
> basis).
>
> So I think the only path forward is to build up (== buy, probably) our
> own execution resources for CI.  Whether that entails using Github
> self-hosted runners, Buildkite, or yet another system, I have no idea.
>
> I'll submit two requirements though:
> - the configuration for CI builds must be kept in the Arrow repository
>    (as they are currently in .github, etc.)
> - CI builds must be runnable from PRs
>
> Regards
>
> Antoine.
>
>
> Le 15/04/2021 à 00:14, Krisztián Szűcs a écrit :
> > Hi,
> >
> > The Apache Github Actions agent pool seems to be oversubscribed as
> > more Apache projects migrate their CI setup to GHA. We experienced
> > pretty solid feedback times (~20-30m) when we originally moved to GHA
> > but now we are roughly 5hrs behind [1].
> >
> > Based on other projects' complaints and discussions [2][3] (doesn't
> > have all the links at hand) we can't expect a short term solution from
> > infra. I think we *need* to figure out something on the project level
> > instead to maintain the overall project health and to improve the
> > development velocity.
> >
> > I don't have a concrete proposal at the moment, but we should start to
> > collect the available options. Ideas?
> >
> > Thanks, Krisztian
> >
> > [1]: https://github.com/apache/arrow/actions?query=is%3Ain_progress
> > [2]: https://github.com/apache/pulsar/issues/9154
> > [3]: https://issues.apache.org/jira/browse/SPARK-34053
> >
>

Reply via email to