Aaron Fabbri created HADOOP-19896:
-------------------------------------
Summary: ci: improve performance of toolchain image building
Key: HADOOP-19896
URL: https://issues.apache.org/jira/browse/HADOOP-19896
Project: Hadoop Common
Issue Type: Sub-task
Reporter: Aaron Fabbri
Our hadoop CI first builds a "toolchain" container image, based off of
`dev_support/docker` files. In then uses this container to build our code, and
also to run some tests.
Our initial github workflows (inspired by Apache Spark) always build a
container image on each workflow trigger, relying on container build caching to
reduce runtime.
I propose we try a more efficient approach, and see whether or not it is an
improvement:
1. Build and publish toolchain images whenever we push to trunk, and those
changes affect any definitions that influence the toolchain image (i.e. we need
to refresh).
2. (extra credit) Also trigger this toolchain build workflow on a schedule–just
to ensure that we are still updating the toolchain images if for some reason no
changes are made to trunk for some "max age" time (e.g. weekly).
3. PRs that don't make toolchain changes just download the latest trunk
toolchain build. PRs that do make toolchain changes can A. fall back to current
behavior automatically, or B. change a workflow variable to force this
behavior. (A seems preferrable).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]