Re: A CI job inventory

Timothe Litt via curl-library Mon, 07 Feb 2022 15:42:13 -0800

Agree with the thrust of these comments.

Perhaps rather than add metadata, have a utility that each CI job runs to update a database at setup and/or exit.

This would also automate updating descriptions, etc as well as entering new jobs without a separate process.


It could give you actual runtimes, fail counts, etc.

The mechanics might be a bit involved, but not difficult - probably the utility would have to send its updates to a server since most CI environments don't provide a persistent story - and you'd want data from all environments in one place anyhow.


Timothe Litt
ACM Distinguished Engineer
--------------------------
This communication may not represent the ACM or my employer's views,
if any, on the matters discussed.

On 07-Feb-22 18:07, Dan Fandrich via curl-library wrote:

On Mon, Feb 07, 2022 at 11:10:39PM +0100, Daniel Stenberg via curl-library 
wrote:

In order to get better overview and control of the jobs we run, I'm
proposing that we create and maintain a single file that lists all the jobs
we run. This "database" of jobs could then be used to run checks against and
maybe generate some tables or charts and what not to help us make sure our
CI jobs really covers as many build combinations as possible and perhaps it
can help us reduce duplications or too-similar builds.

I suspect we will be able to count the time in hours before such a list
diverges from the actual CI jobs being run because somebody forgot to update
the master list properly. Such a list will be pure duplication of information
already found in the CI configuration files, too.  I would rather treat the CI
files as the sources of truth and derive a dashboard by parsing those instead,
to show the jobs that are *actually* being run.  The down side to that, of
course, is that you'd need to write code to parse 6 different CI configuration
file formats, but the significant benefit is that you could always trust the
dashboard.

Another approach would be to add metadata to the different CI configuration
files that the dashboard could read from each file in a consistent format, such
as a specially-formatted comment, structured job title, and/or special
environment variable definition. That makes parsing easier, but it means that
people would need to remember to update the metadata when they update or add a
job. The metadata could still fall out of date for that reason, but it's less
likely to happen than with a separate, central job registry because the
metadata will always be found along with the job configuration. It should also
be relatively easy to at least count the number of jobs defined in each CI
configuration file and flag those without a special metadata line (catching new
uncategorised jobs).

Maybe a hybrid approach is the best; read and parse as much job data as
practical from the job name and "env" section of each CI configuration file
(which should be pretty simple and stable to retrieve), and supplement that
with additional data from a structured comment (or magic "env" variable), where
necessary.

The only way I'd advocate for a new central job description file is if it could
be used to mechanically generate the CI job files. That would mean there would
be only one source of truth, but this approach would also be pretty impractical
due to the complexity of many job configurations and the need to write 6
different configuration file formats.

Dan

OpenPGP_signature
Description: OpenPGP digital signature

-- 
Unsubscribe: https://lists.haxx.se/listinfo/curl-library
Etiquette:   https://curl.haxx.se/mail/etiquette.html

Re: A CI job inventory

Reply via email to