Agree with the thrust of these comments.Perhaps rather than add metadata, have a utility that each CI job runs to update a database at setup and/or exit.
This would also automate updating descriptions, etc as well as entering new jobs without a separate process.
It could give you actual runtimes, fail counts, etc.The mechanics might be a bit involved, but not difficult - probably the utility would have to send its updates to a server since most CI environments don't provide a persistent story - and you'd want data from all environments in one place anyhow.
Timothe Litt ACM Distinguished Engineer -------------------------- This communication may not represent the ACM or my employer's views, if any, on the matters discussed. On 07-Feb-22 18:07, Dan Fandrich via curl-library wrote:
On Mon, Feb 07, 2022 at 11:10:39PM +0100, Daniel Stenberg via curl-library wrote:In order to get better overview and control of the jobs we run, I'm proposing that we create and maintain a single file that lists all the jobs we run. This "database" of jobs could then be used to run checks against and maybe generate some tables or charts and what not to help us make sure our CI jobs really covers as many build combinations as possible and perhaps it can help us reduce duplications or too-similar builds.I suspect we will be able to count the time in hours before such a list diverges from the actual CI jobs being run because somebody forgot to update the master list properly. Such a list will be pure duplication of information already found in the CI configuration files, too. I would rather treat the CI files as the sources of truth and derive a dashboard by parsing those instead, to show the jobs that are *actually* being run. The down side to that, of course, is that you'd need to write code to parse 6 different CI configuration file formats, but the significant benefit is that you could always trust the dashboard. Another approach would be to add metadata to the different CI configuration files that the dashboard could read from each file in a consistent format, such as a specially-formatted comment, structured job title, and/or special environment variable definition. That makes parsing easier, but it means that people would need to remember to update the metadata when they update or add a job. The metadata could still fall out of date for that reason, but it's less likely to happen than with a separate, central job registry because the metadata will always be found along with the job configuration. It should also be relatively easy to at least count the number of jobs defined in each CI configuration file and flag those without a special metadata line (catching new uncategorised jobs). Maybe a hybrid approach is the best; read and parse as much job data as practical from the job name and "env" section of each CI configuration file (which should be pretty simple and stable to retrieve), and supplement that with additional data from a structured comment (or magic "env" variable), where necessary. The only way I'd advocate for a new central job description file is if it could be used to mechanically generate the CI job files. That would mean there would be only one source of truth, but this approach would also be pretty impractical due to the complexity of many job configurations and the need to write 6 different configuration file formats. Dan
OpenPGP_signature
Description: OpenPGP digital signature
-- Unsubscribe: https://lists.haxx.se/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html