Am 18.11.2020 um 21:25 schrieb Jonas Hahnfeld:
Hi all,
Hi Jonas,
I'd like to present a first workable version of 'make check' for use in our CI pipelines. I've pushed the necessary commits to my personal fork and created two merge requests to demonstrate the results:
Great!
1. Run 'make check' for merge requests (no difference) URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/5 Job: https://gitlab.com/hahnjo/lilypond/-/jobs/858498690 test-results: https://hahnjo.gitlab.io/-/lilypond/-/jobs/858498690/artifacts/test-results/index.html
I stumbled across the git error in https://gitlab.com/hahnjo/lilypond/-/jobs/858498690#L85 but this is from print-gittxt.sh, right?
2. Introduce difference in bar-lines.ly URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/6 Job: https://gitlab.com/hahnjo/lilypond/-/jobs/852618720 test-results: https://hahnjo.gitlab.io/-/lilypond/-/jobs/852618720/artifacts/test-results/index.html This first workable implementation contains the minimum functionality: It runs 'make test-baseline' for every push to the master branch and replaces 'make test' in the pipelines for MRs with 'make check'. The test-results are uploaded as artifacts and can be either downloaded as zip archive or viewed directly (see above). There are a few known issues that I'm aware of: - I needed to delete input/regression/option-help.ly because it logs the options currently in use by lilypond, including the job-count which varies when using our own runners with more than one core.
Sounds like this should be fixed in get_help_string(), but I'm fine with removing the regtest, too. I can not imagine how this 'test' would be useful.
- The test-results are pretty hard to find: Even if you know where to look, it takes at least three clicks to download the zip archive plus extracting and opening index.html; viewing the results directly takes another three clicks from the job page (via 'Browse'). To ease the second option, I've included a link in the latest jobs of !5 above. The link is clickable which is a new feature of GitLab that I asked to be enabled for my repo, so it won't work directly for lilypond/lilypond. - The gittree information does not contain the right merge base. This is related to issue #6061 and I've submitted a change to GitLab to expose the needed information in CI, but it wasn't merged yet. - The modified job for MRs always downloads the latest test-baseline from master and not the one corresponding to the merge base. Same reason as the previous point, I hope we can get the needed information soon-ish. Until then, we need to rebase to adapt the MR commits to the available test-baseline.
But this is also the way James is testing currently, isn't it? He does a `test-baseline` against latest master and a `make check` on the feature branch, which may be lagging behing `master`...right? But I agree that for reproducibility it would be nice to take the baseline from the merge base. When would you do the `make test-baseline`, then? As part of the pipeline on the feature branch? Or would we keep a repository of, say, the last 20 baselines on `master` and pick the right one according to the merge base sha?
- Sometimes the test-results contain spurious diffs, for example here: https://hahnjo.gitlab.io/-/lilypond/-/jobs/858441670/artifacts/test-results/index.html I can reproduce this locally with --enable-checking, but haven't investigated further yet (there were a couple of other problems that I needed to solve in order to get things working...). If somebody has an idea for a fix, that would be great but I think these can be safely ignored for now. There are a few more elaborate things that I'd like to work on in the future. For example, GitLab can show a list of 'failing' tests which can tell us at first glance if we need to look into the test-results. I've prototyped this integration in the second MR, but it's very misleading because the file extensions are missing and GitLab prints "0 failed out of null" when there are no tests. The obvious solution is to mark all existing tests as success, but this requires a bit more thought to integrate into output-distance.py (or somewhere else). Despite these shortcomings, I think it would make sense to enable this first implementation in lilypond/lilypond. What do you think?
Does definitely make sense, however, I would value James' opinion on that, too. Thanks for your work! Michael
