On Wed, Nov 18, 2020 at 9:25 PM Jonas Hahnfeld <hah...@hahnjo.de> wrote:
> Hi all, > > I'd like to present a first workable version of 'make check' for use in > our CI pipelines. I've pushed the necessary commits to my personal fork > and created two merge requests to demonstrate the results: > > 1. Run 'make check' for merge requests (no difference) > URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/5 > Job: https://gitlab.com/hahnjo/lilypond/-/jobs/858498690 > test-results: > > https://hahnjo.gitlab.io/-/lilypond/-/jobs/858498690/artifacts/test-results/index.html > > 2. Introduce difference in bar-lines.ly > URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/6 > Job: https://gitlab.com/hahnjo/lilypond/-/jobs/852618720 > test-results: > > https://hahnjo.gitlab.io/-/lilypond/-/jobs/852618720/artifacts/test-results/index.html > > This first workable implementation contains the minimum functionality: > It runs 'make test-baseline' for every push to the master branch and > replaces 'make test' in the pipelines for MRs with 'make check'. The > test-results are uploaded as artifacts and can be either downloaded as > zip archive or viewed directly (see above). > > There are a few known issues that I'm aware of: > - I needed to delete input/regression/option-help.ly because it logs > the options currently in use by lilypond, including the job-count which > varies when using our own runners with more than one core. > we could overwrite the job-count option in the option help, as it is irrelevant by the time you get to the file processing. > - Sometimes the test-results contain spurious diffs, for example here: > > https://hahnjo.gitlab.io/-/lilypond/-/jobs/858441670/artifacts/test-results/index.html > I can reproduce this locally with --enable-checking, but haven't > investigated further yet (there were a couple of other problems that I > needed to solve in order to get things working...). If somebody has an > idea for a fix, that would be great but I think these can be safely > ignored for now. > This could happen because of false-positives in the conservative garbage collection. It's not super-likely, but at the same time, it can't be ruled out. Does it always happen with the same files? I have never seen this, and the files are not doing anything out of the ordinary. IIRC, the dead-object detection can't be made to work anyway with GUILE 2.x, so this might be a good moment to scrap it. > There are a few more elaborate things that I'd like to work on in the > future. For example, GitLab can show a list of 'failing' tests which > can tell us at first glance if we need to look into the test-results. > I've prototyped this integration in the second MR, but it's very > misleading because the file extensions are missing and GitLab prints "0 > failed out of null" when there are no tests. The obvious solution is to > mark all existing tests as success, but this requires a bit more > thought to integrate into output-distance.py (or somewhere else). > I was going to suggest to use junit XML files, but it looks you already found this. Fantastic! > Despite these shortcomings, I think it would make sense to enable this > first implementation in lilypond/lilypond. What do you think? > yes, +1 . I am especially happy that this will likely have no extra overhead. -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen