Re: [RFC] Automatic 'make check' in CI

Michael Käppler Thu, 19 Nov 2020 12:59:11 -0800

Am 18.11.2020 um 21:25 schrieb Jonas Hahnfeld:

Hi all,

Hi Jonas,


I'd like to present a first workable version of 'make check' for use in
our CI pipelines. I've pushed the necessary commits to my personal fork
and created two merge requests to demonstrate the results:

Great!


1. Run 'make check' for merge requests (no difference)
URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/5
Job: https://gitlab.com/hahnjo/lilypond/-/jobs/858498690
test-results:
https://hahnjo.gitlab.io/-/lilypond/-/jobs/858498690/artifacts/test-results/index.html

I stumbled across the git error in
https://gitlab.com/hahnjo/lilypond/-/jobs/858498690#L85
but this is from print-gittxt.sh, right?


2. Introduce difference in bar-lines.ly
URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/6
Job: https://gitlab.com/hahnjo/lilypond/-/jobs/852618720
test-results:
https://hahnjo.gitlab.io/-/lilypond/-/jobs/852618720/artifacts/test-results/index.html

This first workable implementation contains the minimum functionality:
It runs 'make test-baseline' for every push to the master branch and
replaces 'make test' in the pipelines for MRs with 'make check'. The
test-results are uploaded as artifacts and can be either downloaded as
zip archive or viewed directly (see above).

There are a few known issues that I'm aware of:
  - I needed to delete input/regression/option-help.ly because it logs
the options currently in use by lilypond, including the job-count which
varies when using our own runners with more than one core.

Sounds like this should be fixed in get_help_string(), but I'm fine with
removing
the regtest, too. I can not imagine how this 'test' would be useful.

  - The test-results are pretty hard to find: Even if you know where to
look, it takes at least three clicks to download the zip archive plus
extracting and opening index.html; viewing the results directly takes
another three clicks from the job page (via 'Browse'). To ease the
second option, I've included a link in the latest jobs of !5 above. The
link is clickable which is a new feature of GitLab that I asked to be
enabled for my repo, so it won't work directly for lilypond/lilypond.
  - The gittree information does not contain the right merge base. This
is related to issue #6061 and I've submitted a change to GitLab to
expose the needed information in CI, but it wasn't merged yet.
  - The modified job for MRs always downloads the latest test-baseline
from master and not the one corresponding to the merge base. Same
reason as the previous point, I hope we can get the needed information
soon-ish. Until then, we need to rebase to adapt the MR commits to the
available test-baseline.

But this is also the way James is testing currently, isn't it? He does a
`test-baseline`
against latest master and a `make check` on the feature branch, which
may be lagging behing
`master`...right?
But I agree that for reproducibility it would be nice to take the
baseline from the merge base.
When would you do the `make test-baseline`, then?
As part of the pipeline on the feature branch? Or would we keep a
repository of, say, the
last 20 baselines on `master` and pick the right one according to the
merge base sha?

  - Sometimes the test-results contain spurious diffs, for example here:
https://hahnjo.gitlab.io/-/lilypond/-/jobs/858441670/artifacts/test-results/index.html
I can reproduce this locally with --enable-checking, but haven't
investigated further yet (there were a couple of other problems that I
needed to solve in order to get things working...). If somebody has an
idea for a fix, that would be great but I think these can be safely
ignored for now.

There are a few more elaborate things that I'd like to work on in the
future. For example, GitLab can show a list of 'failing' tests which
can tell us at first glance if we need to look into the test-results.
I've prototyped this integration in the second MR, but it's very
misleading because the file extensions are missing and GitLab prints "0
failed out of null" when there are no tests. The obvious solution is to
mark all existing tests as success, but this requires a bit more
thought to integrate into output-distance.py (or somewhere else).

Despite these shortcomings, I think it would make sense to enable this
first implementation in lilypond/lilypond. What do you think?

Does definitely make sense, however, I would value James' opinion on
that, too.

Thanks for your work!
Michael

Re: [RFC] Automatic 'make check' in CI

Reply via email to