Re: [RFC] Automatic 'make check' in CI
On Nov 23, 2020, at 18:24, Dan Eble wrote: > > On Nov 23, 2020, at 18:01, Han-Wen Nienhuys wrote: >> >> We'd have to make unsmob return a Stencil const *. Maybe that is >> appropriate for more simple smobs? >> > > or for a more explicit interface, delete unsmob so that unsmob Stencil> is required. I'm working on this. — Dan
Re: [RFC] Automatic 'make check' in CI
On Nov 23, 2020, at 18:01, Han-Wen Nienhuys wrote: > > We'd have to make unsmob return a Stencil const *. Maybe that is > appropriate for more simple smobs? > or for a more explicit interface, delete unsmob so that unsmob is required. — Dan
Re: [RFC] Automatic 'make check' in CI
On Mon, Nov 23, 2020 at 9:11 PM Jonas Hahnfeld wrote: > Am Sonntag, den 22.11.2020, 22:49 +0100 schrieb David Kastrup: > > Jonas Hahnfeld writes: > > > > > Nope, that was a red herring: The reason is that the footnote creation > > > process in footnote-volta-spanner.ly messes with (point-stencil), > which > > > is used by skyline-point-extent.ly and also the BendSpanner. As far as > > > I understand the Stencil class, its objects are not immutable... > > > > > > One of the following two changes fixes the problem in my local test > > > scenario: > > > > > > [...] > > > > > > My gut feeling is that we should fix null-markup because nothing should > > > ever translate / rotate / modify the global point-stencil, thoughts? > > > > To me that seems crazy. Instead \footnote should not modify the stencil > > generated from its argument in place. If it needs something with > > different dimensions, it needs to create a new stencil, copying the > > stencil expression and changing the dimensions. > > Yes indeed, this seems to be the contract for stencils: Make a copy > whenever a Stencil is passed in from Scheme. I managed to find the code > path for Balloon grobs that violate this rule, here's a fix: > https://gitlab.com/lilypond/lilypond/-/merge_requests/522 > (Note that the test case shows that this was already broken before this > unstable series, at least 2.20.0 but probably older versions too). > thanks for tracking this down! point-stencil is from 2004, and the balloon code from thereabouts too. It's surprising that this took so long to turn up. > If possible, I'd like to merge this rather sooner than later so I can > finally post the MR to enable 'make check'.. > fine to skip the countdown for me. > > ly:stencil-set-extent! has been removed in 2.5.17. Extents are not > > intended to be mutable as far as I can discern, so if our C++ code does > > violate that assumption, it is that code we should fix instead of giving > > up on behavior guaranteed from the Scheme side of things. > > In fact, the C++ code does and many methods of the Stencil class modify > the object, but this is ok as long as we stick to the rule above. I'd > be happy to assert in some way that stencil representations coming from > Scheme are not altered, but I can't think of a good way to do this... > We'd have to make unsmob return a Stencil const *. Maybe that is appropriate for more simple smobs? -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: [RFC] Automatic 'make check' in CI
Jonas Hahnfeld writes: > Am Sonntag, den 22.11.2020, 22:49 +0100 schrieb David Kastrup: >> >> To me that seems crazy. Instead \footnote should not modify the >> stencil generated from its argument in place. If it needs something >> with different dimensions, it needs to create a new stencil, copying >> the stencil expression and changing the dimensions. > > Yes indeed, this seems to be the contract for stencils: Make a copy > whenever a Stencil is passed in from Scheme. And/or is not being generated from scratch. > I managed to find the code path for Balloon grobs that violate this > rule, here's a fix: > https://gitlab.com/lilypond/lilypond/-/merge_requests/522 (Note that > the test case shows that this was already broken before this unstable > series, at least 2.20.0 but probably older versions too). > > If possible, I'd like to merge this rather sooner than later so I can > finally post the MR to enable 'make check'... > >> ly:stencil-set-extent! has been removed in 2.5.17. Extents are not >> intended to be mutable as far as I can discern, so if our C++ code >> does violate that assumption, it is that code we should fix instead >> of giving up on behavior guaranteed from the Scheme side of things. > > In fact, the C++ code does and many methods of the Stencil class modify > the object, but this is ok as long as we stick to the rule above. Well, basically the rule should be that as soon as a Stencil is exposed to the Scheme layer, it is immutable. Stencils are of type Simple_smob, so one could stipulate that the result of any call to smobbed_copy () is off-limits for further modification. It's not the absolute point after which things come crashing down, but it's close. In particular, anything that we access via unsmob should be considered off-limits for modification. I have no idea whether there is a sane C++ way of ensuring that automagically. > I'd be happy to assert in some way that stencil representations coming > from Scheme are not altered, but I can't think of a good way to do > this... > -- David Kastrup
Re: [RFC] Automatic 'make check' in CI
Am Sonntag, den 22.11.2020, 22:49 +0100 schrieb David Kastrup: > Jonas Hahnfeld writes: > > > Nope, that was a red herring: The reason is that the footnote creation > > process in footnote-volta-spanner.ly messes with (point-stencil), which > > is used by skyline-point-extent.ly and also the BendSpanner. As far as > > I understand the Stencil class, its objects are not immutable... > > > > One of the following two changes fixes the problem in my local test > > scenario: > > > > [...] > > > > My gut feeling is that we should fix null-markup because nothing should > > ever translate / rotate / modify the global point-stencil, thoughts? > > To me that seems crazy. Instead \footnote should not modify the stencil > generated from its argument in place. If it needs something with > different dimensions, it needs to create a new stencil, copying the > stencil expression and changing the dimensions. Yes indeed, this seems to be the contract for stencils: Make a copy whenever a Stencil is passed in from Scheme. I managed to find the code path for Balloon grobs that violate this rule, here's a fix: https://gitlab.com/lilypond/lilypond/-/merge_requests/522 (Note that the test case shows that this was already broken before this unstable series, at least 2.20.0 but probably older versions too). If possible, I'd like to merge this rather sooner than later so I can finally post the MR to enable 'make check'... > ly:stencil-set-extent! has been removed in 2.5.17. Extents are not > intended to be mutable as far as I can discern, so if our C++ code does > violate that assumption, it is that code we should fix instead of giving > up on behavior guaranteed from the Scheme side of things. In fact, the C++ code does and many methods of the Stencil class modify the object, but this is ok as long as we stick to the rule above. I'd be happy to assert in some way that stencil representations coming from Scheme are not altered, but I can't think of a good way to do this... signature.asc Description: This is a digitally signed message part
Re: [RFC] Automatic 'make check' in CI
Jonas Hahnfeld writes: > Nope, that was a red herring: The reason is that the footnote creation > process in footnote-volta-spanner.ly messes with (point-stencil), which > is used by skyline-point-extent.ly and also the BendSpanner. As far as > I understand the Stencil class, its objects are not immutable... > > One of the following two changes fixes the problem in my local test > scenario: > > --- > diff --git a/ly/music-functions-init.ly b/ly/music-functions-init.ly > index 13f5c37811..17f14b1f07 100644 > --- a/ly/music-functions-init.ly > +++ b/ly/music-functions-init.ly > @@ -538,7 +538,7 @@ to the preceding note or rest as a post-event with > @code{-}.") > 'X-offset (car offset) > 'Y-offset (cdr offset) > 'automatically-numbered (not mark) > - 'text (or mark (make-null-markup)) > + 'text (or mark (ly:make-stencil "" '(0 . 0) '(0 . 0))) > 'footnote-text footnote))) > (once (propertyTweak 'footnote-music mus item > > diff --git a/scm/define-markup-commands.scm b/scm/define-markup-commands.scm > index f7280f7a58..1ab6eaafe3 100644 > --- a/scm/define-markup-commands.scm > +++ b/scm/define-markup-commands.scm > @@ -1379,7 +1379,8 @@ An empty markup with extents of a single point. >\\null > } > @end lilypond" > - point-stencil) > + ;; Create a new point-stencil every time, it might get modified... > + (ly:make-stencil "" '(0 . 0) '(0 . 0))) > > > ;; basic formatting. > --- > > My gut feeling is that we should fix null-markup because nothing should > ever translate / rotate / modify the global point-stencil, thoughts? To me that seems crazy. Instead \footnote should not modify the stencil generated from its argument in place. If it needs something with different dimensions, it needs to create a new stencil, copying the stencil expression and changing the dimensions. ly:stencil-set-extent! has been removed in 2.5.17. Extents are not intended to be mutable as far as I can discern, so if our C++ code does violate that assumption, it is that code we should fix instead of giving up on behavior guaranteed from the Scheme side of things. -- David Kastrup
Re: [RFC] Automatic 'make check' in CI
Am Sonntag, den 22.11.2020, 21:12 +0100 schrieb Han-Wen Nienhuys: > On Sun, Nov 22, 2020 at 1:55 PM Jonas Hahnfeld > wrote: > > footnote-volta-spanner.ly was added by > > commit c1f44faf958aafe7802f432a5ae0f4c74a8c0146 > > Author: Han-Wen Nienhuys > > Date: Tue Jun 16 08:59:35 2020 +0200 > > > > For footnotes on spanners, copy bounds from underlying spanner. > > > > This is likely more correct for spanners that run from non-musical > > columns (eg. volta brackets) or are created retroactively > > (eg. automatic beams). > > > > Demonstrate this by adding a regression test that puts a footnote on a > > VoltaBracket. > > > > Without adjustments of Footnote_engraver, the test produces the > > following: > > > > programming error: Spanner `FootnoteSpanner' is not fully contained > > in parent spanner. Ignoring orphaned part > > continuing, cross fingers > > programming error: bounds of this piece aren't breakable. > > continuing, cross fingers > > > > (https://gitlab.com/lilypond/lilypond/-/merge_requests/138). During > > review, David had some concerns about Spanners that were dismissed on > > the basis that there was no test case that showed wrong results (is > > that really our standard for claiming that a change works correctly?!). > > > > I tried reverting that commit and all other differences also went away. > > Does this mean the change is indeed flawed? Nope, that was a red herring: The reason is that the footnote creation process in footnote-volta-spanner.ly messes with (point-stencil), which is used by skyline-point-extent.ly and also the BendSpanner. As far as I understand the Stencil class, its objects are not immutable... One of the following two changes fixes the problem in my local test scenario: --- diff --git a/ly/music-functions-init.ly b/ly/music-functions-init.ly index 13f5c37811..17f14b1f07 100644 --- a/ly/music-functions-init.ly +++ b/ly/music-functions-init.ly @@ -538,7 +538,7 @@ to the preceding note or rest as a post-event with @code{-}.") 'X-offset (car offset) 'Y-offset (cdr offset) 'automatically-numbered (not mark) - 'text (or mark (make-null-markup)) + 'text (or mark (ly:make-stencil "" '(0 . 0) '(0 . 0))) 'footnote-text footnote))) (once (propertyTweak 'footnote-music mus item diff --git a/scm/define-markup-commands.scm b/scm/define-markup-commands.scm index f7280f7a58..1ab6eaafe3 100644 --- a/scm/define-markup-commands.scm +++ b/scm/define-markup-commands.scm @@ -1379,7 +1379,8 @@ An empty markup with extents of a single point. \\null } @end lilypond" - point-stencil) + ;; Create a new point-stencil every time, it might get modified... + (ly:make-stencil "" '(0 . 0) '(0 . 0))) ;; basic formatting. --- My gut feeling is that we should fix null-markup because nothing should ever translate / rotate / modify the global point-stencil, thoughts? I'll try to have a look this week (likely Friday). It may be prudent to revert the change if 2.22 is imminent. Actually CG says that whatever code compiled with a beta release must work with later releases of that series. I think that makes sense and we should stick to it... Jonas signature.asc Description: This is a digitally signed message part
Re: [RFC] Automatic 'make check' in CI
On Sun, Nov 22, 2020 at 1:55 PM Jonas Hahnfeld wrote: > Am Samstag, den 21.11.2020, 18:50 +0100 schrieb Jonas Hahnfeld: > > Am Samstag, den 21.11.2020, 09:46 +0100 schrieb Jonas Hahnfeld: > > > Am Freitag, den 20.11.2020, 17:55 -0500 schrieb Dan Eble: > > > > On Nov 20, 2020, at 12:22, Jonas Hahnfeld wrote: > > > > > > > Despite these shortcomings, I think it would make sense to > enable > > > > > > > this first implementation in lilypond/lilypond. What do you > think? > > > > > > > > > > > > > > > > > > > yes, +1 . I am especially happy that this will likely have no > extra > > > > > > overhead. > > > > > > > > > > No extra overhead per pipeline in MRs, but you get an extra > pipeline > > > > > for every merge to master. We'll see if this poses a serious > problem… > > > > > > > > Given the positive feedback this change has already received, can we > > > > accelerate its progress? I'm eager to have it in place because I've > > > > got a branch with new some regression tests ready. > > > > > > Sure, I can merge !517 later today and then work on the second MR to > > > actually enable 'make check' in the pipelines. > > > > Something's not working correctly here: > > > https://hahnjo.gitlab.io/-/lilypond/-/jobs/864846942/artifacts/test-results/index.html > > The test-baseline was generated by frog (using parallelism), check > > executed on a shared runner. I thought I had tested using a test- > > baseline with different number of cores... > > What I probably checked is various values for CPU_COUNT, including just > one core, and all works correctly AFAICT. What gives differences is not > setting CPU_COUNT / job-count at all. > > I successfully traced this back to compiling, for example, footnote- > volta-spanner.ly and skyline-point-extent.ly (in that order!) with an > added '\include "lilypond-book-preamble.ly"' (attached). If I run > $ ./out/bin/lilypond footnote-volta-spanner.ly skyline-point-extent.ly > the skyline-point-extent.pdf has some added whitespace that vanishes > when passing -djob-count=1 or 2 (starting with 3, it won't fork because > there are less input files than jobs). > > footnote-volta-spanner.ly was added by > commit c1f44faf958aafe7802f432a5ae0f4c74a8c0146 > Author: Han-Wen Nienhuys > Date: Tue Jun 16 08:59:35 2020 +0200 > > For footnotes on spanners, copy bounds from underlying spanner. > > This is likely more correct for spanners that run from non-musical > columns (eg. volta brackets) or are created retroactively > (eg. automatic beams). > > Demonstrate this by adding a regression test that puts a footnote on a > VoltaBracket. > > Without adjustments of Footnote_engraver, the test produces the > following: > > programming error: Spanner `FootnoteSpanner' is not fully contained > in parent spanner. Ignoring orphaned part > continuing, cross fingers > programming error: bounds of this piece aren't breakable. > continuing, cross fingers > > (https://gitlab.com/lilypond/lilypond/-/merge_requests/138). During > review, David had some concerns about Spanners that were dismissed on > the basis that there was no test case that showed wrong results (is > that really our standard for claiming that a change works correctly?!). > > I tried reverting that commit and all other differences also went away. > Does this mean the change is indeed flawed? > I'll try to have a look this week (likely Friday). It may be prudent to revert the change if 2.22 is imminent. Any ideas, comments? > Let's not reverse the arg list after fork, to avoid -djob-count confusing things more. I'd like to get rid of the fork call -that would also solve the problem with the options regtest. Let me try to have another look at !204. I am not too optimistic, but it seems worth trying another time. -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: [RFC] Automatic 'make check' in CI
Am Sonntag, den 22.11.2020, 09:56 -0500 schrieb Dan Eble: > On Nov 22, 2020, at 07:55, Jonas Hahnfeld wrote: > > > > > > I successfully traced this back to compiling, for example, footnote- > > volta-spanner.ly and skyline-point-extent.ly (in that order!) with an > > added '\include "lilypond-book-preamble.ly"' (attached). If I run > > $ ./out/bin/lilypond footnote-volta-spanner.ly skyline-point-extent.ly > > the skyline-point-extent.pdf has some added whitespace that vanishes > > when passing -djob-count=1 or 2 (starting with 3, it won't fork because > > there are less input files than jobs). > > I have to run right now, but I would look for things related to > garbage collection: does forcing GC at some point make the behavior > reproducible? Are there missing mark calls in the new code? After hours of looking at the wrong code, I realized that the list of input files is reversed when job-count is set (due to how split-list works). So when swapping the two arguments to make footnote-volta- spanner.ly compiled first, I reproducibly get the added whitespace in skyline-point-extent.ly. I guess this rules out garbage collection as a suspect. Jonas signature.asc Description: This is a digitally signed message part
Re: [RFC] Automatic 'make check' in CI
On Nov 22, 2020, at 07:55, Jonas Hahnfeld wrote: > > > I successfully traced this back to compiling, for example, footnote- > volta-spanner.ly and skyline-point-extent.ly (in that order!) with an > added '\include "lilypond-book-preamble.ly"' (attached). If I run > $ ./out/bin/lilypond footnote-volta-spanner.ly skyline-point-extent.ly > the skyline-point-extent.pdf has some added whitespace that vanishes > when passing -djob-count=1 or 2 (starting with 3, it won't fork because > there are less input files than jobs). I have to run right now, but I would look for things related to garbage collection: does forcing GC at some point make the behavior reproducible? Are there missing mark calls in the new code? — Dan
Re: [RFC] Automatic 'make check' in CI
On Nov 22, 2020, at 07:55, Jonas Hahnfeld wrote: > > . . . had some concerns about Spanners that were dismissed on > the basis that there was no test case that showed wrong results (is > that really our standard for claiming that a change works correctly?!). To be fair, there are things in this program that are complicated enough that it's unclear how to reach a specific state. On the other hand, there are areas where existing tests cover the basics but give less attention to boundary cases and interactions between features. Demanding that a reviewer produce a test case is a faux pas where I come from (midwest U.S. and Canada). A productive response would be something like this: I spent half an hour trying to create a test case to address your concern. The closest I got was . . . . That doesn't quite cover it. Do you have any suggestions? If not, I'd like to move forward with this patch. Regards, — Dan
Re: [RFC] Automatic 'make check' in CI
Am Samstag, den 21.11.2020, 18:50 +0100 schrieb Jonas Hahnfeld: > Am Samstag, den 21.11.2020, 09:46 +0100 schrieb Jonas Hahnfeld: > > Am Freitag, den 20.11.2020, 17:55 -0500 schrieb Dan Eble: > > > On Nov 20, 2020, at 12:22, Jonas Hahnfeld wrote: > > > > > > Despite these shortcomings, I think it would make sense to enable > > > > > > this first implementation in lilypond/lilypond. What do you think? > > > > > > > > > > > > > > > > yes, +1 . I am especially happy that this will likely have no extra > > > > > overhead. > > > > > > > > No extra overhead per pipeline in MRs, but you get an extra pipeline > > > > for every merge to master. We'll see if this poses a serious problem… > > > > > > Given the positive feedback this change has already received, can we > > > accelerate its progress? I'm eager to have it in place because I've > > > got a branch with new some regression tests ready. > > > > Sure, I can merge !517 later today and then work on the second MR to > > actually enable 'make check' in the pipelines. > > Something's not working correctly here: > https://hahnjo.gitlab.io/-/lilypond/-/jobs/864846942/artifacts/test-results/index.html > The test-baseline was generated by frog (using parallelism), check > executed on a shared runner. I thought I had tested using a test- > baseline with different number of cores... What I probably checked is various values for CPU_COUNT, including just one core, and all works correctly AFAICT. What gives differences is not setting CPU_COUNT / job-count at all. I successfully traced this back to compiling, for example, footnote- volta-spanner.ly and skyline-point-extent.ly (in that order!) with an added '\include "lilypond-book-preamble.ly"' (attached). If I run $ ./out/bin/lilypond footnote-volta-spanner.ly skyline-point-extent.ly the skyline-point-extent.pdf has some added whitespace that vanishes when passing -djob-count=1 or 2 (starting with 3, it won't fork because there are less input files than jobs). footnote-volta-spanner.ly was added by commit c1f44faf958aafe7802f432a5ae0f4c74a8c0146 Author: Han-Wen Nienhuys Date: Tue Jun 16 08:59:35 2020 +0200 For footnotes on spanners, copy bounds from underlying spanner. This is likely more correct for spanners that run from non-musical columns (eg. volta brackets) or are created retroactively (eg. automatic beams). Demonstrate this by adding a regression test that puts a footnote on a VoltaBracket. Without adjustments of Footnote_engraver, the test produces the following: programming error: Spanner `FootnoteSpanner' is not fully contained in parent spanner. Ignoring orphaned part continuing, cross fingers programming error: bounds of this piece aren't breakable. continuing, cross fingers (https://gitlab.com/lilypond/lilypond/-/merge_requests/138). During review, David had some concerns about Spanners that were dismissed on the basis that there was no test case that showed wrong results (is that really our standard for claiming that a change works correctly?!). I tried reverting that commit and all other differences also went away. Does this mean the change is indeed flawed? Any ideas, comments? Jonas \include "lilypond-book-preamble.ly" \header { texidoc = "Footnotes on volta brackets also work" } \version "2.21.1" \relative c'' \repeat volta 2 { c1 \footnote #'(1 . 1) "volta" Score.VoltaBracket } \alternative { { d } { e } } \include "lilypond-book-preamble.ly" \version "2.19.12" \header { texidoc = "The @code{Script} grobs should follow the descending melody line, even though the @code{NoteHead} stencils are point stencils. The @code{Stem_engraver} is removed so that the only @code{side-support-element} is the @code{NoteHead}. " } \layout { \context { \Voice \remove "Stem_engraver" } } { \override Script.direction = #DOWN \override NoteHead.stencil = #point-stencil c'2.-> b8-- a-- g1-> } signature.asc Description: This is a digitally signed message part
Re: [RFC] Automatic 'make check' in CI
On Nov 21, 2020, at 12:50, Jonas Hahnfeld wrote: > > Probably won't figure this out today since I want to listen to some > great film music by Howard Shore - well yes, including the film. Enjoy! — Dan
Re: [RFC] Automatic 'make check' in CI
Am Samstag, den 21.11.2020, 09:46 +0100 schrieb Jonas Hahnfeld: > Am Freitag, den 20.11.2020, 17:55 -0500 schrieb Dan Eble: > > On Nov 20, 2020, at 12:22, Jonas Hahnfeld wrote: > > > > > Despite these shortcomings, I think it would make sense to enable > > > > > this first implementation in lilypond/lilypond. What do you think? > > > > > > > > > > > > > yes, +1 . I am especially happy that this will likely have no extra > > > > overhead. > > > > > > No extra overhead per pipeline in MRs, but you get an extra pipeline > > > for every merge to master. We'll see if this poses a serious problem… > > > > Given the positive feedback this change has already received, can we > > accelerate its progress? I'm eager to have it in place because I've > > got a branch with new some regression tests ready. > > Sure, I can merge !517 later today and then work on the second MR to > actually enable 'make check' in the pipelines. Something's not working correctly here: https://hahnjo.gitlab.io/-/lilypond/-/jobs/864846942/artifacts/test-results/index.html The test-baseline was generated by frog (using parallelism), check executed on a shared runner. I thought I had tested using a test- baseline with different number of cores... Probably won't figure this out today since I want to listen to some great film music by Howard Shore - well yes, including the film. Jonas signature.asc Description: This is a digitally signed message part
Re: [RFC] Automatic 'make check' in CI
Am Freitag, den 20.11.2020, 17:55 -0500 schrieb Dan Eble: > On Nov 20, 2020, at 12:22, Jonas Hahnfeld wrote: > > > > Despite these shortcomings, I think it would make sense to enable > > > > this first implementation in lilypond/lilypond. What do you think? > > > > > > > > > > yes, +1 . I am especially happy that this will likely have no extra > > > overhead. > > > > No extra overhead per pipeline in MRs, but you get an extra pipeline > > for every merge to master. We'll see if this poses a serious problem… > > Given the positive feedback this change has already received, can we > accelerate its progress? I'm eager to have it in place because I've > got a branch with new some regression tests ready. Sure, I can merge !517 later today and then work on the second MR to actually enable 'make check' in the pipelines. signature.asc Description: This is a digitally signed message part
Re: [RFC] Automatic 'make check' in CI
On Nov 20, 2020, at 12:22, Jonas Hahnfeld wrote: > >>> Despite these shortcomings, I think it would make sense to enable >>> this first implementation in lilypond/lilypond. What do you think? >>> >> >> yes, +1 . I am especially happy that this will likely have no extra >> overhead. > > No extra overhead per pipeline in MRs, but you get an extra pipeline > for every merge to master. We'll see if this poses a serious problem… Given the positive feedback this change has already received, can we accelerate its progress? I'm eager to have it in place because I've got a branch with new some regression tests ready. — Dan
Re: [RFC] Automatic 'make check' in CI
Am Freitag, den 20.11.2020, 10:11 +0100 schrieb Han-Wen Nienhuys: > > There are a few known issues that I'm aware of: > > - I needed to delete input/regression/option-help.ly because it > > logs the options currently in use by lilypond, including the job- > > count which varies when using our own runners with more than one > > core. > > > > we could overwrite the job-count option in the option help, as it is > irrelevant by the time you get to the file processing. While this might work for job-count (I'd still consider it a hack), there is at least also log-file which depends on the current process. The right solution would be to show the original default value in the help output, but the test prints exactly what you enter into scm/lily.scm so I don't really see its benefit at this point.. > > > - Sometimes the test-results contain spurious diffs, for example > > here: > > https://hahnjo.gitlab.io/-/lilypond/- > > /jobs/858441670/artifacts/test-results/index.html > > I can reproduce this locally with --enable-checking, but haven't > > investigated further yet (there were a couple of other problems > > that I needed to solve in order to get things working...). If > > somebody has an idea for a fix, that would be great but I think > > these can be safely ignored for now. > > > > This could happen because of false-positives in the conservative > garbage collection. It's not super-likely, but at the same time, it > can't be ruled out. Does it always happen with the same files? I > have never seen this, and the files are not doing anything out of the > ordinary. It doesn't happen every time and AFAICT it might hit many different files, but I see it pretty consistently when building with --enable- checking. > IIRC, the dead-object detection can't be made to work anyway with > GUILE 2.x, so this might be a good moment to scrap it. As I said, I didn't investigate yet. But an explicit warning that consistently triggers on different systems at least makes me wonder if there's something going on. > > Despite these shortcomings, I think it would make sense to enable > > this first implementation in lilypond/lilypond. What do you think? > > > > yes, +1 . I am especially happy that this will likely have no extra > overhead. No extra overhead per pipeline in MRs, but you get an extra pipeline for every merge to master. We'll see if this poses a serious problem... Jonas signature.asc Description: This is a digitally signed message part
Re: [RFC] Automatic 'make check' in CI
On Wed, Nov 18, 2020 at 9:25 PM Jonas Hahnfeld wrote: > Hi all, > > I'd like to present a first workable version of 'make check' for use in > our CI pipelines. I've pushed the necessary commits to my personal fork > and created two merge requests to demonstrate the results: > > 1. Run 'make check' for merge requests (no difference) > URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/5 > Job: https://gitlab.com/hahnjo/lilypond/-/jobs/858498690 > test-results: > > https://hahnjo.gitlab.io/-/lilypond/-/jobs/858498690/artifacts/test-results/index.html > > 2. Introduce difference in bar-lines.ly > URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/6 > Job: https://gitlab.com/hahnjo/lilypond/-/jobs/852618720 > test-results: > > https://hahnjo.gitlab.io/-/lilypond/-/jobs/852618720/artifacts/test-results/index.html > > This first workable implementation contains the minimum functionality: > It runs 'make test-baseline' for every push to the master branch and > replaces 'make test' in the pipelines for MRs with 'make check'. The > test-results are uploaded as artifacts and can be either downloaded as > zip archive or viewed directly (see above). > > There are a few known issues that I'm aware of: > - I needed to delete input/regression/option-help.ly because it logs > the options currently in use by lilypond, including the job-count which > varies when using our own runners with more than one core. > we could overwrite the job-count option in the option help, as it is irrelevant by the time you get to the file processing. > - Sometimes the test-results contain spurious diffs, for example here: > > https://hahnjo.gitlab.io/-/lilypond/-/jobs/858441670/artifacts/test-results/index.html > I can reproduce this locally with --enable-checking, but haven't > investigated further yet (there were a couple of other problems that I > needed to solve in order to get things working...). If somebody has an > idea for a fix, that would be great but I think these can be safely > ignored for now. > This could happen because of false-positives in the conservative garbage collection. It's not super-likely, but at the same time, it can't be ruled out. Does it always happen with the same files? I have never seen this, and the files are not doing anything out of the ordinary. IIRC, the dead-object detection can't be made to work anyway with GUILE 2.x, so this might be a good moment to scrap it. > There are a few more elaborate things that I'd like to work on in the > future. For example, GitLab can show a list of 'failing' tests which > can tell us at first glance if we need to look into the test-results. > I've prototyped this integration in the second MR, but it's very > misleading because the file extensions are missing and GitLab prints "0 > failed out of null" when there are no tests. The obvious solution is to > mark all existing tests as success, but this requires a bit more > thought to integrate into output-distance.py (or somewhere else). > I was going to suggest to use junit XML files, but it looks you already found this. Fantastic! > Despite these shortcomings, I think it would make sense to enable this > first implementation in lilypond/lilypond. What do you think? > yes, +1 . I am especially happy that this will likely have no extra overhead. -- Han-Wen Nienhuys - hanw...@gmail.com - http://www.xs4all.nl/~hanwen
Re: [RFC] Automatic 'make check' in CI
Am Donnerstag, den 19.11.2020, 21:44 +0100 schrieb Michael Käppler: > Am 18.11.2020 um 21:25 schrieb Jonas Hahnfeld: > > Hi all, > Hi Jonas, > > > > I'd like to present a first workable version of 'make check' for use in > > our CI pipelines. I've pushed the necessary commits to my personal fork > > and created two merge requests to demonstrate the results: > Great! > > > > 1. Run 'make check' for merge requests (no difference) > > URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/5 > > Job: https://gitlab.com/hahnjo/lilypond/-/jobs/858498690 > > test-results: > > https://hahnjo.gitlab.io/-/lilypond/-/jobs/858498690/artifacts/test-results/index.html > I stumbled across the git error in > https://gitlab.com/hahnjo/lilypond/-/jobs/858498690#L85 > but this is from print-gittxt.sh, right? Exactly, this is issue #6061 also mentioned below. > > > > 2. Introduce difference in bar-lines.ly > > URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/6 > > Job: https://gitlab.com/hahnjo/lilypond/-/jobs/852618720 > > test-results: > > https://hahnjo.gitlab.io/-/lilypond/-/jobs/852618720/artifacts/test-results/index.html > > > > This first workable implementation contains the minimum functionality: > > It runs 'make test-baseline' for every push to the master branch and > > replaces 'make test' in the pipelines for MRs with 'make check'. The > > test-results are uploaded as artifacts and can be either downloaded as > > zip archive or viewed directly (see above). > > > > There are a few known issues that I'm aware of: > > - I needed to delete input/regression/option-help.ly because it logs > > the options currently in use by lilypond, including the job-count which > > varies when using our own runners with more than one core. > Sounds like this should be fixed in get_help_string(), but I'm fine with > removing > the regtest, too. I can not imagine how this 'test' would be useful. > > - The test-results are pretty hard to find: Even if you know where to > > look, it takes at least three clicks to download the zip archive plus > > extracting and opening index.html; viewing the results directly takes > > another three clicks from the job page (via 'Browse'). To ease the > > second option, I've included a link in the latest jobs of !5 above. The > > link is clickable which is a new feature of GitLab that I asked to be > > enabled for my repo, so it won't work directly for lilypond/lilypond. > > - The gittree information does not contain the right merge base. This > > is related to issue #6061 and I've submitted a change to GitLab to > > expose the needed information in CI, but it wasn't merged yet. > > - The modified job for MRs always downloads the latest test-baseline > > from master and not the one corresponding to the merge base. Same > > reason as the previous point, I hope we can get the needed information > > soon-ish. Until then, we need to rebase to adapt the MR commits to the > > available test-baseline. > But this is also the way James is testing currently, isn't it? No, it's not: > He does a `test-baseline` against latest master Yes, but ... > and a `make check` on the feature branch, the feature branch gets applied on whatever latest master is (I personally merged the branch, but the end result is the same). This comes from the time when we really submitted patches / diffs via Rietveld. As discussed in the other messages with Dan, I think this would be the right way to test things, but the currently available options make this unattractive. > which may be lagging behing `master`...right? > But I agree that for reproducibility it would be nice to take the > baseline from the merge base. > When would you do the `make test-baseline`, then? > As part of the pipeline on the feature branch? Or would we keep a > repository of, say, the > last 20 baselines on `master` and pick the right one according to the > merge base sha? Generating test-baseline in the MR pipeline would be very hard to implement, so I'd leave this to pipelines on the master branch. In my book, this should also use less resources because, by definition, there are less merges to master than pipelines on MRs. Downloading the right artifact should be easy once we know the commit -> pipeline -> job. GitLab takes care of removing the artifacts after 7 days, so we don't have to worry about this. > > - Sometimes the test-results contain spurious diffs, for example here: > > https://hahnjo.gitlab.io/-/lilypond/-/jobs/858441670/artifacts/test-results/index.html > > I can reproduce this locally with --enable-checking, but haven't > > investigated further yet (there were a couple of other problems that I > > needed to solve in order to get things working...). If somebody has an > > idea for a fix, that would be great but I think these can be safely > > ignored for now. > > > > There are a few more elaborate things that I'd like to work on in the > > future. For example, GitLab can show a list of 'failing' tests which > > can
Re: [RFC] Automatic 'make check' in CI
Am 18.11.2020 um 21:25 schrieb Jonas Hahnfeld: Hi all, Hi Jonas, I'd like to present a first workable version of 'make check' for use in our CI pipelines. I've pushed the necessary commits to my personal fork and created two merge requests to demonstrate the results: Great! 1. Run 'make check' for merge requests (no difference) URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/5 Job: https://gitlab.com/hahnjo/lilypond/-/jobs/858498690 test-results: https://hahnjo.gitlab.io/-/lilypond/-/jobs/858498690/artifacts/test-results/index.html I stumbled across the git error in https://gitlab.com/hahnjo/lilypond/-/jobs/858498690#L85 but this is from print-gittxt.sh, right? 2. Introduce difference in bar-lines.ly URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/6 Job: https://gitlab.com/hahnjo/lilypond/-/jobs/852618720 test-results: https://hahnjo.gitlab.io/-/lilypond/-/jobs/852618720/artifacts/test-results/index.html This first workable implementation contains the minimum functionality: It runs 'make test-baseline' for every push to the master branch and replaces 'make test' in the pipelines for MRs with 'make check'. The test-results are uploaded as artifacts and can be either downloaded as zip archive or viewed directly (see above). There are a few known issues that I'm aware of: - I needed to delete input/regression/option-help.ly because it logs the options currently in use by lilypond, including the job-count which varies when using our own runners with more than one core. Sounds like this should be fixed in get_help_string(), but I'm fine with removing the regtest, too. I can not imagine how this 'test' would be useful. - The test-results are pretty hard to find: Even if you know where to look, it takes at least three clicks to download the zip archive plus extracting and opening index.html; viewing the results directly takes another three clicks from the job page (via 'Browse'). To ease the second option, I've included a link in the latest jobs of !5 above. The link is clickable which is a new feature of GitLab that I asked to be enabled for my repo, so it won't work directly for lilypond/lilypond. - The gittree information does not contain the right merge base. This is related to issue #6061 and I've submitted a change to GitLab to expose the needed information in CI, but it wasn't merged yet. - The modified job for MRs always downloads the latest test-baseline from master and not the one corresponding to the merge base. Same reason as the previous point, I hope we can get the needed information soon-ish. Until then, we need to rebase to adapt the MR commits to the available test-baseline. But this is also the way James is testing currently, isn't it? He does a `test-baseline` against latest master and a `make check` on the feature branch, which may be lagging behing `master`...right? But I agree that for reproducibility it would be nice to take the baseline from the merge base. When would you do the `make test-baseline`, then? As part of the pipeline on the feature branch? Or would we keep a repository of, say, the last 20 baselines on `master` and pick the right one according to the merge base sha? - Sometimes the test-results contain spurious diffs, for example here: https://hahnjo.gitlab.io/-/lilypond/-/jobs/858441670/artifacts/test-results/index.html I can reproduce this locally with --enable-checking, but haven't investigated further yet (there were a couple of other problems that I needed to solve in order to get things working...). If somebody has an idea for a fix, that would be great but I think these can be safely ignored for now. There are a few more elaborate things that I'd like to work on in the future. For example, GitLab can show a list of 'failing' tests which can tell us at first glance if we need to look into the test-results. I've prototyped this integration in the second MR, but it's very misleading because the file extensions are missing and GitLab prints "0 failed out of null" when there are no tests. The obvious solution is to mark all existing tests as success, but this requires a bit more thought to integrate into output-distance.py (or somewhere else). Despite these shortcomings, I think it would make sense to enable this first implementation in lilypond/lilypond. What do you think? Does definitely make sense, however, I would value James' opinion on that, too. Thanks for your work! Michael
Re: [RFC] Automatic 'make check' in CI
> I'd like to present a first workable version of 'make check' for use > in our CI pipelines. I've pushed the necessary commits to my > personal fork and created two merge requests to demonstrate the > results: [...] This looks very nice, thanks! I don't know enough about the inner workings of gitlab's CI system to give any more qualified comments, though. > Despite these shortcomings, I think it would make sense to enable > this first implementation in lilypond/lilypond. What do you think? Definitely! Werner
Re: [RFC] Automatic 'make check' in CI
Am Donnerstag, den 19.11.2020, 07:28 -0500 schrieb Dan Eble: > On Nov 19, 2020, at 02:30, Jonas Hahnfeld wrote: > > > > Am Mittwoch, den 18.11.2020, 21:49 -0500 schrieb Dan Eble: > > > On Nov 18, 2020, at 15:25, Jonas Hahnfeld wrote: > > > > > > > > Until then, we need to rebase to adapt the MR commits to the > > > > available test-baseline. > > > > > > Do you mean that we would need to rebase to master before any push that > > > updates an MR? > > > > At the very least if there were changes to regression tests added to > > master in the mean time and you don't want them to appear as in your > > run of "make check". So always doing this should be a safe default > > (even if this renders the "Compare with previous version" kind of > > useless; some compromise we have to make here…) > > Hold on. I understand that "Compare with previous version" becomes > harder to read after a rebase, but is that the only reason you've > been thinking of testing against the merge base? No, the primary reason is reproducibility: If always downloading from master, you cannot re-run a job on the same input. The merge base stays constant for a given commit, so you can run the job a second time (if there was an error on the runner for example) or even restart the entire pipeline and still get the same test-baseline (or an error if it's not available anymore). > I think I would prefer testing against the current version of master. > That will expose a patch that lags behind as soon as it goes into > Patch::new rather than delaying until the submitter chooses to rebase > it, which might be immediately before merging it when nobody is > paying attention. Yes, I agree on that. What we could do is enabling Pipelines for Merged Results, which is a Premium feature (but we have it as OSS project): https://docs.gitlab.com/ee/ci/merge_request_pipelines/pipelines_for_merged_results/ This simulates a (temporary) merge to master at the time the pipeline starts. I don't like it very much because we have a linear history in master, so the final pipeline will run on a temporary merge commit that will never actually be part of the repository. GitLab is working on enabling the rebase strategy for this (so it would temporarily rebase your branch on master and run the pipeline), but it's not there yet... (Btw this would also give us a working merge train when multiple MRs count down on the same day!) Jonas signature.asc Description: This is a digitally signed message part
Re: [RFC] Automatic 'make check' in CI
On Nov 19, 2020, at 02:30, Jonas Hahnfeld wrote: > > Am Mittwoch, den 18.11.2020, 21:49 -0500 schrieb Dan Eble: >> On Nov 18, 2020, at 15:25, Jonas Hahnfeld wrote: >>> >>> Until then, we need to rebase to adapt the MR commits to the >>> available test-baseline. >> >> Do you mean that we would need to rebase to master before any push that >> updates an MR? > > At the very least if there were changes to regression tests added to > master in the mean time and you don't want them to appear as in your > run of "make check". So always doing this should be a safe default > (even if this renders the "Compare with previous version" kind of > useless; some compromise we have to make here…) Hold on. I understand that "Compare with previous version" becomes harder to read after a rebase, but is that the only reason you've been thinking of testing against the merge base? I think I would prefer testing against the current version of master. That will expose a patch that lags behind as soon as it goes into Patch::new rather than delaying until the submitter chooses to rebase it, which might be immediately before merging it when nobody is paying attention. >> I could live with that. >> >> Thanks for the work you've put into this. It's going to be helpful. > > So OK to start with the initial version and improve upon that in > smaller steps? OK with me. — Dan
Re: [RFC] Automatic 'make check' in CI
Am Mittwoch, den 18.11.2020, 21:49 -0500 schrieb Dan Eble: > On Nov 18, 2020, at 15:25, Jonas Hahnfeld wrote: > > > > Until then, we need to rebase to adapt the MR commits to the > > available test-baseline. > > Do you mean that we would need to rebase to master before any push that > updates an MR? At the very least if there were changes to regression tests added to master in the mean time and you don't want them to appear as in your run of "make check". So always doing this should be a safe default (even if this renders the "Compare with previous version" kind of useless; some compromise we have to make here...) > I could live with that. > > Thanks for the work you've put into this. It's going to be helpful. So OK to start with the initial version and improve upon that in smaller steps? Jonas > — > Dan signature.asc Description: This is a digitally signed message part
Re: [RFC] Automatic 'make check' in CI
On Nov 18, 2020, at 15:25, Jonas Hahnfeld wrote: > > Until then, we need to rebase to adapt the MR commits to the > available test-baseline. Do you mean that we would need to rebase to master before any push that updates an MR? I could live with that. Thanks for the work you've put into this. It's going to be helpful. — Dan
[RFC] Automatic 'make check' in CI
Hi all, I'd like to present a first workable version of 'make check' for use in our CI pipelines. I've pushed the necessary commits to my personal fork and created two merge requests to demonstrate the results: 1. Run 'make check' for merge requests (no difference) URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/5 Job: https://gitlab.com/hahnjo/lilypond/-/jobs/858498690 test-results: https://hahnjo.gitlab.io/-/lilypond/-/jobs/858498690/artifacts/test-results/index.html 2. Introduce difference in bar-lines.ly URL: https://gitlab.com/hahnjo/lilypond/-/merge_requests/6 Job: https://gitlab.com/hahnjo/lilypond/-/jobs/852618720 test-results: https://hahnjo.gitlab.io/-/lilypond/-/jobs/852618720/artifacts/test-results/index.html This first workable implementation contains the minimum functionality: It runs 'make test-baseline' for every push to the master branch and replaces 'make test' in the pipelines for MRs with 'make check'. The test-results are uploaded as artifacts and can be either downloaded as zip archive or viewed directly (see above). There are a few known issues that I'm aware of: - I needed to delete input/regression/option-help.ly because it logs the options currently in use by lilypond, including the job-count which varies when using our own runners with more than one core. - The test-results are pretty hard to find: Even if you know where to look, it takes at least three clicks to download the zip archive plus extracting and opening index.html; viewing the results directly takes another three clicks from the job page (via 'Browse'). To ease the second option, I've included a link in the latest jobs of !5 above. The link is clickable which is a new feature of GitLab that I asked to be enabled for my repo, so it won't work directly for lilypond/lilypond. - The gittree information does not contain the right merge base. This is related to issue #6061 and I've submitted a change to GitLab to expose the needed information in CI, but it wasn't merged yet. - The modified job for MRs always downloads the latest test-baseline from master and not the one corresponding to the merge base. Same reason as the previous point, I hope we can get the needed information soon-ish. Until then, we need to rebase to adapt the MR commits to the available test-baseline. - Sometimes the test-results contain spurious diffs, for example here: https://hahnjo.gitlab.io/-/lilypond/-/jobs/858441670/artifacts/test-results/index.html I can reproduce this locally with --enable-checking, but haven't investigated further yet (there were a couple of other problems that I needed to solve in order to get things working...). If somebody has an idea for a fix, that would be great but I think these can be safely ignored for now. There are a few more elaborate things that I'd like to work on in the future. For example, GitLab can show a list of 'failing' tests which can tell us at first glance if we need to look into the test-results. I've prototyped this integration in the second MR, but it's very misleading because the file extensions are missing and GitLab prints "0 failed out of null" when there are no tests. The obvious solution is to mark all existing tests as success, but this requires a bit more thought to integrate into output-distance.py (or somewhere else). Despite these shortcomings, I think it would make sense to enable this first implementation in lilypond/lilypond. What do you think? Jonas P.S. I probably forgot a few important things; please ask if some details are not clear... signature.asc Description: This is a digitally signed message part