Han-Wen Nienhuys <[email protected]> writes: > On Fri, Mar 6, 2020 at 11:18 PM <[email protected]> wrote: >> >> Sigh. I just noticed that opposed to the patch title, this does not >> just introduce a file lock for lilypond-book but _also_ changes the >> build system such that now almost double the number of allocated jobs >> get used. It would be good if different topics weren't conflated into >> single issues so that it's easier to discuss what one is actually >> dealing with and make decisions based on the respective merits of the >> individual parts. >> >> "It doesn't actually work well as a job control measure in connection >> with parallel Make" should likely have been an indicator of what I >> thought I was talking about. > > Can you tell me what problem you are currently experiencing?
Harm has a system with memory pressure. That means that he so far has only been able to work with CPU_COUNT=2 make -j2 doc Since now lilypond-doc is no longer serialised, he'd need to reduce to CPU_COUNT=1 make -j2 doc or CPU_COUNT=2 make -j1 doc to get similar memory utilisation, for a considerable loss in performance. I've taken a look at Make's jobserver implementation and it is pretty straightforward. The real solution would, of course, be to make lilypond-book, with its directory-based database, not lock other instances of lilypond-book but take over their job load. However, the current interaction of lilypond-book is giving the whole work to lilypond which splits into n copies with a fixed work load. To make that work, one would rather have one "job server" of LilyPond itself which does all the initialisation work and then waits for job requests. Upon receiving them, it forks off copies working on them. Working with freshly forked copies would have the advantage of having reproducible stats not depending on the exact work distribution, and the disadvantage of things like typical font loading and symbol memoization in frequent code paths happening in each copy. On the other hand, the question of "gc between files?" would not be an issue since one would just throw the current state of memory away. One would probably want fresh forks for regtests because of the stats and reproducibility, and would accept continuous forks for documentation building (I assume that continuous forks, by which I mean one instance of LilyPond processing several files in sequence like we do now, would be faster in the long run but probably not all that much). I previously thought of trying to pin down the job distribution of regtests upon make test-baseline so that only new regtests (rather than the preexisting ones) would get distributed arbitrarily on make check, but starting with fresh forks seems like a much better deal for reproducibility. Of course, that's all for the long haul. To get back to your question: the consequences are worst when the job count is constrained due to memory pressure. My laptop has uncommonly large memory for its overall age and power, so I am not hit worst. The rough doubling of jobs does not cause me to run into swap space. -- David Kastrup
