On Wed, 2022-10-05 at 15:24 -0600, Karl Berry wrote: > What troubles me most is that there's no obvious way to debug any > test failure involving parallelism, since they go away with serial > execution. Any ideas about how to determine what is going wrong in > the parallel make? Any way to make parallel failures more > reproducible?
I don't have any great ideas myself. In the new prerelease of GNU make there's a --shuffle option which will randomize (or just reverse) the order in which prerequisites are built. Often if you have a timing-dependent failure, forcing the prerequisites to build in a different order can make the failure more obvious. In general, though, the best way to attack the issue is to try to understand why the failure happens: what goes wrong that causes the failure. If that can be understood then often we can envision a way that parallel or "out of order" builds might cause that problem. Alternatively since you seem to have relatively well-defined "good" and "bad" commits you could use git bisect to figure out which commit actually causes the problem (obviously you need to be able to force the failure, if not every time then at least often enough to detect a "bad" commit). Maybe that will shed some light. But I expect there's nothing here you haven't already thought of yourself :( :).