Hi, I’ve been working on a proposal that I’m really excited to share. It could significantly improve BuildStream’s performance on large dependency graphs, especially for C/C++ codebases, without changing correctness or requiring new semantics from build tools. This builds on top of the current state of the art, using both Remote Execution for parallelism and using recc to cache and remote execute compilation at the translation unit level.
I've put the full proposal in issue #2083 <https://github.com/apache/buildstream/issues/2083>, as that renders the included diagrams nicely. I'll not rehash it here as it is quite long. A quick summary: When we build an element, sub-actions are spawned, e.g. compile through recc. We can record these subactions, and in a next build speculatively execute those same sub-actions, with updated files, at the start of the build. We can do this aggressively for all actions regardless of what element they correspond to in the dependency graph. When elements are built, the sub-actions within will receive cache hits, dramatically speeding up the overall end-to-end build. I am quite happy with the limited amount of additional bookkeeping that is required to implement the proposal. I do think it is a step towards Bazel's promise of "{ Fast, Correct } — Choose two", while not requiring all projects/elements to adopt a different build system. I’d love to hear thoughts on - the proposal in general - the best way to integrate speculative action scheduling with BuildStream’s element execution model - any projects you'd like to try this on I am looking for all the help I can get to prove this idea 😉; if you have an interest and some spare cycles to throw at this let me know. I'll try to make a start this week. Cheers, Sander
