Hi Sander, Le lun. 20 oct. 2025 à 18:09, Sander Striker <[email protected]> a écrit : > > Hi, > > I’ve been working on a proposal that I’m really excited to share. It could > significantly improve BuildStream’s performance on large dependency graphs, > especially for C/C++ codebases, without changing correctness or requiring > new semantics from build tools. This builds on top of the current state of > the art, using both Remote Execution for parallelism and using recc to > cache and remote execute compilation at the translation unit level. > > I've put the full proposal in issue #2083 > <https://github.com/apache/buildstream/issues/2083>, as that renders the > included diagrams nicely. I'll not rehash it here as it is quite long. A > quick summary: > When we build an element, sub-actions are spawned, e.g. compile through > recc. We can record these subactions, and in a next build speculatively > execute those same sub-actions, with updated files, at the start of the > build. We can do this aggressively for all actions regardless of what > element they correspond to in the dependency graph. When elements are > built, the sub-actions within will receive cache hits, dramatically > speeding up the overall end-to-end build. > > I am quite happy with the limited amount of additional bookkeeping that is > required to implement the proposal. I do think it is a step towards > Bazel's promise of "{ Fast, Correct } — Choose two", while not requiring > all projects/elements to adopt a different build system. > > I’d love to hear thoughts on > > - the proposal in general
I've talked with a few colleagues about this proposal and overall people are confused. I think it would be useful to state your premise before diving into the technical details. Let me try to state my understanding, and please let me know if it's correct. We assume that: * The project is already using RECC (or similar) in BuildStream * While this provides a welcome improvement, we want even faster builds * The user is using remote execution (and not just building locally using remote execution API), and have access to large "build grid" that is potentially sitting idle * One thing we could do is try to predict actions that RECC will execute and schedule them early, so that RECC will find them in the (remote) action cache when needed Given this, I think it might be too early to start working on this. We should look at how much improvement RECC-in-BuildStream brings by itself, before trying to optimize further. I have a draft merge request for freedesktop-sdk [1], and I hope to start working on integrating it in gnome-build-meta [2] once that MR lands. Another point is that I'm not sure the complexity of this proposal is proportionate to the performance improvement we hope to gain. Happy to be proven wrong on this aspect, though :-) > - the best way to integrate speculative action scheduling with > BuildStream’s element execution model I think the best way to implement this would be as an additional (optional) stage between source fetch and build. I'm not very familiar with the inner workings of the scheduler, but I think it is possible to have the job run when all the build dependencies of an element have either their sources or artifacts ready. We could potentially require that some elements have their artifacts, and not accept only sources (I'm thinking of stuff like compilers here). At this point, we can start scheduling the speculative actions. I feel that we could schedule more speculative actions as more build dependencies have their artifacts ready, but this means potentially scheduling the "cache priming" job more than once for any given element. One thing I notice is that in your proposal, generating speculative actions happens after the element is built. My thinking above points more towards having it part of the second build. Either way, there is something important to consider: we need to store information between the two builds. Your proposal seems to suggest putting it in the artifact, but I am not sure how to retrieve the old artifact from the new build: the weak cache key will change if the sources of the elements change (which is the most important use case IMO). > - any projects you'd like to try this on gnome-build-meta seems like an ideal target. We try to build with the latest main/master of every GNOME project at least once per day, and every time we upgrade glib everything needs to be rebuilt including two instances of WebKitGtk. However it doesn't have access to a "potentially idle large build grid", so I'm not sure if we can actually implement this. Anyway, these are some quick thoughts (as quick as the complexity of the proposal allows). I'll keep thinking about it, I may have more ideas. Cheers, Abderrahim [1] https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/merge_requests/22859 [2] https://gitlab.gnome.org/GNOME/gnome-build-meta/
