Re: [Proposal] Speculative Actions: Predictive Cache Priming for BuildStream

Sander Striker Sun, 23 Nov 2025 03:34:53 -0800

Hi again,

Apologies for the late reply.

On Tue, Nov 4, 2025 at 1:05 PM Abderrahim Kitouni <[email protected]>
wrote:

> Hi again,
>
> Le mar. 28 oct. 2025 à 03:13, Sander Striker <[email protected]> a écrit
> :
> [...]
> > In the abstract, let's assume that translation unit compilation is the
> > slowest part of the build.  Then if you take the slowest translation unit
> > per element, and build the elements in dependency order, you can project
> > the minimal time that you need.  Now if you could start these translation
> > units all at the same time, then you are looking at just the single
> slowest
> > of all translation units (plus the cache hit overhead).
>
> Sounds reasonable, except that the "cache hit overhead" might be a
> little too optimistic: one very slow aspect of building a large C/C++
> project is running the cmake/configure script.
>

There is always going to be an action per project in the dependency graph
that is the slowest.  By observing what we can speculatively execute we
also learn what we aren't.  And this may reveal other opportunities for
further optimization.  If we already know that cmake/configure is a
particularly slow operation, e.g by looking into making that operation
cacheable (via Remote Execution API) could help in making the entire build
faster.

> > Another point is that I'm not sure the complexity of this proposal is
> > > proportionate to the performance improvement we hope to gain. Happy to
> > > be proven wrong on this aspect, though :-)
> > >
> >
> > The complexity may turn out to be less than expected :-)  And the gains,
> > well, I would love to do a PoC, so we can test it out.
>
> Indeed :)
>
> > > >    - the best way to integrate speculative action scheduling with
> > > >    BuildStream’s element execution model
> > >
> > > I think the best way to implement this would be as an additional
> > > (optional) stage between source fetch and build.
> >
> >
> > Yes, that was what I was thinking as well.  A new queue in between the
> pull
> > queue and build queue.
> >
> > [...]
> > I think this would be another queue, so that the build queue is not held
> up
> > by speculative action generation.
>
> Of course. I'm more thinking whether we need two queues (one for
> mapping subactions to speculative actions after the build, and one to
> trigger speculative actions before the next build) or if one would be
> enough (just mapping the old subactions to new actions and schedule
> them). Your approach has the advantage that it is possible to support
> ACTION overlays, but I am not sure how useful they would be.
>

I'm not sure I follow what it would look like with just one additional
queue?  The mapping of subactions to speculative actions after the build is
when the information is present to generate the overlay data.  This is not
unique to ACTION overlays.

> > Either way, there is
> > > something important to consider: we need to store information between
> > > the two builds. Your proposal seems to suggest putting it in the
> > > artifact, but I am not sure how to retrieve the old artifact from the
> > > new build: the weak cache key will change if the sources of the
> > > elements change (which is the most important use case IMO).
> > >
> >
> > The trick to that is the ReferencedSpeculativeActions, which backfill via
> > the elements that have a dependency on the changed element.
>
> Yeah, I understand this. But how can I link speculative actions coming
> from the old element build to the new element build?
>

I'm not sure if I'm understanding your question correctly.  It appears the
same as the question/paragraph below.

> > I've been trying to refamiliarize myself with the BuildStream internals
> to
> > figure out how to integrate the proposal.  I've put a potential
> > implementation approach as an issue comment:
> >
> https://github.com/apache/buildstream/issues/2083#issuecomment-3454136984.
> > You can potentially skip ahead to "Scheduler Queue Flow"; I'd love to
> > validate if that is a sensible approach.  Note that it is describing a
> > functionally complete implementation, however, I intend to keep it
> > initially limited to SOURCE type overlays.  I am aware that the
> > implementation will be performance sensitive for this to work well.
>
> I believe "Retrieve Artifacts & SpeculativeActions by element
> weak-ref" is the weak link here: the weak ref will change if the
> sources of the element changes, which is the most likely reason we
> want to do this. The case where the dependencies change but the
> element stays the same should be pretty well covered by just using
> RECC.
>

What you identify as the weak link is actually at the core of the
proposal.  We know that sources actually churn relatively slowly.  If you
consider a `bst build --track` on a regular short interval (continuous
integration), each run would pick up a small number of elements changed per
run.  The lower in the dependency tree a change occurs the more expensive
the whole build; a low level header change would result in every transitive
dependent element to rebuild.  These elements do not have source changes
(weak ref is the same), and we can actually execute their associated
speculative actions (with updated header file via overlay) to prime the
cache for e.g. recc, so that once the build reaches these elements we get
cache hits.  To get these benefits, it is assumed a remote execution
service is being used to actually be able to achieve the broader
parallelism that this brings.
For the element that did change we would not get a weak ref hit, but its
dependents will build and they will backfill the speculative actions from a
previous version of that element.  I'll simplify the initial PoC and leave
ReferencedSpeculativeActions out for now.  While those can be useful beyond
ACTION overlays, they were initially aimed towards ACTION overlays
specifically when multiple elements in a chain change.

The "Scheduler Interleaves" is something that buildstream doesn't
> currently do, but I think is possible with the current design of the
> scheduler.
>
> The rest seems reasonable.
>

Cool.  Let me see when I can get a first PR up.

> Cheers,
>
> Abderrahim
>

Cheers,

Sander

Re: [Proposal] Speculative Actions: Predictive Cache Priming for BuildStream

Reply via email to