>
> I think a solution based on multi-stage programming is the most
> natural way to tackle the stream fusion problem in a robust manner.


How close are we to being able to do that in Haskell?

Simon

On Tue, 9 Dec 2025 at 09:56, Matthew Pickering <[email protected]>
wrote:

> I think a solution based on multi-stage programming is the most
> natural way to tackle the stream fusion problem in a robust manner.
>
> For example: https://andraskovacs.github.io/pdfs/2ltt_icfp24.pdf
>
> By allowing a programmer to express which parts of the program should
> only exist code generation time, you guarantee that fusion occurs.
> This seems much more aligned with how Haskell programmers think about
> other things.
>
> Cheers,
>
> Matt
>
> On Tue, Dec 9, 2025 at 12:03 AM Harendra Kumar <[email protected]>
> wrote:
> >
> > On Mon, 8 Dec 2025 at 23:35, Jaro Reinders <[email protected]>
> wrote:
> > >
> > > I suggested we could integrate this into GHC by giving an inlining
> discount to
> > > any function which contains function that is also mentioned in a
> rewrite rule.
> >
> > There are two distinct fusion mechanisms under consideration here -
> > foldr/build fusion and stream fusion. Rewrite rules allow us to
> > replace a combination of functions with an equivalent function,
> > forming the basis of foldr/build fusion by fusing or eliminating
> > intermediate functions. Stream fusion on the other hand eliminates
> > intermediate constructors, the key optimizations for achieving this
> > are case-of-case and worker-wrapper. Both types of fusion require
> > inlining but they are clearly different at the implementation level,
> > even though the same high level ideas may apply in both cases.
> >
> > The fusion-plugin deals with stream fusion only -- where inlining and
> > spec-constr are used to expose constructors which are then eliminated
> > by case-of-case (for linear pipelines) or worker wrapper (for
> > recursive loops) transformations. fusion-plugin has been used
> > extensively in streamly and we have been able to fuse almost all cases
> > that we wanted to. I would recommend that you try out fusion-plugin on
> > the vector package, it can help fuse the pesky cases and you can also
> > gain some insight into how much code bloat it actually causes in
> > practice. Vector is primarily based on stream fusion, and even though
> > it uses rewrite rules, those are not for fusion as such I believe.
> >
> > > However, I also noted I think this will have a large impact on code
> size and
> > > compilation time in larger code bases. Simon agreed, but noted that
> Harendra's
> > > plugin is much more selective than what I suggested.
> >
> > Typically we have a pipeline of functions connected by fusible
> > constructors. To fuse the entire pipeline completely we need to inline
> > all the functions involved in the pipeline. Thus instead of having
> > multiple small functions we end up creating one large combined
> > function. However, when we use case-of-case transformation on the
> > combined function, the size often gets reduced dramatically. In fact
> > the total size of the end product is usually smaller than the combined
> > original components in the pipeline. However, ghc requires more memory
> > and CPU to simplify, especially if the loop becomes too large.
> >
> > In our experience aggressive inlining for fusing linear segments of
> > pipelines even if they are pretty big has almost never been a problem.
> > If it becomes large we can always break the pipeline by using a
> > NOINLINE, making a trade off between runtime perf and compile time
> > resource requirement, but we rarely needed to do that. The code bloat
> > comes more often from the spec-constr optimization when we are fusing
> > recursive loops. In some cases we have used
> > -fspec-constr-recursive=16, which I admit is a very large value, the
> > compile times can sometimes increase dramatically due to this. We can
> > always choose to use a lower value which may result in unfused code in
> > some cases. There may be scope of improvement in GHC for making this
> > optimization smarter and more selective.
> >
> > We can also put some hard thresholds or heuristics here for the cases
> > when the resulting monolithic functions become too large. One possible
> > way to decide whether to inline or not could be to check if the size
> > of the combined/optimized function is smaller than or roughly
> > equivalent to the sum of the individual parts. If inlining does not
> > result in elimination of the constructors in subsequent passes, then
> > the inlining is not useful. But we have seen that it is almost always
> > possible to eliminate all or most of the constructors.
> >
> > > Translating this to the fold/build fusion system in GHC today, we
> would force
> > > inlining of any function that calls `foldr` or `build` (or `augment`).
> > > Essentially, we would want to inline any binding that refers to a
> function that
> > > partakes in the rewrite rules for fusion. I'll admit we probably don't
> have to
> > > generalise this to all rewrite rules, but even with this restriction
> it seems
> > > like a rather aggressive optimisation strategy to me.
> >
> > I do not have much experience with using foldr/build. But conceptually
> > this is quite similar to what we do in stream fusion and similar ideas
> > should be applicable here as well. In this case we inline to expose
> > foldr/build primitives and then use rewrite rules to combine those. If
> > the inlining does not result in further elimination or firing of
> > rewrite rules then the inlining is not useful. You can probably try
> > out automatic inlining in the same way as we do in fusion-plugin and
> > see if code-bloat/compilation times become a real issue, and in which
> > particular cases, then put heuristics/thresholds to avoid that. Is
> > there a driving use case for extending this to foldr/build?
> >
> > -harendra
> > _______________________________________________
> > ghc-devs mailing list -- [email protected]
> > To unsubscribe send an email to [email protected]
> _______________________________________________
> ghc-devs mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
_______________________________________________
ghc-devs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to