Hello!

I'm curious about how the approach based on metaprogramming is better than
as a compiler optimization + inspection-testing [1]. If someone has
opinions on the matter, I think they would be interesting to read.

Cheers!
Facundo

[1] https://hackage.haskell.org/package/inspection-testing

On Wed, Dec 10, 2025 at 11:31 AM Matthew Pickering <
[email protected]> wrote:

> When it comes to fusion of `Generics`, we wrote a paper about
> performing that fusion in a multi-stage language:
> https://mpickering.github.io/papers/staged-sop.pdf
>
> In general there are soundness issues with Typed Template Haskell
> which make programming like this brittle in practice, which is what I
> discussed in my PhD thesis.
>
> Matt
>
> On Tue, Dec 9, 2025 at 4:35 PM Iavor Diatchki <[email protected]>
> wrote:
> >
> > Hello,
> > I am sure folks are aware of this, but since it has not been mentioned
> another example of useful fusion is `to . from` from `Generics`, where we'd
> like the representation types to never exist at runtime.  I think
> conceptually the problem is similar, but it's a bit interesting in that it
> has a different flavor to streams, and also there's a type family involved.
> > Cheers,
> > Iavor
> >
> > On Tue, Dec 9, 2025 at 5:27 AM Sebastian Graf <[email protected]>
> wrote:
> >>
> >> (Perhaps I should have said "returning `Step`" instead of "involving
> `Step`".)
> >>
> >> Am Di., 9. Dez. 2025 um 14:26 Uhr schrieb Sebastian Graf <
> [email protected]>:
> >>>
> >>> I agree that multi-stage programming is the principled green field
> solution. It provides the right model in which to think about solutions for
> GHC.
> >>> For GHC, that model certainly means "Step is just an intermediate data
> structure", i.e., any computation involving `Step` should only live at
> compile-time ("template polymorphism") and it should be inlined/specialised
> for aggressively.
> >>>
> >>> This is not too different for the situation with list fusion. Any
> compiled program that *runs* `map`/`filter`/`foldr`/`build` has a
> performance bug. You always want to end up with `foldr`'s SAT'd
> unfolding/explicit `(:)`/`[]` instead of a higher-order function `build`
> which does the same thing. That is achieved by the intricate multiphase
> rewrite rule framework.
> >>> If all goes well, the final program never contains any lists
> whatsoever; only a nesting of tight recursive loops. Note that there is no
> code sharing of `map`/`filter`/`foldr` here; they must all have been
> inlined and rewritten and were purely a compile-time thing.
> >>>
> >>> We want the same for stream fusion because it has better fusion
> support for `zipWith`. Inlining any function involving `Step` is absolutely
> the right thing to do! Aggressive specialization as well, but that's really
> difficult to do using SpecConstr because it blows up so easily. The stream
> fusion paper instead suggests applying the static argument transformation
> in section 7.2. A draft patch of mine tried to do that on the fly to
> compute unfoldings for recursive functions with static args:
> https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4553. It lacks
> polishing and my time and interest, but I firmly believe this should help
> the stream fusion use case, because SAT+inlining is *exactly*
> specialization of recursive functions for static arguments.
> >>> (Side Note: Andrasz pointed out to me two years ago that there were
> further complications which I don't recall and which mean that he settled
> on the practical compromise of going for foldr/build fusion and enabling
> zip fusion/parallel for with stream fusion.)
> >>>
> >>> ---
> >>>
> >>> > what happens if you have sharing
> >>> >   let ys :: Stream Int = h x
> >>> >   in f ys + g ys
> >>>
> >>> This is a flaw in the way that GHC says "lists are streams". IMO a
> stream (such as in streamly) should happily duplicate all the work that
> goes into computing the elements of `ys`, since it is just code. By
> contrast, data structures such as lists do not have that luxury. `let ys ::
> [Int] = ...; in f ys ++ g ys` should not try to fuse `ys`.
> >>> Here's a repo that implements just foldr/build fusion without
> rewriting lists to/from it: https://github.com/sgraf812/foldr-build. It
> doesn't have the "fusion duplicates work" flaw because there is no data
> that should have shared the work in the first place.
> >>>
> >>> ---
> >>>
> >>> TLDR; inlining and specialising `Step` sounds like exactly what is
> needed. Specialisation of concatMap is the hard part for stream fusion,
> though, and that means we'll end up with `Step` in the program nonetheless.
> >>>
> >>> Btw., Lean's standard `Iterator` framework (inspired by Rust's) will
> implement "stream fusion". (That's what external iteration is.) I'm not
> saying that it's simple or that the optimiser is equipped to handle it,
> though :)
> >>>
> >>> Cheers,
> >>> Sebastian
> >>>
> >>>
> >>> Am Di., 9. Dez. 2025 um 13:45 Uhr schrieb Simon Peyton Jones <
> [email protected]>:
> >>>>>
> >>>>> I think a solution based on multi-stage programming is the most
> >>>>> natural way to tackle the stream fusion problem in a robust manner.
> >>>>
> >>>>
> >>>> How close are we to being able to do that in Haskell?
> >>>>
> >>>> Simon
> >>>>
> >>>> On Tue, 9 Dec 2025 at 09:56, Matthew Pickering <
> [email protected]> wrote:
> >>>>>
> >>>>> I think a solution based on multi-stage programming is the most
> >>>>> natural way to tackle the stream fusion problem in a robust manner.
> >>>>>
> >>>>> For example: https://andraskovacs.github.io/pdfs/2ltt_icfp24.pdf
> >>>>>
> >>>>> By allowing a programmer to express which parts of the program should
> >>>>> only exist code generation time, you guarantee that fusion occurs.
> >>>>> This seems much more aligned with how Haskell programmers think about
> >>>>> other things.
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> Matt
> >>>>>
> >>>>> On Tue, Dec 9, 2025 at 12:03 AM Harendra Kumar <
> [email protected]> wrote:
> >>>>> >
> >>>>> > On Mon, 8 Dec 2025 at 23:35, Jaro Reinders <
> [email protected]> wrote:
> >>>>> > >
> >>>>> > > I suggested we could integrate this into GHC by giving an
> inlining discount to
> >>>>> > > any function which contains function that is also mentioned in a
> rewrite rule.
> >>>>> >
> >>>>> > There are two distinct fusion mechanisms under consideration here -
> >>>>> > foldr/build fusion and stream fusion. Rewrite rules allow us to
> >>>>> > replace a combination of functions with an equivalent function,
> >>>>> > forming the basis of foldr/build fusion by fusing or eliminating
> >>>>> > intermediate functions. Stream fusion on the other hand eliminates
> >>>>> > intermediate constructors, the key optimizations for achieving this
> >>>>> > are case-of-case and worker-wrapper. Both types of fusion require
> >>>>> > inlining but they are clearly different at the implementation
> level,
> >>>>> > even though the same high level ideas may apply in both cases.
> >>>>> >
> >>>>> > The fusion-plugin deals with stream fusion only -- where inlining
> and
> >>>>> > spec-constr are used to expose constructors which are then
> eliminated
> >>>>> > by case-of-case (for linear pipelines) or worker wrapper (for
> >>>>> > recursive loops) transformations. fusion-plugin has been used
> >>>>> > extensively in streamly and we have been able to fuse almost all
> cases
> >>>>> > that we wanted to. I would recommend that you try out
> fusion-plugin on
> >>>>> > the vector package, it can help fuse the pesky cases and you can
> also
> >>>>> > gain some insight into how much code bloat it actually causes in
> >>>>> > practice. Vector is primarily based on stream fusion, and even
> though
> >>>>> > it uses rewrite rules, those are not for fusion as such I believe.
> >>>>> >
> >>>>> > > However, I also noted I think this will have a large impact on
> code size and
> >>>>> > > compilation time in larger code bases. Simon agreed, but noted
> that Harendra's
> >>>>> > > plugin is much more selective than what I suggested.
> >>>>> >
> >>>>> > Typically we have a pipeline of functions connected by fusible
> >>>>> > constructors. To fuse the entire pipeline completely we need to
> inline
> >>>>> > all the functions involved in the pipeline. Thus instead of having
> >>>>> > multiple small functions we end up creating one large combined
> >>>>> > function. However, when we use case-of-case transformation on the
> >>>>> > combined function, the size often gets reduced dramatically. In
> fact
> >>>>> > the total size of the end product is usually smaller than the
> combined
> >>>>> > original components in the pipeline. However, ghc requires more
> memory
> >>>>> > and CPU to simplify, especially if the loop becomes too large.
> >>>>> >
> >>>>> > In our experience aggressive inlining for fusing linear segments of
> >>>>> > pipelines even if they are pretty big has almost never been a
> problem.
> >>>>> > If it becomes large we can always break the pipeline by using a
> >>>>> > NOINLINE, making a trade off between runtime perf and compile time
> >>>>> > resource requirement, but we rarely needed to do that. The code
> bloat
> >>>>> > comes more often from the spec-constr optimization when we are
> fusing
> >>>>> > recursive loops. In some cases we have used
> >>>>> > -fspec-constr-recursive=16, which I admit is a very large value,
> the
> >>>>> > compile times can sometimes increase dramatically due to this. We
> can
> >>>>> > always choose to use a lower value which may result in unfused
> code in
> >>>>> > some cases. There may be scope of improvement in GHC for making
> this
> >>>>> > optimization smarter and more selective.
> >>>>> >
> >>>>> > We can also put some hard thresholds or heuristics here for the
> cases
> >>>>> > when the resulting monolithic functions become too large. One
> possible
> >>>>> > way to decide whether to inline or not could be to check if the
> size
> >>>>> > of the combined/optimized function is smaller than or roughly
> >>>>> > equivalent to the sum of the individual parts. If inlining does not
> >>>>> > result in elimination of the constructors in subsequent passes,
> then
> >>>>> > the inlining is not useful. But we have seen that it is almost
> always
> >>>>> > possible to eliminate all or most of the constructors.
> >>>>> >
> >>>>> > > Translating this to the fold/build fusion system in GHC today,
> we would force
> >>>>> > > inlining of any function that calls `foldr` or `build` (or
> `augment`).
> >>>>> > > Essentially, we would want to inline any binding that refers to
> a function that
> >>>>> > > partakes in the rewrite rules for fusion. I'll admit we probably
> don't have to
> >>>>> > > generalise this to all rewrite rules, but even with this
> restriction it seems
> >>>>> > > like a rather aggressive optimisation strategy to me.
> >>>>> >
> >>>>> > I do not have much experience with using foldr/build. But
> conceptually
> >>>>> > this is quite similar to what we do in stream fusion and similar
> ideas
> >>>>> > should be applicable here as well. In this case we inline to expose
> >>>>> > foldr/build primitives and then use rewrite rules to combine
> those. If
> >>>>> > the inlining does not result in further elimination or firing of
> >>>>> > rewrite rules then the inlining is not useful. You can probably try
> >>>>> > out automatic inlining in the same way as we do in fusion-plugin
> and
> >>>>> > see if code-bloat/compilation times become a real issue, and in
> which
> >>>>> > particular cases, then put heuristics/thresholds to avoid that. Is
> >>>>> > there a driving use case for extending this to foldr/build?
> >>>>> >
> >>>>> > -harendra
> >>>>> > _______________________________________________
> >>>>> > ghc-devs mailing list -- [email protected]
> >>>>> > To unsubscribe send an email to [email protected]
> >>>>> _______________________________________________
> >>>>> ghc-devs mailing list -- [email protected]
> >>>>> To unsubscribe send an email to [email protected]
> >>>>
> >>>> _______________________________________________
> >>>> ghc-devs mailing list -- [email protected]
> >>>> To unsubscribe send an email to [email protected]
> >>
> >> _______________________________________________
> >> ghc-devs mailing list -- [email protected]
> >> To unsubscribe send an email to [email protected]
> >
> > _______________________________________________
> > ghc-devs mailing list -- [email protected]
> > To unsubscribe send an email to [email protected]
> _______________________________________________
> ghc-devs mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>


-- 
All views and opinions expressed in this email message are the personal
opinions of the author and do not represent those of the organization or
its customers.
_______________________________________________
ghc-devs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to