Hello! I'm curious about how the approach based on metaprogramming is better than as a compiler optimization + inspection-testing [1]. If someone has opinions on the matter, I think they would be interesting to read.
Cheers! Facundo [1] https://hackage.haskell.org/package/inspection-testing On Wed, Dec 10, 2025 at 11:31 AM Matthew Pickering < [email protected]> wrote: > When it comes to fusion of `Generics`, we wrote a paper about > performing that fusion in a multi-stage language: > https://mpickering.github.io/papers/staged-sop.pdf > > In general there are soundness issues with Typed Template Haskell > which make programming like this brittle in practice, which is what I > discussed in my PhD thesis. > > Matt > > On Tue, Dec 9, 2025 at 4:35 PM Iavor Diatchki <[email protected]> > wrote: > > > > Hello, > > I am sure folks are aware of this, but since it has not been mentioned > another example of useful fusion is `to . from` from `Generics`, where we'd > like the representation types to never exist at runtime. I think > conceptually the problem is similar, but it's a bit interesting in that it > has a different flavor to streams, and also there's a type family involved. > > Cheers, > > Iavor > > > > On Tue, Dec 9, 2025 at 5:27 AM Sebastian Graf <[email protected]> > wrote: > >> > >> (Perhaps I should have said "returning `Step`" instead of "involving > `Step`".) > >> > >> Am Di., 9. Dez. 2025 um 14:26 Uhr schrieb Sebastian Graf < > [email protected]>: > >>> > >>> I agree that multi-stage programming is the principled green field > solution. It provides the right model in which to think about solutions for > GHC. > >>> For GHC, that model certainly means "Step is just an intermediate data > structure", i.e., any computation involving `Step` should only live at > compile-time ("template polymorphism") and it should be inlined/specialised > for aggressively. > >>> > >>> This is not too different for the situation with list fusion. Any > compiled program that *runs* `map`/`filter`/`foldr`/`build` has a > performance bug. You always want to end up with `foldr`'s SAT'd > unfolding/explicit `(:)`/`[]` instead of a higher-order function `build` > which does the same thing. That is achieved by the intricate multiphase > rewrite rule framework. > >>> If all goes well, the final program never contains any lists > whatsoever; only a nesting of tight recursive loops. Note that there is no > code sharing of `map`/`filter`/`foldr` here; they must all have been > inlined and rewritten and were purely a compile-time thing. > >>> > >>> We want the same for stream fusion because it has better fusion > support for `zipWith`. Inlining any function involving `Step` is absolutely > the right thing to do! Aggressive specialization as well, but that's really > difficult to do using SpecConstr because it blows up so easily. The stream > fusion paper instead suggests applying the static argument transformation > in section 7.2. A draft patch of mine tried to do that on the fly to > compute unfoldings for recursive functions with static args: > https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4553. It lacks > polishing and my time and interest, but I firmly believe this should help > the stream fusion use case, because SAT+inlining is *exactly* > specialization of recursive functions for static arguments. > >>> (Side Note: Andrasz pointed out to me two years ago that there were > further complications which I don't recall and which mean that he settled > on the practical compromise of going for foldr/build fusion and enabling > zip fusion/parallel for with stream fusion.) > >>> > >>> --- > >>> > >>> > what happens if you have sharing > >>> > let ys :: Stream Int = h x > >>> > in f ys + g ys > >>> > >>> This is a flaw in the way that GHC says "lists are streams". IMO a > stream (such as in streamly) should happily duplicate all the work that > goes into computing the elements of `ys`, since it is just code. By > contrast, data structures such as lists do not have that luxury. `let ys :: > [Int] = ...; in f ys ++ g ys` should not try to fuse `ys`. > >>> Here's a repo that implements just foldr/build fusion without > rewriting lists to/from it: https://github.com/sgraf812/foldr-build. It > doesn't have the "fusion duplicates work" flaw because there is no data > that should have shared the work in the first place. > >>> > >>> --- > >>> > >>> TLDR; inlining and specialising `Step` sounds like exactly what is > needed. Specialisation of concatMap is the hard part for stream fusion, > though, and that means we'll end up with `Step` in the program nonetheless. > >>> > >>> Btw., Lean's standard `Iterator` framework (inspired by Rust's) will > implement "stream fusion". (That's what external iteration is.) I'm not > saying that it's simple or that the optimiser is equipped to handle it, > though :) > >>> > >>> Cheers, > >>> Sebastian > >>> > >>> > >>> Am Di., 9. Dez. 2025 um 13:45 Uhr schrieb Simon Peyton Jones < > [email protected]>: > >>>>> > >>>>> I think a solution based on multi-stage programming is the most > >>>>> natural way to tackle the stream fusion problem in a robust manner. > >>>> > >>>> > >>>> How close are we to being able to do that in Haskell? > >>>> > >>>> Simon > >>>> > >>>> On Tue, 9 Dec 2025 at 09:56, Matthew Pickering < > [email protected]> wrote: > >>>>> > >>>>> I think a solution based on multi-stage programming is the most > >>>>> natural way to tackle the stream fusion problem in a robust manner. > >>>>> > >>>>> For example: https://andraskovacs.github.io/pdfs/2ltt_icfp24.pdf > >>>>> > >>>>> By allowing a programmer to express which parts of the program should > >>>>> only exist code generation time, you guarantee that fusion occurs. > >>>>> This seems much more aligned with how Haskell programmers think about > >>>>> other things. > >>>>> > >>>>> Cheers, > >>>>> > >>>>> Matt > >>>>> > >>>>> On Tue, Dec 9, 2025 at 12:03 AM Harendra Kumar < > [email protected]> wrote: > >>>>> > > >>>>> > On Mon, 8 Dec 2025 at 23:35, Jaro Reinders < > [email protected]> wrote: > >>>>> > > > >>>>> > > I suggested we could integrate this into GHC by giving an > inlining discount to > >>>>> > > any function which contains function that is also mentioned in a > rewrite rule. > >>>>> > > >>>>> > There are two distinct fusion mechanisms under consideration here - > >>>>> > foldr/build fusion and stream fusion. Rewrite rules allow us to > >>>>> > replace a combination of functions with an equivalent function, > >>>>> > forming the basis of foldr/build fusion by fusing or eliminating > >>>>> > intermediate functions. Stream fusion on the other hand eliminates > >>>>> > intermediate constructors, the key optimizations for achieving this > >>>>> > are case-of-case and worker-wrapper. Both types of fusion require > >>>>> > inlining but they are clearly different at the implementation > level, > >>>>> > even though the same high level ideas may apply in both cases. > >>>>> > > >>>>> > The fusion-plugin deals with stream fusion only -- where inlining > and > >>>>> > spec-constr are used to expose constructors which are then > eliminated > >>>>> > by case-of-case (for linear pipelines) or worker wrapper (for > >>>>> > recursive loops) transformations. fusion-plugin has been used > >>>>> > extensively in streamly and we have been able to fuse almost all > cases > >>>>> > that we wanted to. I would recommend that you try out > fusion-plugin on > >>>>> > the vector package, it can help fuse the pesky cases and you can > also > >>>>> > gain some insight into how much code bloat it actually causes in > >>>>> > practice. Vector is primarily based on stream fusion, and even > though > >>>>> > it uses rewrite rules, those are not for fusion as such I believe. > >>>>> > > >>>>> > > However, I also noted I think this will have a large impact on > code size and > >>>>> > > compilation time in larger code bases. Simon agreed, but noted > that Harendra's > >>>>> > > plugin is much more selective than what I suggested. > >>>>> > > >>>>> > Typically we have a pipeline of functions connected by fusible > >>>>> > constructors. To fuse the entire pipeline completely we need to > inline > >>>>> > all the functions involved in the pipeline. Thus instead of having > >>>>> > multiple small functions we end up creating one large combined > >>>>> > function. However, when we use case-of-case transformation on the > >>>>> > combined function, the size often gets reduced dramatically. In > fact > >>>>> > the total size of the end product is usually smaller than the > combined > >>>>> > original components in the pipeline. However, ghc requires more > memory > >>>>> > and CPU to simplify, especially if the loop becomes too large. > >>>>> > > >>>>> > In our experience aggressive inlining for fusing linear segments of > >>>>> > pipelines even if they are pretty big has almost never been a > problem. > >>>>> > If it becomes large we can always break the pipeline by using a > >>>>> > NOINLINE, making a trade off between runtime perf and compile time > >>>>> > resource requirement, but we rarely needed to do that. The code > bloat > >>>>> > comes more often from the spec-constr optimization when we are > fusing > >>>>> > recursive loops. In some cases we have used > >>>>> > -fspec-constr-recursive=16, which I admit is a very large value, > the > >>>>> > compile times can sometimes increase dramatically due to this. We > can > >>>>> > always choose to use a lower value which may result in unfused > code in > >>>>> > some cases. There may be scope of improvement in GHC for making > this > >>>>> > optimization smarter and more selective. > >>>>> > > >>>>> > We can also put some hard thresholds or heuristics here for the > cases > >>>>> > when the resulting monolithic functions become too large. One > possible > >>>>> > way to decide whether to inline or not could be to check if the > size > >>>>> > of the combined/optimized function is smaller than or roughly > >>>>> > equivalent to the sum of the individual parts. If inlining does not > >>>>> > result in elimination of the constructors in subsequent passes, > then > >>>>> > the inlining is not useful. But we have seen that it is almost > always > >>>>> > possible to eliminate all or most of the constructors. > >>>>> > > >>>>> > > Translating this to the fold/build fusion system in GHC today, > we would force > >>>>> > > inlining of any function that calls `foldr` or `build` (or > `augment`). > >>>>> > > Essentially, we would want to inline any binding that refers to > a function that > >>>>> > > partakes in the rewrite rules for fusion. I'll admit we probably > don't have to > >>>>> > > generalise this to all rewrite rules, but even with this > restriction it seems > >>>>> > > like a rather aggressive optimisation strategy to me. > >>>>> > > >>>>> > I do not have much experience with using foldr/build. But > conceptually > >>>>> > this is quite similar to what we do in stream fusion and similar > ideas > >>>>> > should be applicable here as well. In this case we inline to expose > >>>>> > foldr/build primitives and then use rewrite rules to combine > those. If > >>>>> > the inlining does not result in further elimination or firing of > >>>>> > rewrite rules then the inlining is not useful. You can probably try > >>>>> > out automatic inlining in the same way as we do in fusion-plugin > and > >>>>> > see if code-bloat/compilation times become a real issue, and in > which > >>>>> > particular cases, then put heuristics/thresholds to avoid that. Is > >>>>> > there a driving use case for extending this to foldr/build? > >>>>> > > >>>>> > -harendra > >>>>> > _______________________________________________ > >>>>> > ghc-devs mailing list -- [email protected] > >>>>> > To unsubscribe send an email to [email protected] > >>>>> _______________________________________________ > >>>>> ghc-devs mailing list -- [email protected] > >>>>> To unsubscribe send an email to [email protected] > >>>> > >>>> _______________________________________________ > >>>> ghc-devs mailing list -- [email protected] > >>>> To unsubscribe send an email to [email protected] > >> > >> _______________________________________________ > >> ghc-devs mailing list -- [email protected] > >> To unsubscribe send an email to [email protected] > > > > _______________________________________________ > > ghc-devs mailing list -- [email protected] > > To unsubscribe send an email to [email protected] > _______________________________________________ > ghc-devs mailing list -- [email protected] > To unsubscribe send an email to [email protected] > -- All views and opinions expressed in this email message are the personal opinions of the author and do not represent those of the organization or its customers.
_______________________________________________ ghc-devs mailing list -- [email protected] To unsubscribe send an email to [email protected]
