On Tue, 9 Dec 2025 at 15:19, Simon Peyton Jones
<[email protected]> wrote:

>
> I'm not really getting this distinction.   foldr/build fusion eliminates 
> intermediate lists (cons cells); stream fusion eliminates intermediate Step 
> constructors.   What is the difference, really?

My point was that these two fusion approaches have very different
implementations: foldr/build relies on identifying the respective
functions and their rewrite rules, while stream fusion relies on
recognizing the constructors involved. It wasn’t clear which one we
were discussing. Jaro was coming from the point of view of the vector
library, where fusion is based on the Step constructor, and then he
mentioned rewrite rules, which don’t apply to stream fusion. So I’m
unsure of the goal: is it improving list fusion, or making
vector-style stream fusion more reliable, possibly by integrating
fusion-plugin ideas into GHC.

> The only difference in my mind is that you can declare the entire Step data 
> type to be fusible, thereby driving aggressive inlining.  Step is *only* an 
> intermediate data type.    But we can't do that for lists because they are 
> ubiquitous.

In the list fusion case, constructor marking scheme will not apply, in
this case we will have to identify if a function results in
foldr/build primitives and if a rewrite rule exists that can eliminate
it, if so we can mark the function with INLINE. So the corresponding
concept here would be to mark and find the foldr/build functions
rather than the list constructors. However, this is already achieved
by manually marking all such functions to be inlined, though it may be
fragile. In case of stream-fusion also the programmer marks the
involved function with INLINE, but the core problem that the
fusion-plugin solves is that the internal join-points/let bindings
created by GHC during core-2-core transformations are not marked
inline, therefore, the programmer loses control over inlining. I am
not sure if such a problem even exists in case of list fusion.

> BTW Harendra, what happens if you have sharing
>
>    let ys :: Stream Int = h x
>    in f ys + g ys
>
> where one Stream is consumed by two consumers, f and g.   Now it's not so 
> easy to eliminate the intermediate, and aggressively inlining f, g, and h 
> will simply increase code size.

First of all, I would discourage writing code which duplicates the
stream. The recommended approach is to use fold combinators to combine
the folds "f" and "g" here, this is achieved by using the "tee" or
"teeWith" combinators in streamly. The resulting fold consumes a
single stream and distributes it to both the folds, achieving complete
fusion. If someone does write this code, in the case of stream fusion
both f and g will get inlined and fuse with "ys", so fusion is not an
issue. However, "ys" gets duplicated which has its own ramifications
in impure code, but that's a different issue altogether.

-harendra
_______________________________________________
ghc-devs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to