RE: Fusing loops by specializing on functions with SpecConstr?

2020-04-06 Thread Simon Peyton Jones via ghc-devs
Cool -- but please do write a blog post or something to distil what you have 
learned. I have not followed this thread in detail, and I bet others haven't 
either. But it'd be a pity for your learning not to be shared somehow!

Thanks

Simon

| -Original Message-
| From: ghc-devs  On Behalf Of Alexis King
| Sent: 04 April 2020 02:46
| To: Sebastian Graf 
| Cc: ghc-devs 
| Subject: Re: Fusing loops by specializing on functions with SpecConstr?
| 
| 
| I fiddled with alternative representations for a while and didn’t make any
| progress—it was too easy to end up with code explosion in the presence of
| any unknown calls—but I seem to have found a RULES-based approach that
| works very well on the examples I’ve tried. It’s quite simple, which makes
| it especially appealing!
| 
| I started by defining a wrapper around the `SF` constructor to attach
| rules to:
| 
| mkSF :: (a -> s -> Step s b) -> s -> SF a b
| mkSF = SF
| {-# INLINE CONLIKE [1] mkSF #-}
| 
| I  then changed the definitions of (.), (***), (&&&), (+++), and (&&&) to
| use `mkSF` instead of `SF`, but I left the other methods alone, so they
| just use `SF` directly. Then I defined two rewrite rules:
| 
| {-# RULES
| "mkSF @((), _)" forall f s. mkSF f ((), s) =
|   SF (\a s1 -> case f a ((), s1) of Step ((), s2) b -> Step s2 b) s
| "mkSF @(_, ())" forall f s. mkSF f (s, ()) =
|   SF (\a s1 -> case f a (s1, ()) of Step (s2, ()) b -> Step s2 b) s
| #-}
| 
| That’s it. These two rules alone are enough to eliminate the redundant
| tupling. Now the optimized version of `mapMaybeSF` is beautiful!
| 
| mapMaybeSF = \ @ a @ b f -> case f of { SF @ s f2 s2 ->
|   SF (\ a1 s1 -> case a1 of {
|Nothing -> case s1 of dt { __DEFAULT -> Step dt Nothing }
|Just x -> case f2 x s1 of {
|  Step s2' c1 -> Step s2' (Just c1) }})
|  s2 }
| 
| So unless this breaks down in some larger situation I’m not aware of, I
| think this solves my problem without the need for any fancy SpecConstr
| shenanigans. Many thanks to you, Sebastian, for pointing me in the right
| direction!
| 
| Alexis
| ___
| ghc-devs mailing list
| ghc-devs@haskell.org
| https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail.hask
| ell.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fghc-
| devs&data=02%7C01%7Csimonpj%40microsoft.com%7Cfa33485e4b3643e695fe08d7
| d839ecb9%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637215615608529596&a
| mp;sdata=CSDPKcz%2BnVuQC%2BitP%2FZXpPpOtcTxUAfe0fxiNZAfTrs%3D&reserved
| =0
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Fusing loops by specializing on functions with SpecConstr?

2020-04-05 Thread Sebastian Graf
>
> That’s it. These two rules alone are enough to eliminate the redundant
> tupling. Now the optimized version of `mapMaybeSF` is beautiful!
>

Beautiful indeed! That's wonderful to hear. Good luck messing about with
your FRP framework!

Sebastian

Am Sa., 4. Apr. 2020 um 03:45 Uhr schrieb Alexis King :

>
> I fiddled with alternative representations for a while and didn’t make
> any progress—it was too easy to end up with code explosion in the
> presence of any unknown calls—but I seem to have found a RULES-based
> approach that works very well on the examples I’ve tried. It’s quite
> simple, which makes it especially appealing!
>
> I started by defining a wrapper around the `SF` constructor to attach
> rules to:
>
> mkSF :: (a -> s -> Step s b) -> s -> SF a b
> mkSF = SF
> {-# INLINE CONLIKE [1] mkSF #-}
>
> I  then changed the definitions of (.), (***), (&&&), (+++), and (&&&)
> to use `mkSF` instead of `SF`, but I left the other methods alone, so
> they just use `SF` directly. Then I defined two rewrite rules:
>
> {-# RULES
> "mkSF @((), _)" forall f s. mkSF f ((), s) =
>   SF (\a s1 -> case f a ((), s1) of Step ((), s2) b -> Step s2 b) s
> "mkSF @(_, ())" forall f s. mkSF f (s, ()) =
>   SF (\a s1 -> case f a (s1, ()) of Step (s2, ()) b -> Step s2 b) s
> #-}
>
> That’s it. These two rules alone are enough to eliminate the redundant
> tupling. Now the optimized version of `mapMaybeSF` is beautiful!
>
> mapMaybeSF = \ @ a @ b f -> case f of { SF @ s f2 s2 ->
>   SF (\ a1 s1 -> case a1 of {
>Nothing -> case s1 of dt { __DEFAULT -> Step dt Nothing }
>Just x -> case f2 x s1 of {
>  Step s2' c1 -> Step s2' (Just c1) }})
>  s2 }
>
> So unless this breaks down in some larger situation I’m not aware of, I
> think this solves my problem without the need for any fancy SpecConstr
> shenanigans. Many thanks to you, Sebastian, for pointing me in the right
> direction!
>
> Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Fusing loops by specializing on functions with SpecConstr?

2020-04-03 Thread Alexis King

I fiddled with alternative representations for a while and didn’t make
any progress—it was too easy to end up with code explosion in the
presence of any unknown calls—but I seem to have found a RULES-based
approach that works very well on the examples I’ve tried. It’s quite
simple, which makes it especially appealing!

I started by defining a wrapper around the `SF` constructor to attach
rules to:

mkSF :: (a -> s -> Step s b) -> s -> SF a b
mkSF = SF
{-# INLINE CONLIKE [1] mkSF #-}

I  then changed the definitions of (.), (***), (&&&), (+++), and (&&&)
to use `mkSF` instead of `SF`, but I left the other methods alone, so
they just use `SF` directly. Then I defined two rewrite rules:

{-# RULES
"mkSF @((), _)" forall f s. mkSF f ((), s) =
  SF (\a s1 -> case f a ((), s1) of Step ((), s2) b -> Step s2 b) s
"mkSF @(_, ())" forall f s. mkSF f (s, ()) =
  SF (\a s1 -> case f a (s1, ()) of Step (s2, ()) b -> Step s2 b) s
#-}

That’s it. These two rules alone are enough to eliminate the redundant
tupling. Now the optimized version of `mapMaybeSF` is beautiful!

mapMaybeSF = \ @ a @ b f -> case f of { SF @ s f2 s2 ->
  SF (\ a1 s1 -> case a1 of {
   Nothing -> case s1 of dt { __DEFAULT -> Step dt Nothing }
   Just x -> case f2 x s1 of {
 Step s2' c1 -> Step s2' (Just c1) }})
 s2 }

So unless this breaks down in some larger situation I’m not aware of, I
think this solves my problem without the need for any fancy SpecConstr
shenanigans. Many thanks to you, Sebastian, for pointing me in the right
direction!

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


RE: Fusing loops by specializing on functions with SpecConstr?

2020-04-01 Thread Simon Peyton Jones via ghc-devs
I have started a wiki page for join points here
https://gitlab.haskell.org/ghc/ghc/-/wikis/Join-points-in-GHC

Do add to it

Simon

| -Original Message-
| From: ghc-devs  On Behalf Of Joachim
| Breitner
| Sent: 01 April 2020 19:37
| To: ghc-devs 
| Subject: Re: Fusing loops by specializing on functions with SpecConstr?
| 
| Hi,
| 
| I think most of the docs about exitification are the notes in the Exitify
| module, and then there is the original ticket at
| https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.h
| askell.org%2Fghc%2Fghc%2Fissues%2F14152&data=02%7C01%7Csimonpj%40micro
| soft.com%7C8e5db80efc0f407af9c308d7d66bcd58%7C72f988bf86f141af91ab2d7cd011
| db47%7C1%7C0%7C637213630817542428&sdata=TBb6lzmIJvtHOQLwpsyFLPi2BEF%2B
| B66piMGTgcV%2Bkls%3D&reserved=0
| 
| I don’t immediately see the connection to SpecConstr on function values,
| though, so I don't really know what’s tickling your neurons right now.
| 
| Cheers,
| Joachim
| 
| 
| Am Dienstag, den 31.03.2020, 22:49 + schrieb Simon Peyton Jones:
| > Joachim: this conversation is triggering some hind-brain neurons
| > related to exitification, or something like that.  I recall that we
| > discovered we could get some surprising fusion of recursive functions
| > expressed as  join points.  Something like   f . g . h
| > where h loops for a while and returns, and same for g and f.  Then the
| > call to g landed up in the return branch of h, and same for f.
| >
| > But I can’t find anything in writing.  The Exitify module doesn’t say
| much. I thought we had a wiki page but I can’t find it.  Can you remember?
| >
| > Thanks
| >
| > Simon
| >
| > From: Alexis King 
| > Sent: 31 March 2020 22:18
| > To: Sebastian Graf ; Simon Peyton Jones
| > 
| > Cc: ghc-devs 
| > Subject: Re: Fusing loops by specializing on functions with SpecConstr?
| >
| > Sebastian and Simon,
| >
| > Thank you both for your responses—they are all quite helpful! I agree
| with both of you that figuring out how to do this kind of specialization
| without any guidance from the programmer seems rather intractable. It’s
| too hard to divine where it would actually be beneficial, and even if you
| could, it seems likely that other optimizations would get in the way of it
| actually working out.
| >
| > I’ve been trying to figure out if it would be possible to help the
| optimizer out by annotating the program with special combinators like the
| existing ones provided by GHC.Magic. However, I haven’t been able to come
| up with anything yet that seems like it would actually work.
| >
| > > On Mar 31, 2020, at 06:12, Simon Peyton Jones 
| wrote:
| > >
| > > Wow – tricky stuff!   I would never have thought of trying to optimise
| that program, but it’s fascinating that you get lots and lots of them from
| FRP.
| >
| >
| > For context, the reason you get all these tiny loops is that arrowized
| FRP uses the Arrow and ArrowChoice interfaces to build its programs, and
| those interfaces use tiny combinator functions like these:
| >
| > first :: Arrow a => a b c -> a (b, d) (c, d)
| > (***) :: Arrow a => a b d -> a c e -> a (b, c) (d, e)
| > (|||) :: ArrowChoice a => a b d -> a c d -> a (Either b c) d
| >
| > This means you end up with programs built out of dozens or hundreds of
| > uses of these tiny combinators. You get code that looks like
| >
| > first (left (arr f >>> g ||| right h) *** second i)
| >
| > and this is a textbook situation where you want to specialize and inline
| all the combinators! For arrows without this tricky recursion, doing that
| works as intended, and GHC’s simplifier will do what it’s supposed to, and
| you get fast code.
| >
| > But with FRP, each of these combinators is recursive. This means you
| often get really awful code that looks like this:
| >
| > arr (\case { Nothing -> Left (); Just x -> Right x }) >>> (f |||
| > g)
| >
| > This converts a Maybe to an Either, then branches on it. It’s analogous
| to writing something like this in direct-style code:
| >
| > let y = case x of { Nothing -> Left (); Just x -> Right x }
| > in case y of { Left () -> f; Right x -> g x }
| >
| > We really want the optimizer to eliminate the intermediate Either and
| just branch on it directly, and if GHC could fuse these tiny recursive
| loops, it could! But without that, all this pointless shuffling of values
| around remains in the optimized program.
| >
| >
| > >  I wonder whether it’d be possible to adjust the FRP library to
| generate easier-to-optimise code. Probably not, but worth asking.
| >
| >
| > I think it’s entirely possible to somehow annotate these combinators
| > to communicate this information to the optimizer, but I don’t know
|

RE: Fusing loops by specializing on functions with SpecConstr?

2020-04-01 Thread Simon Peyton Jones via ghc-devs
Thanks.  Perhaps I was thinking of Section 5 of the join-point paper
https://www.microsoft.com/en-us/research/publication/compiling-without-continuations/

That's about compositions of tiny tail recursive loops.  Alexis, just 
conceivably this might be relevant to your thinking on FRP ... but I'm waving 
my arms here so might be wide of the mark.

Simon

| -Original Message-
| From: ghc-devs  On Behalf Of Joachim
| Breitner
| Sent: 01 April 2020 19:37
| To: ghc-devs 
| Subject: Re: Fusing loops by specializing on functions with SpecConstr?
| 
| Hi,
| 
| I think most of the docs about exitification are the notes in the Exitify
| module, and then there is the original ticket at
| https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.h
| askell.org%2Fghc%2Fghc%2Fissues%2F14152&data=02%7C01%7Csimonpj%40micro
| soft.com%7C8e5db80efc0f407af9c308d7d66bcd58%7C72f988bf86f141af91ab2d7cd011
| db47%7C1%7C0%7C637213630817542428&sdata=TBb6lzmIJvtHOQLwpsyFLPi2BEF%2B
| B66piMGTgcV%2Bkls%3D&reserved=0
| 
| I don’t immediately see the connection to SpecConstr on function values,
| though, so I don't really know what’s tickling your neurons right now.
| 
| Cheers,
| Joachim
| 
| 
| Am Dienstag, den 31.03.2020, 22:49 + schrieb Simon Peyton Jones:
| > Joachim: this conversation is triggering some hind-brain neurons
| > related to exitification, or something like that.  I recall that we
| > discovered we could get some surprising fusion of recursive functions
| > expressed as  join points.  Something like   f . g . h
| > where h loops for a while and returns, and same for g and f.  Then the
| > call to g landed up in the return branch of h, and same for f.
| >
| > But I can’t find anything in writing.  The Exitify module doesn’t say
| much. I thought we had a wiki page but I can’t find it.  Can you remember?
| >
| > Thanks
| >
| > Simon
| >
| > From: Alexis King 
| > Sent: 31 March 2020 22:18
| > To: Sebastian Graf ; Simon Peyton Jones
| > 
| > Cc: ghc-devs 
| > Subject: Re: Fusing loops by specializing on functions with SpecConstr?
| >
| > Sebastian and Simon,
| >
| > Thank you both for your responses—they are all quite helpful! I agree
| with both of you that figuring out how to do this kind of specialization
| without any guidance from the programmer seems rather intractable. It’s
| too hard to divine where it would actually be beneficial, and even if you
| could, it seems likely that other optimizations would get in the way of it
| actually working out.
| >
| > I’ve been trying to figure out if it would be possible to help the
| optimizer out by annotating the program with special combinators like the
| existing ones provided by GHC.Magic. However, I haven’t been able to come
| up with anything yet that seems like it would actually work.
| >
| > > On Mar 31, 2020, at 06:12, Simon Peyton Jones 
| wrote:
| > >
| > > Wow – tricky stuff!   I would never have thought of trying to optimise
| that program, but it’s fascinating that you get lots and lots of them from
| FRP.
| >
| >
| > For context, the reason you get all these tiny loops is that arrowized
| FRP uses the Arrow and ArrowChoice interfaces to build its programs, and
| those interfaces use tiny combinator functions like these:
| >
| > first :: Arrow a => a b c -> a (b, d) (c, d)
| > (***) :: Arrow a => a b d -> a c e -> a (b, c) (d, e)
| > (|||) :: ArrowChoice a => a b d -> a c d -> a (Either b c) d
| >
| > This means you end up with programs built out of dozens or hundreds of
| > uses of these tiny combinators. You get code that looks like
| >
| > first (left (arr f >>> g ||| right h) *** second i)
| >
| > and this is a textbook situation where you want to specialize and inline
| all the combinators! For arrows without this tricky recursion, doing that
| works as intended, and GHC’s simplifier will do what it’s supposed to, and
| you get fast code.
| >
| > But with FRP, each of these combinators is recursive. This means you
| often get really awful code that looks like this:
| >
| > arr (\case { Nothing -> Left (); Just x -> Right x }) >>> (f |||
| > g)
| >
| > This converts a Maybe to an Either, then branches on it. It’s analogous
| to writing something like this in direct-style code:
| >
| > let y = case x of { Nothing -> Left (); Just x -> Right x }
| > in case y of { Left () -> f; Right x -> g x }
| >
| > We really want the optimizer to eliminate the intermediate Either and
| just branch on it directly, and if GHC could fuse these tiny recursive
| loops, it could! But without that, all this pointless shuffling of values
| around remains in the optimized program.
| >
| >
| > >  I wonder whether it’d be possible to adjust the FRP library to

Re: Fusing loops by specializing on functions with SpecConstr?

2020-04-01 Thread Joachim Breitner
Hi,

I think most of the docs about exitification are the notes in the
Exitify module, and then there is the original ticket at
https://gitlab.haskell.org/ghc/ghc/issues/14152

I don’t immediately see the connection to SpecConstr on function
values, though, so I don't really know what’s tickling your neurons
right now.

Cheers,
Joachim


Am Dienstag, den 31.03.2020, 22:49 + schrieb Simon Peyton Jones:
> Joachim: this conversation is triggering some hind-brain neurons
> related to exitification, or something like that.  I recall that we
> discovered we could get some surprising fusion of recursive functions
> expressed as  join points.  Something like   f . g . h
> where h loops for a while and returns, and same for g and f.  Then
> the call to g landed up in the return branch of h, and same for f.
>  
> But I can’t find anything in writing.  The Exitify module doesn’t say much. I 
> thought we had a wiki page but I can’t find it.  Can you remember?
>  
> Thanks
>  
> Simon
>  
> From: Alexis King  
> Sent: 31 March 2020 22:18
> To: Sebastian Graf ; Simon Peyton Jones 
> 
> Cc: ghc-devs 
> Subject: Re: Fusing loops by specializing on functions with SpecConstr?
>  
> Sebastian and Simon,
>  
> Thank you both for your responses—they are all quite helpful! I agree with 
> both of you that figuring out how to do this kind of specialization without 
> any guidance from the programmer seems rather intractable. It’s too hard to 
> divine where it would actually be beneficial, and even if you could, it seems 
> likely that other optimizations would get in the way of it actually working 
> out.
>  
> I’ve been trying to figure out if it would be possible to help the optimizer 
> out by annotating the program with special combinators like the existing ones 
> provided by GHC.Magic. However, I haven’t been able to come up with anything 
> yet that seems like it would actually work.
>  
> > On Mar 31, 2020, at 06:12, Simon Peyton Jones  wrote:
> >  
> > Wow – tricky stuff!   I would never have thought of trying to optimise that 
> > program, but it’s fascinating that you get lots and lots of them from FRP.
> 
>  
> For context, the reason you get all these tiny loops is that arrowized FRP 
> uses the Arrow and ArrowChoice interfaces to build its programs, and those 
> interfaces use tiny combinator functions like these:
>  
> first :: Arrow a => a b c -> a (b, d) (c, d)
> (***) :: Arrow a => a b d -> a c e -> a (b, c) (d, e)
> (|||) :: ArrowChoice a => a b d -> a c d -> a (Either b c) d
>  
> This means you end up with programs built out of dozens or hundreds of uses 
> of these tiny combinators. You get code that looks like
>  
> first (left (arr f >>> g ||| right h) *** second i)
>  
> and this is a textbook situation where you want to specialize and inline all 
> the combinators! For arrows without this tricky recursion, doing that works 
> as intended, and GHC’s simplifier will do what it’s supposed to, and you get 
> fast code.
>  
> But with FRP, each of these combinators is recursive. This means you often 
> get really awful code that looks like this:
>  
> arr (\case { Nothing -> Left (); Just x -> Right x }) >>> (f ||| g)
>  
> This converts a Maybe to an Either, then branches on it. It’s analogous to 
> writing something like this in direct-style code:
>  
> let y = case x of { Nothing -> Left (); Just x -> Right x }
> in case y of { Left () -> f; Right x -> g x }
>  
> We really want the optimizer to eliminate the intermediate Either and just 
> branch on it directly, and if GHC could fuse these tiny recursive loops, it 
> could! But without that, all this pointless shuffling of values around 
> remains in the optimized program.
> 
> 
> >  I wonder whether it’d be possible to adjust the FRP library to generate 
> > easier-to-optimise code. Probably not, but worth asking.
> 
>  
> I think it’s entirely possible to somehow annotate these combinators to 
> communicate this information to the optimizer, but I don’t know what the 
> annotations ought to look like. (That’s the research part!)
>  
> But I’m not very optimistic about getting the library to generate 
> easier-to-optimize code with the tools available today. Sebastian’s example 
> of SF2 and stream fusion sort of works, but in my experience, something like 
> that doesn’t handle enough cases well enough to work on real arrow programs.
>  
> >  Unrolling one layer of a recursive function.  That seems harder: how we 
> > know to *stop* unrolling as we successively simplify?  One idea: do one 
> > layer of unrolling by hand, perhaps even in FRP source code:
&g

Re: Fusing loops by specializing on functions with SpecConstr?

2020-04-01 Thread Alexis King
> On Apr 1, 2020, at 03:21, Sebastian Graf  wrote:
> 
> That is indeed true. But note that as long as you manage to inline 
> `mapMaybeSF`, the final `runSF` will only allocate once on the "edge" of each 
> iteration, all intermediate allocations will have been fused away. But the 
> allocation of these non-sense records seems unfortunate.

Yes, that is technically true, but note that even if we inline mapMaybeSF, 
those nonsense records don’t go away, they just bubble up to the “fringe” of 
the enclosing computation. And consider how tiny mapMaybeSF is: I shudder to 
think how enormous that “fringe” would be for a large program written in SF!

(And of course, nothing prevents the runSF itself from appearing in a 
loop—quite probable, in fact, given its use in the hypothetical `lazy` 
combinator.)

> So this already seems quite brittle. Maybe a very targeted optimisation that 
> gets rid of the boring ((), _) wrappers could be worthwhile, given that a 
> potential caller is never able to construct such a thing themselves. But that 
> very much hinges on being able to prove that in fact every such ((), _) 
> constructed in the function itself terminates.

Yes, that is a good point. I concede that seems much less tractable than I had 
initially hoped.

Still, as you suggest, it does seem plausible that a different encoding could 
avoid this problem. I will experiment with a few different things and get back 
to you if I find anything interesting (assuming you don’t beat me to it first!).
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Fusing loops by specializing on functions with SpecConstr?

2020-04-01 Thread Sebastian Graf
>
> Looking at the optimized core, it’s true that the conversion of Maybe to
> Either and back again gets eliminated, which is wonderful! But what’s less
> wonderful is the value passed around through `s`:
>
> mapMaybeSF
>   = \ (@ a) (@ b) (f :: SF a b) ->
>   case f of { SF @ s f2 s2 ->
>   SF
> (\ (a1 :: Maybe a) (ds2 :: ((), ((), (((), (((), (((), s),
> ())), ((), ((), (), ((), ()) ->
>

That is indeed true. But note that as long as you manage to inline
`mapMaybeSF`, the final `runSF` will only allocate once on the "edge" of
each iteration, all intermediate allocations will have been fused away. But
the allocation of these non-sense records seems unfortunate.

Optimisation-wise, I see two problems here:

   1. `mapMaybeSF` is already too huge to inline without INLINE. That is
   because its lambda isn't floated out to the top-level, which is because of
   the existential @s (that shouldn't be a problem), but also its mention of
   f2. The fact that f2 occurs free rather than as an argument makes the
   simplifier specialise `mapMaybeSF` for it, so if it were floated out
   (thereby necessarily lambda-lifted) to top-level, then we'd lose the
   ability to specialise without SpecConstr (which currently only applies to
   recursive functions anyway).
   2. The lambda isn't let-bound (which is probably a consequence of the
   previous point), so it isn't strictness analysed and we have no W/W split.
   If we had, I imagine we would have a worker of type `s -> ...` here. W/W is
   unnecessary if we manage to inline the function anyway, but I'm pretty
   certain we won't inline for larger programs (like `mapMaybeSF` already), in
   which case every failure to inline leaves behind such a residue of records.

So this already seems quite brittle. Maybe a very targeted optimisation
that gets rid of the boring ((), _) wrappers could be worthwhile, given
that a potential caller is never able to construct such a thing themselves.
But that very much hinges on being able to prove that in fact every such
((), _) constructed in the function itself terminates.

There are a few ways I can think of in which we as the programmer could
have been smarter, though:


   - Simply by specialising `SF` for the `()` case:

   data SF a b where
 SFState :: !(a -> s -> Step s b) -> !s -> SF a b
 SFNoState :: !(a -> Step () b) -> SF a b

   And then implementing every action 2^n times, where n is the number of
   `SF` arguments. That undoubtly leads to even more code bloat.
   - An alternative that I'm a little uncertain would play out would be

   data SMaybe a = SNothing | SJust !a
   data SF a b where
 SF :: !(SMaybe (s :~: ()) ->  !(a -> s -> Step s b) -> !s -> SF a b

   and try match on the proof everywhere needed to justify e.g. in `(.)`
   only storing e.g. s1 instead of (s1, s2). Basically do some type algebra in
   the implementation.
   - An even simpler thing would be to somehow use `Void#` (which should
   have been named `Unit#`), but I think that doesn't work due to runtime rep
   polymorphism restrictions.

I think there is lots that can be done to tune this idea.

Am Mi., 1. Apr. 2020 um 01:16 Uhr schrieb Alexis King :

> > On Mar 31, 2020, at 17:05, Sebastian Graf  wrote:
> >
> > Yeah, SPEC is quite unreliable, because IIRC at some point it's either
> consumed or irrelevant. But none of the combinators you mentioned should
> rely on SpecConstr! They are all non-recursive, so the Simplifier will take
> care of "specialisation". And it works just fine, I just tried it
>
> Ah! You are right, I did not read carefully enough and misinterpreted.
> That approach is clever, indeed. I had tried something similar with a CPS
> encoding, but the piece I was missing was using the existential to tie the
> final knot.
>
> I have tried it out on some of my experiments. It’s definitely a
> significant improvement, but it isn’t perfect. Here’s a small example:
>
> mapMaybeSF :: SF a b -> SF (Maybe a) (Maybe b)
> mapMaybeSF f = proc v -> case v of
>   Just x -> do
> y <- f -< x
> returnA -< Just y
>   Nothing -> returnA -< Nothing
>
> Looking at the optimized core, it’s true that the conversion of Maybe to
> Either and back again gets eliminated, which is wonderful! But what’s less
> wonderful is the value passed around through `s`:
>
> mapMaybeSF
>   = \ (@ a) (@ b) (f :: SF a b) ->
>   case f of { SF @ s f2 s2 ->
>   SF
> (\ (a1 :: Maybe a) (ds2 :: ((), ((), (((), (((), (((), s),
> ())), ((), ((), (), ((), ()) ->
>
> Yikes! GHC has no obvious way to clean this type up, so it will just grow
> indefinitely, and we end up doing a dozen pattern-matches in the body
> followed by another dozen allocations, just wrapping and unwrapping tuples.
>
> Getting rid of that seems probably a lot more tractable than fusing the
> recursive loops, but I’m still not immediately certain how to do it. GHC
> would have to so

Re: Fusing loops by specializing on functions with SpecConstr?

2020-04-01 Thread Harendra Kumar
On Wed, 1 Apr 2020 at 02:49, Alexis King  wrote:

>
> I’ve been trying to figure out if it would be possible to help the
> optimizer out by annotating the program with special combinators like the
> existing ones provided by GHC.Magic. However, I haven’t been able to come
> up with anything yet that seems like it would actually work.
>

You may want to take a look at https://github.com/composewell/fusion-plugin
which uses annotations to help GHC fuse, not specifically what you want but
might possibly be relevant to your work.
https://github.com/composewell/streamly relies heavily on case-of-case and
SpecConstr for stream fusion. There are several cases that GHC is unable to
fuse currently. We use a "Fuse" annotation to tell GHC that any function
involving this type must be inlined so that fusion can occur reliably. With
the help of fusion-plugin we have been able to fuse almost every known case
in streamly till now.

-harendra
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Fusing loops by specializing on functions with SpecConstr?

2020-03-31 Thread Alexis King
> On Mar 31, 2020, at 17:05, Sebastian Graf  wrote:
> 
> Yeah, SPEC is quite unreliable, because IIRC at some point it's either 
> consumed or irrelevant. But none of the combinators you mentioned should rely 
> on SpecConstr! They are all non-recursive, so the Simplifier will take care 
> of "specialisation". And it works just fine, I just tried it

Ah! You are right, I did not read carefully enough and misinterpreted. That 
approach is clever, indeed. I had tried something similar with a CPS encoding, 
but the piece I was missing was using the existential to tie the final knot.

I have tried it out on some of my experiments. It’s definitely a significant 
improvement, but it isn’t perfect. Here’s a small example:

mapMaybeSF :: SF a b -> SF (Maybe a) (Maybe b)
mapMaybeSF f = proc v -> case v of
  Just x -> do
y <- f -< x
returnA -< Just y
  Nothing -> returnA -< Nothing

Looking at the optimized core, it’s true that the conversion of Maybe to Either 
and back again gets eliminated, which is wonderful! But what’s less wonderful 
is the value passed around through `s`:

mapMaybeSF
  = \ (@ a) (@ b) (f :: SF a b) ->
  case f of { SF @ s f2 s2 ->
  SF
(\ (a1 :: Maybe a) (ds2 :: ((), ((), (((), (((), (((), s), ())), 
((), ((), (), ((), ()) ->

Yikes! GHC has no obvious way to clean this type up, so it will just grow 
indefinitely, and we end up doing a dozen pattern-matches in the body followed 
by another dozen allocations, just wrapping and unwrapping tuples.

Getting rid of that seems probably a lot more tractable than fusing the 
recursive loops, but I’m still not immediately certain how to do it. GHC would 
have to somehow deduce that `s` is existentially-bound, so it can rewrite 
something like

SF (\a ((), x) -> ... Yield ((), y) b ...) ((), s)

to

SF (\a x -> ... Yield y b) s

by parametricity. Is that an unreasonable ask? I don’t know!

Another subtlety I considered involves recursive arrows, where I currently 
depend on laziness in (|||). Here’s one example:

mapSF :: SF a b -> SF [a] [b]
mapSF f = proc xs -> case xs of
  x:xs -> do
y <- f -< x
ys <- mapSF f -< xs
returnA -< (y:ys)
  [] -> returnA -< []

Currently, GHC will just compile this to `mapSF f = mapSF f` under your 
implementation, since (|||) and (>>>) are both strict. However, I think this is 
not totally intractable—we can easily introduce an explicit `lazy` combinator 
to rein in strictness:

lazy :: SF a b -> SF a b
lazy sf0 = SF g (Unit sf0) where
  g a (Unit sf1) = case runSF sf1 a of
(b, sf2) -> Yield (Unit sf2) b

And now we can write `lazy (mapSF f)` at the point of the recursive call to 
avoid the infinite loop. This defeats some optimizations, of course, but 
`mapSF` is fundamentally recursive, so there’s only so much we can really 
expect.

So perhaps my needs here are less ambitious, after all! Getting rid of all 
those redundant tuples is my next question, but that’s rather unrelated from 
what we’ve been talking about so far.

Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


RE: Fusing loops by specializing on functions with SpecConstr?

2020-03-31 Thread Simon Peyton Jones via ghc-devs
Joachim: this conversation is triggering some hind-brain neurons related to 
exitification, or something like that.  I recall that we discovered we could 
get some surprising fusion of recursive functions expressed as  join points.  
Something like   f . g . h
where h loops for a while and returns, and same for g and f.  Then the call to 
g landed up in the return branch of h, and same for f.

But I can’t find anything in writing.  The Exitify module doesn’t say much. I 
thought we had a wiki page but I can’t find it.  Can you remember?

Thanks

Simon

From: Alexis King 
Sent: 31 March 2020 22:18
To: Sebastian Graf ; Simon Peyton Jones 

Cc: ghc-devs 
Subject: Re: Fusing loops by specializing on functions with SpecConstr?

Sebastian and Simon,

Thank you both for your responses—they are all quite helpful! I agree with both 
of you that figuring out how to do this kind of specialization without any 
guidance from the programmer seems rather intractable. It’s too hard to divine 
where it would actually be beneficial, and even if you could, it seems likely 
that other optimizations would get in the way of it actually working out.

I’ve been trying to figure out if it would be possible to help the optimizer 
out by annotating the program with special combinators like the existing ones 
provided by GHC.Magic. However, I haven’t been able to come up with anything 
yet that seems like it would actually work.

On Mar 31, 2020, at 06:12, Simon Peyton Jones 
mailto:simo...@microsoft.com>> wrote:

Wow – tricky stuff!   I would never have thought of trying to optimise that 
program, but it’s fascinating that you get lots and lots of them from FRP.

For context, the reason you get all these tiny loops is that arrowized FRP uses 
the Arrow and ArrowChoice interfaces to build its programs, and those 
interfaces use tiny combinator functions like these:

first :: Arrow a => a b c -> a (b, d) (c, d)
(***) :: Arrow a => a b d -> a c e -> a (b, c) (d, e)
(|||) :: ArrowChoice a => a b d -> a c d -> a (Either b c) d

This means you end up with programs built out of dozens or hundreds of uses of 
these tiny combinators. You get code that looks like

first (left (arr f >>> g ||| right h) *** second i)

and this is a textbook situation where you want to specialize and inline all 
the combinators! For arrows without this tricky recursion, doing that works as 
intended, and GHC’s simplifier will do what it’s supposed to, and you get fast 
code.

But with FRP, each of these combinators is recursive. This means you often get 
really awful code that looks like this:

arr (\case { Nothing -> Left (); Just x -> Right x }) >>> (f ||| g)

This converts a Maybe to an Either, then branches on it. It’s analogous to 
writing something like this in direct-style code:

let y = case x of { Nothing -> Left (); Just x -> Right x }
in case y of { Left () -> f; Right x -> g x }

We really want the optimizer to eliminate the intermediate Either and just 
branch on it directly, and if GHC could fuse these tiny recursive loops, it 
could! But without that, all this pointless shuffling of values around remains 
in the optimized program.



  *   I wonder whether it’d be possible to adjust the FRP library to generate 
easier-to-optimise code. Probably not, but worth asking.

I think it’s entirely possible to somehow annotate these combinators to 
communicate this information to the optimizer, but I don’t know what the 
annotations ought to look like. (That’s the research part!)

But I’m not very optimistic about getting the library to generate 
easier-to-optimize code with the tools available today. Sebastian’s example of 
SF2 and stream fusion sort of works, but in my experience, something like that 
doesn’t handle enough cases well enough to work on real arrow programs.


 *   Unrolling one layer of a recursive function.  That seems harder: how 
we know to *stop* unrolling as we successively simplify?  One idea: do one 
layer of unrolling by hand, perhaps even in FRP source code:
add1rec = SF (\a -> let !b = a+1 in (b,add1rec))
add1 = SF (\a -> let !b = a+1 in (b,add1rec))

Yes, I was playing with the idea at one point of some kind of RULE that inserts 
GHC.Magic.inline on the specialized RHS. That way the programmer could ask for 
the unrolling explicitly, as otherwise it seems unreasonable to ask the 
compiler to figure it out.

On Mar 31, 2020, at 08:08, Sebastian Graf 
mailto:sgraf1...@gmail.com>> wrote:

We can formulate SF as a classic Stream that needs an `a` to produce its next 
element of type `b` like this (SF2 below)

This is a neat trick, though I’ve had trouble getting it to work reliably in my 
experiments (even though I was using GHC.Types.SPEC). That said, I also feel 
like I don’t understand the subtleties of SpecConstr very well, so it could 
have been my fault.

The more fundamental problem I’ve found with that approach is t

Re: Fusing loops by specializing on functions with SpecConstr?

2020-03-31 Thread Sebastian Graf
>
> This is a neat trick, though I’ve had trouble getting it to work reliably
> in my experiments (even though I was using GHC.Types.SPEC). That said, I
> also feel like I don’t understand the subtleties of SpecConstr very well,
> so it could have been my fault.
>

Yeah, SPEC is quite unreliable, because IIRC at some point it's either
consumed or irrelevant. But none of the combinators you mentioned should
rely on SpecConstr! They are all non-recursive, so the Simplifier will take
care of "specialisation". And it works just fine, I just tried it:

https://gist.github.com/sgraf812/d15cd3ee9cc9bd2e72704f90567ef35b

`test` there is optimised reasonably well. The problem is that we don't
have the concrete a..f so we can't cancel away all allocations.
If you give me a closed program where we fail to optimise away every bit of
allocation (and it isn't due to size concerns), then I would be surprised.
Although there might be a bug in how I encoded the streams, maybe we can be
a bit stricter here or there if need be.

`test2 = (double &&& inc) >>> arr (uncurry (+)) :: SF Int Int` is such a
function that we optimise down to (the equivalent of) `arr (\n -> 3*n+1)`.

Maybe you can give a medium-sized program where you think GHC does a poor
job at optimising?

Am Di., 31. März 2020 um 23:18 Uhr schrieb Alexis King <
lexi.lam...@gmail.com>:

> Sebastian and Simon,
>
> Thank you both for your responses—they are all quite helpful! I agree with
> both of you that figuring out how to do this kind of specialization without
> any guidance from the programmer seems rather intractable. It’s too hard to
> divine where it would actually be beneficial, and even if you could, it
> seems likely that other optimizations would get in the way of it actually
> working out.
>
> I’ve been trying to figure out if it would be possible to help the
> optimizer out by annotating the program with special combinators like the
> existing ones provided by GHC.Magic. However, I haven’t been able to come
> up with anything yet that seems like it would actually work.
>
> On Mar 31, 2020, at 06:12, Simon Peyton Jones 
> wrote:
>
> Wow – tricky stuff!   I would never have thought of trying to optimise
> that program, but it’s fascinating that you get lots and lots of them from
> FRP.
>
>
> For context, the reason you get all these tiny loops is that arrowized FRP
> uses the Arrow and ArrowChoice interfaces to build its programs, and those
> interfaces use tiny combinator functions like these:
>
> first :: Arrow a => a b c -> a (b, d) (c, d)
> (***) :: Arrow a => a b d -> a c e -> a (b, c) (d, e)
> (|||) :: ArrowChoice a => a b d -> a c d -> a (Either b c) d
>
> This means you end up with programs built out of dozens or hundreds of
> uses of these tiny combinators. You get code that looks like
>
> first (left (arr f >>> g ||| right h) *** second i)
>
> and this is a textbook situation where you want to specialize and inline
> all the combinators! For arrows without this tricky recursion, doing that
> works as intended, and GHC’s simplifier will do what it’s supposed to, and
> you get fast code.
>
> But with FRP, each of these combinators is recursive. This means you often
> get really awful code that looks like this:
>
> arr (\case { Nothing -> Left (); Just x -> Right x }) >>> (f ||| g)
>
> This converts a Maybe to an Either, then branches on it. It’s analogous to
> writing something like this in direct-style code:
>
> let y = case x of { Nothing -> Left (); Just x -> Right x }
> in case y of { Left () -> f; Right x -> g x }
>
> We really want the optimizer to eliminate the intermediate Either and just
> branch on it directly, and if GHC could fuse these tiny recursive loops, it
> could! But without that, all this pointless shuffling of values around
> remains in the optimized program.
>
>
>- I wonder whether it’d be possible to adjust the FRP library to
>generate easier-to-optimise code. Probably not, but worth asking.
>
>
> I think it’s entirely possible to somehow annotate these combinators to
> communicate this information to the optimizer, but I don’t know what the
> annotations ought to look like. (That’s the research part!)
>
> But I’m not very optimistic about getting the library to generate
> easier-to-optimize code with the tools available today. Sebastian’s example
> of SF2 and stream fusion sort of works, but in my experience, something
> like that doesn’t handle enough cases well enough to work on real arrow
> programs.
>
>
>- Unrolling one layer of a recursive function.  That seems harder: how
>   we know to **stop** unrolling as we successively simplify?  One
>   idea: do one layer of unrolling by hand, perhaps even in FRP source 
> code:
>
> add1rec = SF (\a -> let !b = a+1 in (b,add1rec))
> add1 = SF (\a -> let !b = a+1 in (b,add1rec))
>
>
> Yes, I was playing with the idea at one point of some kind of RULE that
> inserts GHC.Magic.inline on the specialized RHS. That way the programmer
>

Re: Fusing loops by specializing on functions with SpecConstr?

2020-03-31 Thread Alexis King
Sebastian and Simon,

Thank you both for your responses—they are all quite helpful! I agree with both 
of you that figuring out how to do this kind of specialization without any 
guidance from the programmer seems rather intractable. It’s too hard to divine 
where it would actually be beneficial, and even if you could, it seems likely 
that other optimizations would get in the way of it actually working out.

I’ve been trying to figure out if it would be possible to help the optimizer 
out by annotating the program with special combinators like the existing ones 
provided by GHC.Magic. However, I haven’t been able to come up with anything 
yet that seems like it would actually work.

> On Mar 31, 2020, at 06:12, Simon Peyton Jones  wrote:
> 
> Wow – tricky stuff!   I would never have thought of trying to optimise that 
> program, but it’s fascinating that you get lots and lots of them from FRP.

For context, the reason you get all these tiny loops is that arrowized FRP uses 
the Arrow and ArrowChoice interfaces to build its programs, and those 
interfaces use tiny combinator functions like these:

first :: Arrow a => a b c -> a (b, d) (c, d)
(***) :: Arrow a => a b d -> a c e -> a (b, c) (d, e)
(|||) :: ArrowChoice a => a b d -> a c d -> a (Either b c) d

This means you end up with programs built out of dozens or hundreds of uses of 
these tiny combinators. You get code that looks like

first (left (arr f >>> g ||| right h) *** second i)

and this is a textbook situation where you want to specialize and inline all 
the combinators! For arrows without this tricky recursion, doing that works as 
intended, and GHC’s simplifier will do what it’s supposed to, and you get fast 
code.

But with FRP, each of these combinators is recursive. This means you often get 
really awful code that looks like this:

arr (\case { Nothing -> Left (); Just x -> Right x }) >>> (f ||| g)

This converts a Maybe to an Either, then branches on it. It’s analogous to 
writing something like this in direct-style code:

let y = case x of { Nothing -> Left (); Just x -> Right x }
in case y of { Left () -> f; Right x -> g x }

We really want the optimizer to eliminate the intermediate Either and just 
branch on it directly, and if GHC could fuse these tiny recursive loops, it 
could! But without that, all this pointless shuffling of values around remains 
in the optimized program.

> I wonder whether it’d be possible to adjust the FRP library to generate 
> easier-to-optimise code. Probably not, but worth asking.

I think it’s entirely possible to somehow annotate these combinators to 
communicate this information to the optimizer, but I don’t know what the 
annotations ought to look like. (That’s the research part!)

But I’m not very optimistic about getting the library to generate 
easier-to-optimize code with the tools available today. Sebastian’s example of 
SF2 and stream fusion sort of works, but in my experience, something like that 
doesn’t handle enough cases well enough to work on real arrow programs.

> Unrolling one layer of a recursive function.  That seems harder: how we know 
> to *stop* unrolling as we successively simplify?  One idea: do one layer of 
> unrolling by hand, perhaps even in FRP source code:
> add1rec = SF (\a -> let !b = a+1 in (b,add1rec))
> add1 = SF (\a -> let !b = a+1 in (b,add1rec))

Yes, I was playing with the idea at one point of some kind of RULE that inserts 
GHC.Magic.inline on the specialized RHS. That way the programmer could ask for 
the unrolling explicitly, as otherwise it seems unreasonable to ask the 
compiler to figure it out.

> On Mar 31, 2020, at 08:08, Sebastian Graf  wrote:
> 
> We can formulate SF as a classic Stream that needs an `a` to produce its next 
> element of type `b` like this (SF2 below)

This is a neat trick, though I’ve had trouble getting it to work reliably in my 
experiments (even though I was using GHC.Types.SPEC). That said, I also feel 
like I don’t understand the subtleties of SpecConstr very well, so it could 
have been my fault.

The more fundamental problem I’ve found with that approach is that it doesn’t 
do very well for arrow combinators like (***) and (|||), which come up very 
often in arrow programs but rarely in streaming. Fusing long chains of 
first/second/left/right is actually pretty easy with ordinary RULEs, but (***) 
and (|||) are much harder, since they have multiple continuations.

It seems at first appealing to rewrite `f *** g` into `first f >>> second g`, 
which solves the immediate problem, but this is actually a lot less efficient 
after repeated rewritings. You end up rewriting `(f ||| g) *** h` into `first 
(left f) >>> first (right g) >>> second h`, turning two distinct branches into 
four, and larger programs have much worse exponential blowups.

So that’s where I’ve gotten stuck! I’ve been toying with the idea of thinking 
about expression “shells”, so if you have something like

first (a ||| b) >>> c ***

Re: Fusing loops by specializing on functions with SpecConstr?

2020-03-31 Thread Sebastian Graf
We can formulate SF as a classic Stream that needs an `a` to produce its
next element of type `b` like this (SF2 below):

{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE GADTs #-}

module Lib where

newtype SF a b = SF { runSF :: a -> (b, SF a b) }

inc1 :: SF Int Int
inc1 = SF $ \a -> let !b = a+1 in (b, inc1)

data Step s a = Yield !s a

data SF2 a b where
  SF2 :: !(a -> s -> Step s b) -> !s -> SF2 a b

inc2 :: SF2 Int Int
inc2 = SF2 go ()
  where
go a _ = let !b = a+1 in Yield () b

runSF2 :: SF2 a b -> a -> (b, SF2 a b)
runSF2 (SF2 f s) a = case f a s of
  Yield s' b -> (b, (SF2 f s'))

Note the absence of recursion in inc2. This resolves the tension around
having to specialise for a function argument that is recursive and having
to do the unrolling. I bet that similar to stream fusion, we can arrange
that only the consumer has to be explicitly recursive. Indeed, I think this
will help you inline mapping combinators such as `second`, because it won't
be recursive itself anymore.
Now we "only" have to solve the same problems as with good old stream
fusion.

The tricky case (after realising that we need to add `Skip` to `Step` for
`filterSF2`) is when we want to optimise a signal of signals, e.g.
something like `concatMapSF2 :: (b -> SF2 a c) -> SF2 a b -> SF2 a c` or
some such. And here we are again in #855/#915.



Also if you need convincing that we can embed any SF into SF2, look at this:

embed :: SF Int Int -> SF2 Int Int
embed origSF = SF2 go origSF
  where
go a sf = case runSF sf a of
  (b, sf') -> Yield sf' b

Please do open a ticket about this, though. It's an interesting data point!

Cheers,
Sebastian


Am Di., 31. März 2020 um 13:12 Uhr schrieb Simon Peyton Jones <
simo...@microsoft.com>:

> Wow – tricky stuff!   I would never have thought of trying to optimise
> that program, but it’s fascinating that you get lots and lots of them from
> FRP.
>
>
>
>- Don’t lose this thread!  Make a ticket, or a wiki page. If the
>former, put the main payload (including Alexis’s examples) into the
>Descriptions, not deep in the discussion.
>- I wonder whether it’d be possible to adjust the FRP library to
>generate easier-to-optimise code. Probably not, but worth asking.
>- Alexis’s proposed solution relies on
>   - Specialising on a function argument.  Clearly this must be
>   possible, and it’d be very beneficial.
>   - Unrolling one layer of a recursive function.  That seems harder:
>   how we know to **stop** unrolling as we successively simplify?  One
>   idea: do one layer of unrolling by hand, perhaps even in FRP source 
> code:
>
> add1rec = SF (\a -> let !b = a+1 in (b,add1rec))
>
> add1 = SF (\a -> let !b = a+1 in (b,add1rec))
>
>
>
> Simon
>
>
>
> *From:* ghc-devs  *On Behalf Of *Sebastian
> Graf
> *Sent:* 29 March 2020 15:34
> *To:* Alexis King 
> *Cc:* ghc-devs 
> *Subject:* Re: Fusing loops by specializing on functions with SpecConstr?
>
>
>
> Hi Alexis,
>
>
>
> I've been wondering the same things and have worked on it on and off. See
> my progress in https://gitlab.haskell.org/ghc/ghc/issues/855#note_149482
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2Fissues%2F855%23note_149482&data=02%7C01%7Csimonpj%40microsoft.com%7Cab7afece6b43485f5e5508d7d3ee4cfc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637210892758857668&sdata=BWptTEUj%2BcKu1cEkYiFtDuBRHKKzl%2BkVxUzV%2FRIje1c%3D&reserved=0>
> and https://gitlab.haskell.org/ghc/ghc/issues/915#note_241520
> <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2Fissues%2F915%23note_241520&data=02%7C01%7Csimonpj%40microsoft.com%7Cab7afece6b43485f5e5508d7d3ee4cfc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637210892758867663&sdata=w5cJDvwz0e1RWq3c%2BrG12McHTt9H%2FkMzRnUlyAS22bM%3D&reserved=0>
> .
>
>
>
> The big problem with solving the higher-order specialisation problem
> through SpecConstr (which is what I did in my reports in #855) is indeed
> that it's hard to
>
>1. Anticipate what the rewritten program looks like without doing a
>Simplifier pass after each specialisation, so that we can see and exploit
>new specialisation opportunities. SpecConstr does use the simple Core
>optimiser but, that often is not enough IIRC (think of ArgOccs from
>recursive calls). In particular, it will not do RULE rewrites. Interleaving
>SpecConstr with the Simplifier, apart from nigh impossible conceptually, is
>computationally intractable and would quickly drift off into Partial
>Evaluation swamp.
>2. Make the RULE engine match and rewrite call sit

RE: Fusing loops by specializing on functions with SpecConstr?

2020-03-31 Thread Simon Peyton Jones via ghc-devs
Wow – tricky stuff!   I would never have thought of trying to optimise that 
program, but it’s fascinating that you get lots and lots of them from FRP.


  *   Don’t lose this thread!  Make a ticket, or a wiki page. If the former, 
put the main payload (including Alexis’s examples) into the Descriptions, not 
deep in the discussion.
  *   I wonder whether it’d be possible to adjust the FRP library to generate 
easier-to-optimise code. Probably not, but worth asking.
  *   Alexis’s proposed solution relies on
 *   Specialising on a function argument.  Clearly this must be possible, 
and it’d be very beneficial.
 *   Unrolling one layer of a recursive function.  That seems harder: how 
we know to *stop* unrolling as we successively simplify?  One idea: do one 
layer of unrolling by hand, perhaps even in FRP source code:

add1rec = SF (\a -> let !b = a+1 in (b,add1rec))

add1 = SF (\a -> let !b = a+1 in (b,add1rec))

Simon

From: ghc-devs  On Behalf Of Sebastian Graf
Sent: 29 March 2020 15:34
To: Alexis King 
Cc: ghc-devs 
Subject: Re: Fusing loops by specializing on functions with SpecConstr?

Hi Alexis,

I've been wondering the same things and have worked on it on and off. See my 
progress in 
https://gitlab.haskell.org/ghc/ghc/issues/855#note_149482<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2Fissues%2F855%23note_149482&data=02%7C01%7Csimonpj%40microsoft.com%7Cab7afece6b43485f5e5508d7d3ee4cfc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637210892758857668&sdata=BWptTEUj%2BcKu1cEkYiFtDuBRHKKzl%2BkVxUzV%2FRIje1c%3D&reserved=0>
 and 
https://gitlab.haskell.org/ghc/ghc/issues/915#note_241520<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2Fissues%2F915%23note_241520&data=02%7C01%7Csimonpj%40microsoft.com%7Cab7afece6b43485f5e5508d7d3ee4cfc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637210892758867663&sdata=w5cJDvwz0e1RWq3c%2BrG12McHTt9H%2FkMzRnUlyAS22bM%3D&reserved=0>.

The big problem with solving the higher-order specialisation problem through 
SpecConstr (which is what I did in my reports in #855) is indeed that it's hard 
to

  1.  Anticipate what the rewritten program looks like without doing a 
Simplifier pass after each specialisation, so that we can see and exploit new 
specialisation opportunities. SpecConstr does use the simple Core optimiser 
but, that often is not enough IIRC (think of ArgOccs from recursive calls). In 
particular, it will not do RULE rewrites. Interleaving SpecConstr with the 
Simplifier, apart from nigh impossible conceptually, is computationally 
intractable and would quickly drift off into Partial Evaluation swamp.
  2.  Make the RULE engine match and rewrite call sites in all call patterns 
they can apply.
I.e., `f (\x -> Just (x +1))` calls its argument with one argument and 
scrutinises the resulting Maybe (that's what is described by the argument's 
`ArgOcc`), so that we want to specialise to a call pattern `f (\x -> Just )`, giving rise to the specialisation `$sf ctx`, where `ctx 
x` describes the `` part. In an ideal world, we want a 
(higher-order pattern unification) RULE for `forall f ctx. f (\x -> Just (ctx 
x)) ==> $sf ctx`. But from what I remember, GHC's RULE engine works quite 
different from that and isn't even concerned with finding unifiers (rather than 
just matching concrete call sites without meta variables against RULEs with 
meta variables) at all.
Note that matching on specific Ids binding functions is just an approximation 
using representional equality (on the Id's Unique) rather than some sort of 
more semantic equality. My latest endeavour into the matter in #915 from 
December was using types as the representational entity and type class 
specialisation. I think I got ultimately blocked on 
thttps://gitlab.haskell.org/ghc/ghc/issues/17548<https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgitlab.haskell.org%2Fghc%2Fghc%2Fissues%2F17548&data=02%7C01%7Csimonpj%40microsoft.com%7Cab7afece6b43485f5e5508d7d3ee4cfc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637210892758877658&sdata=x6YJBWtNzzX2ad05yr2KoAE7G42A7agIFINws0VI%2BeY%3D&reserved=0>,
 but apparently I didn't document the problematic program.

Maybe my failure so far is that I want it to apply and optimise all cases and 
for more complex stream pipelines, rather than just doing a better best effort 
job.

Hope that helps. Anyway, I'm also really keen on nailing this! It's one of my 
high-risk, high-reward research topics. So if you need someone to 
collaborate/exchange ideas with, I'm happy to help!

All the best,
Sebastian

Am So., 29. März 2020 um 10:39 Uhr schrieb Alexis King 
mailto:lexi.lam...@gmail.com>>:
Hi all,

I have recently been toying with FRP, and I’ve noticed that
traditional formulations generate a lot of tiny loops that G

Re: Fusing loops by specializing on functions with SpecConstr?

2020-03-29 Thread Sebastian Graf
 Hi Alexis,

I've been wondering the same things and have worked on it on and off. See
my progress in https://gitlab.haskell.org/ghc/ghc/issues/855#note_149482
and https://gitlab.haskell.org/ghc/ghc/issues/915#note_241520.

The big problem with solving the higher-order specialisation problem
through SpecConstr (which is what I did in my reports in #855) is indeed
that it's hard to

   1. Anticipate what the rewritten program looks like without doing a
   Simplifier pass after each specialisation, so that we can see and exploit
   new specialisation opportunities. SpecConstr does use the simple Core
   optimiser but, that often is not enough IIRC (think of ArgOccs from
   recursive calls). In particular, it will not do RULE rewrites. Interleaving
   SpecConstr with the Simplifier, apart from nigh impossible conceptually, is
   computationally intractable and would quickly drift off into Partial
   Evaluation swamp.
   2. Make the RULE engine match and rewrite call sites in all call
   patterns they can apply.
   I.e., `f (\x -> Just (x +1))` calls its argument with one argument and
   scrutinises the resulting Maybe (that's what is described by the argument's
   `ArgOcc`), so that we want to specialise to a call pattern `f (\x -> Just
   )`, giving rise to the specialisation `$sf ctx`,
   where `ctx x` describes the `` part. In an ideal
   world, we want a (higher-order pattern unification) RULE for `forall f ctx.
   f (\x -> Just (ctx x)) ==> $sf ctx`. But from what I remember, GHC's RULE
   engine works quite different from that and isn't even concerned with
   finding unifiers (rather than just matching concrete call sites without
   meta variables against RULEs with meta variables) at all.

Note that matching on specific Ids binding functions is just an
approximation using representional equality (on the Id's Unique) rather
than some sort of more semantic equality. My latest endeavour into the
matter in #915 from December was using types as the representational entity
and type class specialisation. I think I got ultimately blocked on thttps://
gitlab.haskell.org/ghc/ghc/issues/17548, but apparently I didn't document
the problematic program.

Maybe my failure so far is that I want it to apply and optimise all cases
and for more complex stream pipelines, rather than just doing a better best
effort job.

Hope that helps. Anyway, I'm also really keen on nailing this! It's one of
my high-risk, high-reward research topics. So if you need someone to
collaborate/exchange ideas with, I'm happy to help!

All the best,
Sebastian

Am So., 29. März 2020 um 10:39 Uhr schrieb Alexis King <
lexi.lam...@gmail.com>:

> Hi all,
>
> I have recently been toying with FRP, and I’ve noticed that
> traditional formulations generate a lot of tiny loops that GHC does
> a very poor job optimizing. Here’s a simplified example:
>
> newtype SF a b = SF { runSF :: a -> (b, SF a b) }
>
> add1_snd :: SF (String, Int) (String, Int)
> add1_snd = second add1 where
>   add1 = SF $ \a -> let !b = a + 1 in (b, add1)
>   second f = SF $ \(a, b) ->
> let !(c, f') = runSF f b
> in ((a, c), second f')
>
> Here, `add1_snd` is defined in terms of two recursive bindings,
> `add1` and `second`. Because they’re both recursive, GHC doesn’t
> know what to do with them, and the optimized program still has two
> separate recursive knots. But this is a missed optimization, as
> `add1_snd` is equivalent to the following definition, which fuses
> the two loops together and consequently has just one recursive knot:
>
> add1_snd_fused :: SF (String, Int) (String, Int)
> add1_snd_fused = SF $ \(a, b) ->
>   let !c = b + 1
>   in ((a, c), add1_snd_fused)
>
> How could GHC get from `add1_snd` to `add1_snd_fused`? In theory,
> SpecConstr could do it! Suppose we specialize `second` at the call
> pattern `second add1`:
>
> {-# RULE "second/add1" second add1 = second_add1 #-}
>
> second_add1 = SF $ \(a, b) ->
>   let !(c, f') = runSF add1 b
>   in ((a, c), second f')
>
> This doesn’t immediately look like an improvement, but we’re
> actually almost there. If we unroll `add1` once on the RHS of
> `second_add1`, the simplifier will get us the rest of the way. We’ll
> end up with
>
> let !b1 = b + 1
> !(c, f') = (b1, add1)
> in ((a, c), second f')
>
> and after substituting f' to get `second add1`, the RULE will tie
> the knot for us.
>
> This may look like small potatoes in isolation, but real programs
> can generate hundreds of these tiny, tiny loops, and fusing them
> together would be a big win. The only problem is SpecConstr doesn’t
> currently specialize on functions! The original paper, “Call-pattern
> Specialisation for Haskell Programs,” mentions this as a possibility
> in Section 6.2, but it points out that actually doing this in
> practice would be pretty tricky:
>
> > Specialising for function arguments is more slippery than for
> > constructor arguments. In the example above

Fusing loops by specializing on functions with SpecConstr?

2020-03-29 Thread Alexis King
Hi all,

I have recently been toying with FRP, and I’ve noticed that
traditional formulations generate a lot of tiny loops that GHC does
a very poor job optimizing. Here’s a simplified example:

newtype SF a b = SF { runSF :: a -> (b, SF a b) }

add1_snd :: SF (String, Int) (String, Int)
add1_snd = second add1 where
  add1 = SF $ \a -> let !b = a + 1 in (b, add1)
  second f = SF $ \(a, b) ->
let !(c, f') = runSF f b
in ((a, c), second f')

Here, `add1_snd` is defined in terms of two recursive bindings,
`add1` and `second`. Because they’re both recursive, GHC doesn’t
know what to do with them, and the optimized program still has two
separate recursive knots. But this is a missed optimization, as
`add1_snd` is equivalent to the following definition, which fuses
the two loops together and consequently has just one recursive knot:

add1_snd_fused :: SF (String, Int) (String, Int)
add1_snd_fused = SF $ \(a, b) ->
  let !c = b + 1
  in ((a, c), add1_snd_fused)

How could GHC get from `add1_snd` to `add1_snd_fused`? In theory,
SpecConstr could do it! Suppose we specialize `second` at the call
pattern `second add1`:

{-# RULE "second/add1" second add1 = second_add1 #-}

second_add1 = SF $ \(a, b) ->
  let !(c, f') = runSF add1 b
  in ((a, c), second f')

This doesn’t immediately look like an improvement, but we’re
actually almost there. If we unroll `add1` once on the RHS of
`second_add1`, the simplifier will get us the rest of the way. We’ll
end up with

let !b1 = b + 1
!(c, f') = (b1, add1)
in ((a, c), second f')

and after substituting f' to get `second add1`, the RULE will tie
the knot for us.

This may look like small potatoes in isolation, but real programs
can generate hundreds of these tiny, tiny loops, and fusing them
together would be a big win. The only problem is SpecConstr doesn’t
currently specialize on functions! The original paper, “Call-pattern
Specialisation for Haskell Programs,” mentions this as a possibility
in Section 6.2, but it points out that actually doing this in
practice would be pretty tricky:

> Specialising for function arguments is more slippery than for
> constructor arguments. In the example above the argument was a
> simple variable, but what if it was instead a lambda term? [...]
>
> The trouble is that lambda abstractions are much more fragile than
> constructor applications, in the sense that simple transformations
> may make two abstractions look different although they have the
> same value.

Still, the difference this could make in a program of mine is so
large that I am interested in exploring it anyway. I am wondering if
anyone has investigated this possibility any further since the paper
was published, or if anyone knows of other use cases that would
benefit from this capability.

Thanks,
Alexis
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs