Re: [Haskell-cafe] Reifying case expressions [was: Re: On stream processing, and a new release of timeplot coming]

Eugene Kirpichov Mon, 26 Dec 2011 08:18:00 -0800

Whoa. Sebastian, you're my hero — I've been struggling with defining Arrow for 
ListTransformer for a substantial time without success, and here you got it, 
dramatically simpler than I thought it could be done (I was using explicit 
queues).
I wonder if now this datatype of yours is isomorphic to StreamSummary b r -> 
StreamSummary a r.




26.12.2011, в 19:56, Sebastian Fischer <fisc...@nii.ac.jp> написал(а):

> On Sun, Dec 25, 2011 at 11:25 AM, Heinrich Apfelmus 
> <apfel...@quantentunnel.de> wrote:
> Your  StreamSummary  type has a really nice interpretation: it's a 
> reification of  case  expressions [on lists].
> 
> nice observation!
>  
> For instance, consider the following simple function from lists to integers
> 
>    length :: [a] -> Int
>    length xs = case xs of
>        []     -> 0
>        (y:ys) -> 1 + length ys
> 
> We want to reify the case expression as constructor of a data type. [...]
> 
>    data ListTo a r = CaseOf r (a -> ListTo a r)
> 
>    interpret :: ListTo a r -> ([a] -> r)
>    interpret (CaseOf nil cons) xs =
>        case xs of
>            []     -> nil
>            (y:ys) -> interpret (cons y) ys
> 
> [...]
> 
> Likewise, each function from lists can be represented in terms of our new 
> data type [...]
> 
>    length' :: ListTo a Int
>    length' = CaseOf
>        (0)
>        (\x -> fmap (1+) length')
> 
>    length = interpret length'
> 
> This version of `length` is tail recursive while the previous version is not. 
> In general, all functions defined in terms of `ListTo` and `interpret` are 
> spine strict - they return a result only after consuming all input list 
> constructors.
> 
> This is what Eugene observed when defining the identity function as
> 
>     idC = CaseOf [] (\x -> (x:) <$> idC)
> 
> This version does not work for infinite lists. Similarly, `head` and `take` 
> cannot be defined as lazily as in the standard libraries.
> 
> We can support lazier list consumers by adding a case to the ListTo type that 
> allows to stop consuming the list. To avoid confusion, I chose new names for 
> my new types.
> 
>     data ListConsumer a b
>       = Done !b
>       | Continue !b (a -> ListConsumer a b)
> 
> The interpretation function just ignores the remaining input in the case of 
> `Done`:
> 
>     consumeList :: ListConsumer a b -> [a] -> b
>     consumeList (Done b)       _      = b
>     consumeList (Continue b _) []     = b
>     consumeList (Continue _ f) (x:xs) = consumeList (f x) xs
> 
> We can define lazier versions of `head` and `take` as follows:
> 
>     headC :: ListConsumer a a
>     headC = Continue (error "head of empty list") Done
>     
>     takeC :: Int -> ListConsumer a [a]
>     takeC 0 = Done []
>     takeC n = Continue [] (\x -> (x:) <$> takeC (n-1))
> 
> However, we still cannot define a lazy version of the identity funtion with 
> list consumers. 
> 
> The identity function and `takeC` belong to a special case of a list 
> consumers because they also *produce* lists. We can define a specialized type 
> for list transformers that consume and produce lists. One advantage of this 
> specialization will be that we can define a lazy version of the identity 
> function. The transformer type can have functor and applicative instances 
> similar to the consumer type to compose transformers in parallel. 
> Additionally, it can have category and arrow instances to compose 
> transformers sequentially.
> 
> Here is a type for lazy list transformers:
> 
>     data ListTransformer a b
>       = Cut
>       | Put b (ListTransformer a b)
>       | Get (a -> ListTransformer a b)
> 
> A transformer can either cut off the input list and return the empty list, 
> output a new element before transforming the input, or consume one element 
> from the input list and transform the remaining elements. The interpretation 
> function should make this clearer:
> 
>     transformList :: ListTransformer a b -> [a] -> [b]
>     transformList Cut       _      = []
>     transformList (Put b t) xs     = b : transformList t xs
>     transformList (Get _)   []     = []
>     transformList (Get f)   (x:xs) = transformList (f x) xs
> 
> Note that, if the transformer wants to read another element that is not 
> there, it simply returns the empty list.
> 
> Now we can define a lazy identity function and another version of `take`:
> 
>     idT :: ListTransformer a a
>     idT = Get (\x -> Put x idT)
> 
>     takeT :: Int -> ListTransformer a a
>     takeT 0 = Cut
>     takeT n = Get (\x -> Put x (takeT (n-1)))
> 
> Here is another translation of a common list function:
> 
>     filterT :: (a -> Bool) -> ListTransformer a a
>     filterT p = Get (\x -> if p x then Put x (filterT p) else filterT p)
> 
> `filterT` is an example for a function that can consume several input 
> elements before producing an output element. Other examples for functions of 
> this kind are chunking functions:
> 
>     pairsT :: ListTransformer a (a,a)
>     pairsT = Get (\x -> Get (\y -> Put (x,y) pairsT))
>     
>     chunks :: Int -> ListTransformer a [a]
>     chunks n = collect n
>      where
>       collect 0 = Put [] (chunks n)
>       collect m = Get (\x -> collect (m-1) >>> Get (\xs -> Put (x:xs) id))
> 
> Here are some example calls in GHCi that demonstrate the category and 
> applicative instances for sequential and parallel composition (see below for 
> a link to the complete source code):
> 
>     ghci> transformList pairsT [1..5]
>     [(1,2),(3,4)]                                    -- 5 is ignored
>     ghci> transformList pairsT [1..6]
>     [(1,2),(3,4),(5,6)]
>     ghci> transformList (chunks 2) [1..5]
>     [[1,2],[3,4]]                                    -- similar to pairsT
>     ghci> transformList (chunks 3) [1..6]
>     [[1,2,3],[4,5,6]]
>     ghci> transformList (takeT 3 . chunks 3) [1..]         -- infinite input
>     [[1,2,3],[4,5,6],[7,8,9]]
>     ghci> transformList ((,) <$> takeT 3 . chunks 3 <*> chunks 2 . filterT 
> even) [1..]
>     [([1,2,3],[2,4]),([4,5,6],[6,8]),([7,8,9],[10,12])]
> 
> When we compose transformers in parallel, the memory requirements depend on 
> how far they run out of sync. If they consume elements at the same pace, 
> memory requirements should be constant. Otherwise, some part of the input is 
> retained to satisfy all transformers. In the final call above bigger and 
> bigger parts are retained because the first transformer is slower than the 
> second.
> 
> As transformers are a special case of consumers, we can compose a consumer 
> and a transformer to give a new consumer:
> 
>     (<.) :: ListConsumer b c -> ListTransformer a b -> ListConsumer a c
>     Done c       <. _       = Done c
>     Continue c _ <. Cut     = Done c
>     Continue _ f <. Put x t = f x <. t
>     Continue c f <. Get g   = Continue c (\a -> Continue c f <. g a)
> 
> Using the instances for parallel and sequential transformer composition as 
> well as the instances for parallel consumer composition, we can build complex 
> consumers that still execute in lockstep and consume their input lazily.
> 
> Sebastian
> 
> P.S. https://gist.github.com/1521467
> 
> 
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Reifying case expressions [was: Re: On stream processing, and a new release of timeplot coming]

Reply via email to