Re: [haskell-pipes] Some code review for a pipes tutorial?

Gabriel Gonzalez Tue, 20 May 2014 18:45:07 -0700

`pack` is defined like this:

    pack = dimap to (fmap from)


... which is equivalent to:

    pack = iso to from

When you apply `view` to an isomorphism, it gives you the forward direction:

    view (iso to from) = to

That means you can narrow the problem down to understanding how `to`works, which is defined in this case as:

to p = Pipes.Group.folds step id done (p ^. Pipes.Group.chunksOfdefaultChunkSize)


    step diffAs w8 = diffAs . (w8:)

    done diffAs = BS.pack (diffAs [])

This is the same thing as:

purely folds (fmap BS.pack list) (p ^. Pipes.Group.chunksOfdefaultChunkSize)

In other words, it's splitting up the stream of `Word8`s into groups of`defaultChunkSize`, folding each group into a list, and then `pack`ingthat list into a `ByteString`.

To learn more about how `chunksOf` and `folds` work, you can read the`pipes-group` tutorial here:


http://hackage.haskell.org/package/pipes-group-1.0.0/docs/Pipes-Group-Tutorial.html

You might also find this blog post useful, which is on a similar subject:

http://www.haskellforall.com/2013/09/perfect-streaming-using-pipes-bytestring.html

On 5/20/14, 11:22 AM, Justin Le wrote:

Ah, that makes sense.

You're right, I was removing the wrong runEffect...and I see why itwas unnecessary :)

One last thing I am stuck on; what exactly does `view` do in thiscontext? And how does it know when to properly chunk up ByteStrings,as to keep constant space?


On Saturday, May 10, 2014 7:40:52 AM UTC-7, Gabriel Gonzalez wrote:


    On 05/09/2014 09:21 PM, Justin Le wrote:

    Thanks Gabriel!  This was very helpful :)

    I've updated the repo with my modifications, and I had a couple
    of comments.

    1. Deleting `runEffect` appears to bring about a type error,
    trying to unify the Proxy type with IO ().  Did I do something
    wrong here?


    Make sure you are deleting the correct `runEffect` (from
    encode.hs).  The result of `freqs` is not an `Effect`, so you
    don't need to run it.

    2. Is there anywhere I can read up on Consumer' and (>~)?  I sort
    of have been avoided using them because I don't fully understand
    the differences between the (>->) category and the (>~) category,
    actually.


    The English explanation of what `(>~)` does is that `p' >~ p`
    replaces every `await` in `p` with `p'`.  So, for example, when
    you write:

        p' >~ cat

    That's equivalent to:

        forever $ do
            a <- p'
            yield a

    In other words, `p'` gets reused in its entirety every time the
    downstream pipe `await`s.

    This behavior can also be specified more formally using these laws:

        -- (p' >~) is a monad morphism
        p' >~ (do
            x <- m
            f x )
        = do
            x <- p' >~ m
            p' >~ f x

        p' >~ return r = return

        -- `await` is the right-identity of `(>~)`
        p' >~ await = p'

        -- The next two equations are free theorems
        p' >~ yield x = yield x

        p' >~ lift m = lift m

    So, using the example of `p' >~ cat`, you can evaluate it like so:

        p' >~ cat

        -- Definition of `cat`
        = p' >~ (forever $ do
            a <- await
            yield a )

        -- Monad morphisms distribute over `forever`
        = forever$ p' >~ (do
            a <- await
            yield a )

        -- Monad morphism distributes over bind
        = forever $ do
            a <- p' >~ await
            p' >~ yield a

        -- p' >~ await = p'
        = forever $ do
            a <- p'
            p' >~ yield a

        -- p' >~ yield a = yield a
        = forever $ do
            a <- p'
            yield a

    It may also help to read the "Consumers" section of the `pipes`
    tutorial:

    http://hackage.haskell.org/package/pipes-4.1.1/docs/Pipes-Tutorial.html#g:4
    
<http://hackage.haskell.org/package/pipes-4.1.1/docs/Pipes-Tutorial.html#g:4>

    Consumer' is just Consumer, but with the output not technically
    "closed" off for good (just effectively), right?  And how does
    (>~ cat) turn it into a Pipe?

    Thank you again!

    Justin

    On Friday, May 2, 2014 4:06:07 PM UTC-7, Gabriel Gonzalez wrote:

        This is the perfect kind of question to post to the mailing list!

        I will go down the two programs and make minor comments and
        then review their overall structure.

        -- encode.hs

        * Delete `runEffect`.  It's not doing anything. The reason
        that it still type-checked was because your base monad was
        polymorphic over `MonadIO`, so it let you accidentally insert

an additional `Pipe` layer (which was not doing anything).As a side note, I think I made a mistake by parametrizing the

        `Pipes.Prelude` utilities over `MonadIO` (I prefer using
        `hoist` now), but I don't want to make a breaking change to
        fix it.

        * Good use of `withFile` instead of `pipes-safe`. I feel like
        too many people unnecessarily use `pipes-safe` when
        `withFile` suffices.

        * Use `view Pipes.ByteString.pack p` instead of `p >-> PP.map
        B.singleton`.  It will group your Word8's into a more
        efficient chunk size.  Your current formulation will call a
        separate write command for every single byte, which is very
        inefficient.

        * For the reverse direction (i.e. `bytes`), you can either:

        A) Use `view (from Pipes.ByteString.pack)`, but that requires
        a `lens` dependency (which I think is not good).  I plan to
        fix that by providing an `unpack` lens in an upcoming
        `pipes-bytestring` release.  I created an issue for this:

        https://github.com/Gabriel439/Haskell-Pipes-ByteString-Library/issues/36
        
<https://github.com/Gabriel439/Haskell-Pipes-ByteString-Library/issues/36>

        B) Use `mapFoldable`:

            bytes = Pipes.Prelude.mapFoldable BS.unpack

        That's much more efficient.  The problem with your `bytes`
        function is that it uses `foldl`, which triggers a bunch of
        left-associated binds, generating quadratic time complexity
        in the number of bytes:

            ((((return ()) >> yield byte1) >> yield byte2) >> yield byte3

        `mapFoldable`, on the other hand, is implemented in terms of
        `each`, which uses a right-fold like this:

        each = Data.Foldable.foldr (\a p -> yield a >> p) (return ())

        ... which triggers build/fold fusion and also gives linear
        time complexity:

            yield byte1 >> (yield byte2 >> (yield byte3 >> return ()))

        * If you're willing to skip the error message, you can
        shorten `encodeByte` to:

            encodeByte t = for cat $ \b -> each(b `M.lookup` t)

        ... which is the same thing as:

            encodeByte t = Pipes.Prelude.mapFoldable (`M.lookup` t)

        * I should probably provide a function that transforms
        `Parser`s to functions between `Producer`s to simplify your
        `dirsBytes` code.  I also find myself writing that same
        pattern way too many times.  I just created an issue to
        remind myself to do this:

        https://github.com/Gabriel439/Haskell-Pipes-Parse-Library/issues/28
        <https://github.com/Gabriel439/Haskell-Pipes-Parse-Library/issues/28>

        -- decode.hs

        * Is there any reason why you `drain` unused input using
        `limit` instead of just using `take` by itself?

        * Same thing as `encode`.hs: try using
        `Pipes.ByteString.pack` and `Pipes.Prelude.mapFoldable
        Data.ByteString.unpack` for much greater efficiency
        translating between `Word8`s and `ByteString`s

        * You can make the code for `searchPT` more reusable by first
        defining a `Consumer'` (note the prime!) that produces a
        single `Direction`, like this:

            searchPT :: forall m. Monad m => PreTree Word8 ->
        Consumer' Direction m Word8
            searchPT pt0 = go pt0
              where
                go :: PreTree Word8 -> Consumer Direction m Word8
                go (PTLeaf x      ) = return x
                go (PTNode pt1 pt2) = do
                    dir <- await
        go $ case dir of
                        DLeft  -> pt1
        DRight -> pt2

        ... and then you can optionally upgrade that to a `Pipe` like
        this:

            searchPT pt >~ cat:: Pipe Direction Word8 m r

        That decouples the logic for parsing one direction from the
        logic for looping.

        * Also, there's nothing `Word8`-specific about your
        `searchPT` function.  Consider generalizing the type to any
        value.

        * You can simplify the implementation of `dirs` using
        `mapFoldable`:

            dirs = Pipes.Prelude.mapFoldable byteToDirs

        Overall the architecture of your program looks correct.  I
        don't see any obvious non-idiomatic things that you are doing.

        On 5/2/14, 2:43 AM, Justin Le wrote:

        Hi pipes people;

        I really don't know too much about pipes, but an entire
        section in a project tutorial I am writing is going to be
        dedicated to hooking up all of the pipes plumbing together.
         Seeing as this might also be possibly used as a pipes
        tutorial, I just wanted to make sure that my pipes code is
        idiomatic/not awful/not going to set back your progress by
        generations.  Does anyone mind maybe giving it a quick look
        over? :)  I would really appreciate it, and credit will be
        given where deserved :)  I hope it is not too imposing for
        me to ask!

        It's actually a pair of programs --- a Huffman compression
        encoder and decoder.

        The encoder:
        
https://github.com/mstksg/inCode/blob/master/code-samples/huffman/encode.hs
        
<https://github.com/mstksg/inCode/blob/master/code-samples/huffman/encode.hs>
        The decoer:
        
https://github.com/mstksg/inCode/blob/master/code-samples/huffman/decode.hs
        
<https://github.com/mstksg/inCode/blob/master/code-samples/huffman/decode.hs>

        I tried my best to abstract away the actual mechanisms of
        the huffman logic where I could; it does peak in at some
        times, but the comments should give you a general high-level
        idea of what each function is trying to do.  For reference,
        the series itself explaining the logic is hosted at
        http://blog.jle.im/entries/series/+huffman-compression
        <http://blog.jle.im/entries/series/+huffman-compression>

        I am pretty sure that the code gives away my unfamiliarity :)

        Thank you all!
        Justin

--You received this message because you are subscribed to the

        Google Groups "Haskell Pipes" group.
        To unsubscribe from this group and stop receiving emails
        from it, send an email to [email protected].
        To post to this group, send email to [email protected].

--You received this message because you are subscribed to the

    Google Groups "Haskell Pipes" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected] <javascript:>.
    To post to this group, send email to [email protected]
    <javascript:>.

--

You received this message because you are subscribed to the GoogleGroups "Haskell Pipes" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected]<mailto:[email protected]>.To post to this group, send email to [email protected]<mailto:[email protected]>.


--
You received this message because you are subscribed to the Google Groups "Haskell 
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Re: [haskell-pipes] Some code review for a pipes tutorial?

Reply via email to