Re: [haskell-pipes] Some code review for a pipes tutorial?

Gabriel Gonzalez Thu, 19 Jun 2014 10:08:32 -0700

Alright.  I'll take a look at it this afternoon.

On 6/19/14, 10:06 AM, Justin Le wrote:

Hi,

It's been about a month exactly since your last reply, but now thatschool is over I've found time to finish the post; if you had time,would you mind helping me make sure it is correct and useful forpeople trying to learn lens? it's online athttp://blog.jle.im/entry/pipes-streaming-huffman-compression-in-haskell-part-3<http://blog.jle.im/entry/pipes-streaming-huffman-compression-in-haskell-part-3>(it only slightly builds on some non-pipes material from theprevious two parts of the tutorial.) I'd greatly appreciate any help;my goal is to hopefully help with the availability of basic pipesprojects available online and I'd hate to be spreading misinformationor bad intuitions.


Thanks again :)
Justin

On Tuesday, May 20, 2014 6:44:29 PM UTC-7, Gabriel Gonzalez wrote:

    `pack` is defined like this:

        pack = dimap to (fmap from)

    ... which is equivalent to:

        pack = iso to from

    When you apply `view` to an isomorphism, it gives you the forward
    direction:

        view (iso to from) = to

    That means you can narrow the problem down to understanding how
    `to` works, which is defined in this case as:

    to p = Pipes.Group.folds step id done (p ^. Pipes.Group.chunksOf
    defaultChunkSize)

        step diffAs w8 = diffAs . (w8:)

        done diffAs = BS.pack (diffAs [])

    This is the same thing as:

        purely folds (fmap BS.pack list) (p ^. Pipes.Group.chunksOf
    defaultChunkSize)

    In other words, it's splitting up the stream of `Word8`s into
    groups of `defaultChunkSize`, folding each group into a list, and
    then `pack`ing that list into a `ByteString`.

    To learn more about how `chunksOf` and `folds` work, you can read
    the `pipes-group` tutorial here:

    
http://hackage.haskell.org/package/pipes-group-1.0.0/docs/Pipes-Group-Tutorial.html
    
<http://hackage.haskell.org/package/pipes-group-1.0.0/docs/Pipes-Group-Tutorial.html>

    You might also find this blog post useful, which is on a similar
    subject:

    
http://www.haskellforall.com/2013/09/perfect-streaming-using-pipes-bytestring.html
    
<http://www.haskellforall.com/2013/09/perfect-streaming-using-pipes-bytestring.html>

    On 5/20/14, 11:22 AM, Justin Le wrote:

    Ah, that makes sense.

    You're right, I was removing the wrong runEffect...and I see why
    it was unnecessary :)

    One last thing I am stuck on; what exactly does `view` do in this
    context?  And how does it know when to properly chunk up
    ByteStrings, as to keep constant space?

    On Saturday, May 10, 2014 7:40:52 AM UTC-7, Gabriel Gonzalez wrote:


        On 05/09/2014 09:21 PM, Justin Le wrote:

        Thanks Gabriel!  This was very helpful :)

        I've updated the repo with my modifications, and I had a
        couple of comments.

        1. Deleting `runEffect` appears to bring about a type error,
        trying to unify the Proxy type with IO ().  Did I do
        something wrong here?


        Make sure you are deleting the correct `runEffect` (from
        encode.hs).  The result of `freqs` is not an `Effect`, so you
        don't need to run it.

        2. Is there anywhere I can read up on Consumer' and (>~)?  I
        sort of have been avoided using them because I don't fully
        understand the differences between the (>->) category and
        the (>~) category, actually.


        The English explanation of what `(>~)` does is that `p' >~ p`
        replaces every `await` in `p` with `p'`.  So, for example,
        when you write:

            p' >~ cat

        That's equivalent to:

            forever $ do
                a <- p'
                yield a

        In other words, `p'` gets reused in its entirety every time
        the downstream pipe `await`s.

        This behavior can also be specified more formally using these
        laws:

            -- (p' >~) is a monad morphism
            p' >~ (do
                x <- m
                f x )
            = do
                x <- p' >~ m
                p' >~ f x

            p' >~ return r = return

            -- `await` is the right-identity of `(>~)`
            p' >~ await = p'

            -- The next two equations are free theorems
            p' >~ yield x = yield x

            p' >~ lift m = lift m

        So, using the example of `p' >~ cat`, you can evaluate it
        like so:

            p' >~ cat

            -- Definition of `cat`
            = p' >~ (forever $ do
                a <- await
                yield a )

            -- Monad morphisms distribute over `forever`
            = forever$ p' >~ (do
                a <- await
                yield a )

            -- Monad morphism distributes over bind
            = forever $ do
                a <- p' >~ await
                p' >~ yield a

            -- p' >~ await = p'
            = forever $ do
                a <- p'
                p' >~ yield a

            -- p' >~ yield a = yield a
            = forever $ do
                a <- p'
                yield a

        It may also help to read the "Consumers" section of the
        `pipes` tutorial:

        
http://hackage.haskell.org/package/pipes-4.1.1/docs/Pipes-Tutorial.html#g:4
        
<http://hackage.haskell.org/package/pipes-4.1.1/docs/Pipes-Tutorial.html#g:4>

        Consumer' is just Consumer, but with the output not
        technically "closed" off for good (just effectively), right?
         And how does (>~ cat) turn it into a Pipe?

        Thank you again!

        Justin

        On Friday, May 2, 2014 4:06:07 PM UTC-7, Gabriel Gonzalez
        wrote:

            This is the perfect kind of question to post to the
            mailing list!

            I will go down the two programs and make minor comments
            and then review their overall structure.

            -- encode.hs

            * Delete `runEffect`.  It's not doing anything.  The
            reason that it still type-checked was because your base
            monad was polymorphic over `MonadIO`, so it let you
            accidentally insert an additional `Pipe` layer (which
            was not doing anything).  As a side note, I think I made
            a mistake by parametrizing the `Pipes.Prelude` utilities
            over `MonadIO` (I prefer using `hoist` now), but I don't
            want to make a breaking change to fix it.

            * Good use of `withFile` instead of `pipes-safe`.  I
            feel like too many people unnecessarily use `pipes-safe`
            when `withFile` suffices.

            * Use `view Pipes.ByteString.pack p` instead of `p >->
            PP.map B.singleton`.  It will group your Word8's into a
            more efficient chunk size.  Your current formulation
            will call a separate write command for every single
            byte, which is very inefficient.

            * For the reverse direction (i.e. `bytes`), you can either:

            A) Use `view (from Pipes.ByteString.pack)`, but that
            requires a `lens` dependency (which I think is not
            good).  I plan to fix that by providing an `unpack` lens
            in an upcoming `pipes-bytestring` release.  I created an
            issue for this:

            
https://github.com/Gabriel439/Haskell-Pipes-ByteString-Library/issues/36
            
<https://github.com/Gabriel439/Haskell-Pipes-ByteString-Library/issues/36>

            B) Use `mapFoldable`:

                bytes = Pipes.Prelude.mapFoldable BS.unpack

            That's much more efficient.  The problem with your
            `bytes` function is that it uses `foldl`, which triggers
            a bunch of left-associated binds, generating quadratic
            time complexity in the number of bytes:

                ((((return ()) >> yield byte1) >> yield byte2) >>
            yield byte3

            `mapFoldable`, on the other hand, is implemented in
            terms of `each`, which uses a right-fold like this:

            each = Data.Foldable.foldr (\a p -> yield a >> p)
            (return ())

            ... which triggers build/fold fusion and also gives
            linear time complexity:

                yield byte1 >> (yield byte2 >> (yield byte3 >>
            return ()))

            * If you're willing to skip the error message, you can
            shorten `encodeByte` to:

                encodeByte t = for cat $ \b -> each(b `M.lookup` t)

            ... which is the same thing as:

                encodeByte t = Pipes.Prelude.mapFoldable (`M.lookup` t)

            * I should probably provide a function that transforms
            `Parser`s to functions between `Producer`s to simplify
            your `dirsBytes` code.  I also find myself writing that
            same pattern way too many times.  I just created an
            issue to remind myself to do this:

            https://github.com/Gabriel439/Haskell-Pipes-Parse-Library/issues/28
            
<https://github.com/Gabriel439/Haskell-Pipes-Parse-Library/issues/28>

            -- decode.hs

            * Is there any reason why you `drain` unused input using
            `limit` instead of just using `take` by itself?

            * Same thing as `encode`.hs: try using
            `Pipes.ByteString.pack` and `Pipes.Prelude.mapFoldable
            Data.ByteString.unpack` for much greater efficiency
            translating between `Word8`s and `ByteString`s

            * You can make the code for `searchPT` more reusable by
            first defining a `Consumer'` (note the prime!) that
            produces a single `Direction`, like this:

                searchPT :: forall m. Monad m => PreTree Word8 ->
            Consumer' Direction m Word8
                searchPT pt0 = go pt0
                  where
                    go :: PreTree Word8 -> Consumer Direction m Word8
                    go (PTLeaf x      ) = return x
                    go (PTNode pt1 pt2) = do
                        dir <- await
            go $ case dir of
                            DLeft  -> pt1
            DRight -> pt2

            ... and then you can optionally upgrade that to a `Pipe`
            like this:

                searchPT pt >~ cat:: Pipe Direction Word8 m r

            That decouples the logic for parsing one direction from
            the logic for looping.

            * Also, there's nothing `Word8`-specific about your
            `searchPT` function.  Consider generalizing the type to
            any value.

            * You can simplify the implementation of `dirs` using
            `mapFoldable`:

                dirs = Pipes.Prelude.mapFoldable byteToDirs

Overall the architecture of your program looks correct.I don't see any obvious non-idiomatic things that you

            are doing.

            On 5/2/14, 2:43 AM, Justin Le wrote:

            Hi pipes people;

            I really don't know too much about pipes, but an entire
            section in a project tutorial I am writing is going to
            be dedicated to hooking up all of the pipes plumbing
            together.  Seeing as this might also be possibly used
            as a pipes tutorial, I just wanted to make sure that my
            pipes code is idiomatic/not awful/not going to set back
            your progress by generations.  Does anyone mind maybe
            giving it a quick look over? :)  I would really
            appreciate it, and credit will be given where deserved
            :)  I hope it is not too imposing for me to ask!

            It's actually a pair of programs --- a Huffman
            compression encoder and decoder.

            The encoder:
            
https://github.com/mstksg/inCode/blob/master/code-samples/huffman/encode.hs
            
<https://github.com/mstksg/inCode/blob/master/code-samples/huffman/encode.hs>
            The decoer:
            
https://github.com/mstksg/inCode/blob/master/code-samples/huffman/decode.hs
            
<https://github.com/mstksg/inCode/blob/master/code-samples/huffman/decode.hs>

            I tried my best to abstract away the actual mechanisms
            of the huffman logic where I could; it does peak in at
            some times, but the comments should give you a general
            high-level idea of what each function is trying to do.
             For reference, the series itself explaining the logic
            is hosted at
            http://blog.jle.im/entries/series/+huffman-compression
            <http://blog.jle.im/entries/series/+huffman-compression>

            I am pretty sure that the code gives away my
            unfamiliarity :)

            Thank you all!
            Justin

--You received this message because you are subscribed to

            the Google Groups "Haskell Pipes" group.
            To unsubscribe from this group and stop receiving
            emails from it, send an email to
            [email protected].
            To post to this group, send email to
            [email protected].

--You received this message because you are subscribed to the

        Google Groups "Haskell Pipes" group.
        To unsubscribe from this group and stop receiving emails
        from it, send an email to [email protected].
        To post to this group, send email to [email protected].

--You received this message because you are subscribed to the

    Google Groups "Haskell Pipes" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected] <javascript:>.
    To post to this group, send email to [email protected]
    <javascript:>.

--

You received this message because you are subscribed to the GoogleGroups "Haskell Pipes" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected]<mailto:[email protected]>.To post to this group, send email to [email protected]<mailto:[email protected]>.


--
You received this message because you are subscribed to the Google Groups "Haskell 
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Re: [haskell-pipes] Some code review for a pipes tutorial?

Reply via email to