Re: [haskell-pipes] Some code review for a pipes tutorial?

Gabriel Gonzalez Sat, 10 May 2014 07:41:23 -0700


On 05/09/2014 09:21 PM, Justin Le wrote:

Thanks Gabriel!  This was very helpful :)
I've updated the repo with my modifications, and I had a couple ofcomments.
1. Deleting `runEffect` appears to bring about a type error, trying tounify the Proxy type with IO (). Did I do something wrong here?

Make sure you are deleting the correct `runEffect` (from encode.hs).The result of `freqs` is not an `Effect`, so you don't need to run it.

2. Is there anywhere I can read up on Consumer' and (>~)? I sort ofhave been avoided using them because I don't fully understand thedifferences between the (>->) category and the (>~) category, actually.

The English explanation of what `(>~)` does is that `p' >~ p` replacesevery `await` in `p` with `p'`. So, for example, when you write:


    p' >~ cat

That's equivalent to:

    forever $ do
        a <- p'
        yield a

In other words, `p'` gets reused in its entirety every time thedownstream pipe `await`s.


This behavior can also be specified more formally using these laws:

    -- (p' >~) is a monad morphism
    p' >~ (do
        x <- m
        f x )
    = do
        x <- p' >~ m
        p' >~ f x

    p' >~ return r = return

    -- `await` is the right-identity of `(>~)`
    p' >~ await = p'

    -- The next two equations are free theorems
    p' >~ yield x = yield x

    p' >~ lift m = lift m

So, using the example of `p' >~ cat`, you can evaluate it like so:

    p' >~ cat

    -- Definition of `cat`
    = p' >~ (forever $ do
        a <- await
        yield a )

    -- Monad morphisms distribute over `forever`
    = forever$ p' >~ (do
        a <- await
        yield a )

    -- Monad morphism distributes over bind
    = forever $ do
        a <- p' >~ await
        p' >~ yield a

    -- p' >~ await = p'
    = forever $ do
        a <- p'
        p' >~ yield a

    -- p' >~ yield a = yield a
    = forever $ do
        a <- p'
        yield a

It may also help to read the "Consumers" section of the `pipes` tutorial:

http://hackage.haskell.org/package/pipes-4.1.1/docs/Pipes-Tutorial.html#g:4

Consumer' is just Consumer, but with the output not technically"closed" off for good (just effectively), right? And how does (>~cat) turn it into a Pipe?


Thank you again!

Justin

On Friday, May 2, 2014 4:06:07 PM UTC-7, Gabriel Gonzalez wrote:

    This is the perfect kind of question to post to the mailing list!

    I will go down the two programs and make minor comments and then
    review their overall structure.

    -- encode.hs

    * Delete `runEffect`.  It's not doing anything.  The reason that
    it still type-checked was because your base monad was polymorphic
    over `MonadIO`, so it let you accidentally insert an additional
    `Pipe` layer (which was not doing anything).  As a side note, I
    think I made a mistake by parametrizing the `Pipes.Prelude`
    utilities over `MonadIO` (I prefer using `hoist` now), but I don't
    want to make a breaking change to fix it.

    * Good use of `withFile` instead of `pipes-safe`.  I feel like too
    many people unnecessarily use `pipes-safe` when `withFile` suffices.

    * Use `view Pipes.ByteString.pack p` instead of `p >-> PP.map
    B.singleton`.  It will group your Word8's into a more efficient
    chunk size.  Your current formulation will call a separate write
    command for every single byte, which is very inefficient.

    * For the reverse direction (i.e. `bytes`), you can either:

    A) Use `view (from Pipes.ByteString.pack)`, but that requires a
    `lens` dependency (which I think is not good). I plan to fix that
    by providing an `unpack` lens in an upcoming `pipes-bytestring`
    release.  I created an issue for this:

    https://github.com/Gabriel439/Haskell-Pipes-ByteString-Library/issues/36
    <https://github.com/Gabriel439/Haskell-Pipes-ByteString-Library/issues/36>

    B) Use `mapFoldable`:

        bytes = Pipes.Prelude.mapFoldable BS.unpack

    That's much more efficient.  The problem with your `bytes`
    function is that it uses `foldl`, which triggers a bunch of
    left-associated binds, generating quadratic time complexity in the
    number of bytes:

        ((((return ()) >> yield byte1) >> yield byte2) >> yield byte3

    `mapFoldable`, on the other hand, is implemented in terms of
    `each`, which uses a right-fold like this:

    each = Data.Foldable.foldr (\a p -> yield a >> p) (return ())

    ... which triggers build/fold fusion and also gives linear time
    complexity:

        yield byte1 >> (yield byte2 >> (yield byte3 >> return ()))

    * If you're willing to skip the error message, you can shorten
    `encodeByte` to:

        encodeByte t = for cat $ \b -> each(b `M.lookup` t)

    ... which is the same thing as:

        encodeByte t = Pipes.Prelude.mapFoldable (`M.lookup` t)

    * I should probably provide a function that transforms `Parser`s
    to functions between `Producer`s to simplify your `dirsBytes`
    code.  I also find myself writing that same pattern way too many
    times.  I just created an issue to remind myself to do this:

    https://github.com/Gabriel439/Haskell-Pipes-Parse-Library/issues/28 
<https://github.com/Gabriel439/Haskell-Pipes-Parse-Library/issues/28>

    -- decode.hs

    * Is there any reason why you `drain` unused input using `limit`
    instead of just using `take` by itself?

    * Same thing as `encode`.hs: try using `Pipes.ByteString.pack` and
    `Pipes.Prelude.mapFoldable Data.ByteString.unpack` for much
    greater efficiency translating between `Word8`s and `ByteString`s

    * You can make the code for `searchPT` more reusable by first
    defining a `Consumer'` (note the prime!) that produces a single
    `Direction`, like this:

        searchPT :: forall m. Monad m => PreTree Word8 -> Consumer'
    Direction m Word8
        searchPT pt0 = go pt0
          where
            go :: PreTree Word8 -> Consumer Direction m Word8
            go (PTLeaf x      ) = return x
            go (PTNode pt1 pt2) = do
                dir <- await
    go $ case dir of
                    DLeft  -> pt1
    DRight -> pt2

    ... and then you can optionally upgrade that to a `Pipe` like this:

        searchPT pt >~ cat:: Pipe Direction Word8 m r

    That decouples the logic for parsing one direction from the logic
    for looping.

    * Also, there's nothing `Word8`-specific about your `searchPT`
    function.  Consider generalizing the type to any value.

    * You can simplify the implementation of `dirs` using `mapFoldable`:

        dirs = Pipes.Prelude.mapFoldable byteToDirs

    Overall the architecture of your program looks correct.  I don't
    see any obvious non-idiomatic things that you are doing.

    On 5/2/14, 2:43 AM, Justin Le wrote:

    Hi pipes people;

    I really don't know too much about pipes, but an entire section
    in a project tutorial I am writing is going to be dedicated to
    hooking up all of the pipes plumbing together.  Seeing as this
    might also be possibly used as a pipes tutorial, I just wanted to
    make sure that my pipes code is idiomatic/not awful/not going to
    set back your progress by generations.  Does anyone mind maybe
    giving it a quick look over? :)  I would really appreciate it,
    and credit will be given where deserved :)  I hope it is not too
    imposing for me to ask!

    It's actually a pair of programs --- a Huffman compression
    encoder and decoder.

    The encoder:
    https://github.com/mstksg/inCode/blob/master/code-samples/huffman/encode.hs
    
<https://github.com/mstksg/inCode/blob/master/code-samples/huffman/encode.hs>
    The decoer:
    https://github.com/mstksg/inCode/blob/master/code-samples/huffman/decode.hs
    
<https://github.com/mstksg/inCode/blob/master/code-samples/huffman/decode.hs>

    I tried my best to abstract away the actual mechanisms of the
    huffman logic where I could; it does peak in at some times, but
    the comments should give you a general high-level idea of what
    each function is trying to do.  For reference, the series itself
    explaining the logic is hosted at
    http://blog.jle.im/entries/series/+huffman-compression
    <http://blog.jle.im/entries/series/+huffman-compression>

    I am pretty sure that the code gives away my unfamiliarity :)

    Thank you all!
    Justin

--You received this message because you are subscribed to the

    Google Groups "Haskell Pipes" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected] <javascript:>.
    To post to this group, send email to [email protected]
    <javascript:>.

--

You received this message because you are subscribed to the GoogleGroups "Haskell Pipes" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected]<mailto:[email protected]>.To post to this group, send email to [email protected]<mailto:[email protected]>.


--
You received this message because you are subscribed to the Google Groups "Haskell 
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Re: [haskell-pipes] Some code review for a pipes tutorial?

Reply via email to