Re: [haskell-pipes] Some code review for a pipes tutorial?

Justin Le Thu, 19 Jun 2014 20:05:32 -0700

Thanks Gabriel,

* I don't think it's under high load; I'll investigate a bit about the load 
times.  It sometimes was slow for me but a minute is a bit alarming indeed.


* Ah more lens to learn :)  I'll admit that I learned more about lens while 
researching this topic than I had ever before.  What is the timeframe for 
the next `pipes-bytestring` release?  If it's not planned any time soon I 
can just leave a footnote with that comment about future compatibility and 
better options.

* Thanks for the heads up about `pack`, I'll look into the impure version.

* It actually is all on github 
:) https://github.com/mstksg/inCode/tree/master/code-samples/huffman  but I 
should probably mention that explicitly before I put it up.

Thanks again for the prompt reply.  Hope I can do your great library 
justice.  Cheers!
Justin



On Thursday, June 19, 2014 4:47:52 PM UTC-7, Gabriel Gonzalez wrote:
>
>  Comments:
>
> * Your site took a minute to load the page.  Is it under high load?
>
> * I'd rephrase point #1 under "Design" as "A component that emits 
> `ByteString`s read from a file ..."
>
> * I don't like the use of `error`, but if you do choose to use it I would 
> recommend instead pattern matching on `Just` instead, which will give a 
> more informative line number if if throws an error.
>
> * If you can wait for me to release the next `pipes-bytestring`, I plan to 
> provide `unpack`, which is the reverse of `pack`.  Basically the new 
> version of `pipes-bytestring` is downgrading every `Iso` to a `Lens` and 
> providing the reverse `Lens` so that people can get the flipped versions 
> without the `lens` dependency.  That means you will have:
>
>     pack   :: Monad m => Lens' (Producer Word8      m r) (Producer 
> ByteString m r)
>
>     unpack :: Monad m => Lens' (Producer ByteString m r) (Producer 
> Word8      m r)
>
> Using the `unpack` lens, you can simplify your code to:
>
>     over unpack transform (PB.fromhandle hIn)
>       where
>         transform bytes = dirsBytes (bytes >-> encodeByte encTable)
>
> You can read that as saying: "Apply `transform` over an unpacked view of 
> `fromHandle hIn`).  Under the hood, it takes care of unpacking the byte 
> stream, applying the transformation, and then repacking the stream for you.
>
> * `pack` might be the culprit for poor performance.  There is a much more 
> efficient way to pack a `ByteString`, but it requires an `IO` base monad 
> and an impure `PG.foldsM` call
>
> * I'd also recommend putting everything into a Github repository so that 
> people can download it and test it out themselves.
>
> Other than that, it looks good.  The code is pretty idiomatic.
>
> On 6/19/14, 10:06 AM, Justin Le wrote:
>  
> Hi, 
>
>  It's been about a month exactly since your last reply, but now that 
> school is over I've found time to finish the post; if you had time, would 
> you mind helping me make sure it is correct and useful for people trying to 
> learn lens?  it's online at 
> http://blog.jle.im/entry/pipes-streaming-huffman-compression-in-haskell-part-3
>  
>  
>   (it only slightly builds on some non-pipes material from the previous 
> two parts of the tutorial.)  I'd greatly appreciate any help; my goal is to 
> hopefully help with the availability of basic pipes projects available 
> online and I'd hate to be spreading misinformation or bad intuitions.
>
>  Thanks again :)
> Justin
>
> On Tuesday, May 20, 2014 6:44:29 PM UTC-7, Gabriel Gonzalez wrote: 
>>
>>  `pack` is defined like this:
>>
>>     pack = dimap to (fmap from)
>>
>> ... which is equivalent to:
>>
>>     pack = iso to from
>>
>> When you apply `view` to an isomorphism, it gives you the forward 
>> direction:
>>
>>     view (iso to from) = to
>>
>> That means you can narrow the problem down to understanding how `to` 
>> works, which is defined in this case as:
>>
>>     to p = Pipes.Group.folds step id done (p ^. Pipes.Group.chunksOf 
>> defaultChunkSize)
>>  
>>     step diffAs w8 = diffAs . (w8:)
>>
>>     done diffAs = BS.pack (diffAs [])
>>
>> This is the same thing as:
>>
>>     purely folds (fmap BS.pack list) (p ^. Pipes.Group.chunksOf 
>> defaultChunkSize)
>>
>> In other words, it's splitting up the stream of `Word8`s into groups of 
>> `defaultChunkSize`, folding each group into a list, and then `pack`ing that 
>> list into a `ByteString`.
>>
>> To learn more about how `chunksOf` and `folds` work, you can read the 
>> `pipes-group` tutorial here:
>>
>>
>> http://hackage.haskell.org/package/pipes-group-1.0.0/docs/Pipes-Group-Tutorial.html
>>
>> You might also find this blog post useful, which is on a similar subject:
>>
>>
>> http://www.haskellforall.com/2013/09/perfect-streaming-using-pipes-bytestring.html
>>
>> On 5/20/14, 11:22 AM, Justin Le wrote:
>>  
>>  Ah, that makes sense.
>>
>>  You're right, I was removing the wrong runEffect...and I see why it was 
>> unnecessary :)
>>
>>  One last thing I am stuck on; what exactly does `view` do in this 
>> context?  And how does it know when to properly chunk up ByteStrings, as to 
>> keep constant space?
>>
>> On Saturday, May 10, 2014 7:40:52 AM UTC-7, Gabriel Gonzalez wrote: 
>>>
>>>  
>>> On 05/09/2014 09:21 PM, Justin Le wrote:
>>>  
>>> Thanks Gabriel!  This was very helpful :) 
>>>
>>>  I've updated the repo with my modifications, and I had a couple of 
>>> comments.
>>>
>>>  1. Deleting `runEffect` appears to bring about a type error, trying to 
>>> unify the Proxy type with IO ().  Did I do something wrong here?
>>>  
>>>
>>> Make sure you are deleting the correct `runEffect` (from encode.hs).  
>>> The result of `freqs` is not an `Effect`, so you don't need to run it.
>>>
>>>  2. Is there anywhere I can read up on Consumer' and (>~)?  I sort of 
>>> have been avoided using them because I don't fully understand the 
>>> differences between the (>->) category and the (>~) category, actually.  
>>>  
>>>
>>> The English explanation of what `(>~)` does is that `p' >~ p` replaces 
>>> every `await` in `p` with `p'`.  So, for example, when you write:
>>>
>>>     p' >~ cat
>>>
>>> That's equivalent to:
>>>
>>>     forever $ do
>>>         a <- p'
>>>         yield a
>>>
>>> In other words, `p'` gets reused in its entirety every time the 
>>> downstream pipe `await`s.
>>>
>>> This behavior can also be specified more formally using these laws:
>>>
>>>     -- (p' >~) is a monad morphism
>>>     p' >~ (do
>>>         x <- m
>>>         f x )
>>>     = do
>>>         x <- p' >~ m
>>>         p' >~ f x
>>>
>>>     p' >~ return r = return 
>>>
>>>     -- `await` is the right-identity of `(>~)`
>>>     p' >~ await = p'
>>>
>>>     -- The next two equations are free theorems
>>>     p' >~ yield x = yield x
>>>
>>>     p' >~ lift m = lift m
>>>
>>> So, using the example of `p' >~ cat`, you can evaluate it like so:
>>>
>>>     p' >~ cat
>>>
>>>     -- Definition of `cat`
>>>     = p' >~ (forever $ do
>>>         a <- await
>>>         yield a )
>>>
>>>     -- Monad morphisms distribute over `forever`
>>>     = forever $ p' >~ (do
>>>         a <- await
>>>         yield a )
>>>
>>>     -- Monad morphism distributes over bind
>>>     = forever $ do
>>>         a <- p' >~ await
>>>         p' >~ yield a
>>>
>>>     -- p' >~ await = p'
>>>     = forever $ do
>>>         a <- p'
>>>         p' >~ yield a
>>>
>>>     -- p' >~ yield a = yield a
>>>     = forever $ do
>>>         a <- p'
>>>         yield a
>>>
>>> It may also help to read the "Consumers" section of the `pipes` tutorial:
>>>
>>>
>>> http://hackage.haskell.org/package/pipes-4.1.1/docs/Pipes-Tutorial.html#g:4
>>>
>>>
>>>  Consumer' is just Consumer, but with the output not technically 
>>> "closed" off for good (just effectively), right?  And how does (>~ cat) 
>>> turn it into a Pipe?
>>>
>>>  Thank you again!
>>>
>>>  Justin
>>>
>>> On Friday, May 2, 2014 4:06:07 PM UTC-7, Gabriel Gonzalez wrote: 
>>>>
>>>>  This is the perfect kind of question to post to the mailing list!
>>>>
>>>> I will go down the two programs and make minor comments and then review 
>>>> their overall structure.
>>>>
>>>> -- encode.hs
>>>>
>>>> * Delete `runEffect`.  It's not doing anything.  The reason that it 
>>>> still type-checked was because your base monad was polymorphic over 
>>>> `MonadIO`, so it let you accidentally insert an additional `Pipe` layer 
>>>> (which was not doing anything).  As a side note, I think I made a mistake 
>>>> by parametrizing the `Pipes.Prelude` utilities over `MonadIO` (I prefer 
>>>> using `hoist` now), but I don't want to make a breaking change to fix it.
>>>>
>>>> * Good use of `withFile` instead of `pipes-safe`.  I feel like too many 
>>>> people unnecessarily use `pipes-safe` when `withFile` suffices.
>>>>
>>>> * Use `view Pipes.ByteString.pack p` instead of `p >-> PP.map 
>>>> B.singleton`.  It will group your Word8's into a more efficient chunk 
>>>> size.  Your current formulation will call a separate write command for 
>>>> every single byte, which is very inefficient.
>>>>
>>>> * For the reverse direction (i.e. `bytes`), you can either:
>>>>
>>>> A) Use `view (from Pipes.ByteString.pack)`, but that requires a `lens` 
>>>> dependency (which I think is not good).  I plan to fix that by providing 
>>>> an 
>>>> `unpack` lens in an upcoming `pipes-bytestring` release.  I created an 
>>>> issue for this:
>>>>
>>>> https://github.com/Gabriel439/Haskell-Pipes-ByteString-Library/issues/36
>>>>
>>>> B) Use `mapFoldable`:
>>>>
>>>>     bytes = Pipes.Prelude.mapFoldable BS.unpack
>>>>
>>>> That's much more efficient.  The problem with your `bytes` function is 
>>>> that it uses `foldl`, which triggers a bunch of left-associated binds, 
>>>> generating quadratic time complexity in the number of bytes:
>>>>
>>>>     ((((return ()) >> yield byte1) >> yield byte2) >> yield byte3
>>>>
>>>> `mapFoldable`, on the other hand, is implemented in terms of `each`, 
>>>> which uses a right-fold like this:
>>>>
>>>>     each = Data.Foldable.foldr (\a p -> yield a >> p) (return ())
>>>>
>>>> ... which triggers build/fold fusion and also gives linear time 
>>>> complexity:
>>>>
>>>>     yield byte1 >> (yield byte2 >> (yield byte3 >> return ()))
>>>>
>>>> * If you're willing to skip the error message, you can shorten 
>>>> `encodeByte` to:
>>>>
>>>>     encodeByte t = for cat $ \b -> each (b `M.lookup` t)
>>>>
>>>> ... which is the same thing as:
>>>>
>>>>     encodeByte t = Pipes.Prelude.mapFoldable (`M.lookup` t)
>>>>
>>>> * I should probably provide a function that transforms `Parser`s to 
>>>> functions between `Producer`s to simplify your `dirsBytes` code.  I also 
>>>> find myself writing that same pattern way too many times.  I just created 
>>>> an issue to remind myself to do this:
>>>>
>>>> https://github.com/Gabriel439/Haskell-Pipes-Parse-Library/issues/28
>>>>
>>>> -- decode.hs
>>>>
>>>> * Is there any reason why you `drain` unused input using `limit` 
>>>> instead of just using `take` by itself?
>>>>
>>>> * Same thing as `encode`.hs: try using `Pipes.ByteString.pack` and 
>>>> `Pipes.Prelude.mapFoldable Data.ByteString.unpack` for much greater 
>>>> efficiency translating between `Word8`s and `ByteString`s
>>>>
>>>> * You can make the code for `searchPT` more reusable by first defining 
>>>> a `Consumer'` (note the prime!) that produces a single `Direction`, like 
>>>> this:
>>>>
>>>>      searchPT :: forall m. Monad m => PreTree Word8 -> Consumer' 
>>>> Direction m Word8
>>>>     searchPT pt0 = go pt0
>>>>       where
>>>>         go :: PreTree Word8 -> Consumer Direction m Word8
>>>>         go (PTLeaf x      ) = return x
>>>>         go (PTNode pt1 pt2) = do
>>>>             dir <- await
>>>>             go $ case dir of
>>>>                 DLeft  -> pt1
>>>>                 DRight -> pt2
>>>>
>>>> ... and then you can optionally upgrade that to a `Pipe` like this:
>>>>
>>>>     searchPT pt >~ cat :: Pipe Direction Word8 m r
>>>>
>>>> That decouples the logic for parsing one direction from the logic for 
>>>> looping.
>>>>
>>>> * Also, there's nothing `Word8`-specific about your `searchPT` 
>>>> function.  Consider generalizing the type to any value.
>>>>
>>>> * You can simplify the implementation of `dirs` using `mapFoldable`:
>>>>
>>>>     dirs = Pipes.Prelude.mapFoldable byteToDirs
>>>>
>>>> Overall the architecture of your program looks correct.  I don't see 
>>>> any obvious non-idiomatic things that you are doing.
>>>>
>>>> On 5/2/14, 2:43 AM, Justin Le wrote:
>>>>  
>>>> Hi pipes people; 
>>>>
>>>>  I really don't know too much about pipes, but an entire section in a 
>>>> project tutorial I am writing is going to be dedicated to hooking up all 
>>>> of 
>>>> the pipes plumbing together.  Seeing as this might also be possibly used 
>>>> as 
>>>> a pipes tutorial, I just wanted to make sure that my pipes code is 
>>>> idiomatic/not awful/not going to set back your progress by generations. 
>>>>  Does anyone mind maybe giving it a quick look over? :)  I would really 
>>>> appreciate it, and credit will be given where deserved :)  I hope it is 
>>>> not 
>>>> too imposing for me to ask!
>>>>
>>>>  It's actually a pair of programs --- a Huffman compression encoder 
>>>> and decoder.
>>>>
>>>>  The encoder: 
>>>> https://github.com/mstksg/inCode/blob/master/code-samples/huffman/encode.hs
>>>> The decoer: 
>>>> https://github.com/mstksg/inCode/blob/master/code-samples/huffman/decode.hs
>>>>
>>>>  I tried my best to abstract away the actual mechanisms of the huffman 
>>>> logic where I could; it does peak in at some times, but the comments 
>>>> should 
>>>> give you a general high-level idea of what each function is trying to do. 
>>>>  For reference, the series itself explaining the logic is hosted at 
>>>> http://blog.jle.im/entries/series/+huffman-compression
>>>>
>>>>  I am pretty sure that the code gives away my unfamiliarity :)
>>>>
>>>>  Thank you all!
>>>> Justin
>>>>  -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Haskell Pipes" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>>
>>>>
>>>>   -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Haskell Pipes" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>>
>>>
>>>   -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Haskell Pipes" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To post to this group, send email to [email protected].
>>
>>
>>   -- 
> You received this message because you are subscribed to the Google Groups 
> "Haskell Pipes" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> To post to this group, send email to [email protected] 
> <javascript:>.
>
>
>  

-- 
You received this message because you are subscribed to the Google Groups 
"Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Re: [haskell-pipes] Some code review for a pipes tutorial?

Reply via email to