Thanks Gabriel, * I don't think it's under high load; I'll investigate a bit about the load times. It sometimes was slow for me but a minute is a bit alarming indeed.
* Ah more lens to learn :) I'll admit that I learned more about lens while researching this topic than I had ever before. What is the timeframe for the next `pipes-bytestring` release? If it's not planned any time soon I can just leave a footnote with that comment about future compatibility and better options. * Thanks for the heads up about `pack`, I'll look into the impure version. * It actually is all on github :) https://github.com/mstksg/inCode/tree/master/code-samples/huffman but I should probably mention that explicitly before I put it up. Thanks again for the prompt reply. Hope I can do your great library justice. Cheers! Justin On Thursday, June 19, 2014 4:47:52 PM UTC-7, Gabriel Gonzalez wrote: > > Comments: > > * Your site took a minute to load the page. Is it under high load? > > * I'd rephrase point #1 under "Design" as "A component that emits > `ByteString`s read from a file ..." > > * I don't like the use of `error`, but if you do choose to use it I would > recommend instead pattern matching on `Just` instead, which will give a > more informative line number if if throws an error. > > * If you can wait for me to release the next `pipes-bytestring`, I plan to > provide `unpack`, which is the reverse of `pack`. Basically the new > version of `pipes-bytestring` is downgrading every `Iso` to a `Lens` and > providing the reverse `Lens` so that people can get the flipped versions > without the `lens` dependency. That means you will have: > > pack :: Monad m => Lens' (Producer Word8 m r) (Producer > ByteString m r) > > unpack :: Monad m => Lens' (Producer ByteString m r) (Producer > Word8 m r) > > Using the `unpack` lens, you can simplify your code to: > > over unpack transform (PB.fromhandle hIn) > where > transform bytes = dirsBytes (bytes >-> encodeByte encTable) > > You can read that as saying: "Apply `transform` over an unpacked view of > `fromHandle hIn`). Under the hood, it takes care of unpacking the byte > stream, applying the transformation, and then repacking the stream for you. > > * `pack` might be the culprit for poor performance. There is a much more > efficient way to pack a `ByteString`, but it requires an `IO` base monad > and an impure `PG.foldsM` call > > * I'd also recommend putting everything into a Github repository so that > people can download it and test it out themselves. > > Other than that, it looks good. The code is pretty idiomatic. > > On 6/19/14, 10:06 AM, Justin Le wrote: > > Hi, > > It's been about a month exactly since your last reply, but now that > school is over I've found time to finish the post; if you had time, would > you mind helping me make sure it is correct and useful for people trying to > learn lens? it's online at > http://blog.jle.im/entry/pipes-streaming-huffman-compression-in-haskell-part-3 > > > (it only slightly builds on some non-pipes material from the previous > two parts of the tutorial.) I'd greatly appreciate any help; my goal is to > hopefully help with the availability of basic pipes projects available > online and I'd hate to be spreading misinformation or bad intuitions. > > Thanks again :) > Justin > > On Tuesday, May 20, 2014 6:44:29 PM UTC-7, Gabriel Gonzalez wrote: >> >> `pack` is defined like this: >> >> pack = dimap to (fmap from) >> >> ... which is equivalent to: >> >> pack = iso to from >> >> When you apply `view` to an isomorphism, it gives you the forward >> direction: >> >> view (iso to from) = to >> >> That means you can narrow the problem down to understanding how `to` >> works, which is defined in this case as: >> >> to p = Pipes.Group.folds step id done (p ^. Pipes.Group.chunksOf >> defaultChunkSize) >> >> step diffAs w8 = diffAs . (w8:) >> >> done diffAs = BS.pack (diffAs []) >> >> This is the same thing as: >> >> purely folds (fmap BS.pack list) (p ^. Pipes.Group.chunksOf >> defaultChunkSize) >> >> In other words, it's splitting up the stream of `Word8`s into groups of >> `defaultChunkSize`, folding each group into a list, and then `pack`ing that >> list into a `ByteString`. >> >> To learn more about how `chunksOf` and `folds` work, you can read the >> `pipes-group` tutorial here: >> >> >> http://hackage.haskell.org/package/pipes-group-1.0.0/docs/Pipes-Group-Tutorial.html >> >> You might also find this blog post useful, which is on a similar subject: >> >> >> http://www.haskellforall.com/2013/09/perfect-streaming-using-pipes-bytestring.html >> >> On 5/20/14, 11:22 AM, Justin Le wrote: >> >> Ah, that makes sense. >> >> You're right, I was removing the wrong runEffect...and I see why it was >> unnecessary :) >> >> One last thing I am stuck on; what exactly does `view` do in this >> context? And how does it know when to properly chunk up ByteStrings, as to >> keep constant space? >> >> On Saturday, May 10, 2014 7:40:52 AM UTC-7, Gabriel Gonzalez wrote: >>> >>> >>> On 05/09/2014 09:21 PM, Justin Le wrote: >>> >>> Thanks Gabriel! This was very helpful :) >>> >>> I've updated the repo with my modifications, and I had a couple of >>> comments. >>> >>> 1. Deleting `runEffect` appears to bring about a type error, trying to >>> unify the Proxy type with IO (). Did I do something wrong here? >>> >>> >>> Make sure you are deleting the correct `runEffect` (from encode.hs). >>> The result of `freqs` is not an `Effect`, so you don't need to run it. >>> >>> 2. Is there anywhere I can read up on Consumer' and (>~)? I sort of >>> have been avoided using them because I don't fully understand the >>> differences between the (>->) category and the (>~) category, actually. >>> >>> >>> The English explanation of what `(>~)` does is that `p' >~ p` replaces >>> every `await` in `p` with `p'`. So, for example, when you write: >>> >>> p' >~ cat >>> >>> That's equivalent to: >>> >>> forever $ do >>> a <- p' >>> yield a >>> >>> In other words, `p'` gets reused in its entirety every time the >>> downstream pipe `await`s. >>> >>> This behavior can also be specified more formally using these laws: >>> >>> -- (p' >~) is a monad morphism >>> p' >~ (do >>> x <- m >>> f x ) >>> = do >>> x <- p' >~ m >>> p' >~ f x >>> >>> p' >~ return r = return >>> >>> -- `await` is the right-identity of `(>~)` >>> p' >~ await = p' >>> >>> -- The next two equations are free theorems >>> p' >~ yield x = yield x >>> >>> p' >~ lift m = lift m >>> >>> So, using the example of `p' >~ cat`, you can evaluate it like so: >>> >>> p' >~ cat >>> >>> -- Definition of `cat` >>> = p' >~ (forever $ do >>> a <- await >>> yield a ) >>> >>> -- Monad morphisms distribute over `forever` >>> = forever $ p' >~ (do >>> a <- await >>> yield a ) >>> >>> -- Monad morphism distributes over bind >>> = forever $ do >>> a <- p' >~ await >>> p' >~ yield a >>> >>> -- p' >~ await = p' >>> = forever $ do >>> a <- p' >>> p' >~ yield a >>> >>> -- p' >~ yield a = yield a >>> = forever $ do >>> a <- p' >>> yield a >>> >>> It may also help to read the "Consumers" section of the `pipes` tutorial: >>> >>> >>> http://hackage.haskell.org/package/pipes-4.1.1/docs/Pipes-Tutorial.html#g:4 >>> >>> >>> Consumer' is just Consumer, but with the output not technically >>> "closed" off for good (just effectively), right? And how does (>~ cat) >>> turn it into a Pipe? >>> >>> Thank you again! >>> >>> Justin >>> >>> On Friday, May 2, 2014 4:06:07 PM UTC-7, Gabriel Gonzalez wrote: >>>> >>>> This is the perfect kind of question to post to the mailing list! >>>> >>>> I will go down the two programs and make minor comments and then review >>>> their overall structure. >>>> >>>> -- encode.hs >>>> >>>> * Delete `runEffect`. It's not doing anything. The reason that it >>>> still type-checked was because your base monad was polymorphic over >>>> `MonadIO`, so it let you accidentally insert an additional `Pipe` layer >>>> (which was not doing anything). As a side note, I think I made a mistake >>>> by parametrizing the `Pipes.Prelude` utilities over `MonadIO` (I prefer >>>> using `hoist` now), but I don't want to make a breaking change to fix it. >>>> >>>> * Good use of `withFile` instead of `pipes-safe`. I feel like too many >>>> people unnecessarily use `pipes-safe` when `withFile` suffices. >>>> >>>> * Use `view Pipes.ByteString.pack p` instead of `p >-> PP.map >>>> B.singleton`. It will group your Word8's into a more efficient chunk >>>> size. Your current formulation will call a separate write command for >>>> every single byte, which is very inefficient. >>>> >>>> * For the reverse direction (i.e. `bytes`), you can either: >>>> >>>> A) Use `view (from Pipes.ByteString.pack)`, but that requires a `lens` >>>> dependency (which I think is not good). I plan to fix that by providing >>>> an >>>> `unpack` lens in an upcoming `pipes-bytestring` release. I created an >>>> issue for this: >>>> >>>> https://github.com/Gabriel439/Haskell-Pipes-ByteString-Library/issues/36 >>>> >>>> B) Use `mapFoldable`: >>>> >>>> bytes = Pipes.Prelude.mapFoldable BS.unpack >>>> >>>> That's much more efficient. The problem with your `bytes` function is >>>> that it uses `foldl`, which triggers a bunch of left-associated binds, >>>> generating quadratic time complexity in the number of bytes: >>>> >>>> ((((return ()) >> yield byte1) >> yield byte2) >> yield byte3 >>>> >>>> `mapFoldable`, on the other hand, is implemented in terms of `each`, >>>> which uses a right-fold like this: >>>> >>>> each = Data.Foldable.foldr (\a p -> yield a >> p) (return ()) >>>> >>>> ... which triggers build/fold fusion and also gives linear time >>>> complexity: >>>> >>>> yield byte1 >> (yield byte2 >> (yield byte3 >> return ())) >>>> >>>> * If you're willing to skip the error message, you can shorten >>>> `encodeByte` to: >>>> >>>> encodeByte t = for cat $ \b -> each (b `M.lookup` t) >>>> >>>> ... which is the same thing as: >>>> >>>> encodeByte t = Pipes.Prelude.mapFoldable (`M.lookup` t) >>>> >>>> * I should probably provide a function that transforms `Parser`s to >>>> functions between `Producer`s to simplify your `dirsBytes` code. I also >>>> find myself writing that same pattern way too many times. I just created >>>> an issue to remind myself to do this: >>>> >>>> https://github.com/Gabriel439/Haskell-Pipes-Parse-Library/issues/28 >>>> >>>> -- decode.hs >>>> >>>> * Is there any reason why you `drain` unused input using `limit` >>>> instead of just using `take` by itself? >>>> >>>> * Same thing as `encode`.hs: try using `Pipes.ByteString.pack` and >>>> `Pipes.Prelude.mapFoldable Data.ByteString.unpack` for much greater >>>> efficiency translating between `Word8`s and `ByteString`s >>>> >>>> * You can make the code for `searchPT` more reusable by first defining >>>> a `Consumer'` (note the prime!) that produces a single `Direction`, like >>>> this: >>>> >>>> searchPT :: forall m. Monad m => PreTree Word8 -> Consumer' >>>> Direction m Word8 >>>> searchPT pt0 = go pt0 >>>> where >>>> go :: PreTree Word8 -> Consumer Direction m Word8 >>>> go (PTLeaf x ) = return x >>>> go (PTNode pt1 pt2) = do >>>> dir <- await >>>> go $ case dir of >>>> DLeft -> pt1 >>>> DRight -> pt2 >>>> >>>> ... and then you can optionally upgrade that to a `Pipe` like this: >>>> >>>> searchPT pt >~ cat :: Pipe Direction Word8 m r >>>> >>>> That decouples the logic for parsing one direction from the logic for >>>> looping. >>>> >>>> * Also, there's nothing `Word8`-specific about your `searchPT` >>>> function. Consider generalizing the type to any value. >>>> >>>> * You can simplify the implementation of `dirs` using `mapFoldable`: >>>> >>>> dirs = Pipes.Prelude.mapFoldable byteToDirs >>>> >>>> Overall the architecture of your program looks correct. I don't see >>>> any obvious non-idiomatic things that you are doing. >>>> >>>> On 5/2/14, 2:43 AM, Justin Le wrote: >>>> >>>> Hi pipes people; >>>> >>>> I really don't know too much about pipes, but an entire section in a >>>> project tutorial I am writing is going to be dedicated to hooking up all >>>> of >>>> the pipes plumbing together. Seeing as this might also be possibly used >>>> as >>>> a pipes tutorial, I just wanted to make sure that my pipes code is >>>> idiomatic/not awful/not going to set back your progress by generations. >>>> Does anyone mind maybe giving it a quick look over? :) I would really >>>> appreciate it, and credit will be given where deserved :) I hope it is >>>> not >>>> too imposing for me to ask! >>>> >>>> It's actually a pair of programs --- a Huffman compression encoder >>>> and decoder. >>>> >>>> The encoder: >>>> https://github.com/mstksg/inCode/blob/master/code-samples/huffman/encode.hs >>>> The decoer: >>>> https://github.com/mstksg/inCode/blob/master/code-samples/huffman/decode.hs >>>> >>>> I tried my best to abstract away the actual mechanisms of the huffman >>>> logic where I could; it does peak in at some times, but the comments >>>> should >>>> give you a general high-level idea of what each function is trying to do. >>>> For reference, the series itself explaining the logic is hosted at >>>> http://blog.jle.im/entries/series/+huffman-compression >>>> >>>> I am pretty sure that the code gives away my unfamiliarity :) >>>> >>>> Thank you all! >>>> Justin >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Haskell Pipes" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Haskell Pipes" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "Haskell Pipes" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> >> >> -- > You received this message because you are subscribed to the Google Groups > "Haskell Pipes" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To post to this group, send email to [email protected] > <javascript:>. > > > -- You received this message because you are subscribed to the Google Groups "Haskell Pipes" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected].
