Yeah, leftovers handling is a little bit more subtle. First, read this post if you haven’t already, particularly the section titled “Leftovers”:
http://www.haskellforall.com/2014/02/pipes-parse-30-lens-based-parsing.html <http://www.haskellforall.com/2014/02/pipes-parse-30-lens-based-parsing.html> I can also add a more sophisticated example in addition to the one given in the above post. Suppose that you have a pipe that encodes `Text` into `ByteString` named `encoder` that has this rough type: encoder :: Pipe Text ByteString () … which you can freely modify to add in any leftover functionality you want. Now suppose we hook that up in this configuration: consumesBytes :: Pipe ByteString Out () consumesText :: Pipe Text Out () example :: Pipe Text Out () example = do encoder >-> consumesBytes consumesText Now think about what should happen if `consumesBytes` terminates while `encoder` is holding onto a non-empty queue of leftovers that haven’t been used up by `consumesBytes` Well, first off, what type of leftovers would `encoder` be holding onto? In this case the type of leftovers will be `ByteString`s that were returned to `encoder` by the `consumesBytes` `Pipe`. There are two possible things that `encoder` could do with those leftovers upon termination: * (A) Discard the leftovers. However, that means that `consumesText` will begin at the wrong position in the stream * (B) Transform the `ByteString` leftovers into `Text` leftovers and push those further upstream before terminating Option (B) sounds reasonable at first except that there might not be a way to transform the `ByteString` leftovers into `Text` that can be pushed further upstream, for a couple of reasons: * The encoding might not necessarily round-trip * Even if the encoding *did* round-trip (like UTF8), there is nothing that requires that `consumesBytes` consumes bytes only along Unicode character boundaries To elaborate on the latter case, assume that `encoder` received a text chunk containing a single character: "⌘”. If you UTF8-encode that you get three bytes: "e2 8c 98”. If `consumesBytes` only consumes the first byte (i.e. “e2”) then that means that `encoder` is now holding onto two byes in its leftovers queue, “8c 98”, and there’s no longer a way to push those two bytes further upstream as `Text` since they cannot be (correctly) re-encoded as `Text`. Now, if `consumesBytes` terminates there is no legitimate way for `consumesText` to begin where `consumesBytes` left off. > On Mar 4, 2016, at 10:44 AM, Tom Ellis > <tom-lists-haskell-pipes-2...@jaguarpaw.co.uk> wrote: > > I never really grasped what leftovers are. I don't understand why it > wouldn't suffice to have a "pushback pipe" > > pushback :: Proxy a b (Either a b) b m r > > that allows you to push "unused" 'b's back into it, to be stored in a queue. > The next 'b's then extracted from the pushback pipe will be the ones most > recently pushed in. If the queue is empty then we request a 'b' from the > other end. > > Does this make no sense? Are leftovers much more subtle than I am > realising? > > Thanks, > > Tom > > -- > You received this message because you are subscribed to the Google Groups > "Haskell Pipes" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to haskell-pipes+unsubscr...@googlegroups.com. > To post to this group, send email to haskell-pipes@googlegroups.com. -- You received this message because you are subscribed to the Google Groups "Haskell Pipes" group. To unsubscribe from this group and stop receiving emails from it, send an email to haskell-pipes+unsubscr...@googlegroups.com. To post to this group, send email to haskell-pipes@googlegroups.com.