Re: [haskell-pipes] Producers, Purity and Resumability

Gabriel Gonzalez Thu, 30 Jan 2014 18:21:00 -0800

Now the correct API is that the user should supply a `Parser ByteStringIO ()`, to ensure that they don't need to keep track of unused input.For example, if the user just wanted to save the first 10 lines of aresponse body to some handle, they could write the following `example`parser:


    import Control.Monad (join)
    import Data.ByteString (ByteString)
    import Lens.Family2
    import Lens.Family2.Unchecked (iso)
    import Lens.Family2.State.Strict (zoom)
    import Pipes
    import Pipes.Parse
    import qualified Pipes.ByteString as PB
    import System.IO (Handle)

-- This reminds me that I need to provide a function to convert'Consumer's to

    -- their equivalent 'Parser's
    toHandle' :: Handle -> Parser ByteString IO ()
    toHandle' h = StateT $ \p -> do
        r <- runEffect (p >-> PB.toHandle h)
        return ((), return r)

    -- Here's an example for how to define a custom parsing lens
    maxLines
        :: Monad m
        => Int
        -> Lens' (Producer ByteString m x)
                 (Producer ByteString m (Producer ByteString m x))
    maxLines n = iso (cut n) join
      where
        cut 0 p = return p
        cut n p = p ^. PB.line >>= cut (n - 1)

    example :: Handle -> Parser ByteString IO ()
    example h = zoom (maxLines 10) (toHandle' h)

That will run in constant space and automatically handle unused inputfor the user.

Note that the implementation of `cut` from `maxLines` could besimplified further, but at the expense of complicating the type of`Pipes.ByteString.lines` (by generalizing that `Iso'` to afour-parameter `Iso`). I'm a little bit reluctant to complicate thattype signature, though.


On 01/31/2014 04:33 AM, Jeremy Shaw wrote:

For an HTTP server, we want the user to be able to supply a handler
that optionally consumes the request body.

We could (and I have even implemented) a system where you have a
Request type like:


data RequestF next = RequestF
     { requestHead   :: Request
     , requestBody   :: Producer ByteString IO next
     } deriving (Functor)

type Requests = FreeT RequestF IO ()

And in this system, the user supplied handler is responsible for
draining the the current request body and returning the remainder of
the Request stream. But this seems to put a lot of trust and hassle on
the poor end user.

  -jeremy


On Thu, Jan 30, 2014 at 3:12 PM, Carter Schonwald
<[email protected]> wrote:

What's a good minimal example of this resumption problem?  Couldn't a
functional approach be that running a pipe till it returns a value also
returns a continuation? In the style of attoparsec and friends?


On Thursday, January 30, 2014, Jeremy Shaw <[email protected]> wrote:

Yes -- io-stream forces everything to be done in the IO monad and uses
hidden IORefs. conduit also uses hidden IORefs for resumable streams.

But is that really the best choice?

- jeremy

On Thu, Jan 30, 2014 at 2:12 PM, Carter Schonwald
<[email protected]> wrote:

I think this precise issue is why the snap server http parser tooling
uses
the iostreams lib!


On Thu, Jan 30, 2014 at 1:02 PM, Jeremy Shaw <[email protected]>
wrote:

I have been thinking about what it means to run a 'Producer'
twice. Specifically -- whether the Producer resumes where it left of
or not. I think that in general the behavior is undefined. I feel like
this has not been explicitly stated much -- so I am going to say it
now. In some sense, it should be obvious -- but when peering through
  the haze of Pipes, StateT, and IO, the simple things can get lost.

Consider two different cases:

  1. a producer that produces values from a pure list

  2. a producer that produces values from a network connection


If we run the first producer twice we will get the same answer each
time. If we run the second producer twice -- we will likely get
different results -- depending on what data is available from the
network stream.

Now -- that is not entirely surprising -- one value is pure and one is
based on IO. So that is no different than calling a normal pure
function versus a normal IO function.

But -- I think it can be easy to forget that when writing pipes
code. Imagine we write some pipes code that processes a network stream
-- and it relies on the fact that the network Producer automatically
resumes from where it left off.

Now, let's pretend we want to test our code. So we create a pure
Producer that produces the same bytestring that the network pipe was
producing. Alas, our code will not work because the pure Producer does
not automatically resume when called multiple times.

I think this means that we must assume, by default, that the Producer
does not have resumable behavior. If we want to write code that relies
on the resumable behavior -- then we must explictly ensure that it
happens.

In pipes-parse the resumability is handled by storing the 'Producer'
in 'StateT'.

Another alternative is to use an 'IORef'. I have an example of the
'IORef' solution below.

module Main where
import Data.IORef             (IORef(..), newIORef, readIORef,
writeIORef)
import           Pipes
import qualified Pipes.Prelude as P

Here is our pure Producer:

pure10 :: (Monad m) => Producer Int m ()
pure10 = mapM_ yield [1..10]

And here is a function which uses a Producer twice.

take5_twice :: Show a => Producer a IO () -> IO ()
take5_twice p =
     do runEffect $ p >-> P.take 5 >-> P.print
        putStrLn "<<Intermission>>"
        runEffect $ p >-> P.take 5 >-> P.print

Note that we have limited ability reason about the results since we do
not know if the 'Producer' is resumable or not.

If we run 'take5_twice' using our pure Producer:

pure10_test :: IO ()
pure10_test =
     take5_twice pure10

it will restart from 1 each time:

     > pure10_test
     1
     2
     3
     4
     5
     <<Intermission>>
     1
     2
     3
     4
     5

Here is a (not very generalized) function that uses an 'IORef' to
store the current position in the 'Producer' -- similar to how
'StateT' works:

--
You received this message because you are subscribed to the Google Groups
"Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
To post to this group, send email to [email protected].


--
You received this message because you are subscribed to the Google Groups "Haskell 
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Re: [haskell-pipes] Producers, Purity and Resumability

Reply via email to