Like Michael mentioned, you want `Pipes.Prelude.fold'`
If you know that your `FreeT` list only has one `Producer`, then you
should encode that in the type by keeping it as a `Producer`. Then the
question becomes how to fold that `Producer` directly instead of folding
it within the context of a `FreeT` list, and that's what the `fold'`
function does: it folds the producer and also preserves the return value:
fold' :: Monad m => (x -> a -> x) -> x -> (x -> b) -> Producer a m
r -> m (b, r)
... or combined with the `foldl` library it would be:
purely fold' :: Monad m => Fold a b -> Producer a m r -> m (b, r)
On 9/24/2015 8:47 PM, Dylan Tisdall wrote:
I have a quick follow-up question, actually; pipes-group defines:
Pipes.Group.folds
:: Monad m
=> (x -> a -> x)
-- ^ Step function
-> x
-- ^ Initial accumulator
-> (x -> b)
-- ^ Extraction function
-> FreeT (Producer a m) m r
-- ^
-> Producer b m r
If I'm reading this right, when my FreeT "list" consists of just one
Producer, then Pipes.Groups.folds returns a Producer that yields one
output, and preserves the original Producer's return type, r, in the
returned Producer. This is in contrast to the similar function
Pipes.Prelude.fold :: Monad m => (x -> a -> x) -> x -> (x -> b) ->
Producer a m () -> m b
which only works on Producers with return type (). You note in the
documentation for Pipes.Prelude.fold that this type is required
because it may stop drawing from the Producer early, so you don't
necessarily get to compute the return type. I'm wondering if it's easy
to define a function
foldToProducer :: Monad m => (x -> a -> x) -> x -> (x -> b) ->
Producer a m r -> Producer b m r
that does what I think Pipes.Group.folds is doing, but without needing
all the FreeT bits as well. As an exercise, I tried to write
foldToProducer, but couldn't figure it out.
Thanks again,
Dylan
On Thursday, September 24, 2015 at 11:20:06 PM UTC-4, Dylan Tisdall
wrote:
Right, I wasn't recognizing that `Producer` was an instance of
`Functor` since it's an instance of `Monad`, so I wasn't even
looking there. Thanks again for all your help!
On Tuesday, September 22, 2015 at 6:56:49 PM UTC-4, Gabriel
Gonzalez wrote:
Use the `void` function from `Control.Monad` if you want to
erase the return type of a `Producer`:
void :: Functor f => f a -> f ()
void = fmap (\_ -> ())
I might even re-export this from `pipes` as a convenience
since this question comes up a lot.
Originally functions like `Pipes.Prelude.length` had a more
general type like this:
Pipes.Prelude.length :: Producer a m r -> m Int
... but then at the advice of others I restricted the type to
this:
Pipes.Prelude.length :: Producer a m () -> m Int
... so that the user would have to explicitly discard the
return value to signal that they were okay with ignoring that
data. This is similar in principle to the warning you get if
you turn on the `-Wall` flag that (among other things) warns
if you have an unused non-empty return value, like this:
example = do
getLine // Compiler warning because you didn't use the
result
...
... and you usually have to explicitly ignore the value using
something like this syntax to indicate that you are
intentionally ignoring the value:
example = do
_ <- getLine
...
So the requirement to explicitly discard the value using
`void` is in the same spirit as that compiler warning.
On 9/22/15 3:50 PM, Dylan Tisdall wrote:
Hi Gabriel,
Thanks again for your help. That really clarified that I
should be using lift to keep everything inside the Producer
transfomer. To make all the types work, I ended up with:
type MDHAndScanLineProducer = P.Producer MDHAndScanLine IO
(Either
(P.DecodingError, P.Producer P.ByteString IO ()) ())
measDatMDHScanLinePairs :: Handle -> MDHAndScanLineProducer
measDatMDHScanLinePairs h = do
(hLen, leftovers) <- lift $ P.runStateT (P.decodeGet
getWord32le) p
case (hLen :: Either P.DecodingError Word32) of
Left err -> return $ Left (err, leftovers)
Right len -> do
lift (hSeek h AbsoluteSeek (fromIntegral len))
view P.decoded p
where
p = PB.fromHandle h
This seems to work exactly as I'd hoped.
As a follow-up, I'm now wondering how to use this producer
and ignore its return type; effectively how to turn it into a
Producer MDHAndScanLine IO (). This seems to be necessary to
access many library functions. For example, I can't use
Pipes.Prelude.length :: Monadm => Producer a m () -> m Int
directly on the output of measDatMDHScanLinePairs because the
return type doesn't match.
Thanks again for all your help as I get up to speed on this!
Dylan
On Monday, September 21, 2015 at 11:43:58 PM UTC-4, Gabriel
Gonzalez wrote:
You're definitely on the right track. The type I would
aim for would be something like this:
example :: Handle -> Producer MDHAndScanLine IO
(Either DecodingError (Producer ByteString IO ()))
Notice that this slightly differs from your type; I'm
merging the outer `IO (Either DecodingError ...)` into
the first `Producer` to simplify the type.
The implementation for that type would be very similar to
the one you wrote in your second e-mail:
example :: Handle -> Producer MDHAndScanLine IO
(Either DecodingError (Producer ByteString IO ()))
example handle = do
let p = Pipes.ByteString.fromHandle handle
x <- lift (evalStateT (decodeGet getWord32le) p)
case x of
Left err -> return (Left err)
Right len -> do
lift (hSeek handle AbsoluteSeek
(fromIntegral l))
view decoded p
That will definitely run in constant memory, meaning that
it won't ever load more than one chunk of bytes at a time
(where a chunk is something like 32 kB, I think). You
can profile the heap if you want to verify this by
following these instructions:
https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/prof-heap.html
<https://downloads.haskell.org/%7Eghc/latest/docs/html/users_guide/prof-heap.html>
Also, to answer your other question, `pipes-attoparsec`
runs in constant memory. The difference between
`pipes-attoparsec` and `attoparsec` is that
`pipes-attoparsec` runs a separate parser for each
element in the stream, which is equivalent to
"committing" after each parsed element. That means that
it can only backtrack while parsing a single element in
the stream, but no further back. This is why
`pipes-attoparsec` runs in constant space over a large
file and why `attoparsec` does not, because `attoparsec`
backtracks indefinitely and `pipes-attoparsec` does not.
On 9/21/15 12:10 PM, Dylan Tisdall wrote:
Following up on my last question, my next issue is also
probably a very straight ahead example of pipes, but
I've managed to get tangled up going back and forth in
the packages' documentation.
I've got a file whose first 4 bytes give the offset into
the file of a series of binary data elements (called
MDHs in my case). Given a Handle to the start of such a
file, I want to:
1. read the first Word32 in the file, to retrieve the
offset;
2. skip the Handle to that offset; and
3. turn the rest of the file into a Producer MDH IO ()
Given that the file I'm reading may be large, I want to
make sure this process is going to run in constant
memory. I thought I could use pipes-attoparsec, but I
couldn't get straight whether it would need to read the
whole file before it could produce anything (as I
understand is normally the case with attoparsec).
So far I have the following, which isn't complete, but
at least does the skip and converts the remaining file
to a ByteString producer.
|
handleToMDHs ::Handle->IO
(EitherP.DecodingError(P.ProducerP.ByteStringIO ()))
handleToMDHs h =do
hLen <-P.evalStateT (P.decodeGet
getWord32le)(PB.fromHandle h)
case(hLen ::EitherP.DecodingErrorWord32)of
Lefterr ->return$ Lefterr
Rightlen ->fmap Right(skipAndProceed h len)
where
skipAndProceed ::Handle->Word32->IO
(P.ProducerP.ByteStringIO ())
skipAndProceed handle l =do
(hSeek handle AbsoluteSeek)(fromIntegral l)
return$ PB.fromHandle handle
|
My MDH type is an instance of Binary, so there is a get
method available. I'm wondering:
a) What's the right way to turn this into a Producer of
MDHs instead of a Producer of ByteStrings while
operating in constant memory?
b) Is there a more elegant way to deal with error
handling here? I'm not even dealing with possible
failure in hSeek, and I already think this looks pretty
messy. I'm not wedded to my function type being
|
handleToMDHs ::Handle->IO
(EitherP.DecodingError(P.ProducerMDH IO ()))
|
I just am not sure how else to express the possibility
of failure in this kind of operation.
Thanks,
Dylan
--
You received this message because you are subscribed to
the Google Groups "Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails
from it, send an email to haskell-pipe...@googlegroups.com.
To post to this group, send email to
haskel...@googlegroups.com.
--
You received this message because you are subscribed to the
Google Groups "Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to haskell-pipe...@googlegroups.com.
To post to this group, send email to haskel...@googlegroups.com.
--
You received this message because you are subscribed to the Google
Groups "Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to haskell-pipes+unsubscr...@googlegroups.com
<mailto:haskell-pipes+unsubscr...@googlegroups.com>.
To post to this group, send email to haskell-pipes@googlegroups.com
<mailto:haskell-pipes@googlegroups.com>.
--
You received this message because you are subscribed to the Google Groups "Haskell
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to haskell-pipes+unsubscr...@googlegroups.com.
To post to this group, send email to haskell-pipes@googlegroups.com.