This is what the `pipes-group` library was designed to handle: you can
split a stream into groups without having to fully materialize each
group. If you are new to `pipes-group`, you should check out the
tutorial here:
http://hackage.haskell.org/package/pipes-group-1.0.2/docs/Pipes-Group-Tutorial.html
... but I'll summarize the parts relevant to your problem.
First, you need a way to split `theFiles` into groups of 20 elements.
The relevant utility from `pipes-group` is the `chunksOf` lens, which
you would use like this:
view (chunksOf 20) theFiles :: FreeT (Producer ByteString IO) IO ()
That creates a "linked list" of `Producer`s and each `Producer` contains
20 `ByteString`s except for the last one which contains up to 20 elements.
Then you need a way to reduce each `Producer` to write out to a single
file. The most direct way to do this is to just explicitly recurse over
the linked list of `Producer`s to extract one `Producer` at a time. You
can "pattern match" on the head of the "linked list" by using the
`runFreeT` function:
loop :: Int -> FreeT (Producer ByteString IO) IO () -> IO ()
loop n f = do
x <- runFreeT f
case x of
-- No more `Producer`s left, we're done
-- Think of this case as analogous to `Nil`
Pure () -> return ()
-- Found a `Producer`, process it
-- Think of this case as analogous to `Cons`
-- The `Producer`'s return value is the rest of the "linked
list"
Free p -> do
-- p :: Producer ByteString IO (FreeT (Producer
ByteString IO) IO ())
f' <- withFile ("output" <> show n) ReadMode (\handle ->
for p (\bytestring -> liftIO (processFile handle
bytestring)) )
-- f' :: FreeT (Producer ByteString IO) IO ()
loop (n + 1) f'
... and once you have that then the final solution is:
loop 0 (view (chunksOf 20) theFiles)
There are some higher-order combinators that you can find in
`Control.Monad.Trans.Free` (from the `free` package) for folding a
`FreeT` data structure like `iter` and `iterT` and there are also the
`folds`/`foldsM` utilities from `pipes-group` as well. However, I think
in this particular case the simplest approach is just explicit manual
recursion.
On 8/14/2015 4:49 PM, Erik Rantapaa wrote:
I'm trying to figure out how to process a large number of files
(ByteStrings) in groups of, say 20, with the results of each group
going to a different output file.
For instance, I have:
theFiles :: Producer ByteString m r
processFile :: Handle -> ByteString -> IO ()
and for the first 20 ByteStrings I want the handle passed to
processFile to be opened to "output-1", and for the next 20 it should
be opened to "output-2", etc.
I'm sure I could write it in a very mundane fashion using Pipes.foldM,
but I'm sure there is a better way.
Thanks!
--
You received this message because you are subscribed to the Google
Groups "Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to haskell-pipes+unsubscr...@googlegroups.com
<mailto:haskell-pipes+unsubscr...@googlegroups.com>.
To post to this group, send email to haskell-pipes@googlegroups.com
<mailto:haskell-pipes@googlegroups.com>.
--
You received this message because you are subscribed to the Google Groups "Haskell
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to haskell-pipes+unsubscr...@googlegroups.com.
To post to this group, send email to haskell-pipes@googlegroups.com.