This is what the `pipes-group` library was designed to handle: you can split a stream into groups without having to fully materialize each group. If you are new to `pipes-group`, you should check out the tutorial here:

http://hackage.haskell.org/package/pipes-group-1.0.2/docs/Pipes-Group-Tutorial.html

... but I'll summarize the parts relevant to your problem.

First, you need a way to split `theFiles` into groups of 20 elements. The relevant utility from `pipes-group` is the `chunksOf` lens, which you would use like this:

    view (chunksOf 20) theFiles :: FreeT (Producer ByteString IO) IO ()

That creates a "linked list" of `Producer`s and each `Producer` contains 20 `ByteString`s except for the last one which contains up to 20 elements.

Then you need a way to reduce each `Producer` to write out to a single file. The most direct way to do this is to just explicitly recurse over the linked list of `Producer`s to extract one `Producer` at a time. You can "pattern match" on the head of the "linked list" by using the `runFreeT` function:

    loop :: Int -> FreeT (Producer ByteString IO) IO () -> IO ()
    loop n f = do
        x <- runFreeT f
        case x of
            -- No more `Producer`s left, we're done
            -- Think of this case as analogous to `Nil`
            Pure () -> return ()

            -- Found a `Producer`, process it
            -- Think of this case as analogous to `Cons`
-- The `Producer`'s return value is the rest of the "linked list"
            Free p -> do
-- p :: Producer ByteString IO (FreeT (Producer ByteString IO) IO ())
                f' <- withFile ("output" <> show n) ReadMode (\handle ->
for p (\bytestring -> liftIO (processFile handle bytestring)) )

                -- f' :: FreeT (Producer ByteString IO) IO ()
                loop (n + 1) f'

... and once you have that then the final solution is:

    loop 0 (view (chunksOf 20) theFiles)

There are some higher-order combinators that you can find in `Control.Monad.Trans.Free` (from the `free` package) for folding a `FreeT` data structure like `iter` and `iterT` and there are also the `folds`/`foldsM` utilities from `pipes-group` as well. However, I think in this particular case the simplest approach is just explicit manual recursion.

On 8/14/2015 4:49 PM, Erik Rantapaa wrote:
I'm trying to figure out how to process a large number of files (ByteStrings) in groups of, say 20, with the results of each group going to a different output file.

For instance, I have:

    theFiles :: Producer ByteString m r
    processFile :: Handle -> ByteString -> IO ()

and for the first 20 ByteStrings I want the handle passed to processFile to be opened to "output-1", and for the next 20 it should be opened to "output-2", etc.

I'm sure I could write it in a very mundane fashion using Pipes.foldM, but I'm sure there is a better way.

Thanks!

--
You received this message because you are subscribed to the Google Groups "Haskell Pipes" group. To unsubscribe from this group and stop receiving emails from it, send an email to haskell-pipes+unsubscr...@googlegroups.com <mailto:haskell-pipes+unsubscr...@googlegroups.com>. To post to this group, send email to haskell-pipes@googlegroups.com <mailto:haskell-pipes@googlegroups.com>.

--
You received this message because you are subscribed to the Google Groups "Haskell 
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to haskell-pipes+unsubscr...@googlegroups.com.
To post to this group, send email to haskell-pipes@googlegroups.com.

Reply via email to