Re: [haskell-pipes] folding over groups?

Gabriel Gonzalez Sun, 16 Aug 2015 07:50:51 -0700

This is what the `pipes-group` library was designed to handle: you cansplit a stream into groups without having to fully materialize eachgroup. If you are new to `pipes-group`, you should check out thetutorial here:


http://hackage.haskell.org/package/pipes-group-1.0.2/docs/Pipes-Group-Tutorial.html


... but I'll summarize the parts relevant to your problem.

First, you need a way to split `theFiles` into groups of 20 elements.The relevant utility from `pipes-group` is the `chunksOf` lens, whichyou would use like this:


    view (chunksOf 20) theFiles :: FreeT (Producer ByteString IO) IO ()

That creates a "linked list" of `Producer`s and each `Producer` contains20 `ByteString`s except for the last one which contains up to 20 elements.

Then you need a way to reduce each `Producer` to write out to a singlefile. The most direct way to do this is to just explicitly recurse overthe linked list of `Producer`s to extract one `Producer` at a time. Youcan "pattern match" on the head of the "linked list" by using the`runFreeT` function:


    loop :: Int -> FreeT (Producer ByteString IO) IO () -> IO ()
    loop n f = do
        x <- runFreeT f
        case x of
            -- No more `Producer`s left, we're done
            -- Think of this case as analogous to `Nil`
            Pure () -> return ()

            -- Found a `Producer`, process it
            -- Think of this case as analogous to `Cons`

-- The `Producer`'s return value is the rest of the "linkedlist"

            Free p -> do

-- p :: Producer ByteString IO (FreeT (ProducerByteString IO) IO ())

                f' <- withFile ("output" <> show n) ReadMode (\handle ->

for p (\bytestring -> liftIO (processFile handlebytestring)) )


                -- f' :: FreeT (Producer ByteString IO) IO ()
                loop (n + 1) f'

... and once you have that then the final solution is:

    loop 0 (view (chunksOf 20) theFiles)

There are some higher-order combinators that you can find in`Control.Monad.Trans.Free` (from the `free` package) for folding a`FreeT` data structure like `iter` and `iterT` and there are also the`folds`/`foldsM` utilities from `pipes-group` as well. However, I thinkin this particular case the simplest approach is just explicit manualrecursion.


On 8/14/2015 4:49 PM, Erik Rantapaa wrote:

I'm trying to figure out how to process a large number of files(ByteStrings) in groups of, say 20, with the results of each groupgoing to a different output file.
For instance, I have:

    theFiles :: Producer ByteString m r
    processFile :: Handle -> ByteString -> IO ()
and for the first 20 ByteStrings I want the handle passed toprocessFile to be opened to "output-1", and for the next 20 it shouldbe opened to "output-2", etc.
I'm sure I could write it in a very mundane fashion using Pipes.foldM,but I'm sure there is a better way.
Thanks!

--
You received this message because you are subscribed to the GoogleGroups "Haskell Pipes" group.To unsubscribe from this group and stop receiving emails from it, sendan email to haskell-pipes+unsubscr...@googlegroups.com<mailto:haskell-pipes+unsubscr...@googlegroups.com>.To post to this group, send email to haskell-pipes@googlegroups.com<mailto:haskell-pipes@googlegroups.com>.


--
You received this message because you are subscribed to the Google Groups "Haskell 
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to haskell-pipes+unsubscr...@googlegroups.com.
To post to this group, send email to haskell-pipes@googlegroups.com.

Re: [haskell-pipes] folding over groups?

Reply via email to