I recently had reason to buffer up chunks of streamed data and coalesce them
into strict ByteStrings that are at least a given buffer size. Splitting up
chunks of data is not allowed.

The data flow looks like this (reading top to bottom):

   Big File      (Lazy ByteString)
                ▼
   Parsed Chunks (Int, Builder)
                ▼
   ByteStrings   (Strict ByteString)
                ▼
   Network       (sent via ZeroMQ)

Which lead to this function:

  -- Take a producer of (Int, Builder), where Int is the number of bytes in the
  -- builder and produce chunks of n bytes.

  chunkBuilder :: Monad m
               => Producer (Int, Builder) m r
               -> Producer S.ByteString m r

Which uses, internally:

  builderChunks :: Monad m
                => Int
                -- ^ The size to split a stream of builders at
                -> Producer (Int, Builder) m r
                -- ^ The input producer
                -> FreeT (Producer Builder m) m r
                -- ^ The FreeT delimited chunks of that producer, split into
                -- the desired chunk length

So it just splits the Producer into groups, and folds them back together with
mappend. This works exactly as expected, lovely, however it's quite verbose and
possibly over-complicated?

The actual implementation can be found here:
https://github.com/anchor/marquise/blob/master/lib/Marquise/Server.hs

By the way, can anyone tell me how to get the type annotation on "go" to
typecheck?

My question: It seems like builderChunks could be re-usable. Possibly
             implemented as a lens similar to chunksOf?

  --| someAwesomeName is a lens that splits a Producer into a FreeT group of
  --  Producers of at least the minimum size provided.
  someAwesomeName :: Monad m
                  => Int
                  -- ^ Minimum size
                  -> Lens' (Producer (Int, a) m x) (FreeT (Producer a m) m x)

Another question: Would it be reasonable to expose the splitting part of the
lenses in your pipes-parse into separate functions? I.e.

  chunksOf
      :: Monad m => Int -> Lens' (Producer a m x) (FreeT (Producer a m) m x)

  chunksOf' :: Monad m => Int -> Producer a m x -> FreeT (Producer a m) m x

This would be convenient for users wanting to avoid lens as a dependency.

-- 
Christian Marie - Sparkly Code Princess

Attachment: pgplU9umyzuK6.pgp
Description: PGP signature

Reply via email to