The most efficienty way to consume it line by line is to (paradoxically) not consume it line by line.

The `pipes` ecosystem does not expose an efficient way of concatenating chunks into lines because `pipes` tries to guarantee an upper bound on memory usage and there is no upper bound on how large a line can be.

To get an example of how to do line-based processing idiomatically in `pipes`, read this post about processing large lines in constant space:

http://www.haskellforall.com/2013/09/perfect-streaming-using-pipes-bytestring.html

... and also read the `pipes-group` tutorial after that:

http://hackage.haskell.org/package/pipes-group-1.0.2/docs/Pipes-Group-Tutorial.html

I can point you to more specific functions if you describe what you want to do with each line.

On 8/20/2015 4:43 AM, Alexey Raga wrote:
Hi,

I have a huge file (~40M rows) in a custom format where each line represents a data type, so I want to process it line-by-line.

this code runs very fast (~20 seconds):

    import Pipes.ByteString as BS

    runEffect $ BS.stdin >-> BS.stdout


while this one runs much slower (>2 minutes to execute):

    bslines :: (MonadIO m) => Producer ByteString m ()
    bslines = purely folds mconcat . view BS.lines $ BS.stdin

    main :: IO ()
    main = runEffect $ bslines >-> BS.stdout

Why does it happen? And what would be the fastest way to consume a file line-by-line? To compare, consuming the same file in Node.js line-by-line takes ~40 seconds, how can similar results be achieved?

Regards,
Alexey.
--
You received this message because you are subscribed to the Google Groups "Haskell Pipes" group. To unsubscribe from this group and stop receiving emails from it, send an email to haskell-pipes+unsubscr...@googlegroups.com <mailto:haskell-pipes+unsubscr...@googlegroups.com>. To post to this group, send email to haskell-pipes@googlegroups.com <mailto:haskell-pipes@googlegroups.com>.

--
You received this message because you are subscribed to the Google Groups "Haskell 
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to haskell-pipes+unsubscr...@googlegroups.com.
To post to this group, send email to haskell-pipes@googlegroups.com.

Reply via email to