The most efficienty way to consume it line by line is to (paradoxically)
not consume it line by line.
The `pipes` ecosystem does not expose an efficient way of concatenating
chunks into lines because `pipes` tries to guarantee an upper bound on
memory usage and there is no upper bound on how large a line can be.
To get an example of how to do line-based processing idiomatically in
`pipes`, read this post about processing large lines in constant space:
http://www.haskellforall.com/2013/09/perfect-streaming-using-pipes-bytestring.html
... and also read the `pipes-group` tutorial after that:
http://hackage.haskell.org/package/pipes-group-1.0.2/docs/Pipes-Group-Tutorial.html
I can point you to more specific functions if you describe what you want
to do with each line.
On 8/20/2015 4:43 AM, Alexey Raga wrote:
Hi,
I have a huge file (~40M rows) in a custom format where each line
represents a data type, so I want to process it line-by-line.
this code runs very fast (~20 seconds):
import Pipes.ByteString as BS
runEffect $ BS.stdin >-> BS.stdout
while this one runs much slower (>2 minutes to execute):
bslines :: (MonadIO m) => Producer ByteString m ()
bslines = purely folds mconcat . view BS.lines $ BS.stdin
main :: IO ()
main = runEffect $ bslines >-> BS.stdout
Why does it happen? And what would be the fastest way to consume a
file line-by-line?
To compare, consuming the same file in Node.js line-by-line takes ~40
seconds, how can similar results be achieved?
Regards,
Alexey.
--
You received this message because you are subscribed to the Google
Groups "Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to haskell-pipes+unsubscr...@googlegroups.com
<mailto:haskell-pipes+unsubscr...@googlegroups.com>.
To post to this group, send email to haskell-pipes@googlegroups.com
<mailto:haskell-pipes@googlegroups.com>.
--
You received this message because you are subscribed to the Google Groups "Haskell
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to haskell-pipes+unsubscr...@googlegroups.com.
To post to this group, send email to haskell-pipes@googlegroups.com.