Generally, `pipes` (or any other streaming library) is not designed for
random access and instead promotes going over the data in a single
pass. However, I'm still more than happy to brainstorm how to
efficiently implement this tool.
First off, I **highly** recommend you watch this lecture by Edward Kmett:
http://www.youtube.com/watch?v=uA0Z7_4J7u8
... and read this paper he mentions in the talk:
http://www.di.unipi.it/~ottavian/files/semi_index_cikm.pdf
They describe how you can efficiently index and browse very large and
heterogeneous data sets on the fly in tiny space. This makes it
possible to store logs blockwise-compressed and index to specific lines
very rapidly. In fact, the use case you are proposing is much simpler
than the case described in the talk and paper (indexing into tree-like
data structures).
If you would like to try out the solution described in the talk then let
me know and I can help you flesh out the solution a bit more. I've
already been experimenting along these lines for similar reasons
(browsing Twitter logs, which are huge).
On 6/29/14, 6:57 AM, Jonathan Johnsson wrote:
Hi all! I intend to try to build a utility similar to less, using
Haskell. I want it to be efficient and able to jump around in
multi-gigabyte text files (logs), and have some simple parse, filter
and highlighting functionality, that I can tailor to my needs. From
what I have read about Pipes, it sounds like it would be suitable as a
building block. I hope to learn and understand Haskell and Pipes
better in the process.
I have a decent grasp of how Haskell works and I have read through the
Pipes tutorial and numerous blog posts, but I haven't really coded
that much before, except for some toy examples. I just wonder if you,
being much more knowledgeable than I in this subject, could give me
any starting pointers on which packages I should check out and what I
should think about. Right now my idea is to start out by building a
program that runs through a file without interaction, and try to
augment it to support interactivity later.
Thank you for any help! :)
--
You received this message because you are subscribed to the Google
Groups "Haskell Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected]
<mailto:[email protected]>.
To post to this group, send email to [email protected]
<mailto:[email protected]>.
--
You received this message because you are subscribed to the Google Groups "Haskell
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].