Generally, `pipes` (or any other streaming library) is not designed for random access and instead promotes going over the data in a single pass. However, I'm still more than happy to brainstorm how to efficiently implement this tool.

First off, I **highly** recommend you watch this lecture by Edward Kmett:

http://www.youtube.com/watch?v=uA0Z7_4J7u8

... and read this paper he mentions in the talk:

http://www.di.unipi.it/~ottavian/files/semi_index_cikm.pdf

They describe how you can efficiently index and browse very large and heterogeneous data sets on the fly in tiny space. This makes it possible to store logs blockwise-compressed and index to specific lines very rapidly. In fact, the use case you are proposing is much simpler than the case described in the talk and paper (indexing into tree-like data structures).

If you would like to try out the solution described in the talk then let me know and I can help you flesh out the solution a bit more. I've already been experimenting along these lines for similar reasons (browsing Twitter logs, which are huge).

On 6/29/14, 6:57 AM, Jonathan Johnsson wrote:
Hi all! I intend to try to build a utility similar to less, using Haskell. I want it to be efficient and able to jump around in multi-gigabyte text files (logs), and have some simple parse, filter and highlighting functionality, that I can tailor to my needs. From what I have read about Pipes, it sounds like it would be suitable as a building block. I hope to learn and understand Haskell and Pipes better in the process.

I have a decent grasp of how Haskell works and I have read through the Pipes tutorial and numerous blog posts, but I haven't really coded that much before, except for some toy examples. I just wonder if you, being much more knowledgeable than I in this subject, could give me any starting pointers on which packages I should check out and what I should think about. Right now my idea is to start out by building a program that runs through a file without interaction, and try to augment it to support interactivity later.

Thank you for any help! :)
--
You received this message because you are subscribed to the Google Groups "Haskell Pipes" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>. To post to this group, send email to [email protected] <mailto:[email protected]>.

--
You received this message because you are subscribed to the Google Groups "Haskell 
Pipes" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Reply via email to