Hi, for mmap-ing, there are xyz-mmap libraries on hackage.
In general, for libraries accessing this kind (huge bio data base) of data, we want both, efficient streaming and efficient pull-in of the whole dataset. Gruss, Christian * Johannes Waldmann <johannes.waldm...@htwk-leipzig.de> [27.06.2014 15:58]: > Hi, > > >> what is meant by "the parsing is lazy" exactly? > > I don't know, did I use that term? > > Yes, in the docs > http://hackage.haskell.org/package/blastxml-0.3.1/docs/Bio-BlastXML.html > > >> You want a BlastResult with a lazy list of results > >> (containing BlastRecords with a lazy list of hits, etc)? > > > > No - that is the case now, but I generally just discard the top > > BlastResult "node", and extract the results -- as a lazy list. > > what is the typical usage pattern of that list? > > if (memory) space is very tight, then the result of the parse > could just be a list of file offsets, to be used later, > when elements are accessed. > > NB: There's also > http://hackage.haskell.org/package/xml-conduit-1.2.0.2/docs/Text-XML-Stream-Parse.html > (parseBytes) (haven't used it, > but Snoyman's libraries are generally well-optimized for performance.) > > NB: ignoring that it's about Haskell - what do you want the OS to do? > mmap the whole file? ( > http://stackoverflow.com/questions/258091/when-should-i-use-mmap-for-file-access > ) > Can this be done with some Haskell library? > > - J. >
pgp31gH3HZMW3.pgp
Description: PGP signature