Hello! I'm pleased to announce the second major release of the biostockholm library! This library allows you to parse and render files in the Stockholm 1.0 format, which is used by Pfam, Rfam, Infernal and others for holding information about families of proteins or non-coding RNAs.
http://hackage.haskell.org/package/biostockholm Despite this low increase in number from 0.1 to 0.2, this is actually a big rewrite of the library. Now we have: - An streaming interface similar to what SAX parsers provide. This allows you to consume Stockholm files using constant memory (80k in a simple case). - More test cases. It's able to consume its own pretty printed version of Rfam through the document interface, and is also capable of reading the full Rfam stockholm file (which has some huge families) through the streaming interface. - QuickCheck properties. Now we have three different QuickCheck properties covering almost everything. These have helped uncover some tricky bugs that were never found before. However, two of these three properties still don't pass, but I consider the failing examples that I've investigated just corner cases. Unfortunately, Stockholm lacks a formal specification. - Conduit interface. Besides a lazy I/O version, now there's a conduit interface. - Code much easier to read and reason about. - Fast enough: the streaming interface achieves 12 MiB/s for parsing, which is pretty nice considering that there are some known overheads on its implementation. For the tasks that biostockholm 0.1 already handled, biostockholm 0.2 tends to be slightly slower. However, biostockholm 0.2 is able to handle some previously impossible cases where an streaming solution is required. Cheers! -- Felipe. _______________________________________________ Biohaskell mailing list Biohaskell@biohaskell.org http://malde.org/cgi-bin/mailman/listinfo/biohaskell