On Mar 29, 2011, at 4:01 AM, Ketil Malde wrote: > > I was thinking that we might want to keep e.g. Blast results' original > offsets (which I believe are 1-based), so that you don't need to convert > in order to do non-transforming operations (e.g. select a subset of > results). Using a different type would be good, since it would catch > inadvertent mixing of conceptiually different values.
That sounds reasonable to me, though the time taken by one extra succ or pred is pretty small relative to the overall time required to convert between string and machine representation of numbers. However, that's a decision that's really independent of seqloc, and I think it makes sense to keep a single 0-based interface for seqloc. A Blast alignment library might have a data type with a qstart and a qend field containing 1-based Int64 indices and provide an accessor that handled any coordinate conversion needed to produce a seqloc location. That's what I do for GTF annotations (and BED annotations, whose coordinate scheme that wouldn't be captured perfectly by either 0-based or 1-based indexing). > Thus my call for a general Alignment class or data type, converting > to (or accessing through) Alignment should convert to standard choices, > and make things comparable - also alignment from different tools. There is a lot of heterogeneity in alignments produced by different tools--how they handle gaps, whether they're designed for mRNA-to-genome alignments, local versus global alignment, whether they score similarity in protein sequence, &c. I don't expect to find myself abstracting over different kinds of Alignments in tools or libraries that I write, though I'm happy to provide support for an Alignment typeclass in the seqloc library and in my samtools wrapper. For the moment, my inclination is to update the seqloc library to use Int64 indices and leave it otherwise unchanged, and to release the GTF & BED library at the same time. Best, --Nick _______________________________________________ Biohaskell mailing list Biohaskell@biohaskell.org http://malde.org/cgi-bin/mailman/listinfo/biohaskell