Hi there, I'm bringing the subject back to the mailing list in order to try and come up with a strategy for this without bothering Chris (who appears to be seriously preoccupied at present, for obvious reasons) too much. To sum up what Chris, Jenny and I have found out so far communicating with Hiram from UCSC: - all Ensembl gene business is stored in three tables: ensGene (containing information about transcripts), ensPep (containing amino-acid sequences for individual peptides) and ensGtp (containing relations between transcripts, genes and proteins); - all exon information is stored in the form of comma-separated value lists stored in the database as blobs (two for, respectively, start and stop coordinates, one for the frame). Ensembl exon IDs are NOT stored anywhere, on purpose (I have confirmed this with Hiram); exon information can be retrieved only by combining the transcript ID with the index pointing at that exon's data in the three blobs.
This of course raises several questions: 1. Can we live without original Ensembl exon IDs? 2. Do all exons in each transcript in Ensembl data lie within that transcript's transcription region, or can they "stick out"? It seems to me the former is the case but that's an empirical observation from a limited part of the data set. The answer to this question can affect the way of executing by-coordinates exon queries. 3. How to implement support for dynamic querying/generation of exon information within AnnotationDB without a major redesign of same? In Perl it could be achieved easily using the "tie" mechanism (e.g. having "sliceAttrDict{'start'}" call an appropriate function), unfortunately I've got no idea how to achieve something similar in Python. 4. (insert further questions that I have missed) What do you think? Cheers, -- MS --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to pygr-dev@googlegroups.com To unsubscribe from this group, send email to pygr-dev+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---