Hi there,

I'm bringing the subject back to the mailing list in order to try and
come up with a strategy for this without bothering Chris (who appears
to be seriously preoccupied at present, for obvious reasons) too much.
To sum up what Chris, Jenny and I have found out so far communicating
with Hiram from UCSC:
 - all Ensembl gene business is stored in three tables: ensGene
(containing information about transcripts), ensPep (containing
amino-acid sequences for individual peptides) and ensGtp (containing
relations between transcripts, genes and proteins);
 - all exon information is stored in the form of comma-separated value
lists stored in the database as blobs (two for, respectively, start and
stop coordinates, one for the frame). Ensembl exon IDs are NOT stored
anywhere, on purpose (I have confirmed this with Hiram); exon
information can be retrieved only by combining the transcript ID with
the index pointing at that exon's data in the three blobs.

This of course raises several questions:

1. Can we live without original Ensembl exon IDs?

2. Do all exons in each transcript in Ensembl data lie within that
transcript's transcription region, or can they "stick out"? It seems to
me the former is the case but that's an empirical observation from a
limited part of the data set. The answer to this question can affect
the way of executing by-coordinates exon queries.

3. How to implement support for dynamic querying/generation of exon
information within AnnotationDB without a major redesign of same? In
Perl it could be achieved easily using the "tie" mechanism (e.g. having
"sliceAttrDict{'start'}" call an appropriate function), unfortunately
I've got no idea how to achieve something similar in Python.

4. (insert further questions that I have missed)

What do you think?

Cheers,
-- 
MS

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to