According to the Ensembl Perl API tutorial, an ensembl feature table refers to an ensembl database table that contains the fields of 'seq_region_id', 'seq_region_start', 'seq_region_end' and 'seq_region_strand'.
During the sprint session this summer at Caltech, Chris suggested that It would be more straightforward to model an ensembl feature table using Pygr's seqdb.AnnotationDB, rather than the sqlgraph.SQLTable. The advantage is that a feature, such as a transcript, will automatically have a sequence attribute in addition to the attributes derived from the table fields. For the past couple of months, I've been experimenting with this idea. I created a new branch in my pyensembl git repository as jenny_pyensembl_new. In this experimental branch, I've managed to model the ensembl gene table, transcript table and exon table as subclasses of seqdb.AnnotationDB. Consequently, the gene, transcript and exon are modeled as subclasses of seqdb.AnnotationSeq. (Hey, Titus and Rob, I've got rid of all the ugly 'import *' statements in my new branch :) The problem is that the ensembl way of defining the seq_region_start and the seq_region_end of a transcript in fact returns a pre-mRNA, i.e. the un-spliced form of the transcript. The Perl Ensembl API provides a get_spliced_seq() method and a get_translateable_seq() method for a user to obtain the desired sequences of a transcript. I could write similar methods for my ensembl transcript class. Or, would it be less work to subclass the seqdb.AnnotationDB to grant a splicedsequence and a translateablesequence attributes for a transcript? thanks, jenny --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---
