[Genome] Human (Feb 2009) gene models: gene/isoform naming convention

Duff Fri, 19 Jun 2009 10:29:15 -0700

I have been developing informatics scripts used primarily in our analysis of
RNAseq data for Drosophila. One of the startingpoints for our analysis is a
gene model specified by the UCSC Table browser
in the form of a .BED file, which lists each isoform name (eg. CG1674-RA,
CG1674-RB,...) along with each isoforms' exons' coordinates. The association
between isoform and gene is straightforward from the isoformID/name.


Lately, I've been attempting to adapt the analysis scripts to Humanexpression
data, and I'm encountering difficulty in locating, or piecing together, a
similar
gene model. I'm trying to work with the most up-to-date (Feb 2009)
annotations,
but the gene/isoform naming convention there seems quite different from that
for fly. For example NM_001145277, NM_001145278, and NM_018090 appear
(judging from txStart & txEnd) to be different isoforms associated with a
common
gene, though there is nothing within the isoform names themselves to
indicate
a common gene (and using common txStart/Ends to associate isoforms with
common genes would seem, in general, to be incorrect).

My question is: For Human Feb 2009 annotations, does there exist a table
that
translates from NM_*  IDs to an ID-scheme similar to that adopted for fly;
i.e.,
a standard gene name followed by an isoform name sub-tag?

Any suggestions you might have would be appreciated.



-Mike


-- 
Mike Duff
Graveley Lab
Department of Genetics and Developmental Biology
University of Connecticut Health Center
[email protected]
http://intron.ccam.uchc.edu/~mikeduff
860.970.2283
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

[Genome] Human (Feb 2009) gene models: gene/isoform naming convention

Reply via email to