I have been developing informatics scripts used primarily in our analysis of RNAseq data for Drosophila. One of the startingpoints for our analysis is a gene model specified by the UCSC Table browser in the form of a .BED file, which lists each isoform name (eg. CG1674-RA, CG1674-RB,...) along with each isoforms' exons' coordinates. The association between isoform and gene is straightforward from the isoformID/name.
Lately, I've been attempting to adapt the analysis scripts to Humanexpression data, and I'm encountering difficulty in locating, or piecing together, a similar gene model. I'm trying to work with the most up-to-date (Feb 2009) annotations, but the gene/isoform naming convention there seems quite different from that for fly. For example NM_001145277, NM_001145278, and NM_018090 appear (judging from txStart & txEnd) to be different isoforms associated with a common gene, though there is nothing within the isoform names themselves to indicate a common gene (and using common txStart/Ends to associate isoforms with common genes would seem, in general, to be incorrect). My question is: For Human Feb 2009 annotations, does there exist a table that translates from NM_* IDs to an ID-scheme similar to that adopted for fly; i.e., a standard gene name followed by an isoform name sub-tag? Any suggestions you might have would be appreciated. -Mike -- Mike Duff Graveley Lab Department of Genetics and Developmental Biology University of Connecticut Health Center [email protected] http://intron.ccam.uchc.edu/~mikeduff 860.970.2283 _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
