Hello, I downloaded UCSC gene annotation track as follows:
http://genome.ucsc.edu/cgi-bin/hgTables group: 'Genes and Gene Prediction Tracks' track: 'UCSC Genes' table: 'knownGene' The table looks like: #name chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds proteinID alignID uc001aaa.2 chr1 + 1115 4121 1115 1115 3 1115,2475,3083, 2090,2584,4121, uc001aaa.2 uc009vip.1 chr1 + 1115 4272 1115 1115 2 1115,2475, 2090,4272, uc009vip.1 uc009vis.1 chr1 - 4268 6628 4268 4268 4 4268,4832,5658,6469, 4692,4901,5805,6628, uc009vis.1 uc001aag.1 chr1 - 5658 7231 5658 5658 4 5658,6469,6738,7095, 5810,6628,6918,7231, uc001aag.1 I have few questions: 1. Why do cdsStart and cdsEnd have the same coordinate? I would think that cdsStart should coincide with first exon's start and cdsEnd should coincide with the last exon's end. 2. In some cases cdsStart differs from cdsEnd but also differs from first exon's start, and cdsEnd differs from the last exon's end. For example: uc001abe.2 chr1 - 653074 654579 653880 654321 1 653074, 654579, uc001abe.2 Could you please explain what are cdsStart and cdsEnd? Thank you, Maria _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
