Hi,
   I'm using the latest version of human genome (hg19, Feb. 2009) and have
downloaded the refGene table. I'm having trouble understanding what cdsStart
- cdsEnd, exonStarts, exonEnds stand for. I had thought originally that if
you concatenate all the exons (which can be obtained by looking at
corressponding entries in exonStarts and exonEnds), you would get the
transcribed mRNA that would eventually be translated into the amino acids.
It seems like that is not the case.

For example,  in refGene table the entry with name='NM_001037165' has the
following fields:

cdsStart = 4721939
cdsEnd = 4802095
exonStarts =
4721929,4780468,4794089,4794867,4796624,4798681,4798941,4800694,4801814
exonEnds  =
4722499,4780654,4794246,4795014,4796818,4798848,4799226,4800919,4811074,

As you can see cdsEnd != the last of exonEnds, which is 4811074. Then what
exactly is the region 4802096 - 4811074? Is it an exon? Is it
transcribed/translated? Or is it part of the UTR?


Thanks in advance,
Huei-Hun Elizabeth Tseng
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to