Hello Rani,

We are currently looking into this and will contact you shortly. Thank 
you for your patience.

Best regards,

Pauline Fujita
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu



On 09/29/10 03:19, [email protected] wrote:
> Hello,
> 
> I have downloaded refGene table form the RefSeq Genes track (hg18) and  
> found the following problem: For hundreds of protein-coding  
> transcripts, the length of the coding region is not a whole  
> multiplication of triplets.
> 
> For one example I checked transcript NM_000804. According to NCBI  
> nucleotide DB record for this transcript, the coding length is 738  
> (which is fine: 738=246*3); but calculating coding length region  
> according to the coordinates provided in refGene, the length is 736.
> 
> To understand where the difference comes from, I compared exons’  
> lengths and found that the problem is in exon3: there is a difference  
> of 2 nucleotides in that exon – see below.
> 
> Tx=NM_000804, (chr11)
> 
> NCBI nucleotide DB info
> http://www.ncbi.nlm.nih.gov/nuccore/9257219
> =============================================
>       exon1            1..44          Len=44
>       exon2            45..218        Len=174
>       exon3            219..407       Len=189
>       exon4            408..543       Len=136
>       exon5            544..847       Len=304
> 
>       CDS             51..788 Len=738
>       polyA_site      847
> 
> 
> RefSeq table downloaded from UCSC
> =======================================
> exon1 len=44,  exS=71524418, exE=71524462
> exon2 len=174, exS=71524640, exE=71524814
> exon3 len=187  exS=71527654, exE=71527841 <----- (len is 187 instead of 189)
> exon4 len=136  exS=71528038, exE=71528174
> exon5 len=304, exS=71528278, exE=71528582
> 
> 5utrL=50, cdsL=736, 3utrL=59, mRNA_L=845
> ------------------------------------------------------
> 
> •     Could you please check why for many protein-coding transcripts, the  
> length of the coding region is not a whole multiplication of triplets.
> 
> •     Another problem that I encountered when calculating exons’ lengths  
> was that in order to get the correct length (according to NCBI  
> nucleotide DB), one has to calculate (exonEnd – exonS) rather than  
> what I expected: (exonEnd – exonS +1). It seems that exonS positions  
> (but not exonsEnd ones) are (-1) shifted. Is this indeed the case?
> 
> Many thanks in advance,
> Rani
> 
> 
> 
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to