Hi Hervé,

Thank you for your email. One of our staff members is also part of CCDS 
project and she has offered the following information:

CCDS43034.1 is actually a selenoprotein (SELO, selenoprotein O) and so 
it has an in-frame stop codon because, in this protein, the in-frame 
stop codon is translated to a selenocysteine. We are currently 
determining if this is the case for the other CCDS you found with 
in-frame stop codons.

As for the CCDS without start codons, there are some CCDS that have been 
annotated with a non-ATG start codon e.g. CTG where there is 
experimental evidence to suggest that the protein is translated from the 
non-ATG start codon.

Finally, CCDS is constantly being updated, and so the project members 
are continually reviewing CCDS and correcting any errors or updating 
annotations based on additional evidence that becomes available. These 
updates are released periodically.

We are currently looking into your additional observations in more 
detail. Please don't hesitate to contact the mail list again if you have 
any further questions.

Katrina Learned
UCSC Genome Bioinformatics Group

Hervé Pagès wrote, On 08/13/10 12:50:
> Hi,
>
> According to the Methods section of the CCDS track page for hg18,
> one of the criteria used to assess each gene is:
>
>    - an initiating ATG, a valid stop codon, and no in-frame stop codons
>
> However when using some tools to extract and translate the transcripts
> for all the genes in the track, I find that some of the genes fail to
> satisfy the criteria. More precisely:
>
>    - 21 genes fail to have an initiating ATG (e.g. CCDS43136.1,
>      CCDS34059.1, etc..., see full listing at the end of the email).
>
>    - 15 genes fail to have no in-frame stop codons. E.g. the
>      CCDS43034.1 gene (on chr22 strand +) has an in-frame stop
>      codon 9 base upstream the stop codon located at the position
>      specified in the cdsEnd column of the ccdsGene table for
>      that gene.
>
> When using the Genome Browser to display CCDS43136.1 and CCDS43034.1
> for hg18, I can *see* a confirmation of the problem. But if I click on
> the CCDS43034.1 gene and then follow the link to the protein sequence
> then the sequence is truncated at the in-frame stop codon, not at the
> stop codon located at ccdsGene.cdsEnd. So I'm wondering why isn't
> ccdsGene.cdsEnd set to the end of the effective stop codon?
>
> For hg19, the situation is slightly worse. In addition to having genes
> with the same problems as reported above, 3 genes have a cumulated
> CDS length that is not even a multiple of 3 (CCDS47664.1, CCDS47663.1
> and CCDS45377.1).
>
> I would be very thankful if someone could provide some insight about
> this.
>
> Thanks,
> H.
>
> Full listing of failing genes for hg18:
>    - without an initiating ATG:
>        CCDS43136.1, CCDS34059.1, CCDS43376.1, CCDS34458.1, CCDS34457.1,
>        CCDS34737.1, CCDS6359.2, CCDS35004.1, CCDS35044.1, CCDS7878.2,
>        CCDS7877.2, CCDS41618.1, CCDS31428.1, CCDS31730.1, CCDS31729.1,
>        CCDS42102.1, CCDS32514.1, CCDS33104.1, CCDS33460.1, CCDS33646.1,
>        CCDS33647.1
>    - with one or more in-frame stop codons:
>        CCDS41340.1, CCDS41339.1, CCDS41283.1, CCDS41282.1, CCDS43091.1,
>        CCDS43389.1, CCDS43432.1, CCDS41964.1, CCDS41992.1, CCDS42100.1,
>        CCDS42150.1, CCDS42457.1, CCDS42981.1, CCDS43003.1, CCDS43034.1
>
>   
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to