Hi Ivan, One of our developers had this to say:
"Looking at the code, multiple mappings are apparently both normal and expected. What it takes for a CCDS-UCSC Gene mapping to make it into the table is for the two transcripts to overlap at 95% or more of the bases in the region starting at the earlier CDS start of the two transcripts and ending at the later CDS end. So this table describes more of a putative association than a precise, carefully-curated mapping." I hope this information is helpful. Please feel free to contact the mail list again if you require further assistance. Best, Mary ------------------ Mary Goldman UCSC Bioinformatics Group On 6/14/11 3:30 PM, Ivan Adzhubey wrote: > Hi, > > I was wondering if it is normal and expected for the CCDS clusters to contain > overlapping lists of known gene transcripts? For example: > > mysql> select * from ccdsKgMap where geneId='uc001cry.3'; > +-------------+------------+-------+------------+----------+---------------+ > | ccdsId | geneId | chrom | chromStart | chromEnd | cdsSimilarity | > +-------------+------------+-------+------------+----------+---------------+ > | CCDS44138.1 | uc001cry.3 | chr1 | 50513685 | 50667540 | 0.983784 | > | CCDS44140.1 | uc001cry.3 | chr1 | 50513685 | 50667540 | 0.983784 | > | CCDS553.1 | uc001cry.3 | chr1 | 50513685 | 50667540 | 0.955381 | > +-------------+------------+-------+------------+----------+---------------+ > 3 rows in set (0.00 sec) > > According to this query, uc001cry.3 transcript belongs to three different CCDS > clusters. My intuitive (and obviously incorrect) understanding of how CCDS is > compiled would be that each CCDS entry maps to a unique set of knownGene > transcripts. Is there a detailed description on how CCDS /KG relations are > obtained? > > Thanks, > Ivan > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
