Hello Ivan,

Thank you very much for the detailed bug report!  There was an off-by-one 
coordinate bug that applied to this particular situation (rightmost base of 
exon is leftmost base of - strand codon with SNP on the exon to the right).  It 
has been fixed in the source code and will go out with the next code release.  
(In the meantime, it is fixed on our test server genome-test.cse.ucsc.edu, with 
the caveat that the code and data there may be unstable.)

Best,
Angie


----- "Ivan Adzhubey" <[email protected]> wrote:

> From: "Ivan Adzhubey" <[email protected]>
> To: [email protected]
> Sent: Saturday, August 14, 2010 9:24:57 PM GMT -08:00 US/Canada Pacific
> Subject: [Genome] Reference sequence annotation conflicts
>
> Hello,
> 
> I have a question about sequence conflicts that occasionally show up between 
> dbSNP missense SNP annotation page and refrence seqeunce. Here is one example 
> from hg19 / dbSNP 131 SNP annotation page:
> 
> dbSNP build 131 rs72894038
> 
> dbSNP: rs72894038
> Position: chr1:1141765-1141765
> Band: 1p36.33
> Genomic Size: 1
> 
> <skipped>
> 
> Coding annotations by dbSNP:
> NM_004195: missense G (GGC) --> S (AGC)
> NM_148901: missense G (GGC) --> S (AGC)
> NM_148902: missense G (GGC) --> S (AGC)
> 
> UCSC's predicted function relative to selected gene tracks:
> UCSC Genes    TNFRSF18 (uc001add.2)   missense G (GGG) --> R (AGG)
> UCSC Genes    TNFRSF18 (uc001adc.2)   missense G (GGG) --> R (AGG)
> UCSC Genes    TNFRSF18 (uc001adb.2)   missense G (GGG) --> R (AGG)
> 
> Now if you look at the sequence of uc001add.2 transcript, this position in 
> fact falls withing codon GGC (NOT GGG) on the reference sequence. It is a bit 
> complicated by the fact that this transcript is annotated on minus strand and 
> the codon in question is split between the end of exon 1 and start of exon 2:
> 
> Exon 1:
> ATG GCA CAG CAC GGG GCG ATG GGC GCG TTT CGG
> GCC CTG TGC GGC CTG GCG CTG CTG TGC GCG CTC
> AGC CTG GGT CAG CGC CCC ACC GGG GGT CCC GGG
> TGC GGC CCT GGG CGC CTC CTG CTT GGG ACG GGA
> ACG GAC GCG CGC TGC TGC CGG GTT CAC ACG ACG
> CGC TGC TGC CGC GAT TAC CCG G
> 
> Exon 2:
>  GC GAG GAG TGC TGT TCC GAG TGG GAC TGC ATG
> <rest skipped>
> 
> I have used Genomic Sequence link on the gene annotation page of Genome 
> Browser to extract the above exon sequences but I obtain the same results 
> when 
> using exons/CDS annotations from MySQL knownGene table for this transcript 
> and 
> fetching corresponding sequences from reference chromosome assemblies. I have 
> also verified this using hg19 Genome Browser web interface. In all cases I 
> can 
> see GGC codon, not GGG. Note, that the codon sequence in NCBI annotations is 
> correct.
> 
> This is not an isolated (although rare) case and I can supply more examples 
> if 
> necessary. Since there is no explananiton on how exactly this "UCSC's
> predicted function relative to selected gene tracks" section of the SNP 
> annotation page is produced, I wonder what may have caused such
> descrepancies?
> 
> Best,
> Ivan
> 
> 
> 
> -- 
> Ivan Adzhubey, Ph.D.
> Instructor
> Division of Genetics, Dept of Medicine
> Brigham & Women's Hospital, Harvard Medical School
> HMS New Research Building, Room 0464C
> 77 Avenue Louis Pasteur
> Boston, MA 02115
> tel.: (617) 525-4728
> fax:  (617) 525-4705
> web:
> http://genetics.bwh.harvard.edu/genetics/members/Ivan_Adzhubey.html
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to