Hello Paul, Yes, there is a sanity check as part of the process. Please examine a SNP's details page for information about surrounding sequence, SNP type, multiple hit locations, or other quality metrics. For the SNPs that are absent, from examination, it appears that many map to alternate versions of genomic (Celera, HuRef) or are noted as "no link established by analysis of contig annotation". One of our developer's suggests reviewing the dbSNP report as it has interesting information/statistics down in the GeneView section. Thank you for following up and getting the full answer. This information will also help many other data users. Jennifer Jackson UCSC Genome Bioinformatics Group
Paul de Bakker wrote: > Hi Jennifer: > > So based on your re-alignments, do you correct the strand field if the > strand reported by dbSNP was wrong? > > I have 282 SNPs that have a valid entry in dbSNP (by using their > browser) but do not appear in the dbSNP129 table in UCSC. > > http://www.broad.mit.edu/~debakker/282_not_annotated_snps.txt > > Any chance you could help me figure out why these do not appear? > > Best wishes, > > Paul > > > > On Mar 23, 2009, at 2:59 PM, Jennifer Jackson wrote: > >> Hello Paul, >> >> We get everything in the snp129 table directly from dbSNP. (Look in >> the "Data Sources" section for details.) >> However when you click through to the details page, we do re-align >> the flanking sequences to the genomic (see "UCSC Re-alignment of >> flanking sequences"). >> >> And you are correct, the strand field means genomic strand. We save >> most of our coordinates (including these) with respect to the (+) >> strand, so you don't need to do anything if that is what you want. >> >> Coordinate help: >> http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms >> >> Thanks, >> Jennifer Jackson >> UCSC Genome Bioinformatics Group >> >> >> >> Paul de Bakker wrote: >>> What I really want to know what I need to do to get the listed >>> alleles to be oriented on the fwd/+ strand? >>> >>> Do you internally align the dbSNP fasta sequences to the hg18 to >>> ascertain the "strand" or are these records straight copies from dbSNP? >>> >>> Thanks! >>> Paul >>> >>> >>> On Mar 22, 2009, at 11:51 AM, Paul de Bakker wrote: >>> >>> >>>> Hi [email protected] >>>> >>>> I have downloaded the dbSNP129 data from the Table browser >>>> (database: hg18) and I wanted to check with you that the listed >>>> "strand" field refers to the + or - strand of the genome assembly >>>> (i.e. hg18)? That is, if I make everything on the + strand I >>>> only need to flip the Watson-Crick bases for the SNPs where strand >>>> = "-"? >>>> >>>> Thanks >>>> Paul >>>> >>>> >>> >>> _______________________________________________ >>> Genome maillist - [email protected] >>> http://www.soe.ucsc.edu/mailman/listinfo/genome >>> > _______________________________________________ Genome maillist - [email protected] http://www.soe.ucsc.edu/mailman/listinfo/genome
