Hello Haiyan, When a SNP's flanking sequences map to multiple genomic locations, we add "MultipleAlignments" to the exceptions column as you can see in your example, as a sort of red flag. The observed differences could be due to population variation, or simply to duplicated/low-complexity regions in the genome. As a sort of quality filter, you might consider ignoring SNPs with the MultipleAlignments flag.
There is probably a synchronization issue that explains the differences you see between dbSNP's display and our display. dbSNP constantly updates the data they display. We typically get the data that dbSNP makes available for download when they announce a new release, and we display that data until another new data set is released. One more note: the single mapping shown on dbSNP's details page for rs74873759 is to Craig Venter's genome, "HuRef", not the GRCh37 (hg19) assembly. I hope this helps explain the SNP data we provide. If you have further questions, please contact us again at [email protected]. -- Brooke Rhead UCSC Genome Bioinformatics Group On 3/30/12 12:51 PM, Zhang, Haiyan wrote: > Dear Sir/Madam, > > I want to download snp135.txt as for our GWAS studies. I found out that in this file that are multiple entries for one rs_id, just wonder which one should I use, why there are so many entries? > e.g. rs74873759 > Three entries in snp135.txt: > 683 chr1 12927802 12927803 rs74873759 0 + > G G G/T genomic single unknown 0.5 0 unknown > exact 3 MultipleAlignments 1 ENSEMBL, 2 G,T, > 1.000000,1.000000, 0.500000,0.500000, > 1564 chr11 128320751 128320752 rs74873759 0 - > T T G/T genomic single unknown 0.5 0 unknown > exact 3 ObservedMismatch,MultipleAlignments 1 ENSEMBL, > 2 G,T, 1.000000,1.000000, 0.500000,0.500000, > 1630 chr3 137089491 137089492 rs74873759 0 + > G G G/T genomic single unknown 0.5 0 unknown > exact 3 MultipleAlignments 1 ENSEMBL, 2 G,T, > 1.000000,1.000000, 0.500000,0.500000, > > While If I searched NCBI dbSNP database, http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=74873759 > The only mapping locations is chr 11:124269136 > > If I want to choose only one entry from snp135.txt for each rs_id, how do I > choose? > > Thanks. > > -Haiyan > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
