Hi Greg,

As you indicated in another email, we recommend removing SNPs that map 
to multiple locations to help increase accuracy in mapping. One of our 
developers adds that "duplicated regions of the genome are difficult to 
assemble and might be more likely to change from one assembly to the 
next than unique regions.  Also, I believe dbSNP did not remap SNP 
flanking sequences to GRCh37/hg19, but did their own coordinate 
translation of NCBI36/hg18 snp130 mappings to GRCh37 and possibly some 
filtering." Thus, the liftOver file may not be as of high quality for 
these regions and dbSNP's process is another variable.

She also added "As of snp132, we are going to do more to separate out 
those multiply-mapped SNPs because they cause trouble in all sorts of 
analyses. The 1000Genomes pilot project has identified genomic regions 
(in hg18) that are not unique enough for the alignment tools to be very 
confident that they have identified the right match, and, anecdotally, 
I've seen a lot of multiply-mapped SNPs in or next to those regions."

I hope this information is helpful. Please feel free to contact the mail 
list again if you require further assistance.

Best,
Mary
------------------
Mary Goldman
UCSC Bioinformatics Group

On 1/11/11 7:49 AM, Gregory Dougherty wrote:
> We are evaluating the liftOver tool for converting SNPs, that our researchers 
> have found, from hg18 to hg19 (IOW, not ones that are in dbSNPs).  As a test 
> I ran the hg18 dbSNPs 130 through liftOver, then compared the results to the 
> hg19 dbSNPs 130.
>
> My results:
> Starting: 18,833,531
> Converted by liftOver: 15,423,712
> Match Official position: 11,788,968
> Failed to map: 3,409,819
> Mapped to "wrong" location: 3,634,744
>
> 1: Should my experiment have worked?  "Should" the liftOver tool have gotten 
> all the SNPs to their official hg19 locations?
> 2: I'm not really worried about the SNPs that failed to map, but have ~20% of 
> the SNPs map to the wrong location is kind of concerning.  What is your 
> expected error rate?
>
> Thank you,
>
> Greg
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to