Hi Aanchal, Since rendering to pixels inherently loses precision, and the volume of data for LD is quite large, our database table uses a lossy compression scheme for those values, developed by Daryl Thomas. This scheme (details below) could be reversed to get approximate/binned values for r^2, D' and LOD. However, it would probably be better to use the precise values from the .LD data files that were processed into our database representation.
Unfortunately, the .LD files used to make the hg18.hapmapLd* tables were lost in a disk crash. Fortunately, HapMap has .LD files from a more recent release of genotypes -- see files in ftp://ftp.hapmap.org/hapmap/ld_data/2009-02_phaseIII_r2/ including 00README.txt which explains the .LD file format. If you download HapMap's files from that ftp link and make a file "myRsIds.txt" that contains your rs#'s of interest, with one rs# per line, like this: rs12627640 rs240444 rs10432925 Then for each chromosome and population code, a command like this will extract the relevant lines of the downloaded file: zcat ld_${chr}_${pop}.txt.gz | grep -Fwf myRsIds.txt > myLD_${chr}_${pop}.txt The paired rsIds will be in the 4th and 5th columns, r^2 in the 7th column. Details of the lossy compression scheme, for the record: D' and r^2 values in the range of [0,1] are encoded like this: encodedValue = 'a' + (actualValue * 9) D' values in the range [-1,0) are encoded like this: encodedValue = 'A' - (actualValue * 9) For LOD it's more complicated and involves the absolute value of D' (|D'|): * if LOD >= 2 and |D'| < 0.5, then encodedValue = 'y' (pink). * if LOD < 2 and |D'| < 0.99, then encodedValue 'z' (blue). * otherwise, encodedValue = 'a' + min(9, (LOD - |D'| - 1.5)) After actual values are transformed into alphabetic characters, the alph. characters are concatenated into strings in order of the SNPs' appearance after the current SNP. So the Nth character in the concatenated string represents the score between the current SNP and the Nth SNP that follows it. Hope that helps, Angie ----- "aanchal sharma" <[email protected]> wrote: > From: "aanchal sharma" <[email protected]> > To: [email protected] > Sent: Friday, June 3, 2011 12:23:41 AM GMT -08:00 US/Canada Pacific > Subject: [Genome] How to interprate D prime , r^2 and LOD values in LD phased > data > > Dear Sir /Madam > > For a certain set of SNPs I want to know which are the SNPs in LD with the > query SNPs and their r^2 values. The output that I am downloading from UCSC > tables, is giving me a table in which D prime values, r^2 and LOD scores are > represented in alphabets. I am unable to inertprate the results. Also how > can I get the list of all other SNPs in LD with my query SNPs? > Waiting for the soon reply. > > Regards > Aanchal > _______________________________________________ > Genome maillist - [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
