Hello Jirong

I am glad that you wrote back to clarify this, as one of our developers 
brought this score confusion possibility up ealier in the week with 
regard to my my initial answer. The data in the hapmapAlleles* tables 
you specify represent a data quality metric. For the other hapmapSNP* 
tables, this data is a score, but perhaps not the score you are 
interested in. If I understand your question correctly, you are 
interesting in the score as represented in the Conservation track 
(phastCons*Way, mutiple species alignments).

I am unable to view your attachments, but I can provide some help for 
obtaining the Conservation score for HapMap SNP locations.

There are a few methods for obtaining this data (Table Browser, Galaxy, 
and Flat File). If you are interested in a specific score for a 
particular HapMap SNP location, the best choice is to access the flat 
files directly.

The files are located our downloads area ("Downloads" in the blue menu 
bar http://genome.ucsc.edu/, left side of main browser window). From the 
downloads page, select "Human" and the track "Conservation scores for 
alignments of 43 vertebrate genomes with Human". The ftp path and a 
description of the files are in the README document.
Instructions for ftp: http://genome.ucsc.edu/FAQ/FAQdownloads#download1
File format help: http://genome.ucsc.edu/FAQ/FAQformat

The idea would be to extract the base position for a HapMap SNP from the 
.wig formatted files. You would need to develop simple tools to search, 
match, and extract the data points.

If you decide to use the Table Browser, it is possible to start with the 
Conservation track and phastCons* table of your choice and perform an 
intersection against the entire HapMap track or a subset by limiting the 
results by genomic region or a custom track containing a subset of the 
HapMap track. When doing this intersection, you will be retrieving 
complete "blocks" of data from the original base table with any overlap 
with any of the data you are used as a filter in the genomic region or 
track intersection. Meaning, the data will not correlate1-1. Any blocks 
will be returned in their entirity and the original HapMap data point 
name will not be annotated in the output. This is why I advise against 
this method.

If you decide to you Galaxy, the data from two tables can also be 
intersected, with the added advantage that data from both the base and 
intersection table will result. Send the tables to Galaxy, format as 
necessary, and join the data.

Suggested tables to get conservation score data from in the latest human 
are phastCons44way (newest data, one score for a block of data) and 
multiz28waySummary (conservation scores per species). Galaxy can convert 
the file formats to interval or bed to provide "one row per base 
position" which will make comparing the data easier.

I hope this helps,
Jennifer Jackson
UCSC Genome Bioinformatics Group





Long, Jirong wrote:
> Many thanks, Jennifer.
>
> We downloaded hapmapAllelesSummary.txt. We just want to make sure the
> 6th column is for the conservation score, such as 0, 0, 65, 0, 4 in the
> first 5 rows of the hapmapAllelesSummary.txt. Please see attachments.
> Appreciate your kind help.
>
> Warmest regard,
>
> Jirong
>
> -----Original Message-----
> From: Jennifer Jackson [mailto:[email protected]] 
> Sent: Tuesday, February 17, 2009 5:40 PM
> To: Long, Jirong
> Cc: [email protected]
> Subject: Re: [Genome] conservate score for HapMap SNPs
>
> Hello,
> You can ftp the files from our downloads server or save the files from 
> the Table Browser.
>
> For ftp, go to the main browser web site http://genome.ucsc.edu/ and 
> click on "Downloads" in the left blue navigation bar. For the most 
> recent data for human (hg18), click on Human, then click into the 
> Annotation Database directory. Files named like hapmapSnps*.txt.gz (and 
> maybe hapmapAlleles*.txt.gz) are the files related to the HapMap SNPs
> track.
> Instructions for ftp access:
>
> For saving from the Table browser, go to the main browser web site 
> http://genome.ucsc.edu/ and click on "Table Browser" in the left blue 
> navigation bar. Select the clade, genome, assembly for the latest human.
>
> Set group: Variation and Repeats and track: HapMap SNPs. The associated 
> tables will be in the tables pull-down menu. Use the "View table schema"
>
> button to view table contents (maybe be a useful tool anyway, even if 
> you use ftp). Make sure that region: genome and output format: all 
> fields from selected tables. Name the output file and it will save to 
> your computer.
>
> For more info about the track, go into the Human assembly browser and 
> click on the track name for a full description of methods, sources, etc.
>
> Thanks!
> Jennifer Jackson
> UCSC Genome Bioinformatics Group
>
>
> Long, Jirong wrote:
>   
>> Dear Sir/Madam,
>>
>>  
>>
>> We are wondering whether you have a ftp address that we can use to
>> download the conservative score for each of the HapMap SNPs? Thanks.
>>
>>  
>>
>> Best,
>>
>>  
>>
>> Jirong
>>
>>  
>>
>> *******************************************
>>
>> Jirong Long, PhD
>>
>> Assistant Professor
>>
>> Vanderbilt Epidemiology Center
>>
>> Vanderbilt Ingram Cancer Center
>>
>> Eighth floor, Suite 800
>>
>> 2525 West End Avenue
>>
>> Nashville, TN 37203-1738
>>
>> Tel: 615-343-6741
>> Fax: 615-322-0502
>>
>> E-mail: [email protected]
>>
>>  
>>
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> http://www.soe.ucsc.edu/mailman/listinfo/genome
>>   
>>     
>>
>> ------------------------------------------------------------------------
>>
>> -- MySQL dump 10.10
>> --
>> -- Host: localhost    Database: hg18
>> -- ------------------------------------------------------
>> -- Server version    5.0.21
>>
>> /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
>> /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
>> /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
>> /*!40101 SET NAMES utf8 */;
>> /*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
>> /*!40103 SET TIME_ZONE='+00:00' */;
>> /*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='' */;
>> /*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;
>>
>> --
>> -- Table structure for table `hapmapAllelesSummary`
>> --
>>
>> DROP TABLE IF EXISTS `hapmapAllelesSummary`;
>> CREATE TABLE `hapmapAllelesSummary` (
>>   `bin` int(10) unsigned NOT NULL default '0',
>>   `chrom` varchar(255) NOT NULL default '',
>>   `chromStart` int(10) unsigned NOT NULL default '0',
>>   `chromEnd` int(10) unsigned NOT NULL default '0',
>>   `name` varchar(255) NOT NULL default '',
>>   `score` int(10) unsigned NOT NULL default '0',
>>   `strand` enum('+','-','?') NOT NULL default '?',
>>   `observed` varchar(255) NOT NULL default '',
>>   `allele1` enum('A','C','G','T') NOT NULL default 'A',
>>   `allele2` enum('C','G','T','none') NOT NULL default 'C',
>>   `popCount` int(10) unsigned NOT NULL default '0',
>>   `isMixed` varchar(255) NOT NULL default '',
>>   `majorAlleleCEU` enum('A','C','G','T','none') NOT NULL default 'A',
>>   `majorAlleleCountCEU` int(10) unsigned NOT NULL default '0',
>>   `totalAlleleCountCEU` int(10) unsigned NOT NULL default '0',
>>   `majorAlleleCHB` enum('A','C','G','T','none') NOT NULL default 'A',
>>   `majorAlleleCountCHB` int(10) unsigned NOT NULL default '0',
>>   `totalAlleleCountCHB` int(10) unsigned NOT NULL default '0',
>>   `majorAlleleJPT` enum('A','C','G','T','none') NOT NULL default 'A',
>>   `majorAlleleCountJPT` int(10) unsigned NOT NULL default '0',
>>   `totalAlleleCountJPT` int(10) unsigned NOT NULL default '0',
>>   `majorAlleleYRI` enum('A','C','G','T','none') NOT NULL default 'A',
>>   `majorAlleleCountYRI` int(10) unsigned NOT NULL default '0',
>>   `totalAlleleCountYRI` int(10) unsigned NOT NULL default '0',
>>   `chimpAllele` enum('A','C','G','N','T','none') NOT NULL default 'A',
>>   `chimpAlleleQuality` int(10) unsigned NOT NULL default '0',
>>   `macaqueAllele` enum('A','C','G','N','T','none') NOT NULL default 'A',
>>   `macaqueAlleleQuality` int(10) unsigned NOT NULL default '0',
>>   KEY `name` (`name`),
>>   KEY `chrom` (`chrom`,`bin`)
>> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
>>
>> /*!40103 SET time_zo...@old_time_zone */;
>>
>> /*!40101 SET sql_mo...@old_sql_mode */;
>> /*!40101 SET character_set_clie...@old_character_set_client */;
>> /*!40101 SET character_set_resul...@old_character_set_results */;
>> /*!40101 SET collation_connecti...@old_collation_connection */;
>> /*!40111 SET sql_not...@old_sql_notes */;
>>
>> -- Dump completed on 2007-07-11 17:35:45
>>     
_______________________________________________
Genome maillist  -  [email protected]
http://www.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to