Hello Chunjiang,

1. Both types of score are explained on the description page. To get to 
the description page click on the blue/gray bar to the left of the main 
display or on the track title in the drop down menu:

"PhastCons (which has been used in previous Conservation tracks) is a 
hidden Markov model-based method that estimates the probability that 
each nucleotide belongs to a conserved element, based on the multiple 
alignment. It considers not just each individual alignment column, but 
also its flanking columns. By contrast, phyloP separately measures 
conservation at individual columns, ignoring the effects of their 
neighbors. As a consequence, the phyloP plots have a less smooth 
appearance than the phastCons plots, with more "texture" at individual 
sites. The two methods have different strengths and weaknesses. 
PhastCons is sensitive to "runs" of conserved sites, and is therefore 
effective for picking out conserved elements. PhyloP, on the other hand, 
is more appropriate for evaluating signatures of selection at particular 
nucleotides or classes of nucleotides (e.g., third codon positions, or 
first positions of miRNA target sites).

Another important difference is that phyloP can measure acceleration 
(faster evolution than expected under neutral drift) as well as 
conservation (slower than expected evolution). In the phyloP plots, 
sites predicted to be conserved are assigned positive scores (and shown 
in blue), while sites predicted to be fast-evolving are assigned 
negative scores (and shown in red). The absolute values of the scores 
represent -log p-values under a null hypothesis of neutral evolution. 
The phastCons scores, by contrast, represent probabilities of negative 
selection and range between 0 and 1.

Both phastCons and phyloP treat alignment gaps and unaligned nucleotides 
as missing data, and both were run with the same parameters for each 
species set (vertebrates, placental mammals, and primates). Thus, in 
regions in which only primates appear in the alignment, all three sets 
of scores will be the same, but in regions in which additional species 
are available, the mammalian and/or vertebrate scores may differ from 
the primate scores. The alternative plots help to identify sequences 
that are under different evolutionary pressures in, say, primates and 
non-primates, or mammals and non-mammals."

If you require more information than what is on the description page you 
should consult the references listed at the end of the description page.

2. The chromosome numbers in mouse do not correspond to the numbers in 
human. Any particular chromosome in mouse is homologous to a bunch of 
different regions on various human chromosomes. Chromosomes are in 
general numbered from the largest to the smallest in that species. 
Except in very closely related species, there's no reason to expect that 
there is a one-to-one correspondence between any particular chromosome 
on one species, and the chromosome given the same number in some other 
species.


Best regards,

Pauline Fujita
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu

On 04/06/11 08:41, chunjiang he wrote:
> Dear Vanessa and Katrina,
> 
> Very appreciate to your answer. Now I think I can get conservation score for
> each base in genome. And I see there are many gaps at each chromosome
> result. I still have two questions about this work.
> 
> 1. I see the phastcons and phylop both provide the conservation score for
> each base. So which one is better when I apply them at a small speical
> genome region, eg. length of 100~200bp.
> 
> 2. I see both phylop and phastcons used 46way in human and 30way in mouse
> prediction. In human, there are 22 autosomes and in mouse there are
> 19 autosomes. So If the conservation score come from alignment between
> different speices, how do they get the conservation info in human chromosome
> 20, 21 and 22. And similar, for other speices that there are not enough
> chromosomes comparing to human, how to understand the conservation score at
> human chromosomes.
> 
> Thanks so much again.
> 
> Best,
> Chunjiang
> 
> 
> In Thu, Mar 24, 2011 at 2:32 PM, Vanessa Kirkup Swing
> <[email protected]>wrote:
> 
>> Dear Chunjiang,
>>
>> Here is a previously answered question about calculating the conservation
>> score in a region:
>>
>> https://lists.soe.ucsc.edu/pipermail/genome/2010-November/024065.html
>>
>> In this case you don't need to be concerned about how to calculate the
>> reverse strand coordinates. My colleague answered this in one of your
>> earlier questions:
>>
>>> To answer your second question, strand has no meaning in these phyloP
>>> tables. A conserved base prediction is simply a prediction of a conserved
>>> base; it is the same base in forward or reverse strand.
>> I hope that this helps to clarify things for you.
>>
>> Vanessa Kirkup Swing
>> UCSC Genome Bioinformatics Group
>>
>> ----- Original Message -----
>> From: "chunjiang he" <[email protected]>
>> To: "Vanessa Kirkup Swing" <[email protected]>
>> Cc: [email protected]
>>  Sent: Thursday, March 24, 2011 9:11:08 AM
>> Subject: Re: [Genome] a question about phylop conservation
>>
>>
>>
>> Dear Dr. Swing,
>>
>> Thanks for your guide. As your answer, I am still a little confused about
>> how to caculate the conservation score in a multiple bases region when I
>> only have the 1-base step scores.
>> I see there is 'span' parameter but I dont know how to use that from the
>> table files.
>>
>> fixedStep chrom=chr1 start=10918 step=1
>> 0.064
>> 0.056
>> 0.064
>>
>> I can't find the answer from
>> http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=cons46way
>>
>> Another, I see chromsize is used to caculate the reverse strand
>> coordinates. So how I can get the chromsize for each chromosome.
>>
>> Thanks again,
>>
>> Best,
>> Chunjiang
>>
>>
>> On Wed, Mar 9, 2011 at 12:14 PM, Vanessa Kirkup Swing <
>> [email protected] > wrote:
>>
>>
>> Dear Chunjiang,
>>
>> See the answers below:
>>
>>
>> So should I treat the 0.064 as the conservation score of the base 10918?
>>
>> Yes, the coordinates for wiggle tracks are specified as 1-relative. For a
>> chromosome of length N, the first position is 1 and the last position is N.
>> Only positions specified have data. Here is more information on wiggle
>> tracks: http://genome.ucsc.edu/goldenPath/help/wiggle.html
>>
>>
>>
>> If I want to get the conservation score from base 10918 to 10920, could I
>> average the three scores "0.064,0.056,0.064" or by some other method?
>>
>> Please read the description page for this track and the papers that are
>> referenced:
>>
>> http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=cons46way
>>
>>
>>
>> And also, If I have a part of genome sequence located in:
>>
>> "chr1/- 1617197 1617281 "
>>
>> How could I get the conservation score of this part as it is on the reverse
>> strand?
>>
>> If these coordinates are from the Genome Browser and not from a table they
>> are 1-based. If they are from a table, they are 0-based. This will help
>> explain our coordinate system:
>>
>> http://genome.ucsc.edu/FAQ/FAQtracks.html#tracks1
>>
>> Here is information on how we calculate the reverse strand:
>>
>> http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms
>>
>>
>> Hope this helps! Please contact the mailing list if you have further
>> questions.
>>
>> Vanessa Kirkup Swing
>> UCSC Genome Bioinformatics Group
>>
>>
>>
>>
>>
>>
>> ----- Original Message -----
>> From: "chunjiang he" < [email protected] >
>> To: "Katrina Learned" < [email protected] >
>> Cc: [email protected]
>> Sent: Tuesday, March 8, 2011 3:29:38 PM
>> Subject: Re: [Genome] a question about phylop conservation
>>
>> Thanks so much Katrina,
>>
>> So as you say, I get the file "chr1.phyloP46way.placental.wigFix" and open
>> it. And I see the file like this
>> "fixedStep chrom=chr1 start=10918 step=1
>> 0.064
>> 0.056
>> 0.064
>> ...
>> " .
>>
>> So should I treat the 0.064 as the conservation score of the base 10918?
>> If I want to get the conservation score from base 10918 to 10920, could I
>> average the three scores "0.064,0.056,0.064" or by some other method?
>>
>> And also, If I have a part of genome sequence located in:
>>
>> "chr1/- 1617197 1617281 "
>>
>> How could I get the conservation score of this part as it is on the reverse
>> strand?
>>
>> Thanks again,
>>
>> Chunjiang
>>
>>
>> On Tue, Mar 8, 2011 at 2:57 PM, Katrina Learned < [email protected]>wrote:
>>
>>> Hi Chunjiang,
>>>
>>> The score of each base isn't actually in the table. The best place to
>>> obtain the score information is from the files in the following
>> directory:
>>>
>> http://hgdownload.cse.ucsc.edu/goldenPath/hg19/phyloP46way/placentalMammals/
>>> For downloading large or multiple files, please see our recommended
>> methods
>>> in the readme.txt file:
>>> http://hgdownload.cse.ucsc.edu/goldenPath/hg19/phyloP46way/README.txt
>>>
>>> This directory contains the original ascii files that we translated (via
>>> wigEncode) into the wiggle track data. These files are the best source
>> for
>>> the actual values at each base since the wigEncode procedure introduces
>> its
>>> own reduction in precision, as explained:
>>> http://genomewiki.ucsc.edu/index.php/Wiggle
>>>
>>> To answer your second question, strand has no meaning in these phyloP
>>> tables. A conserved base prediction is simply a prediction of a conserved
>>> base; it is the same base in forward or reverse strand.
>>>
>>> I hope this information is helpful. Please don't hesitate to contact the
>>> mail list again if you have any further questions.
>>>
>>> Katrina Learned
>>> UCSC Genome Bioinformatics Group
>>>
>>> chunjiang he wrote, On 3/4/2011 12:06 PM:
>>>
>>>> Dear Mr/Ms,
>>>>
>>>> I want to ask which score can be used to represent conservation in
>>>> phyloP46wayPlacental.
>>>> I check it has a title like this:
>>>> *chrom*
>>>> *chromStart*
>>>> *chromEnd*
>>>> *lowerLimit*
>>>> *dataRange*
>>>> *validCount*
>>>> *sumData*
>>>> *sumSquares*
>>>> But I am not sure which column is the most important one i should use.
>>>> I see it is different to the paper they published.
>>>>
>>>> Another, I want to ask if the positive strand and negative strand in one
>>>> chromosome have the same genomic coordinates.
>>>> So here, the genomic coordinates are fit for both +/- strand?
>>>>
>>>> Thanks very much.
>>>>
>>>> Chunjiang
>>>> _______________________________________________
>>>> Genome maillist - [email protected]
>>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>>>
>> _______________________________________________
>> Genome maillist - [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>
>>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to