Hi P채r,

On Wed, 2013-09-11 at 14:53 +0200, P채r Larsson wrote:

> We have recently noted, however, that the number of shared
> (orthologous) nucleotide sites in the final alignment actually becomes
> significantly smaller when we run the genomes in larger sets. The more
> genomes included in the subalignments, the worse the situation gets.
> We get the best result when we do subalignments for individual genomes
> with a reference before merging. Numbers of homoplastic SNPs in the
> different alignments appear to remain about the same.
> 
>  
> 
> This phenomenon is to me completely contra-intuitive as you would
> expect that more genomes in an alignment would enable a more accurate
> end result.

Yes, in principle one would expect that more genomes might lead to a
more accurate alignment. I believe such results have been demonstrated
for multiple sequence alignments of individual genes. And years ago in
my Ph.D. thesis work I demonstrated that adding (simulated) phylogenetic
intermediates between two distant genomes resulted in improved genome
alignments. But the distances studied were far greater than the 1%
divergence you speak of.

> Has anyone else also experienced this? 

In order to answer this it would help to understand a bit more precisely
what you mean by "shared nucleotide sites in the final alignment". Do
you mean alignment columns in which all organisms are present with a
non-gap character? If so it makes sense that this decays as genomes are
added, I would expect it to do so at a rate proportional to the
phylogenetic distance of organisms added. This is something I regularly
observe and expect, and the decrease in alignment columns can be
dramatic for collections of organisms with highly variable gene content.
However, in addition to the expected decay, there may be issues with the
aligner that cause a decay in the number of such alignment columns with
increasing genome count. I am not aware of any major systematic problems
with the current progressiveMauve implementation that would cause this,
but like all software there are bugs and limitations yet to be
discovered.

Best,
-Aaron
-- 
Aaron E. Darling, Ph.D.
Associate Professor, ithree institute
University of Technology Sydney
Australia

http://darlinglab.org
twitter: @koadman



UTS CRICOS Provider Code: 00099F
DISCLAIMER: This email message and any accompanying attachments may contain 
confidential information.
If you are not the intended recipient, do not read, use, disseminate, 
distribute or copy this message or
attachments. If you have received this message in error, please notify the 
sender immediately and delete
this message. Any views expressed in this message are those of the individual 
sender, except where the
sender expressly, and with authority, states them to be the views of the 
University of Technology Sydney.
Before opening any attachments, please check them for viruses and defects.

Think. Green. Do.

Please consider the environment before printing this email.

------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
Mauve-users mailing list
Mauve-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mauve-users

Reply via email to