Hi P채r, On Wed, 2013-09-11 at 14:53 +0200, P채r Larsson wrote:
> We have recently noted, however, that the number of shared > (orthologous) nucleotide sites in the final alignment actually becomes > significantly smaller when we run the genomes in larger sets. The more > genomes included in the subalignments, the worse the situation gets. > We get the best result when we do subalignments for individual genomes > with a reference before merging. Numbers of homoplastic SNPs in the > different alignments appear to remain about the same. > > > > This phenomenon is to me completely contra-intuitive as you would > expect that more genomes in an alignment would enable a more accurate > end result. Yes, in principle one would expect that more genomes might lead to a more accurate alignment. I believe such results have been demonstrated for multiple sequence alignments of individual genes. And years ago in my Ph.D. thesis work I demonstrated that adding (simulated) phylogenetic intermediates between two distant genomes resulted in improved genome alignments. But the distances studied were far greater than the 1% divergence you speak of. > Has anyone else also experienced this? In order to answer this it would help to understand a bit more precisely what you mean by "shared nucleotide sites in the final alignment". Do you mean alignment columns in which all organisms are present with a non-gap character? If so it makes sense that this decays as genomes are added, I would expect it to do so at a rate proportional to the phylogenetic distance of organisms added. This is something I regularly observe and expect, and the decrease in alignment columns can be dramatic for collections of organisms with highly variable gene content. However, in addition to the expected decay, there may be issues with the aligner that cause a decay in the number of such alignment columns with increasing genome count. I am not aware of any major systematic problems with the current progressiveMauve implementation that would cause this, but like all software there are bugs and limitations yet to be discovered. Best, -Aaron -- Aaron E. Darling, Ph.D. Associate Professor, ithree institute University of Technology Sydney Australia http://darlinglab.org twitter: @koadman UTS CRICOS Provider Code: 00099F DISCLAIMER: This email message and any accompanying attachments may contain confidential information. If you are not the intended recipient, do not read, use, disseminate, distribute or copy this message or attachments. If you have received this message in error, please notify the sender immediately and delete this message. Any views expressed in this message are those of the individual sender, except where the sender expressly, and with authority, states them to be the views of the University of Technology Sydney. Before opening any attachments, please check them for viruses and defects. Think. Green. Do. Please consider the environment before printing this email. ------------------------------------------------------------------------------ LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99! 1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint 2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk _______________________________________________ Mauve-users mailing list Mauve-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mauve-users