Hi Franck, see below... On Thu, 2014-04-24 at 18:30 +1000, fceru...@toulouse.inra.fr wrote: > Hello, > > I'm student in bioinformatics and I'm currently undergoing training at > INRA Toulouse (France) under the supervision of Helene Chiapello and > Christine Gaspin. > > I have a question about the Mauve software concerning the "extract > ortholog" feature provided by the GUI. Indeed, I have encountered > several problems while analyzing a large dataset of 27 E. coli > genomes: for several ortholog groups, I identified multiple > overlapping coding sequences in the same genome. > > After manual inspection, I realize that these overlapping CDS seem > mainly enlight two situations : > > => Annotation mistakes in the original genbank file: overlaping coding > sequences are located at different reading frames in the same genomic > region. In this case, I was able to choose one of the two coding > sequences by looking at %identity and coverage of the alignments. > > => Real overlapping coding sequences: strongly annotated and predicted > CDS, located at same reading frame (with a coding sequence included in > another one for example). It should also be possible here to choose > the real otholog CDS among the two coding sequences by looking at > alignment coverage.I'd like to know if a fix already exists to handle > these cases. Or may be you plan to do it ?
There is a bit of a nomenclature problem here. The software produces something that should probably called a positional homolog group rather than an ortholog group. Inference of orthology requires resolution of phylogenetic relationships, and that is not happening within Mauve. Instead, a relatively simple set of criteria are applied to identify clusters of positionally homologous features: if a pair of features overlap by at least a fixed percent (>50%) and have at least a particular percent sequence identity over their length, they are called as homologous. Those thresholds are adjustable via the graphical interface. With that said, in the situation you describe, both overlapping CDS should be positional homologs to the aligned region in the other genome. It sounds like you are more interested in the notion of functional equivalence. The term ortholog was unfortunately bastardized by early genomics researchers in this context to refer to functionally equivalent homologous genes. You might be able to do some relatively simple postprocessing of the Mauve output to identify good candidates for functional equivalence, but it's not something currently supported by Mauve itself. The use of the term Ortholog in Mauve was a nomenclature mistake made when developing the feature. > Last question : is there a way to extract orthologs using a > command-line program ? This is available in the snapshots and will be in a future release, but not in the 2.3.1 release. In the meantime you can get the snapshots here: http://gel.ahabs.wisc.edu/mauve/snapshots You will want to run something like: java -cp Mauve.jar org.gel.mauve.analysis.OneToOneOrthologExporter Best, -Aaron -- Aaron E. Darling, Ph.D. Associate Professor, ithree institute University of Technology Sydney Australia http://darlinglab.org twitter: @koadman UTS CRICOS Provider Code: 00099F DISCLAIMER: This email message and any accompanying attachments may contain confidential information. If you are not the intended recipient, do not read, use, disseminate, distribute or copy this message or attachments. If you have received this message in error, please notify the sender immediately and delete this message. Any views expressed in this message are those of the individual sender, except where the sender expressly, and with authority, states them to be the views of the University of Technology Sydney. Before opening any attachments, please check them for viruses and defects. Think. Green. Do. Please consider the environment before printing this email. ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Mauve-users mailing list Mauve-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mauve-users