Hi Laura, Thanks for writing. Yes, you are correct that 0 in the .backbone file indicates that a particular genome lacks a positional homolog for that region. As for the coordinate system, the aligner concatenates all contigs into a single coordinate system and does this in the same order that they appear in the input file. Header characters are not counted. If you are observing coordinates that are out of range it suggests there is a problem parsing the input sequence file somewhere, possibly due to issues with the formatting of the file.
The negative coordinates indicate a region on the reverse complement strand. To extract the regions you would take the subsequence between the absolute value (e.g. positive values) of the left & right end in the concatenated coordinate system, then reverse complement it. Do not reverse complement the entire genome then extract, doing so will produce incorrect results. In terms of utilities to help wrangle alignment data, if you are comfortable with C++ programming you can use libMems itself, which provides an API for manipulating alignments and is what both mauveAligner and progressiveMauve build upon. Alternatively I believe BioPerl and some of the other Bio* projects can parse XMFA and MAF genome alignments. You can convert progressiveMauve's XMFA files to MAF with the xmfa2maf utility. Best, -Aaron On Mon, 2014-03-31 at 13:08 +1100, Laura Perlaza wrote: > Hi Everyone, > > I'm working with MAUVE. I'm interested in certain regions that are > present in one of my genomes, but no on the others in the alignment. > What I did is that I used the backbone file, and pick the regions that > have coordinates for that genome (cero I assumed was absent of the > region). Additionally, I made a script to use the coordinates in the > backbone file to extract from the genome the sequences of interest. I > have several assumptions about the way the coordinates on the backbone > file reflect the information in the genomes. However, I have been > getting weird sequences, this lead me to ask you several questions, > hoping you could help me. > > > > 1. When I have a fragmented genome, I have a multi-fasta file > representing this genome. When I want to extract the sequences using > the coordinates from the backbone file, what assumptions do I have to > make to deal with the different contigs. How does Mauve make the > counting over a multifasta file?. Is Mauve counting the letters in the > headers? Do the coordinates re-start each time that a header appears, > I mean, for each contig in my fasta file, Mauve starts from 1 again? > > > 2. How do I deal with the negative coordinates? Do I extract the > sequences, an then make the reverse of that sequence? or Do I extract > those coordinates of the whole genome reversed? Do I have to start > counting from the end of the sequence, to the begining? > > > 3. Is there available an utility that helps me with these issues? > > > > > Thanks very much for your attention! > -- > > > Laura Perlaza-Jiménez > > Graduate Research Assistant > Mycology and Plant Pathology Laboratory (LAMFU) > Universidad de Los Andes. Bogotá, Colombia. > > -- Aaron E. Darling, Ph.D. Associate Professor, ithree institute University of Technology Sydney Australia http://darlinglab.org twitter: @koadman UTS CRICOS Provider Code: 00099F DISCLAIMER: This email message and any accompanying attachments may contain confidential information. If you are not the intended recipient, do not read, use, disseminate, distribute or copy this message or attachments. If you have received this message in error, please notify the sender immediately and delete this message. Any views expressed in this message are those of the individual sender, except where the sender expressly, and with authority, states them to be the views of the University of Technology Sydney. Before opening any attachments, please check them for viruses and defects. Think. Green. Do. Please consider the environment before printing this email. ------------------------------------------------------------------------------ _______________________________________________ Mauve-users mailing list Mauve-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mauve-users