Hi Daniel...

On Wed, 2015-07-01 at 23:09 +0200, Daniel Dörr wrote:

> Would you recommend to partition each input sequence into multiple records of 
> a multi-fasta file so as to omit large masked regions? As I understood from 
> the manual and from previous posts on this mailing list, progressiveMauve 
> concatenates all records in multi-fasta files of the input. I guess my 
> question is: in constructing the backbone, does it prevent homologous 
> segments to cover more than one record of a multi-fasta file?

That's a great question. Currently the backbone entries are not split up
by contig, but reported in the concatenate coordinate space. So cutting
out the repeat masked regions and breaking contigs would prevent these
regions from becoming aligned but would potentially add extra complexity
to interpreting the backbone file.

> 
> >> 2) I observe sometimes strange lines in the backbone file such as the 
> >> following:
> >> ___
> >> 7691835 7691966 -85715547       -85715547       0       0       0       0  
> >>      349474437       349474583       -700243823      -700243822      0     
> >>   0
> >> 8282300 8282275 0       0       0       0       0       0       0       0  
> >>      0       0       0       0
> >> ___
> >> 
> >> Note that in the first line, the segments specified by columns [3,4] and 
> >> [11, 12] have lengths 0 and -1, respectively. Negative lengths mostly 
> >> occur for segments that are not homologous to segments in other genomes, 
> >> as shown in the second line (which makes me wonder why they are included 
> >> in the backbone file in the first place).
> > 
> > I've not seen this before but yes it does seem like a bug! As a
> > workaround, is it possible to ignore these segments in your downstream
> > processing until I can get a fix?
> 
> Yes, currently I identify and discard these homologous blocks when processing 
> the backbone file. I like to note that these “strange lines” occur extremely 
> rarely in my dataset - only 89 out of 1693288 lines in the backbone file 
> contain entries of negative segment length. 
> 

ok, good to know the extent of the problem is relatively small.

-- 
Aaron E. Darling, Ph.D.
Associate Professor, ithree institute
University of Technology Sydney
Australia

http://darlinglab.org
twitter: @koadman





UTS CRICOS Provider Code: 00099F
DISCLAIMER: This email message and any accompanying attachments may contain 
confidential information.
If you are not the intended recipient, do not read, use, disseminate, 
distribute or copy this message or
attachments. If you have received this message in error, please notify the 
sender immediately and delete
this message. Any views expressed in this message are those of the individual 
sender, except where the
sender expressly, and with authority, states them to be the views of the 
University of Technology Sydney.
Before opening any attachments, please check them for viruses and defects.

Think. Green. Do.

Please consider the environment before printing this email.
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Mauve-users mailing list
Mauve-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mauve-users

Reply via email to