Hi David,

Hmmm, interesting....

Can you send me example sequences for these cases and I will take a look.

All the best,

Peter Rice
EMBOSS Team

On 05/02/2019 22:58, David Mathog wrote:
EMBOSS 6.6.0 on Centos 6.9.

Trying to align a bunch of 2-20kbp contigs against a 175kbp BAC with stretcher and some odd things are falling out.  I think these point to a problem in the handling of end gaps in that program.  It is invoked like this:

   stretcher -aseq BAC.fasta -bseq contig.fasta \
           -outfile pairs.fasta -aformat3 fasta -auto


1.  If a ~25kbp contig aligns so that its final 12kb overlaps (nearly exactly, like 99.9% identity) with the first 12kbp of the BAC it the alignment produced is a total mess.  It seems like stretcher cannot handle end gaps in this context at all, forcing the 12kbp which should be dangling unpaired off the left end of the BAC into alignment internally. needle doesn't work in this situation either since both of these commands segfault (on a nearly idle machine with 512Gb of RAM):

   needle BAC.fasta contig.fasta -outfile pairs.fasta -aformat3 fasta -auto
    needle BAC.fasta contig.fasta -outfile pairs.fasta -aformat3 fasta \
       -endweight T -endextend 0 -endopen 1 -auto

The ssw_test program from SSWlib handles this correctly.  (Unfortunately it does a local alignment and so cannot replace needle in this context.)

2.  This happens a lot:

    AATTC(lots of sequence)ATGAC...  (BAC)
    A--------(...)----------TGAC...  (contig)

It will also shift two A's, but not 3, or shift AAT, and so forth. Needle doesn't do this when it runs the same alignment (with the 2nd command from the pair above).
Needle is also much slower than stretcher.

The contigs have been flipped if necessary so that they are all in the same direction
as the BAC.

Regards,

David Mathog
[email protected]
Manager, Sequence Analysis Facility, Biology Division, Caltech
_______________________________________________
EMBOSS mailing list
[email protected]
http://mailman.open-bio.org/mailman/listinfo/emboss

---
This email has been checked for viruses by AVG.
https://www.avg.com

_______________________________________________
EMBOSS mailing list
[email protected]
http://mailman.open-bio.org/mailman/listinfo/emboss

Reply via email to