Hi Christine,
There are two levels of the algorithm to consider when it comes to
ambiguities. The first is the alignment anchoring, which is using
spaced seeds to find strings of gap-free matches among the input
sequences. For this, the sequences become encoded in a two-bit
representation, e.g. 00 for A, 01 for C, etc. Any IUPAC ambiguity that
contains an A will be collapsed to 00, any remaining ambiguity with a C
will become 01, and so on for G and T. This means that for example M
and S will not match in the two bit representation even though they
could both encode a C. However, the anchoring tolerates mismatches in
positions dictated by the seed pattern, see the Darling et al 2006 WABI
publication for more details about those seed patterns.
Second, once a set of anchors have been selected, progressiveMauve is
using the MUSCLE algorithm to compute the gapped alignment between
anchors and to subsequently refine the alignment around anchors. In
this stage the sequences, with any IUPAC codes, are passed onto MUSCLE.
For details about how MUSCLE handles these characters I think your best
bet is to inquire with Bob Edgar, who should be able to give the
authoritative answer.
Best,
-Aaron
On Fri, 2016-11-04 at 15:03 +0000, Christine Jandrasits wrote:
> Dear mauve-users,
> 
> I am trying to align sequences with a lot of ambiguous DNA bases (M,
> R, W, S, Y, ...) with progressiveMauve and I was wondering how the
> tool is handling these.
> 
> E.g. is the pair (A, M) considered a match or mismatch? How about (M,
> R)?
> (M = A or C; R = A or G). Are ambigiuous bases even considered when
> matching or are they replaced by "N" as with some other (alignment-)
> tools?
> 
> I tried to find my answer through the used substitution matrix but
> all references to the HOXD matrix only contain A, T, C, G...
> 
> Thanks in advance,
> Christine
> -------------------------------------------------------------------
> -----------
> Developer Access Program for Intel Xeon Phi Processors
> Access to Intel Xeon Phi processor-based developer platforms.
> With one year of Intel Parallel Studio XE.
> Training and support from Colfax.
> Order your platform today. http://sdm.link/xeonphi
> _______________________________________________
> Mauve-users mailing list
> Mauve-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/mauve-users
-- 
Aaron E. Darling, Ph.D.
Associate Professor, ithree institute
University of Technology Sydney
Australia

http://darlinglab.org
twitter: @koadman





UTS CRICOS Provider Code: 00099F
DISCLAIMER: This email message and any accompanying attachments may contain 
confidential information.
If you are not the intended recipient, do not read, use, disseminate, 
distribute or copy this message or
attachments. If you have received this message in error, please notify the 
sender immediately and delete
this message. Any views expressed in this message are those of the individual 
sender, except where the
sender expressly, and with authority, states them to be the views of the 
University of Technology Sydney.
Before opening any attachments, please check them for viruses and defects.

Think. Green. Do.

Please consider the environment before printing this email.
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Mauve-users mailing list
Mauve-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mauve-users

Reply via email to