I was using the (passing) NW test as a baseline to compare the (failing)
anchored test to so as to make sure I was using the BioJava API
correctly. In my particular use case, I want to perform global alignment
with only the first base anchored but unfortunately, NW does not expose
get/set to anchor (and the meaning of anchoring a base for SW/local
alignment is a bit ambiguous).
A bug has been raised with a test case for ease of reproduction.
Cheers
Daniel
On 27/11/2013 3:34 PM, Andreas Prlic wrote:
needleman -wunsch is doing a global alignment, try smith waterman
instead.
can you file a bug report on github?
https://github.com/biojava/biojava/issues
Thanks,
Andreas
On Tue, Nov 26, 2013 at 7:54 PM, Daniel Cameron <[email protected]
<mailto:[email protected]>> wrote:
That API does make sense but unfortunately, it crashes when I try
it. Having traced through the source & test cases from github, it
appears that there isn't actually any test coverage for the
(anchor != null) path in align().
Additionally, align() also explicitly forces the final base of the
query to align to the final base of the target which in my case is
not supposed to be an anchor. I've copied my test cases with
expected behaviour. Is there a bug track system I should be
raising this with?
Cheers
Daniel
@Test
public void testNeedlemanWunschDNAAlignment() {
DNASequence query = new DNASequence("ACGTAACCGGTT",
AmbiguityDNACompoundSet.getDNACompoundSet());
DNASequence target = new
DNASequence("AACGTAACCGGTTACGTACGT",
AmbiguityDNACompoundSet.getDNACompoundSet());
// 123456789012345678900
// Expected: |----------|
NeedlemanWunsch<DNASequence, NucleotideCompound> nw = new
NeedlemanWunsch<DNASequence, NucleotideCompound>(query, target,
new SimpleGapPenalty((short)5, (short)2),
SubstitutionMatrixHelper.getNuc4_4());
AlignedSequence<DNASequence, NucleotideCompound> aligned =
nw.getPair().getQuery();
assertEquals(2, (int)aligned.getStart().getPosition());
assertEquals(13, (int)aligned.getEnd().getPosition());
}
@Test
public void testAnchoredDNAAlignment() {
DNASequence query = new DNASequence("ACGTAACCGGTT",
AmbiguityDNACompoundSet.getDNACompoundSet());
DNASequence target = new
DNASequence("AACGTAACCGGTTACGTACGT",
AmbiguityDNACompoundSet.getDNACompoundSet());
// 123456789012345678900
// Expected: |_----------|
AnchoredPairwiseSequenceAligner<DNASequence, NucleotideCompound>
aligner = new AnchoredPairwiseSequenceAligner<DNASequence,
NucleotideCompound>(query, target, new SimpleGapPenalty((short)5,
(short)2), SubstitutionMatrixHelper.getNuc4_4());
int[] anchors = new int[query.getLength()];
for (int i = 0; i < anchors.length; i++) anchors[i] = -1;
anchors[0] = 0;
AlignedSequence<DNASequence, NucleotideCompound> aligned =
aligner.getPair().getQuery();
assertEquals(1, (int)aligned.getStart().getPosition());
assertEquals(13, (int)aligned.getEnd().getPosition());
}
On 27/11/2013 12:00 PM, Andreas Prlic wrote:
Hi Daniel,
Clearly we need some documentation for this.
Looking at the source: if you get to
AbstractMatrixAligner.align() you can see how the anchors are
being used.
I just took a brief look, so I might be wrong, but I think the
int[] array should have the length of the query sequence and each
position indicates the counterpart in the target sequence.
Positions that are <=0 should be considered not to be anchored.
If this is right, then this should be very close to what you were
expecting.
Can you take a closer look and confirm if this works for you?
Thanks,
Andreas
On Sun, Nov 24, 2013 at 11:22 PM, Daniel Cameron
<[email protected] <mailto:[email protected]>> wrote:
Hello all
I'm looking the BioJava API for anchored alignment and I'm
unsure of how to set alignment anchors. The API exposes int[]
get/setAnchor() which is confusing me somewhat and I'm unsure
of what a BioJava anchor actually is. My use case is the
comparison of two sequences which I have a priori knowledge
of particular positions so I was expecting an API where I
could specify base positions in the query to correspond
different positions in the target. Eg: I was query base 4 to
align to target base 10 and query base 100 to align to target
base 80. Is this possible?
With anchor being only an int[] and the source looking like
this is not an array of paired values, how would one go about
performing the above alignment?
Thanks
Daniel Cameron
______________________________________________________________________
The information in this email is confidential and intended
solely for the addressee.
You must not disclose, forward, print or use it without the
permission of the sender.
______________________________________________________________________
_______________________________________________
Biojava-l mailing list - [email protected]
<mailto:[email protected]>
http://lists.open-bio.org/mailman/listinfo/biojava-l
______________________________________________________________________
The information in this email is confidential and intended solely
for the addressee.
You must not disclose, forward, print or use it without the
permission of the sender.
______________________________________________________________________
______________________________________________________________________
The information in this email is confidential and intended solely for the
addressee.
You must not disclose, forward, print or use it without the permission of the
sender.
______________________________________________________________________
_______________________________________________
Biojava-l mailing list - [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l