I was using the (passing) NW test as a baseline to compare the (failing) anchored test to so as to make sure I was using the BioJava API correctly. In my particular use case, I want to perform global alignment with only the first base anchored but unfortunately, NW does not expose get/set to anchor (and the meaning of anchoring a base for SW/local alignment is a bit ambiguous).

A bug has been raised with a test case for ease of reproduction.

Cheers
Daniel

On 27/11/2013 3:34 PM, Andreas Prlic wrote:
needleman -wunsch is doing a global alignment, try smith waterman instead.

can you file a bug report on github? https://github.com/biojava/biojava/issues

Thanks,

Andreas


On Tue, Nov 26, 2013 at 7:54 PM, Daniel Cameron <[email protected] <mailto:[email protected]>> wrote:

    That API does make sense but unfortunately, it crashes when I try
    it. Having traced through the source & test cases from github, it
    appears that there isn't actually any test coverage for the
    (anchor != null) path in align().

    Additionally, align() also explicitly forces the final base of the
    query to align to the final base of the target which in my case is
    not supposed to be an anchor. I've copied my test cases with
    expected behaviour. Is there a bug track system I should be
    raising this with?

    Cheers
    Daniel

        @Test
        public void testNeedlemanWunschDNAAlignment() {
            DNASequence query = new DNASequence("ACGTAACCGGTT",
    AmbiguityDNACompoundSet.getDNACompoundSet());
            DNASequence target = new
    DNASequence("AACGTAACCGGTTACGTACGT",
    AmbiguityDNACompoundSet.getDNACompoundSet());
            // 123456789012345678900
            // Expected: |----------|
            NeedlemanWunsch<DNASequence, NucleotideCompound> nw = new
    NeedlemanWunsch<DNASequence, NucleotideCompound>(query, target,
    new SimpleGapPenalty((short)5, (short)2),
    SubstitutionMatrixHelper.getNuc4_4());
            AlignedSequence<DNASequence, NucleotideCompound> aligned =
    nw.getPair().getQuery();
            assertEquals(2, (int)aligned.getStart().getPosition());
            assertEquals(13, (int)aligned.getEnd().getPosition());
        }
        @Test
        public void testAnchoredDNAAlignment() {
            DNASequence query = new DNASequence("ACGTAACCGGTT",
    AmbiguityDNACompoundSet.getDNACompoundSet());
            DNASequence target = new
    DNASequence("AACGTAACCGGTTACGTACGT",
    AmbiguityDNACompoundSet.getDNACompoundSet());
            // 123456789012345678900
            // Expected: |_----------|
    AnchoredPairwiseSequenceAligner<DNASequence, NucleotideCompound>
    aligner = new AnchoredPairwiseSequenceAligner<DNASequence,
    NucleotideCompound>(query, target, new SimpleGapPenalty((short)5,
    (short)2), SubstitutionMatrixHelper.getNuc4_4());
            int[] anchors = new int[query.getLength()];
            for (int i = 0; i < anchors.length; i++) anchors[i] = -1;
            anchors[0] = 0;
            AlignedSequence<DNASequence, NucleotideCompound> aligned =
    aligner.getPair().getQuery();
            assertEquals(1, (int)aligned.getStart().getPosition());
            assertEquals(13, (int)aligned.getEnd().getPosition());

        }

    On 27/11/2013 12:00 PM, Andreas Prlic wrote:
    Hi Daniel,

    Clearly we need some documentation for this.

    Looking at the source: if you get to
    AbstractMatrixAligner.align() you can see how the anchors are
    being used.

    I just took a brief look, so I might be wrong, but I think the
    int[] array should have the length of the query sequence and each
    position indicates the counterpart in the target sequence.
    Positions that are <=0 should be considered not to be anchored.
    If this is right, then this should be very close to what you were
    expecting.

    Can you take a closer look and confirm if this works for you?

    Thanks,

    Andreas

    On Sun, Nov 24, 2013 at 11:22 PM, Daniel Cameron
    <[email protected] <mailto:[email protected]>> wrote:

        Hello all

        I'm looking the BioJava API for anchored alignment and I'm
        unsure of how to set alignment anchors. The API exposes int[]
        get/setAnchor() which is confusing me somewhat and I'm unsure
        of what a BioJava anchor actually is. My use case is the
        comparison of two sequences which I have a priori knowledge
        of particular positions so I was expecting an API where I
        could specify base positions in the query to correspond
        different positions in the target. Eg: I was query base 4 to
        align to target base 10 and query base 100 to align to target
        base 80. Is this possible?

        With anchor being only an int[] and the source looking like
        this is not an array of paired values, how would one go about
        performing the above alignment?


        Thanks
        Daniel Cameron

        ______________________________________________________________________
        The information in this email is confidential and intended
        solely for the addressee.
        You must not disclose, forward, print or use it without the
        permission of the sender.
        ______________________________________________________________________
        _______________________________________________
        Biojava-l mailing list  - [email protected]
        <mailto:[email protected]>
        http://lists.open-bio.org/mailman/listinfo/biojava-l


    ______________________________________________________________________
    The information in this email is confidential and intended solely
    for the addressee.
    You must not disclose, forward, print or use it without the
    permission of the sender.
    ______________________________________________________________________







______________________________________________________________________
The information in this email is confidential and intended solely for the 
addressee.
You must not disclose, forward, print or use it without the permission of the 
sender.
______________________________________________________________________
_______________________________________________
Biojava-l mailing list  -  [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l

Reply via email to