Hi all,

I've been trying to generate some global alignments with biojava and comparing them with what needle returns. Doing this, I can't seem to reproduce needle's alignment with biojava. The score returned from biojava seems to be worse than that from needle, so I'm not sure what's happening here.

The sequences are AB004720 and Y17238 (I didn't attach a fasta file to avoid spamming people, let me know if you want one). I align them with:
GapPenalty penalty = new SimpleGapPenalty((short)-14, (short)-4);
PairwiseSequenceAligner<DNASequence, NucleotideCompound> aligner = Alignments.getPairwiseAligner(
new DNASequence(query, AmbiguityDNACompoundSet.getDNACompoundSet()),
new DNASequence(target, AmbiguityDNACompoundSet.getDNACompoundSet()),
PairwiseSequenceAlignerType.GLOBAL,
penalty, SubstitutionMatrixHelper.getNuc4_4());
SequencePair<DNASequence, NucleotideCompound>
alignment = aligner.getPair();

This gives me an alignment with only 23% similarity and a gap at the end. Varying the gap penalties can give me a gap in front too, but that's about it. When aligning in needle, I get a sequence with a higher score (6784 vs (-)5862) and 94% similarity (which seems closer to home). Needle I just run with defaults (so it uses EDNAFULL) and a go/ge of 14/4.

Could this be a bug or am I misunderstanding some of the options?

BTW, if I use a really large gapextend, say -4000, I also get a nullpointer exception.

TIA,
Wim De Smet

--
Wim De Smet
http://www.straininfo.net/
_______________________________________________
Biojava-l mailing list  -  [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l

Reply via email to