Hi Khalil, if you can get 100% sequence ID depends on your sequences.. you can try to enforce a more strict alignment by increasing the gap penalties significantly (try to double or triple gap opening and extension) .
A On Sun, Jun 9, 2013 at 12:32 PM, Khalil El Mazouari < [email protected]> wrote: > Hi, > > I am trying to assemble overlapping sequence (direct & reverse) via local > alignment. I am only searching for local aln with 100% identity. > > Which parameters, matrix ... should I use in order to get 100% ident. > local aln. > > Any other suggestion for assembling overlapping seq (in Java) is welcome. > > Thanks > > khalil > > > > SubstitutionMatrix<NucleotideCompound> matrix = > SubstitutionMatrixHelper.getNuc4_2(); > SimpleGapPenalty gapP = new SimpleGapPenalty(); > gapP.setOpenPenalty((short) 5); > gapP.setExtensionPenalty((short) 1); > SequencePair<DNASequence, NucleotideCompound> psa = > Alignments.getPairwiseAlignment(query, target, > PairwiseSequenceAlignerType.LOCAL, gapP, matrix); > > > > > ======== > > Local Alignment Identity: 97.84688995215312% > > query GGGGAAAACACGAAAGGCCCTTGGTGGAGGCGCTTGAGACGGTGACAAGGGTTCCCTGGC 68 > |||||| || ||| ||||||||||||||||||||||||||||||| ||||||||||||| > target GGGGAAGAC-CGATGGGCCCTTGGTGGAGGCGCTTGAGACGGTGACCAGGGTTCCCTGGC 417 > > query CCCAGTAGTCAAAGGTCCGTGAGGAGCTCCACTTGTGTGCACAGTAATATGTGGCTGAGT 128 > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > target CCCAGTAGTCAAAGGTCCGTGAGGAGCTCCACTTGTGTGCACAGTAATATGTGGCTGAGT 477 > > query CCACAGGGTCCATGTTGGTCATTGTAAGGACCACCTGGTCTTTGGAGGTGTCCTTGGTGA 188 > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > target CCACAGGGTCCATGTTGGTCATTGTAAGGACCACCTGGTCTTTGGAGGTGTCCTTGGTGA 537 > > query TGGTGAGCCTGCTCTTCAGAGATGGGCTGTAGCGCTTATCATCATTCCAATAAATGAGTG 248 > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > target TGGTGAGCCTGCTCTTCAGAGATGGGCTGTAGCGCTTATCATCATTCCAATAAATGAGTG 597 > > query CAAGCCACTCCAGGGCCTTTCCTGGGGGCTGACGGATCCAGCCCACACCCACTCCACTAG 308 > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > target CAAGCCACTCCAGGGCCTTTCCTGGGGGCTGACGGATCCAGCCCACACCCACTCCACTAG 657 > > query TGCTGAGTGAGAACCCAGAGAAGGTGCAGGTCAGCGTGAGGGTCTGTGTGGGTTTCACCA 368 > |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| > target TGCTGAGTGAGAACCCAGAGAAGGTGCAGGTCAGCGTGAGGGTCTGTGTGGGTTTCACCA 717 > > query GCGTAGGACCAGACTCCTTCAAGGTGATCTGGGCCATGGCCGGCTGGGCCGCGAGTAA 426 > |||||||||||||||||||||||||| ||||||||| |||||||||| |||| ||||| > target GCGTAGGACCAGACTCCTTCAAGGTG-TCTGGGCCA-GGCCGGCTGG-CCGCAAGTAA 772 > > > > > > > > > > > ----- > > Confidentiality Notice: This e-mail and any files transmitted with it are > private and confidential and are solely for the use of the addressee. It > may contain material which is legally privileged. If you are not the > addressee or the person responsible for delivering to the addressee, please > notify that you have received this e-mail in error and that any use of it > is strictly prohibited. It would be helpful if you could notify the author > by replying to it. > > > > > _______________________________________________ > Biojava-l mailing list - [email protected] > http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
