Hi,

I'm trying to align amino acids to nucleic acids. I'm using gapped sequences 
both for the protein and for the DNA. I have several problems and I would very 
appreciate if someone could help.
1. How can I parse DNA nucleic acids and get codons. I would like to start with 
DNA that look like this "ATGTAT" and get a protein that look like this "MY". 
I'm using  "Alphabet alpha = DNATools.getCodonAlphabet();" but I can't find 
tokenization to parse the DNA string (does this make any sense?).
2. My other problem is that there are frame shifts and my gapped DNA look 
actually like this "AT-G-TAT". Is there any way to get/translate locations from 
the codon symbols list to/from the DNA symbols list?

I would appreciate any clue whether all of this make any sense.

Thanks,
Alex Golubev.

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l

Reply via email to