Don Naki wrote:
Hi all,
I have a couple of 'novice' questions...
I can't seem to figure out how to create a SimpleGappedSymbolList from a String. I want to parse "-AQSD--VP-" and create a SimpleGappedSymbolList from it.
ProteinTools has methods to return a SymbolList, Sequence, and GappedSequence from a String, but not a GappedSymbolList. I understand GappedSequence extends GappedSymbolList, but I want just the GappedSymbolList. Alternatively, is there a way to get a GappedSymbolList from a GappedSequence?
We could add a uitlity method to do this. Why do you /have/ to have a
GappedSymbolList that is not a GappedSequence? Is there a specific
memory constraint?
A second question is that ProteinTools.createGappedProteinSequence("-AQSD--VP-").seqString() results in the String "XAQSD--VPX". The first and last '-' characters are now represented by 'X'. Is this a special kind of gap symbol? If so, how can I distinguish between '-' and 'X' Symbols?
This is a tokenization bug - the leading/trailing gaps are not being
recognised by the tokenizer, and then replaced by X. It's probably in
CharacterTokenization - needs a special-case for
AlphabetManager.getGapSymbol() - could someone look a this?
Thanks in advance,
Don
Matthew
_______________________________________________
Biojava-l mailing list - [EMAIL PROTECTED]
http://biojava.org/mailman/listinfo/biojava-l