On Fri, Oct 19, 2001 at 03:56:54PM -0700, David Waring wrote: > > I am working on bio.program.PhredSequence and its friends (for handling the > qualitative data associated with the output of Phred). PhredSequence uses > SymbolLists with an IntegerAlphabet. At present the getToken() method of > IntergerAlphabet.IntegerSymbol returns '#'. I guess this is because the > Symbol interface specifies that getToken() return a char. Shouldn't this be > a String? Afterall SymbolParser parseToken() parses a String, and aren't we > dealing with alphabets that can have multi-character tokens such as the 3 > letter amino acids names? Has this issue come up before? Am I > misunderstanding 'token'? > > One of the things that must be done with at PhredSequnece is to write the > quality data (an IntegerAlphabet based SymbolList) to a fasta-like format. > I'd like to just create a Sequence with the quality SymbolList and be able > to write this using a FastaFormat. But since FastaFormat calls seqString() > and that is coded in AbstractSymbolList to use getToken() it can only deal > with chars so it can't handle IntegerSymbols. Another is issue is that with > an IntegerSymbolList one would really like the seqString to output something > like '10 20 22 7' as opposed to '1020227'. > > Three options: > 1) Create a new SequenceFormat just for this, and if there will be no other > use of IntegerSymbolList perhaps this is the best way to go. > > 2) Create an IntegerSymbolList that extends SimpleSymbolList overriding > seqString(). > > 3) (most invasive but perhaps cleanest) Change getToken() to return an > String, or adding toString() to Symbol and add a method paddedSeqString() to > AbstractSymbolList.
4) get rid of getToken() completely, and change the way that sequences get converted to strings -- replacing hardwired code in SymbolList implementations with pluggable `stringifiers'. This was the idea of my SymbolTokenizations patch which I posted a few days ago. Certainly my view is that is provides a much cleaner framework for handling this kind of situtation, and I'd urge you to take a look. Thomas _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l