Hi Frans - Thanks for these changes. I have committed them to cvs and added "default" as a valid tokenization of IntegerAlphabet (as a synonym of "token").
- Mark -----Original Message----- From: VERHOEF Frans [mailto:[EMAIL PROTECTED] Sent: Friday, 28 November 2003 4:34 p.m. To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: [Biojava-dev] PhredFormat Hi, I have fixed the little bugs in PhredFormat bugging me for the last 2 days. I have attached the version fixed by me. Feel free to use it, change it or throw it. In short what I have changed is this: - PhredFormat implements ParseErrorSource and ParseErrorListener. This was not much of a job, as I basically copied it from FastaFormat. - readSequenceData(BufferedReader br, SymbolTokenization parser, SeqIOListener listener) has changed. This method used to parse char arrays for short number strings and feed it to the StreamParser, which in turn would try to do the same. As in the process the whitespaces were removed, in the end a String representing a humongous number was tried to be parsed to integer. Now this method does not parse the char arrays, but just feeds whole chunks of char array to the StreamParser. One new issue came up though, when I am trying to do the following: StreamReader qualityIter = PhredTools.readPhredQuality(new BufferedReader(new FileReader(phredQualityFile))); While (qualityIter.hasNext()){ Sequence seq = qualityIter.nextSequence(); String str = seq.seqString(); } The last line gave the following exception: java.util.NoSuchElementException: default parser not supported by IntegerAlphabet yet at org.biojava.bio.symbol.IntegerAlphabet.getTokenization(IntegerAlphabet.java:216) at org.biojava.bio.symbol.AbstractSymbolList.seqString(AbstractSymbolList.java:101) at org.biojava.bio.seq.impl.SimpleSequence.seqString(SimpleSequence.java:108) at org.gis.server.pipeline.apps.SequenceInfoParser.parseResults(SequenceInfoParser.java:82) What happens is that SimpleSequence calls the AbstractSymbolList.seqString() method. This method in turn executes getAlphabet().getTokenization("default"), where getAlphabet returns the IntegerAlphabet. But IntegerAlphabet throws the Exception here, because it only except a name parameter value "token" and not the "default" that AbstractSymbolList gives. I do have simple workaround, that basically where the method IntegerAplhabet.getTokenization(String name) accepts both "default" and "token". But I am not sure I here understand the philosophy behind the design completely... Kind regards, Frans Verhoef Bioinformatics Specialist Genome Institute of Singapore Genome, #02-01, 60 Biopolis Street, Singapore 138672 Tel: +65 6478 8000 DID: +65 6478 8060 HP: +65 9848 4325 Email: [EMAIL PROTECTED] ======================================================================= Attention: The information contained in this message and/or attachments from AgResearch Limited is intended only for the persons or entities to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipients is prohibited by AgResearch Limited. If you have received this message in error, please notify the sender immediately. ======================================================================= _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l
