Sounds like it _might_ be something to do with the carriage return itself. Is the blast file generated on the same OS that you're running your analysis on? (e.g. you might run Blast on a Linux box, but attempt to parse the file on a Windows box?). If the two OSes are different, this might point to it - as Linux won't necessarily understand the Windows linebreaks, or vice versa, and might misinterpret them. When you copy the portion of the file to a new file on the OS you're running the analysis on, it will substitute its own local linebreaks and thus mask the problem.
So the first thing I'd check is to what the two OSes involved are. If they're different, try running your analysis program on the same OS as the Blast output was generated on. If that does fix it, then try putting your Blast files through dos2unix or something similar to convert the linebreaks before running your analysis program. If they're the same OS, then we still have a problem! cheers, Richard 2008/9/30 David Toomey <[EMAIL PROTECTED]>: > Hi > > > > I am parsing a blast result and I am getting a > StringIndexOutOfBoundsException. The stack trace is > > > > at java.lang.String.substring(String.java:1938) > > at java.lang.String.substring(String.java:1905) > > at > org.biojava.bio.program.sax.BlastLikeAlignmentSAXParser.parseLine(BlastLikeA > lignmentSAXParser.java:291) > > at > org.biojava.bio.program.sax.BlastLikeAlignmentSAXParser.parse(BlastLikeAlign > mentSAXParser.java:116) > > at > org.biojava.bio.program.sax.HitSectionSAXParser.outputHSPInfo(HitSectionSAXP > arser.java:517) > > at > org.biojava.bio.program.sax.HitSectionSAXParser.firstHSPEvent(HitSectionSAXP > arser.java:287) > > at > org.biojava.bio.program.sax.HitSectionSAXParser.interpret(HitSectionSAXParse > r.java:251) > > at > org.biojava.bio.program.sax.HitSectionSAXParser.parse(HitSectionSAXParser.ja > va:117) > > at > org.biojava.bio.program.sax.BlastSAXParser.hitsSectionReached(BlastSAXParser > .java:634) > > at > org.biojava.bio.program.sax.BlastSAXParser.interpret(BlastSAXParser.java:341 > ) > > at > org.biojava.bio.program.sax.BlastSAXParser.parse(BlastSAXParser.java:168) > > at > org.biojava.bio.program.sax.BlastLikeSAXParser.onNewDataSet(BlastLikeSAXPars > er.java:314) > > at > org.biojava.bio.program.sax.BlastLikeSAXParser.interpret(BlastLikeSAXParser. > java:276) > > at > org.biojava.bio.program.sax.BlastLikeSAXParser.parse(BlastLikeSAXParser.java > :163) > > at ie.rcsi.blast.StandardParser.parse(StandardParser.java:65) > > at ie.rcsi.blast.BlastParser.parse(BlastParser.java:44) > > at ie.rcsi.blast.Main.main(Main.java:30) > > > > I have updated BlastLikeAlignmentSAXParser to output some debug info and > narrowed down the line causing the problem to the following line > > > > 2,4-cyclodiphosphate synthase OS=Plasmodium falciparum (isolate 3D7) > > GN=ISPF > > > > If I remove the carriage return and put it on a single line then everything > works fine. Strangely if I copy this entry and put it in a file on it's own > it also parses correctly, even with the carriage return!!! > > > > Has anyone seen this before or does anyone have a suggestion on what I might > to do fix it. I send the complete blast result if it would help. I have > tried using blast 2.2.18 and 2.2.17 and the problem is the same. > > > > Cheers > > > > Dave > > > > > > > _______________________________________________ > Biojava-l mailing list - [email protected] > http://lists.open-bio.org/mailman/listinfo/biojava-l > -- Richard Holland, BSc MBCS Finance Director, Eagle Genomics Ltd M: +44 7500 438846 | E: [EMAIL PROTECTED] http://www.eaglegenomics.com/ _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
