Actually, if it is an OS specific carriage return then there is still a minor issue. We should really try and code stuff so that it can handle files that originate from any major OS.
- Mark On Wed, Oct 1, 2008 at 12:31 AM, Richard Holland <[EMAIL PROTECTED]> wrote: > > Sounds like it _might_ be something to do with the carriage return > itself. Is the blast file generated on the same OS that you're running > your analysis on? (e.g. you might run Blast on a Linux box, but > attempt to parse the file on a Windows box?). If the two OSes are > different, this might point to it - as Linux won't necessarily > understand the Windows linebreaks, or vice versa, and might > misinterpret them. When you copy the portion of the file to a new file > on the OS you're running the analysis on, it will substitute its own > local linebreaks and thus mask the problem. > > So the first thing I'd check is to what the two OSes involved are. If > they're different, try running your analysis program on the same OS as > the Blast output was generated on. If that does fix it, then try > putting your Blast files through dos2unix or something similar to > convert the linebreaks before running your analysis program. > > If they're the same OS, then we still have a problem! > > cheers, > Richard > > 2008/9/30 David Toomey <[EMAIL PROTECTED]>: > > Hi > > > > > > > > I am parsing a blast result and I am getting a > > StringIndexOutOfBoundsException. The stack trace is > > > > > > > > at java.lang.String.substring(String.java:1938) > > > > at java.lang.String.substring(String.java:1905) > > > > at > > org.biojava.bio.program.sax.BlastLikeAlignmentSAXParser.parseLine(BlastLikeA > > lignmentSAXParser.java:291) > > > > at > > org.biojava.bio.program.sax.BlastLikeAlignmentSAXParser.parse(BlastLikeAlign > > mentSAXParser.java:116) > > > > at > > org.biojava.bio.program.sax.HitSectionSAXParser.outputHSPInfo(HitSectionSAXP > > arser.java:517) > > > > at > > org.biojava.bio.program.sax.HitSectionSAXParser.firstHSPEvent(HitSectionSAXP > > arser.java:287) > > > > at > > org.biojava.bio.program.sax.HitSectionSAXParser.interpret(HitSectionSAXParse > > r.java:251) > > > > at > > org.biojava.bio.program.sax.HitSectionSAXParser.parse(HitSectionSAXParser.ja > > va:117) > > > > at > > org.biojava.bio.program.sax.BlastSAXParser.hitsSectionReached(BlastSAXParser > > .java:634) > > > > at > > org.biojava.bio.program.sax.BlastSAXParser.interpret(BlastSAXParser.java:341 > > ) > > > > at > > org.biojava.bio.program.sax.BlastSAXParser.parse(BlastSAXParser.java:168) > > > > at > > org.biojava.bio.program.sax.BlastLikeSAXParser.onNewDataSet(BlastLikeSAXPars > > er.java:314) > > > > at > > org.biojava.bio.program.sax.BlastLikeSAXParser.interpret(BlastLikeSAXParser. > > java:276) > > > > at > > org.biojava.bio.program.sax.BlastLikeSAXParser.parse(BlastLikeSAXParser.java > > :163) > > > > at ie.rcsi.blast.StandardParser.parse(StandardParser.java:65) > > > > at ie.rcsi.blast.BlastParser.parse(BlastParser.java:44) > > > > at ie.rcsi.blast.Main.main(Main.java:30) > > > > > > > > I have updated BlastLikeAlignmentSAXParser to output some debug info and > > narrowed down the line causing the problem to the following line > > > > > > > > 2,4-cyclodiphosphate synthase OS=Plasmodium falciparum (isolate 3D7) > > > > GN=ISPF > > > > > > > > If I remove the carriage return and put it on a single line then everything > > works fine. Strangely if I copy this entry and put it in a file on it's own > > it also parses correctly, even with the carriage return!!! > > > > > > > > Has anyone seen this before or does anyone have a suggestion on what I might > > to do fix it. I send the complete blast result if it would help. I have > > tried using blast 2.2.18 and 2.2.17 and the problem is the same. > > > > > > > > Cheers > > > > > > > > Dave > > > > > > > > > > > > > > _______________________________________________ > > Biojava-l mailing list - [email protected] > > http://lists.open-bio.org/mailman/listinfo/biojava-l > > > > > > -- > Richard Holland, BSc MBCS > Finance Director, Eagle Genomics Ltd > M: +44 7500 438846 | E: [EMAIL PROTECTED] > http://www.eaglegenomics.com/ > _______________________________________________ > Biojava-l mailing list - [email protected] > http://lists.open-bio.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
