More problems with parsing nucleotide sequences from NCBI. Apparently,
there's an odd dbxref tag on some of the sequences submitted by ATCC that
causes an exception. I've ran into 2 so far, but I'm sure there are more:
AA343569.1
AA325485.1
Exceptions produced are as follows:
--------------------------------------------------------------
Trying to get: AA343569.1
org.biojava.bio.BioException: Failed to read Genbank sequence
at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:157)
at exonhit.parsers.EventParser.getSeqFromNCBI(EventParser.java:250)
at exonhit.parsers.EventParser.insertRglrSE(EventParser.java:197)
at
exonhit.parsers.EventParser.createSpliceEvents(EventParser.java:105)
at exonhit.parsers.EventParser.main(EventParser.java:310)
Caused by: org.biojava.bio.BioException: Could not read sequence
at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:112)
at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:153)
... 4 more
Caused by: org.biojava.bio.seq.io.ParseException: Bad dbxref found: ATCC
(inhost):145151, accession:AA343569
at
org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:438)
at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:109)
... 5 more
Java Result: -1
=========================================================
Trying to get: AA325485.1
org.biojava.bio.BioException: Failed to read Genbank sequence
at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:157)
at exonhit.parsers.EventParser.getSeqFromNCBI(EventParser.java:250)
at exonhit.parsers.EventParser.insertRglrSE(EventParser.java:197)
at
exonhit.parsers.EventParser.createSpliceEvents(EventParser.java:105)
at exonhit.parsers.EventParser.main(EventParser.java:312)
Caused by: org.biojava.bio.BioException: Could not read sequence
at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:112)
at
org.biojavax.bio.db.ncbi.GenbankRichSequenceDB.getRichSequence(GenbankRichSequenceDB.java:153)
... 4 more
Caused by: org.biojava.bio.seq.io.ParseException: Bad dbxref found: ATCC
(inhost):125990, accession:AA325485
at
org.biojavax.bio.seq.io.GenbankFormat.readRichSequence(GenbankFormat.java:438)
at
org.biojavax.bio.seq.io.RichStreamReader.nextRichSequence(RichStreamReader.java:109)
... 5 more
Java Result: -1
--
View this message in context:
http://www.nabble.com/Parsing-Genbank-sequences-from-NCBI-tf2052235.html#a5777810
Sent from the BioJava forum at Nabble.com.
_______________________________________________
Biojava-l mailing list - [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l