Hi, I have to extract data from the GenBank XML files. For this purpose I use the biojava API. But I get a parser error.
java.lang.StringIndexOutOfBoundsException: String index out of range: 12 at java.lang.String.substring(String.java:1477) at org.biojava.bio.seq.io.GenbankContext.processHeaderLine (GenbankContext.java:621) at org.biojava.bio.seq.io.GenbankContext.processLine (GenbankContext.java:263) at org.biojava.bio.seq.io.GenbankFormat.readSequence (GenbankFormat.java:144) at org.biojava.bio.seq.io.StreamReader.nextSequence (StreamReader.java:100)
rethrown as org.biojava.bio.BioException: Could not read sequence at org.biojava.bio.seq.io.StreamReader.nextSequence (StreamReader.java:103) at de.izbi.gbm.logistics.GenBankBioJavaImporter.readFile (GenBankBioJavaImporter.java:41) at de.izbi.gbm.gui.GenBankBaseFrame.actionPerformed (GenBankBaseFrame.java:134) at javax.swing.AbstractButton.fireActionPerformed (AbstractButton.java:1764) at javax.swing.AbstractButton$ForwardActionEvents.actionPerformed (AbstractButton.java:1817) at javax.swing.DefaultButtonModel.fireActionPerformed (DefaultButtonModel.java:419) at javax.swing.DefaultButtonModel.setPressed (DefaultButtonModel.java:257) at javax.swing.AbstractButton.doClick(AbstractButton.java:289) at javax.swing.plaf.basic.BasicMenuItemUI.doClick (BasicMenuItemUI.java:1109) at javax.swing.plaf.basic.BasicMenuItemUI$MouseInputHandler. mouseReleased(BasicMenuItemUI.java:943) at java.awt.Component.processMouseEvent(Component.java:5093) at java.awt.Component.processEvent(Component.java:4890) at java.awt.Container.processEvent(Container.java:1566) at java.awt.Component.dispatchEventImpl(Component.java:3598) at java.awt.Container.dispatchEventImpl(Container.java:1623) at java.awt.Component.dispatchEvent(Component.java:3439) at java.awt.LightweightDispatcher.retargetMouseEvent (Container.java:3450) at java.awt.LightweightDispatcher.processMouseEvent (Container.java:3165) at java.awt.LightweightDispatcher.dispatchEvent(Container.java:3095) at java.awt.Container.dispatchEventImpl(Container.java:1609) at java.awt.Window.dispatchEventImpl(Window.java:1585) at java.awt.Component.dispatchEvent(Component.java:3439) at java.awt.EventQueue.dispatchEvent(EventQueue.java:450) at java.awt.EventDispatchThread.pumpOneEventForHierarchy (EventDispatchThread.java:197) at java.awt.EventDispatchThread.pumpEventsForHierarchy (EventDispatchThread.java:150) at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:144) at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:136) at java.awt.EventDispatchThread.run(EventDispatchThread.java:99)
The program is just simple. The user specifies path and file name by the FileChooser component. Then I open the file and apply the Sequence and Annotation classes as visible in the attached method taken from a extended file class.
What I need are the sequence data of the GenBank entry (accession, sequence etc.) and also for its features (start, end position, subtype like t-RNA, cds etc.)
Any hints are welcome. Thanks Tori
---------------------
public GenBankBioJavaImporter(String path, String fileName, Connection genDbCon) { super(); super.setPath(path); super.setFileName(fileName); } public boolean readFile() { if (!super.createInputFile()) return(false);
//read the GenBank File SequenceIterator sequences = SeqIOTools.readGenbank(super.fileReaderHandler); // fileReaderHandler is a BufferedReader
//iterate through the sequences while(sequences.hasNext()) { try {
Sequence seq = sequences.nextSequence(); //do stuff with the sequence System.out.println("Info: "+seq.getName()+", "+seq.getURN()+", "+seq.countFeatures()); Annotation anno = seq.getAnnotation(); //anno.getProperty() } catch (BioException ex) { //not in GenBank format ex.printStackTrace(); super.closeInputFile(); return(false); }catch (NoSuchElementException ex) { //request for more sequence when there isn't any ex.printStackTrace(); super.closeInputFile(); return(false); } } super.closeInputFile(); return(true); }
_______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l