Hi, I have to extract data from the GenBank XML files. For this purpose I use the biojava API. But I get a parser error.
java.lang.StringIndexOutOfBoundsException: String index out of range: 12 at java.lang.String.substring(String.java:1477) at org.biojava.bio.seq.io.GenbankContext.processHeaderLine (GenbankContext.java:621) at org.biojava.bio.seq.io.GenbankContext.processLine (GenbankContext.java:263) at org.biojava.bio.seq.io.GenbankFormat.readSequence (GenbankFormat.java:144) at org.biojava.bio.seq.io.StreamReader.nextSequence (StreamReader.java:100)
rethrown as org.biojava.bio.BioException: Could not read sequence at org.biojava.bio.seq.io.StreamReader.nextSequence (StreamReader.java:103) at de.izbi.gbm.logistics.GenBankBioJavaImporter.readFile (GenBankBioJavaImporter.java:41) at de.izbi.gbm.gui.GenBankBaseFrame.actionPerformed (GenBankBaseFrame.java:134) at javax.swing.AbstractButton.fireActionPerformed (AbstractButton.java:1764) at javax.swing.AbstractButton$ForwardActionEvents.actionPerformed (AbstractButton.java:1817) at javax.swing.DefaultButtonModel.fireActionPerformed (DefaultButtonModel.java:419) at javax.swing.DefaultButtonModel.setPressed (DefaultButtonModel.java:257) at javax.swing.AbstractButton.doClick(AbstractButton.java:289) at javax.swing.plaf.basic.BasicMenuItemUI.doClick (BasicMenuItemUI.java:1109) at javax.swing.plaf.basic.BasicMenuItemUI$MouseInputHandler. mouseReleased(BasicMenuItemUI.java:943) at java.awt.Component.processMouseEvent(Component.java:5093) at java.awt.Component.processEvent(Component.java:4890) at java.awt.Container.processEvent(Container.java:1566) at java.awt.Component.dispatchEventImpl(Component.java:3598) at java.awt.Container.dispatchEventImpl(Container.java:1623) at java.awt.Component.dispatchEvent(Component.java:3439) at java.awt.LightweightDispatcher.retargetMouseEvent (Container.java:3450) at java.awt.LightweightDispatcher.processMouseEvent (Container.java:3165) at java.awt.LightweightDispatcher.dispatchEvent(Container.java:3095) at java.awt.Container.dispatchEventImpl(Container.java:1609) at java.awt.Window.dispatchEventImpl(Window.java:1585) at java.awt.Component.dispatchEvent(Component.java:3439) at java.awt.EventQueue.dispatchEvent(EventQueue.java:450) at java.awt.EventDispatchThread.pumpOneEventForHierarchy (EventDispatchThread.java:197) at java.awt.EventDispatchThread.pumpEventsForHierarchy (EventDispatchThread.java:150) at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:144) at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:136) at java.awt.EventDispatchThread.run(EventDispatchThread.java:99)
The program is just simple. The user specifies path and file name by the FileChooser component. Then I open the file and apply the Sequence and Annotation classes as visible in the attached method taken from a extended file class.
What I need are the sequence data of the GenBank entry (accession, sequence etc.) and also for its features (start, end position, subtype like t-RNA, cds etc.)
Any hints are welcome. Thanks Tori
---------------------
public GenBankBioJavaImporter(String path, String fileName, Connection
genDbCon) {
super();
super.setPath(path);
super.setFileName(fileName);
}
public boolean readFile() {
if (!super.createInputFile()) return(false);//read the GenBank File SequenceIterator sequences = SeqIOTools.readGenbank(super.fileReaderHandler); // fileReaderHandler is a BufferedReader
//iterate through the sequences
while(sequences.hasNext()) {
try { Sequence seq = sequences.nextSequence();
//do stuff with the sequence
System.out.println("Info: "+seq.getName()+", "+seq.getURN()+",
"+seq.countFeatures());
Annotation anno = seq.getAnnotation();
//anno.getProperty()
}
catch (BioException ex) {
//not in GenBank format
ex.printStackTrace();
super.closeInputFile();
return(false);
}catch (NoSuchElementException ex) {
//request for more sequence when there isn't any
ex.printStackTrace();
super.closeInputFile();
return(false);
}
}
super.closeInputFile();
return(true);
}
_______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l
