Hi,
I have to extract data from the GenBank XML files.
For this purpose I use the biojava API. But I get a parser error.

java.lang.StringIndexOutOfBoundsException: String index out of range: 12
at java.lang.String.substring(String.java:1477)
at org.biojava.bio.seq.io.GenbankContext.processHeaderLine
(GenbankContext.java:621)
at org.biojava.bio.seq.io.GenbankContext.processLine
(GenbankContext.java:263)
at org.biojava.bio.seq.io.GenbankFormat.readSequence
(GenbankFormat.java:144)
at org.biojava.bio.seq.io.StreamReader.nextSequence
(StreamReader.java:100)

rethrown as org.biojava.bio.BioException: Could not read sequence
at
org.biojava.bio.seq.io.StreamReader.nextSequence
(StreamReader.java:103)
at de.izbi.gbm.logistics.GenBankBioJavaImporter.readFile
(GenBankBioJavaImporter.java:41)
at de.izbi.gbm.gui.GenBankBaseFrame.actionPerformed
(GenBankBaseFrame.java:134)
at javax.swing.AbstractButton.fireActionPerformed
(AbstractButton.java:1764)
at javax.swing.AbstractButton$ForwardActionEvents.actionPerformed
(AbstractButton.java:1817)
at javax.swing.DefaultButtonModel.fireActionPerformed
(DefaultButtonModel.java:419)
at javax.swing.DefaultButtonModel.setPressed
(DefaultButtonModel.java:257)
at javax.swing.AbstractButton.doClick(AbstractButton.java:289)
at javax.swing.plaf.basic.BasicMenuItemUI.doClick
(BasicMenuItemUI.java:1109)
at javax.swing.plaf.basic.BasicMenuItemUI$MouseInputHandler.
mouseReleased(BasicMenuItemUI.java:943)
at java.awt.Component.processMouseEvent(Component.java:5093)
at java.awt.Component.processEvent(Component.java:4890)
at java.awt.Container.processEvent(Container.java:1566)
at java.awt.Component.dispatchEventImpl(Component.java:3598)
at java.awt.Container.dispatchEventImpl(Container.java:1623)
at java.awt.Component.dispatchEvent(Component.java:3439)
at java.awt.LightweightDispatcher.retargetMouseEvent
(Container.java:3450)
 at java.awt.LightweightDispatcher.processMouseEvent
(Container.java:3165)
at java.awt.LightweightDispatcher.dispatchEvent(Container.java:3095)
at java.awt.Container.dispatchEventImpl(Container.java:1609)
at java.awt.Window.dispatchEventImpl(Window.java:1585)
at java.awt.Component.dispatchEvent(Component.java:3439)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:450)
at java.awt.EventDispatchThread.pumpOneEventForHierarchy
(EventDispatchThread.java:197)
at java.awt.EventDispatchThread.pumpEventsForHierarchy
(EventDispatchThread.java:150)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:144)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:136)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:99)



The program is just simple. The user specifies path and file name by the
FileChooser component. Then I open the file and apply the Sequence and
Annotation classes as visible in the attached method taken from a extended
file class.

What I need are the sequence data of the GenBank entry (accession,
sequence etc.)
and also for its features (start, end position, subtype like t-RNA, cds
etc.)

Any hints are welcome.
Thanks Tori

---------------------

public GenBankBioJavaImporter(String path, String fileName, Connection
genDbCon) {
   super();
   super.setPath(path);
   super.setFileName(fileName);
 }
public boolean readFile() {
   if (!super.createInputFile()) return(false);

   //read the GenBank File
   SequenceIterator sequences =
SeqIOTools.readGenbank(super.fileReaderHandler); // fileReaderHandler is
a BufferedReader

   //iterate through the sequences
   while(sequences.hasNext()) {
     try {

       Sequence seq = sequences.nextSequence();
       //do stuff with the sequence
       System.out.println("Info: "+seq.getName()+", "+seq.getURN()+",
"+seq.countFeatures());
       Annotation anno = seq.getAnnotation();
       //anno.getProperty()
     }
     catch (BioException ex) {
       //not in GenBank format
       ex.printStackTrace();
       super.closeInputFile();
       return(false);
     }catch (NoSuchElementException ex) {
       //request for more sequence when there isn't any
       ex.printStackTrace();
       super.closeInputFile();
       return(false);
     }
   }
   super.closeInputFile();
   return(true);
 }






_______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l

Reply via email to