Hi Matthew, I just finished some further investigation, strengthening my feeling that using SeqIOTools.readGenbank() might not be thread-safe.
The strongest point is that the errors appear less frequently on uniprocessor machines that on multiprocessor ones. As you requested, below is a snippet of the the exception stack whith the bio.* related part delimited by ===========. This pattern repeats itself for different Genbank files except for the actual value of the corrupt(?) index. Note that the problem is solved by prefixing the method calling Sequence.seqString() with static synchronized, but that takes all the fun out of the pipeline ;) If needed, I can hand you the complete source file but I thought I'd better not spam biojava-l with it. Thanks for your support. Mark [java] org.quartz.JobExecutionException: java.lang.Exception: Unable to extract sequence from entry BA000019 [See nested exception: java.lang.Exception: Unable to extract sequence from entry BA000019] [java] at pipeline.jobs.EntryFeeder.execute(EntryFeeder.java:241) [java] at org.quartz.core.JobRunShell.run(JobRunShell.java:178) [java] at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:487) [java] * Nested Exception (Underlying Cause) --------------- [java] java.lang.Exception: Unable to extract sequence from entry BA000019 [java] at pipeline.jobs.EntryFeeder.feedEntry(EntryFeeder.java:199) [java] at pipeline.jobs.EntryFeeder.execute(EntryFeeder.java:234) [java] at org.quartz.core.JobRunShell.run(JobRunShell.java:178) [java] at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:487) =========================================================================================================== [java] java.lang.ArrayIndexOutOfBoundsException: -17406 [java] at org.biojava.bio.symbol.PackedSymbolList.symbolAt(PackedSymbolList.java:275) [java] at org.biojava.bio.seq.io.ChunkedSymbolListFactory$ChunkedSymbolList.symbolAt(ChunkedSymbolListFactory.java:178) [java] at org.biojava.bio.symbol.AbstractSymbolList$SymbolIterator.next(AbstractSymbolList.java:191) [java] at org.biojava.bio.seq.io.CharacterTokenization.tokenizeSymbolList(CharacterTokenization.java:202) [java] at org.biojava.bio.symbol.AlphabetManager$WellKnownTokenizationWrapper.tokenizeSymbolList(AlphabetManager.java:1378) [java] at org.biojava.bio.symbol.AbstractSymbolList.seqString(AbstractSymbolList.java:93) [java] at org.biojava.bio.seq.impl.SimpleSequence.seqString(SimpleSequence.java:89) ============================================================================================================= [java] at pipeline.jobs.EntryFeeder.feedEntry(EntryFeeder.java:194) [java] at pipeline.jobs.EntryFeeder.execute(EntryFeeder.java:234) [java] at org.quartz.core.JobRunShell.run(JobRunShell.java:178) [java] at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:487) [java] java.lang.Exception: Unable to extract sequence from entry AE004092 [java] at pipeline.jobs.EntryFeeder.feedEntry(EntryFeeder.java:199) [java] at pipeline.jobs.EntryFeeder.execute(EntryFeeder.java:234) [java] at org.quartz.core.JobRunShell.run(JobRunShell.java:178) [java] at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:487) [java] 2 avr. 2004 08:12:54 org.quartz.core.JobRunShell run Le jeu 01/04/2004 à 21:31, Matthew Pocock a écrit : > Hi, > > The biojava policy on synchronization is that we try to make things safe > if possible, but expect the user to synchronize sanely. Unfortunately, > this is usually not documented anywhere. I could not guarantee that > GenbankFormat is threadsafe - it would be sensible for it to be, but the > particular implementation may not be. To help us track this, could you > include some example stack traces of eratic behavior? > > Matthew -- [EMAIL PROTECTED] Unité Statistique & Génome Unité MIG +33 (0)1 60 87 38 03 Tél. +33 (0)1 34 65 28 85 +33 (0)1 60 87 38 09 Fax. +33 (0)1 34 65 29 01 Tour Evry 2, 523 pl. des Terrasses INRA - Domaine de Vilvert F - 91000 Evry F - 78352 Jouy-en-Josas CEDEX _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l