Hello,
I switched from uima-2.2.2-incubating to the current head
and now I do not encounter the issue anymore.
But I am still interested to know how to include the Collection Reader
in the Aggregate.
Jörn
Marshall Schor wrote:
Hi Jörn,
I took a look the code you posted and the stack trace. I don't see
anything definitive, but here are some questions.
The Pear you are making use of - it has a collection reader and a main
aggregate. You pull out and parse the collection reader using
instDesc.getMainCollIteratorDesc() and you pull out and parse the main
Aggregate Analysis Engine using instPear.getComponentPearDescPath().
Can you verify that the instPear.getComponentPearDescPath() doesn't also
include the collection reader? (You can include Collection Readers
inside an Aggregate).
The 2nd thing I see is in the stack trace where it looks like the code
that does the call to process the CAS is running descended from some
thread pooling, concurrent execution stuff (Here's the top of the stack
trace up to the point where the process call is:
at
org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:218)
at
dk.infopaq.trainserver.TrainingEngine$TrainingJob.run(TrainingEngine.java:229)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Is it possible there are multiple threads running, and if so is the
TrainingJob code set up to be thread safe?
Here's another observation.
The code is separately instantiating a collection reader and an
Aggregate. The Aggregate is created (produceAnalysisEngine) with a
resource manager that has no extra class paths specified.
The collection reader is instantiated with a resource manager that has
extra class paths instantiated:
rsrcMgr = UIMAFramework.newDefaultResourceManager();
rsrcMgr.setExtensionClassPath(instPear.getComponentPearDescPath() + ":"
+ instPear.buildComponentClassPath(), false);
It seems to me that maybe the same resource manager should be used for
both? or is there a reason for doing it this way?
Another observation: One thing UIMA does with the type system from all
the components is to merge the type system specifications. There are
APIs to do this manually, if you are loading different descriptors. I
see this isn't being done here though. You do send the Aggregate's type
system to the collection reader component, though - so this would work
unless some special stuff was going on with JCas cover types (but maybe
your collection reader isn't using JCas).
Last observation: the code that sends the type system to the collection
reader gets a new CAS (ae.newCAS() ), but doesn't use that CAS for the
subsequent call to process, but instead gets another newCAS().
-Marshall (apologizing for not having a good definitive answer :-) )
Jörn Kottmann wrote:
Eddie Epstein wrote:
Hi Jörn,
From CVD I can repeatedly run aggregates containing cas mutiplier
delegates. Can you be more specific about the scenario? For example,
are you re-instantiating the AE between runs? Would it be easy for you
to make a minimal code sample that demonstrates the problem?
Here is my code which interacts with the PEAR Api. Maybe it differs a bit
from the usual use case, because I also instantiate a CollectionReader.
Errors occurs if this code is run twice.
BTW, is it safe it run it concurrently in multiple threads for
different pears ?
In the case it failed it did not run concurrently, but it is possible
that is run the second time in a different thread.
I can try to make a minimal code sample, but maybe you see something
wrong in my interaction with UIMA.
Jörn
PackageBrowser instPear =
PackageInstaller.installPackage(...);
ResourceManager rsrcMgr = UIMAFramework
.newDefaultResourceManager();
InstallationDescriptor instDesc =
instPear.getInstallationDescriptor();
XMLInputSource in = new
XMLInputSource(instPear
.getComponentPearDescPath());
ResourceSpecifier specifier = UIMAFramework.getXMLParser()
.parseResourceSpecifier(in);
AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(
specifier, rsrcMgr, null);
rsrcMgr =
UIMAFramework.newDefaultResourceManager();
rsrcMgr.setExtensionClassPath(instPear.getComponentPearDescPath() +
":" + instPear.buildComponentClassPath(), false);
XMLInputSource collectionReaderIn = new
XMLInputSource(
instDesc.getMainCollIteratorDesc());
ResourceSpecifier colReaderSpecifier =
UIMAFramework.getXMLParser().parseResourceSpecifier(collectionReaderIn);
Thread.currentThread().setContextClassLoader(rsrcMgr.getExtensionClassLoader());
CollectionReader colReader =
UIMAFramework.produceCollectionReader(colReaderSpecifier, rsrcMgr, null);
colReader.typeSystemInit(ae.newCAS().getTypeSystem());
CAS cas = ae.newCAS();
while (colReader.hasNext()) {
colReader.getNext(cas);
ae.process(cas);
cas.reset();
}
isCompletingCollectionProcessing = true;
ae.collectionProcessComplete();