On 3/2/2012 11:03 AM, Jens Grivolla wrote:
Hi,
when using the ConceptMapper from addons as a PEAR we are having classpath
problems. The ConceptMapper launches a tokenizer AE using its XML descriptor,
but at that point the classpath set from the PEAR does not get used.
That is correct.
This means that it is impossible to point to a tokenizer packaged together
with the CM based AE, or it is at least necessary to add the tokenizer classes
(or jar) as well as all of its dependencies to the global classpath.
There are some other possibilities. One is to package the tokenizer as a PEAR
and install it also, and then update the parameter which specifies the UIMA
pipeline to run for tokenization to be a pearSpecifier. In this case, the
tokenizer would run with its classpath.
It all seems to come down to (AnnotatorAdapter.java:97):
ae = UIMAFramework.produceAnalysisEngine(aeSpecifier);
Right.
But I don't see why the classpath that is used by the ConceptMapper would not
apply here. It must have to do with how the classpath is adjusted "locally"
for PEARs instead of being global to the whole JVM, but I haven't been able to
figure it out yet.
I think this is because the framework sets up a special class loader for the
classes loaded by the PEAR's implementation class. However, in this case that
implementation class calls the UIMA Framework to produce an analysis engine - so
the loader used for that is the one the UIMA Framework has.
I'm pretty sure it is possible to change the design of how Concept Mapper works,
to have the tokenizer inherit the classpath (actually, the Resource Manager) of
the Concept Mapper. If done, this would have potential other implications - for
example, it would be possible to have an external resource specification that
was shared between the tokenizer and the Concept Mapper (and, indeed, if the
Concept Mapper was contained in some outer UIMA Aggregate, with any annotator in
that Aggregate.
This fix would entail capturing the resource manager instance that the concept
mapper is running with, and passing that in to the framework call to produce the
tokenizer resource.
Do people think this would be a good change or does it make things too
complicated?
-Marshall
Any ideas?
Thanks,
Jens