On 3/2/2012 11:03 AM, Jens Grivolla wrote:
Hi,

when using the ConceptMapper from addons as a PEAR we are having classpath problems. The ConceptMapper launches a tokenizer AE using its XML descriptor, but at that point the classpath set from the PEAR does not get used.

That is correct.


This means that it is impossible to point to a tokenizer packaged together with the CM based AE, or it is at least necessary to add the tokenizer classes (or jar) as well as all of its dependencies to the global classpath.

There are some other possibilities. One is to package the tokenizer as a PEAR and install it also, and then update the parameter which specifies the UIMA pipeline to run for tokenization to be a pearSpecifier. In this case, the tokenizer would run with its classpath.


It all seems to come down to (AnnotatorAdapter.java:97):
ae = UIMAFramework.produceAnalysisEngine(aeSpecifier);

Right.
But I don't see why the classpath that is used by the ConceptMapper would not apply here. It must have to do with how the classpath is adjusted "locally" for PEARs instead of being global to the whole JVM, but I haven't been able to figure it out yet.

I think this is because the framework sets up a special class loader for the classes loaded by the PEAR's implementation class. However, in this case that implementation class calls the UIMA Framework to produce an analysis engine - so the loader used for that is the one the UIMA Framework has.

I'm pretty sure it is possible to change the design of how Concept Mapper works, to have the tokenizer inherit the classpath (actually, the Resource Manager) of the Concept Mapper. If done, this would have potential other implications - for example, it would be possible to have an external resource specification that was shared between the tokenizer and the Concept Mapper (and, indeed, if the Concept Mapper was contained in some outer UIMA Aggregate, with any annotator in that Aggregate.

This fix would entail capturing the resource manager instance that the concept mapper is running with, and passing that in to the framework call to produce the tokenizer resource.

Do people think this would be a good change or does it make things too 
complicated?

-Marshall

Any ideas?

Thanks,
Jens


Reply via email to