Questions about proper ClassLoader usage for running arbitrary analysis engines

Kirk True Mon, 02 Apr 2007 15:27:16 -0700

Hi all,

I have some general questions about how to create an
application that can run arbitrary annotators,
including how to set up the ClassLoader, the proper
timing for setting the thread context ClassLoader, and
so forth.


1. In an application in which an arbitrary aggregate
annotator is executed, isn't it possible that the
annotators may include versions of a given library
that conflict either with each other and/or the host
application?

2. Are there any examples of how to create
applications that *don't* have a priori knowledge of
the annotators it will be running?

3. Is it possible to have an annotator-specific
ClassLoader, or is there just one ClassLoader for all
annotators in a given aggregate analysis engine?

4. Should I or should I not be using
Thread.setContextClassLoader to the ClassLoader
returned by ResourceManager? If so, when should I set
it? Before creating the AnalysisEngine or afterward?

Some details...

In my application I allow for running an arbitrary
aggregate annotator, where in the descriptor is passed
in at runtime. I'm creating the appropriate aggregate
class path by concatenating the individual UIMA
annotator's class paths as detailed in
metadata/install.xml. I then call ResourceManager's
setExtensionClasspath method with that string and then
set the thread's ClassLoader (via Thread's
setContextClassLoader) to that returned by the
ResourceManager's getExtensionClassLoader. I create
the AnalysisEngine object via UIMAFramework's
produceAnalysisEngine method, passing in my
ResourceManager to ensure that my annotators use the
extension class path.

However, there's a timing issue. produceAnalysisEngine
parses all the relevant descriptors and then calls the
initialize method on each of the annotators. Because a
given annotator's initialize method may perform logic
that uses a given library, I had been setting the
Thread's context ClassLoader before calling
produceAnalysisEngine. However, when
produceAnalysisEngine is then run, it references a
ClassLoader that may offer a newer version of a
certain library which causes problems. Specifically,
one annotator requires (and includes) and newer
version of  Xerces which causes a
java.lang.LinkageError to be thrown by
produceAnalysisEngine. However, if I don't set the
context ClassLoader before calling
produceAnalysisEngine, the annotator's initialize
method fails with a different error because it's
expecting the newer version. So I'm kind of stuck
between a rock and a hard place, so to speak.

What I've done at this point is to call
produceAnalysisEngine with the ResourceManager that
includes the extension class path, and *then* set the
context ClassLoader, then call the AnalysisEngine
process method. This only works because I've been able
to convince the annotator writers for which I've found
this to be an issue to defer initialization to the
process method. However, this is obviously not a
viable long-term solution given the design of my
product.

Is there something I'm missing? I've scoured over the
documentation looking to find answers, but it seems a
little light on the whole 'dynamic running of
annotators' concept in general. Everything tends to
point to hard-coding the class path at JVM execution
time which isn't possible for my application.

Thanks so much for the product and in advance for the
help.

Kirk

--------------------------------------------
Kirk True
Principal Engineer
Mustard Grain

[EMAIL PROTECTED]
(831) 588-7959
http://www.mustardgrain.com

Questions about proper ClassLoader usage for running arbitrary analysis engines

Reply via email to