A few more high level thoughts about OSGi and UIMA.
 
Both try to address "modularity".  OSGi is about package versioning (or bundle
versioning), and expressing wiring via exports and imports (dependencies) among
packages / bundles.  (OSGi is of course about lots of other stuff, too.)

UIMA addresses modularity by externalizing metadata about components in XML
descriptors.  These can specify versions (but we don't make use of that at the
moment), and "aggregation" can depend on delegates.

The UIMA way of binding to delegate info is either by location (absolute or
path-relative), or by using an externally setup classpath (e.g., by name). 
However, UIMA doesn't have very much support for setting up classpath (it has
PEAR files for "switching" the classpath, and the uima-bootstrap code).  Users
often have trouble in getting the classpath set up properly.

If we think of annotators as OSGi bundles, they can contain within them the
information needed to create a proper classpath.  They do this by importing
packages or requiring-bundles by "name" and "version" ranges.

When hooked up to bundle repositories (or Maven), support is already "out there"
to fetch needed dependencies at the right version level.

Sometimes the dependencies are already OSGi bundles; other times they are just
plain Java Jars.
I have found, in both Apache Karaf and pax-construct, support for automatically
wrapping the plain Java Jars into OSGi bundles.  For example, Karaf has a
feature where you can "drop" things into a monitored directory and they will be
installed into a running OSGi framework.  If you drop a plain Jar, it is
automatically wrapped into a bundle.

If we can get this working, a good outcome would be getting rid of "setting up
the classpath" issues for running UIMA pipelines.  One would instead construct a
top level aggregate and have that "import" the versions wanted for the delegate
components, as well as any other dependencies.  (A button could be added to the
Eclipse configurator, that, when pressed, would produce for an aggregate, the
appropriate OSGi bundle for it, using version info etc. to specify delegates).

The various existing OSGi tools seem to support multiple styles of creating
bundles: if you have an Annotator that depends on other Jars, you can either
incorporate those Jars within the bundle, or you can depend on them (in the OSGi
import-package/bundle sense), and keep your bundle small.  This latter way seems
the preferable approach.  With this style, something like the TikaAnnotator,
which today might include the tika-core.jar and the tika-parser.jar, would
instead just have the UIMA code, and depend on these other Jars at some
version-range levels.  This would allow a more flexible evolution of the
application (e.g., one could upgrade the tika-core jar independently of other
things). 

-Marshall

Reply via email to