A few more high level thoughts about OSGi and UIMA. Both try to address "modularity". OSGi is about package versioning (or bundle versioning), and expressing wiring via exports and imports (dependencies) among packages / bundles. (OSGi is of course about lots of other stuff, too.)
UIMA addresses modularity by externalizing metadata about components in XML descriptors. These can specify versions (but we don't make use of that at the moment), and "aggregation" can depend on delegates. The UIMA way of binding to delegate info is either by location (absolute or path-relative), or by using an externally setup classpath (e.g., by name). However, UIMA doesn't have very much support for setting up classpath (it has PEAR files for "switching" the classpath, and the uima-bootstrap code). Users often have trouble in getting the classpath set up properly. If we think of annotators as OSGi bundles, they can contain within them the information needed to create a proper classpath. They do this by importing packages or requiring-bundles by "name" and "version" ranges. When hooked up to bundle repositories (or Maven), support is already "out there" to fetch needed dependencies at the right version level. Sometimes the dependencies are already OSGi bundles; other times they are just plain Java Jars. I have found, in both Apache Karaf and pax-construct, support for automatically wrapping the plain Java Jars into OSGi bundles. For example, Karaf has a feature where you can "drop" things into a monitored directory and they will be installed into a running OSGi framework. If you drop a plain Jar, it is automatically wrapped into a bundle. If we can get this working, a good outcome would be getting rid of "setting up the classpath" issues for running UIMA pipelines. One would instead construct a top level aggregate and have that "import" the versions wanted for the delegate components, as well as any other dependencies. (A button could be added to the Eclipse configurator, that, when pressed, would produce for an aggregate, the appropriate OSGi bundle for it, using version info etc. to specify delegates). The various existing OSGi tools seem to support multiple styles of creating bundles: if you have an Annotator that depends on other Jars, you can either incorporate those Jars within the bundle, or you can depend on them (in the OSGi import-package/bundle sense), and keep your bundle small. This latter way seems the preferable approach. With this style, something like the TikaAnnotator, which today might include the tika-core.jar and the tika-parser.jar, would instead just have the UIMA code, and depend on these other Jars at some version-range levels. This would allow a more flexible evolution of the application (e.g., one could upgrade the tika-core jar independently of other things). -Marshall
