Recently, Michael proposed adding 'UIMA PEAR runtime' capabilities to automatically install PEARs and run encapsulated analytics. In connection with that discussion, I would like to share experiences related to utilization of PEARs in UIMA and start a broader discussion on this topic. The issues, I'm talking about, were actually identified in the course of developing the 2nd generation of PEARs (in 2005-2006).
PEAR format was introduced as a convenient vehicle for packaging and distributing UIMA analytics and other resources. PEAR package must include installation descriptor file, containing the PEAR identification, reference to the main UIMA descriptor, runtime settings and other information. PEAR package has 3 different states: (1) ready for packing/archiving, (2) packed as a zip (*.pear) archive and (3) installed in a local file system. In state 1, the package is not packed, but may be not usable for UIMA deployment, because its installation descriptor, as well as other descriptor/configuration files may contain $main_root expressions. In state 3, the package is ready for UIMA deployment. Transition from state 1 to state 2 (zipping) does not modify the package contents, while transition from state 2 to state 3 (installation) may irreversibly modify the package contents by localizing (replacing $main_root with absolute path) installation descriptor and other descriptor/configuration files. The installation/localization step is necessary for utilizing PEAR in UIMA. As you can see from the previous paragraph, current PEAR has the following issues: 1. Installation descriptor file contains several different kinds of data: component identification, runtime settings, etc. In the 2nd generation of PEAR we proposed separating component identification part from runtime settings. There are multiple reasons for doing this, but I would like to mention only the following one: the component identification is not modifiable, while runtime settings may contain $main_root expressions and will be modified during the localization step. 2. The presence of the $main_root expressions in component descriptor/configuration files make the PEAR package (in state 1) not usable for UIMA deployment. As a result, the package cannot be validated before packing it by using standard UIMA tooling. To get rid of this limitation we proposed completely removing $main_root expressions from the package files, including runtime settings. This requires using only relative paths or import by name in component descriptors and modifying PEAR API to convert relative paths into absolute paths for runtime settings. Yet another point for discussion is the necessity of installing the same PEAR again and again for each instantiation of UIMA component, as stated in the 'UIMA PEAR runtime' proposal. In the 2nd generation of PEAR we proposed a kind of local registry, which keeps track of locally installed PEARs. In general, I would like to start a discussion on possible ways of improving the processes and API for packaging, managing and deploying analytics in UIMA. -- Lev
