On Sat, Sep 5, 2009 at 5:09 AM, Marshall Schor <[email protected]> wrote: > > Can you say a bit more what the problem is? >
I think my problem is actually tangential to the issue with JCasRegistry. After retrieving types from the jcas TypeSystem, I still ran into issues with multiple Redaction definitions because of multiple copies of the bytecode being loaded (multiple class loaders, I think). I've worked around that -- see below if you're interested. I'd like to hear suggestions to make it cleaner, but at least it's working. I still don't understand why JCasRegistry.register(...) shouldn't be a true function. It seems like there are at least two parallel ways to retrieve types, and in my experience, they don't return the same results--at least when getting filtered annotation indices. (The ways being: JCasRegistry.getClassForIndex(MyAnnotationType.type) and aJCas.getTypeSystem().getTypeByName(MyAnnotationType.class.getType()) Anyhow, here's an overview of what we're doing -- it may shed some light on this issue: The UIMA portion of our application is a self-contained module (lets call it 'core') that (once instantiated) takes a Document as input, and returns a Collection<Violation>. Violations are moderately complex data structures that contain the fields of an Annotation object -- specifically, a Redaction (Redaction is a JCasGen-generated annotation subtype with some minor additional metadata that the Annotators populate). When core is instantiated, it gets a list of UIMA annotators to use to generate Redaction objects, which are, in turn, translated by core into Violations. So, from core's perspective, each UIMA annotator is just a module that generates a jcas with Redaction annotations. The UIMA annotators need know nothing about core to function, although they do have a dependency on Core at the moment, so that they can all share the same implementattion of Redaction and Redaction_Type. My intent was to use PEARs as the distribution mechanism for UIMA annotators. The core module would then be configured with a set of key,value pairs that are provided to the AnalysisEngine as parameters, and deployment would be a simple matter of dropping a pear in the right place and then specifying an additional small section of core config. I now have this all working--and the generated PEARs can run stand-alone too, which makes testing/debugging a good bit easier. (we can load them in the UIMA tools, for example.)--but it reeks. What I've ended up doing is installing the pears programatically at runtime (to simplfy deployment), but loading them as PEARs prevents core from providing prameter values (I don't understand why, but UIMA_IllegalStateExceptions abound if you try that). Instead, we're using the PackageBrowser returned by installing the pear to determine the non-pear descriptor and the classpath/datapath. After filtering out the core dependency from the classpath, it goes to a ResourceManager that can be used to load the annotator properly, and all the code involved can see one definition of the Redaction class. Thanks! Rogan > The use-case for Pears is to provide a shielded environment where the > things in the PEAR can run with an independent classpath. For example, > a Pear component can define a JCas class called Token, which might have > a different cover class than anyone else's Token. While inside the > PEAR, its Token JCas class would be used, while, outside the PEAR, other > versions of this class might be used. This is done on purpose. > > If you don't want this shielding behavior, you can get the non-shielding > behavior by 1) installing the PEAR, 2) resolving any class path issues > by hand, 3) setting up a common, appropriate class path for both the > PEAR component(s) and the remaining components, and then 4) running with > the normal descriptor for the component (not the Pear-specifier descriptor). > > -Marshall >
