On Sat, Sep 5, 2009 at 8:43 PM, Marshall Schor <[email protected]> wrote:
>
> The method JCasRegistry.getClassForIndex... returns a Java
> cover class corresponding to a particular CAS type.
> As previously discussed, there can be multiple definitions for
> these.
How do you map from one to the other, or how do you make use of the
annotation types that come out of a PEAR if they are different
CAS/cover types (I'm still a bit fuzzy on that distinction) than those
the invoking code is aware of? The definition for the cover class
that is used to annotate the content is contained within the pear (as
a jar that reside's in the PEARs lib dir, and on the pear's
classpath), so the calling code (the 'core' module in my case) can't
see that definition.
I could treat it as an Annotation instance, but then I can't access
the data I need -- nor can I filter the types that way. Since the
type id isn't a compile-time constant, I can't store that somewhere
that core can access, so filtering based on that is out too. I could
possibly filter with the Type from
jcas.getTypeSystem.getType(canonicalName), but that would require a
magic string somewhere that would have to be maintained along side the
definition of Redaction (without access to the Redaction class file, I
can't use the Redaction.class.getCanonicalName() trick below).
> Rogan Creswick wrote:
>>
>> a true function. It seems like there are at least two parallel ways
>> to retrieve types, and in my experience, they don't return the same
>> results--at least when getting filtered annotation indices. (The ways
>> being: JCasRegistry.getClassForIndex(MyAnnotationType.type) and
>> aJCas.getTypeSystem().getTypeByName(MyAnnotationType.class.getType())
>>
>
> The second form I think is written incorrectly - it won't compile for me
My apologies, I was being lazy and I didn't look up the actual line of code.
Here's the actual method -- with both approaches to getting an AnnotationIndex:
/**
* Collects the Redaction annotations on the JCas and creates a list of
* Violations from them.
*
* @param jcas The JCas with annotations.
* @return A list of Violations on the document.
*/
private ImmutableList<Violation> extractViolations(final JCas jcas) {
Builder<Violation> builder = ImmutableList.builder();
// First approach: (used in tutorial code, most obvious)
// This delegates to JCasRegistry.getClassForIndex(int)
//
// AnnotationIndex index = jcas.getAnnotationIndex(Redaction.type);
// Second approach: uses the canonical class name to extract a
// type from the JCAS's type system:
Type redactionType =
jcas.getTypeSystem().getType(Redaction.class.getCanonicalName());
AnnotationIndex index = jcas.getAnnotationIndex(redactionType);
// the returned index from jcas is restricted based on the
// Redaction.type passed in. See API docs at:
//
http://incubator.apache.org/uima/downloads/releaseDocs/2.2.2-incubating/docs/api/org/apache/uima/jcas/JCas.html#getAnnotationIndex(int)
@SuppressWarnings("unchecked")
Iterator<Redaction> itr = index.iterator();
while (itr.hasNext()) {
Redaction r = itr.next();
int start = r.getBegin();
int end = r.getEnd();
Set<IRestriction> restrictions =
parseRestrictions(r.getRestrictions());
builder.add(new Violation(start, end, restrictions));
}
return builder.build();
}
Thanks for taking the time to help!
--Rogan
> because the TypeSystem object returned by aJCas.getTypeSystem() doesn't
> have a method getTypeByName... so I'm not sure what you meant.
>> Anyhow, here's an overview of what we're doing -- it may shed some
>> light on this issue:
>>
>>
>> The UIMA portion of our application is a self-contained module (lets
>> call it 'core') that (once instantiated) takes a Document as input,
>> and returns a Collection<Violation>. Violations are moderately
>> complex data structures that contain the fields of an Annotation
>> object -- specifically, a Redaction (Redaction is a JCasGen-generated
>> annotation subtype with some minor additional metadata that the
>> Annotators populate).
>>
>> When core is instantiated, it gets a list of UIMA annotators to use to
>> generate Redaction objects, which are, in turn, translated by core
>> into Violations. So, from core's perspective, each UIMA annotator is
>> just a module that generates a jcas with Redaction annotations.
>>
>> The UIMA annotators need know nothing about core to function, although
>> they do have a dependency on Core at the moment, so that they can all
>> share the same implementattion of Redaction and Redaction_Type.
>>
>> My intent was to use PEARs as the distribution mechanism for UIMA
>> annotators. The core module would then be configured with a set of
>> key,value pairs that are provided to the AnalysisEngine as parameters,
>> and deployment would be a simple matter of dropping a pear in the
>> right place and then specifying an additional small section of core
>> config.
>>
>> I now have this all working--and the generated PEARs can run
>> stand-alone too, which makes testing/debugging a good bit easier. (we
>> can load them in the UIMA tools, for example.)--but it reeks.
>>
>> What I've ended up doing is installing the pears programatically at
>> runtime (to simplfy deployment), but loading them as PEARs prevents
>> core from providing prameter values (I don't understand why, but
>> UIMA_IllegalStateExceptions abound if you try that).
> I think this should now work. There were many changes to make the PEAR
> wrapper work better that were done since 2.2.2 release, including
> UIMA-1107 which I think fixed the parameter setting things. These
> changes are in the SVN trunk, which we're in the process of getting
> ready for the 2.3.0 release.
>> Instead, we're
>> using the PackageBrowser returned by installing the pear to determine
>> the non-pear descriptor and the classpath/datapath. After filtering
>> out the core dependency from the classpath, it goes to a
>> ResourceManager that can be used to load the annotator properly, and
>> all the code involved can see one definition of the Redaction class.
>>
>> Thanks!
>> Rogan
>>
>>
>>
>>
>>> The use-case for Pears is to provide a shielded environment where the
>>> things in the PEAR can run with an independent classpath. For example,
>>> a Pear component can define a JCas class called Token, which might have
>>> a different cover class than anyone else's Token. While inside the
>>> PEAR, its Token JCas class would be used, while, outside the PEAR, other
>>> versions of this class might be used. This is done on purpose.
>>>
>>> If you don't want this shielding behavior, you can get the non-shielding
>>> behavior by 1) installing the PEAR, 2) resolving any class path issues
>>> by hand, 3) setting up a common, appropriate class path for both the
>>> PEAR component(s) and the remaining components, and then 4) running with
>>> the normal descriptor for the component (not the Pear-specifier descriptor).
>>>
>>> -Marshall
>>>
>>>
>>
>>
>>
>