Hi there,

> Hi,
> 
> The UIMAfit manual (5.1) states that the preferred way to iterate over tokens 
> in
> the CAS is the following:
> 
>    // JCas version
>    for (Token token : JCasUtil.select(jcas, Token.class)) {
>      ...
>    }
> 
> This assumes a Token.class is importable somewhere. But I'm using the OpenNLP
> tools, which don't provide such a type. Instead, it seems to be generated at 
> run
> time during configuration steps, and is not accessible as a class in the AE 
> (to
> my knowledge.)

No, it is not generated at runtime. It is generated manually or at build-time, 
e.g. using the maven-jcasgen-plugin. 

OpenNLP aims to be configurable with regards to types. So you must have *some* 
type system that you configure OpenNLP to use, right? Open it in the Eclipse 
UIMA Type-System Editor and hit the "JCasGen" button - it will generate the 
JCas classes that you can use with uimaFIT JCasUtil.

> Additionally, when extending o.a.u.fit.component.JCasAnnotator_ImplBase 
> instead
> of o.a.u.component.JCasAnnotator_ImplBase, the method void 
> typeSystemInit(TypeSytem)
> is not provided, which makes instatiating the type system the same way OpenNLP
> does it rather cumbersome (I generate an empty CAS with the 
> typSystemDescription,
> then get its TypSystem and provide the Type and Feature objects from this
> TypeSystem instance as UIMAfit configuration parameters before deploying my 
> AE.)

typeSystemInit() is meant for CAS-based analysis engines, not for JCas-based 
annotators. 
You need the CAS-based API only if you want to configure your components at 
runtime with regards to the annotation types they should use. If you can stick 
to a specific type system, use the JCas-based analysis engines.

> Even then, I can only use the less type-safe method of iterating over
> annotations: for (AnnotationFS token : cas.getAnnotationIndex(tokenType)) 
> where
> tokenType is the Type instance I acquired from the TypeSystem either during
> typeSystemInit() or during configuration with the above hack.

The CAS-API is not type-safe. Neither is the UIMA-JCas API, but the uimaFIT 
JCas-API is ;)

> Is there some good way of solving this dilemma while still using UIMAfit's
> classes? Obviously, I could go back to using just plain UIMA, but I quite like
> UIMAfit's way of dealing with external resources! And I don't like the
> type-system-through-cas hack.

Generate the JCas classes for your type system and you should be fine.

You could alternatively use an alternative OpenNLP binding for UIMA, e.g. the 
one provided by DKPro Core [1] (not an Apache project, but one I'm working on 
too).

Cheers,

-- Richard

[1] https://code.google.com/p/dkpro-core-asl/

Reply via email to