Hi Georg,

> Is there a way to obtain a specific UIMA-type without having access to a CAS ?

You can only get "Type" instances from a CAS (there may be ways around that, 
but it hardly makes sense).

> At the moment I create a TypeSystemDescription and a CollectionReader. With 
> that reader I get my CASes. Only from those CASes I can get the TypeSystem 
> from which I can get the type I want. The code always looks like this
> 
> void doSomething() {
>    TypeSystemDescription tsd = 
> UIMAFramework.getXMLParser().parseTypeSystemDescription(new 
> XMLInputSource(aTSFile));
>    CollectionReader cr = 
> CollectionReaderFactory.createCollectionReader(MyReaderClass, tsd);
>    JCasIterable iterable = new JCasIterable(reader);
>    Type myType = null;
>    for (JCas aJCas : iterable) {
>        if (myType == null) {
>            myType = aJCAS.getCas().getType("my.type.atype1");
>        }
>        AnnotationIndex<Annotation> index = aJCas .getAnnotIndex(myType);
>        doSomethingWithTheAnnotations()...
>    }
> }

If you have JCas wrappers for your types, stuff gets much easier. E.g. if you 
have a type "my.type.Atype1", you'd do something like

for (JCas jcas : new JCasIterable(reader)) {
  Collection<Atype1> annotations = JCasUtil.select(jcas, Atype1.class);
  // do something with the annotations
}

If you do not have JCas wrappers, then it's a bit more complicated. You can 
e.g. do this:

final static String ATYPE1 = "my.type.Atype1";
for (JCas jcas : new JCasIterable(reader)) {
  CAS cas = jcas.getCas();
  Collection<AnnotationFS) annotations = CasUtil.select(cas, 
CasUtil.getType(cas, ATYPE1));
  // do something with the annotations
}

With the CAS interface only, fetching features from the type is also more 
complicated. E.g. with JCas wrappers you could do this:

  AType1 annotation = …
  annotation.setMyValue("this is my value");
  annotation.getMyValue();

With CAS it's like

  AnnotationFS annotation = …
  Feature f = annotation.getType().getFeatureByBaseName("myValue");
  annotation.setStringValue(f, "this is my value");
  annotation.getStringValue(f);

You may want to consider using a pre-defined type system (as opposed to 
defining types on-the-fly in TextMarker) and generating JCas wrappers for them, 
which you then can use with uimaFIT (as in the examples above with JCasUtil and 
CasUtil) or with the plain UIMA API that you used in your code.

> It looks a bit ugly from the logical structure that the type has to be 
> obtained insed the for loop. It would think it desireable to have a method to 
> get this type already outside the loop.

In any case (CAS or JCas), the annotations do not survive the loop, because the 
CAS is re-used within JCasIterable. You'd have to copy all values that you want 
to continue using. Most people would not want to use JCasIterable, but rather 
write a new UIMA component which further processes the CAS or which writes 
results to some file or database and use e.g. uimaFIT 
SimplePipeline.runPipeline(reader, other-components…) to run these.

Cheers,

-- Richard

Reply via email to