Re: descriptor files in opennlp.uima wrappers (e.g. SentenceDetector)

Jörn Kottmann Tue, 28 Apr 2009 07:53:34 -0700

Tobias Wunner wrote:

Hello,
after switching from "org.apache.uima.examples" to the "opennlp.uima"UIMA wrappers I wanted to create an Aggregate Engine using a SentenceDetector, Tokenizer and POS Tagger. Each component comes in OpenNLP1.4.3 UIMA wrappers with a descriptor file. An AggregateEngineapplying all 3 steps in sequence is not included though. I encounteredthe following behavior of the components when trying to run them inthe CAS Visual Debugger:
1) Tokenizer: load, runs, generates Token Annotation(opennlp.uima.token)
  2) PosTagger: loads, runs, no Annoation

      I guess this is expected since Tokenization was not done before.

  3) SentenceDetector: loads, does not run and generates error

The intended sequence in OpenNLP is a) Sentence Detector b) Tokenizer c)POS Tagger. Inmy opinion it does not matter if you do first tokenization or sentencedetection.

The POS Tagger cannot run without tokens and sentences. In your case itdid not output anything because

it could not find any sentences.

"opennlp.uima.util.OpenNlpAnnotatorProcessException: Therequired parameter opennlp.uima.ContainerType can not be found!"

I am sorry for that, did an update to the code and did not updated thedescriptor. Please get the corrected descriptorfrom the cvs repository. It is now also documented in the javadoc of theSentenceDetector.

This indicates the missing parameter"opennlp.uima.ContainerType". But I do not know what value to setthis. Setting it to any value returns an empty Annotation. I assumedthe SentenceDetector should run out of the box without dependenciessince it is the first step in the pipeline.

Yes, that is true, I will make it optional.

Thanks for you response,
Jörn

Re: descriptor files in opennlp.uima wrappers (e.g. SentenceDetector)

Reply via email to