Tobias Wunner wrote:
Hello,

after switching from "org.apache.uima.examples" to the "opennlp.uima" UIMA wrappers I wanted to create an Aggregate Engine using a Sentence Detector, Tokenizer and POS Tagger. Each component comes in OpenNLP 1.4.3 UIMA wrappers with a descriptor file. An AggregateEngine applying all 3 steps in sequence is not included though. I encountered the following behavior of the components when trying to run them in the CAS Visual Debugger:

1) Tokenizer: load, runs, generates Token Annotation (opennlp.uima.token)

  2) PosTagger: loads, runs, no Annoation

      I guess this is expected since Tokenization was not done before.

  3) SentenceDetector: loads, does not run and generates error
The intended sequence in OpenNLP is a) Sentence Detector b) Tokenizer c) POS Tagger. In my opinion it does not matter if you do first tokenization or sentence detection.

The POS Tagger cannot run without tokens and sentences. In your case it did not output anything because
it could not find any sentences.

"opennlp.uima.util.OpenNlpAnnotatorProcessException: The required parameter opennlp.uima.ContainerType can not be found!"
I am sorry for that, did an update to the code and did not updated the descriptor. Please get the corrected descriptor from the cvs repository. It is now also documented in the javadoc of the SentenceDetector.
This indicates the missing parameter "opennlp.uima.ContainerType". But I do not know what value to set this. Setting it to any value returns an empty Annotation. I assumed the SentenceDetector should run out of the box without dependencies since it is the first step in the pipeline.
Yes, that is true, I will make it optional.

Thanks for you response,
Jörn

Reply via email to