Tobias Wunner wrote:
Hello,
after switching from "org.apache.uima.examples" to the "opennlp.uima"
UIMA wrappers I wanted to create an Aggregate Engine using a Sentence
Detector, Tokenizer and POS Tagger. Each component comes in OpenNLP
1.4.3 UIMA wrappers with a descriptor file. An AggregateEngine
applying all 3 steps in sequence is not included though. I encountered
the following behavior of the components when trying to run them in
the CAS Visual Debugger:
1) Tokenizer: load, runs, generates Token Annotation
(opennlp.uima.token)
2) PosTagger: loads, runs, no Annoation
I guess this is expected since Tokenization was not done before.
3) SentenceDetector: loads, does not run and generates error
The intended sequence in OpenNLP is a) Sentence Detector b) Tokenizer c)
POS Tagger. In
my opinion it does not matter if you do first tokenization or sentence
detection.
The POS Tagger cannot run without tokens and sentences. In your case it
did not output anything because
it could not find any sentences.
"opennlp.uima.util.OpenNlpAnnotatorProcessException: The
required parameter opennlp.uima.ContainerType can not be found!"
I am sorry for that, did an update to the code and did not updated the
descriptor. Please get the corrected descriptor
from the cvs repository. It is now also documented in the javadoc of the
SentenceDetector.
This indicates the missing parameter
"opennlp.uima.ContainerType". But I do not know what value to set
this. Setting it to any value returns an empty Annotation. I assumed
the SentenceDetector should run out of the box without dependencies
since it is the first step in the pipeline.
Yes, that is true, I will make it optional.
Thanks for you response,
Jörn