Re: descriptor files in opennlp.uima wrappers (e.g. SentenceDetector)

Tobias Wunner Tue, 28 Apr 2009 10:02:47 -0700

Hi Jörn,

thanks a lot for the fast fix. Everything works now!


Thanks.

Toby

On Apr 28, 2009, at 4:53 PM, Jörn Kottmann wrote:

Tobias Wunner wrote:
Hello,
after switching from "org.apache.uima.examples" to the"opennlp.uima" UIMA wrappers I wanted to create an Aggregate Engineusing a Sentence Detector, Tokenizer and POS Tagger. Each componentcomes in OpenNLP 1.4.3 UIMA wrappers with a descriptor file. AnAggregateEngine applying all 3 steps in sequence is not includedthough. I encountered the following behavior of the components whentrying to run them in the CAS Visual Debugger:
1) Tokenizer: load, runs, generates Token Annotation(opennlp.uima.token)
 2) PosTagger: loads, runs, no Annoation

     I guess this is expected since Tokenization was not done before.

 3) SentenceDetector: loads, does not run and generates error
The intended sequence in OpenNLP is a) Sentence Detector b)Tokenizer c) POS Tagger. Inmy opinion it does not matter if you do first tokenization orsentence detection.
The POS Tagger cannot run without tokens and sentences. In your caseit did not output anything because
it could not find any sentences.
"opennlp.uima.util.OpenNlpAnnotatorProcessException: Therequired parameter opennlp.uima.ContainerType can not be found!"
I am sorry for that, did an update to the code and did not updatedthe descriptor. Please get the corrected descriptorfrom the cvs repository. It is now also documented in the javadoc ofthe SentenceDetector.
This indicates the missing parameter"opennlp.uima.ContainerType". But I do not know what value to setthis. Setting it to any value returns an empty Annotation. Iassumed the SentenceDetector should run out of the box withoutdependencies since it is the first step in the pipeline.
Yes, that is true, I will make it optional.

Thanks for you response,
Jörn

Re: descriptor files in opennlp.uima wrappers (e.g. SentenceDetector)

Reply via email to