It probably depends on your target data more than anything.  If you are
looking at newswire you'll have different requirements than if you are
looking at e-mail.  The other thing to worry about is what is the cost
if you get it wrong.  You could conceivably get good part-of-speech
tagging results even without great sentences but if you are trying to
use some deep parser it could break down a lot.

-----Original Message-----
From: jonathan doklovic [mailto:[EMAIL PROTECTED] 
Sent: Friday, December 14, 2007 12:55 PM
To: UIMA User
Subject: Sentence Rules vs. Models

Hi,

I've been playing around with the opennlp wrappers and will probably
make use of the entity detection, but I was wondering about the sentence
and token detection.

It seems that a model (statistical) based approach may be overkill and
more of a pain to correct errors in.

I was wondering if there's any reason not to use a rule based
sentence/token detector that then feeds the opennlp pos and entity model
based annotators?

Any thought are welcome.

- Jonathan

Reply via email to