Did you modify the evaluation as well? If you just do it during training the
evaluator will not be able to consider ":" as en EOS character.

For me it sounds like that it fails to split on the ":" in some place.

The sentence detector uses a maxent model to classify every EOS character
as either a SPLIT or NO_SPLIT.

Jörn

On Thu, Feb 9, 2012 at 8:59 AM, Katrin Tomanek
<katrin.toma...@averbis.com>wrote:

> Hi Willian,
>
> I am currently using opennlp-1.5.2 and try to use it as an API, i.e. not
> to modify this code by write my own code around it. However, what I
> described below (with the SDEventStream) results in the same as you are
> describing: I am changing the set of EOS characters.
>
> I am just wondering, why adding ":" as an EOS character decreases the
> results (dropping von ~80F to 45F in sentence splitting, and ":" is always
> a sentence boundary symbol in my data!)
>
> Looks like I need to debug a little bit more whats happening in the
> DefaultSDContextGenerator.
>

Reply via email to