Did you modify the evaluation as well? If you just do it during training the evaluator will not be able to consider ":" as en EOS character.
For me it sounds like that it fails to split on the ":" in some place. The sentence detector uses a maxent model to classify every EOS character as either a SPLIT or NO_SPLIT. Jörn On Thu, Feb 9, 2012 at 8:59 AM, Katrin Tomanek <katrin.toma...@averbis.com>wrote: > Hi Willian, > > I am currently using opennlp-1.5.2 and try to use it as an API, i.e. not > to modify this code by write my own code around it. However, what I > described below (with the SDEventStream) results in the same as you are > describing: I am changing the set of EOS characters. > > I am just wondering, why adding ":" as an EOS character decreases the > results (dropping von ~80F to 45F in sentence splitting, and ":" is always > a sentence boundary symbol in my data!) > > Looks like I need to debug a little bit more whats happening in the > DefaultSDContextGenerator. >