On Wed, Feb 8, 2012 at 3:16 PM, Katrin Tomanek
<katrin.toma...@averbis.com>wrote:

> Hi Jörn,
>
> Good, I'll have a look at the dev list tomorrow.
>
> But still a question on the EOS symbols:
>
> For some testing, I just overwrote the SentenceDetectorME.train(...)
> method, where I basically changed the way the EventStream was so up to:
>
> EventStream eventStream = new SDEventStream(**sampleStreamTrain,
>        new DefaultSDContextGenerator(new char[]{'.', '!', '?',':'}),
>        new DefaultEndOfSentenceScanner(**new char[]{'.', '!', '?',':'}));
>
>
> --> I thought doing so I would have added ":" as a possible sentence
> boundary. However, this did not really help -- the model rather gets worse.
> Maybe I still misunderstood something in how the EOS symbols work?
>

Maybe your issue is that you should set the EOS symbols in other places.
The SentenceDetectorME gets it from the Factory class. Maybe you will need
to create a sub-class from it.

http://svn.apache.org/viewvc/incubator/opennlp/trunk/opennlp-tools/src/main/java/opennlp/tools/sentdetect/lang/Factory.java?view=markup

We are planning to make this process easier for the next release.

William

Reply via email to