On Wed, Feb 8, 2012 at 3:16 PM, Katrin Tomanek <katrin.toma...@averbis.com>wrote:
> Hi Jörn, > > Good, I'll have a look at the dev list tomorrow. > > But still a question on the EOS symbols: > > For some testing, I just overwrote the SentenceDetectorME.train(...) > method, where I basically changed the way the EventStream was so up to: > > EventStream eventStream = new SDEventStream(**sampleStreamTrain, > new DefaultSDContextGenerator(new char[]{'.', '!', '?',':'}), > new DefaultEndOfSentenceScanner(**new char[]{'.', '!', '?',':'})); > > > --> I thought doing so I would have added ":" as a possible sentence > boundary. However, this did not really help -- the model rather gets worse. > Maybe I still misunderstood something in how the EOS symbols work? > Maybe your issue is that you should set the EOS symbols in other places. The SentenceDetectorME gets it from the Factory class. Maybe you will need to create a sub-class from it. http://svn.apache.org/viewvc/incubator/opennlp/trunk/opennlp-tools/src/main/java/opennlp/tools/sentdetect/lang/Factory.java?view=markup We are planning to make this process easier for the next release. William