The artifactMap map contains a manifest (that is a Properties object). You should store the EOS chars in this manifest. We need a smart way to convert them into a String.
The Sentence Detector should retrieve the EOS chars then from the model e.g. make a method getEosChars. Have a look at the other model classes as well, e.g. the tokenizer model. It stores some settings in the manifest. That is a good place to look for a code sample. Jörn On Thu, Feb 9, 2012 at 12:38 PM, Katrin Tomanek <[email protected]>wrote: > Hi, > > I am moving the discussion on making the EOS characters of the sentence > splitter configurable to the dev list (it was previously on the user list). > > I am currently trying to make the EOS characters a parameter of the > SentenceDetectorME and store it as model parameter. > > Thus far, this works fine (although it requires quite some positions in > the code to change). > > I am putting a "char[] eosCharacters" to the artifactMap in SentenceModel. > When predicting with a model, I test whether the eos parameter is set and > if so I use these eos symbols, otherwise the language dependent ones. > > Anyways, I am now getting into troubles when serializing the model with > the new "char[]" parameter: > > Writing sentence detector model ... Exception in thread "main" java.lang.* > *IllegalStateException: Missing serializer for eosCharacters > > I know that I would have to write such a serializer, however, I am a bit > lost here. Any hints (maybe there is already a serializer for char[] which > I could easily use). > > Best > Katrin >
