Hello, Chinese and Japanese use different punctuation characters but passing them to the trainer tool (using -eosChars '。!?.!?') does not seem do anything, the trained models have abysmal scores when using the SentenceDetectorEvaluator tool.
When i transform the 。 to . in the training data using sed, and then train, the models have acceptable scores. I did notice the eosChars do not seem to end up well in the manifest.properties file, it becomes: eosCharacters=\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD.\!? When i manually update the file to list 。!?.!?, nothing changes. What am i doing wrong? Many thanks, Markus