I consider to write a grammar rule of Japanese. "ー" (prolonged sound mark) is a popular symbol in Japanese. And the rule itself is simple:
The symbol is placed after Hiragana or Katakana, not Kanji. Tagging by LT, such sentence is marked as unknown word (未知語). To identify the mistake, simply check it and it is not hard, but I don't have good way to explain correct example. Can I write it as LT grammer rule? or need to write Java rule? $ echo "不適切ー" |java -jar LanguageTool-2.9/languagetool-commandline.jar -t -l ja-JP -c UTF-8 Expected text language: Japanese (no spell checking active, specify a language variant like 'en-GB' if available) Working on STDIN... <S> 不適切[不適切/名詞-形容動詞語幹]ー[ー/未知語,</S>] Time: 296ms for 0 sentences (0.0 sentences/sec) ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel