Thank you, Silvan. I can get more Java regexp expression. At Wed, 13 May 2015 20:43:06 +0200, Silvan Jegen wrote: > the following, using Unicode categories, should work as well. > > <token regexp="yes">\p{IsHan}+</token> > <token >ー</token>
It seems IsHan matches only Han unification character, so other character type like ASCII would be ignored. So I changed to use IsHiragana and IsKatakana combination. it seems fine. <token regexp="yes">[^\p{IsKatakana}\p{IsHiragana}]+</token> <token >ー</token> ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel