Thank you, Silvan. I can get more Java regexp expression.

At Wed, 13 May 2015 20:43:06 +0200,
Silvan Jegen wrote:
> the following, using Unicode categories, should work as well. 
> 
>                               <token regexp="yes">\p{IsHan}+</token>
>                               <token >ー</token>

It seems IsHan matches only Han unification character, so other
character type like ASCII would be ignored.

So I changed to use IsHiragana and IsKatakana combination. it seems fine.

                  <token regexp="yes">[^\p{IsKatakana}\p{IsHiragana}]+</token>
                  <token >ー</token>


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to