Thank you, Daniel and Silvan.

On Wed, 13 May 2015 09:12:14 +0200
Daniel Naber <daniel.na...@languagetool.org> wrote:

> <token regexp="yes">[\u3040-\u309F]+</token>
> <token>X</token>

There is some exepction, like a long consontants character (っ) but it is
not bad to ignore because some slang is breaking the rule.

So I wrote a rule.

--- 
a/languagetool-language-modules/ja/src/main/resources/org/languagetool/rules/ja/grammar.xml
+++ 
b/languagetool-language-modules/ja/src/main/resources/org/languagetool/rules/ja/grammar.xml
@@ -72,6 +72,15 @@ Japanese Grammar and Typo Rules file for LanguageTool
                        <example><marker>おぼつかないです</marker></example>
                        <example correction=""><marker>おぼつきません</marker></
example>
                </rule>
+               <rule id="SINGLE-MARKON" name="長音">
+                 <pattern case_sensitive="no">
+                   <token regexp="yes">[^\u3040-\u30FF]+</token>
+                   <token >ー</token>
+                 </pattern>
+                 <message>不適切な長音符</message>
+                 <example><marker>隣家</marker>ー</example>
+                 <example correction="リンカ"><marker>リンカ</marker>ー</exampl
e>
+               </rule>

It works:
(wrong example)
t$ echo "隣家ー" |java -jar 
languagetool-standalone/target/LanguageTool-3.0-SNAPSHOT/LanguageTool-3.0-SNAPSHOT/languagetool-commandline.jar
 --rulefile ./grammar.xml -l ja-JP -c UTF-8
Expected text language: Japanese (no spell checking active, specify a language 
variant like 'en-GB' if available)
Working on STDIN...
1.) Line 1, column 1, Rule ID: SINGLE-MARKON[1]
Message: 不適切な長音符
隣家ー
^^^
Time: 400ms for 1 sentences (2.5 sentences/sec)

(correct example)
$ echo "リンカー" |java -jar 
languagetool-standalone/target/LanguageTool-3.0-SNAPSHOT/LanguageTool-3.0-SNAPSHOT/languagetool-commandline.jar
 --rulefile ./grammar.xml -l ja-JP -c UTF-8
Expected text language: Japanese (no spell checking active, specify a language 
variant like 'en-GB' if available)
Working on STDIN...
Time: 295ms for 1 sentences (3.4 sentences/sec)

It semms good. I'll make a pull request.


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to