Hi
In the Breton disambiguation file
languagetool-language-modules/br/target/classes/org/languagetool/resource/br/disambiguation.xml
I have the following immunization rule:
<rule id="FRANCE_3" name="France 3">
<pattern>
<token>France</token>
<token regexp="yes">[23]|Bleue</token>
</pattern>
<disambig action="immunize"/>
</rule>
Yet I get this kind of error:
$ echo "France 3 a zo ur chadenn skinwel." | \
java -jar
languagetool/languagetool-standalone/target/LanguageTool-2.1-beta1/LanguageTool-2.1-beta1/languagetool-commandline.jar
-l br
Expected text language: Breton
Working on STDIN...
1.) Line 1, column 1, Rule ID: BR_TOPO
Message: France zo un anv lec’h gallek. Ha fellout a rae deoc’h
skrivañ 'Frañs' pe 'bro-C’hall'?
Suggestion: Frañs; bro-C’hall
France 3 a zo ur chadenn skinwel.
^^^^^^
Isn't this a bug? The words "France 3" should have been immunized,
so I did not expect to get the error.
I assume that it happens because the rule BR_TOPO is a Java rule
and somehow immunization does not work with Java rules.
Another oddity is the output of the verbose mode with the disambiguation rule:
$ echo "France 3 a zo ur chadenn skinwel." | \
java -jar
languagetool/languagetool-standalone/target/LanguageTool-2.1-beta1/LanguageTool-2.1-beta1/languagetool-commandline.jar
-l br -v
Expected text language: Breton
Working on STDIN...
566 rules activated for language Breton
<S> France[France/Z e s top] 3[3] a[mont/V pres 3 s,mont/V impe 2
s,monet/V pres 3 s,monet/V impe 2 s,a/P,a/N m sp,a/L a] zo[teiñ/V
pres 3 s M:2:,teiñ/V impe 2 s M:2:,bezañ/V pres 3 s] ur[un/D e sp]
chadenn[chadenn/N f s] skinwel[skinwel/N m s].[</S>]<P/>
Disambiguator log:
FRANCE_3:1 France[France/Z e s top*] -> France[France/Z e s top*]
UR_N:1 chadenn[chadennañ/V pres 3 s,chadennañ/V impe 2 s,chadenn/N f
s] -> chadenn[chadenn/N f s]
1.) Line 1, column 1, Rule ID: BR_TOPO
Message: France zo un anv lec’h gallek. Ha fellout a rae deoc’h
skrivañ 'Frañs' pe 'bro-C’hall'?
Suggestion: Frañs; bro-C’hall
France 3 a zo ur chadenn skinwel.
^^^^^^
Notice that the verbose mode outputs:
FRANCE_3:1 France[France/Z e s top*] -> France[France/Z e s top*]
This is odd, since I did not put any marker in the disambiguation rule which
contains 2 tokens, so why does it output something only for the first token of
the disambiguation rule?
Regards
Dominique
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel