Anti-patterns and immunization

2014-02-25 Thread Marcin Miłkowski
Hi all,

I have a simple idea how to implement anti-patterns that we talked about 
earlier on this list (see 
http://wiki.languagetool.org/xml-pattern-rule-extensions). The idea is 
to extend the current immunization by adding a simple parameter: a list 
of rule IDs, for which the immunization is supposed to work. The pattern 
rule matcher (and the rest of matchers) would simply check whether their 
own rule ID is on the list, and that would stop them from matching.

The only drawback is that we would store the antipatterns separately 
from the rules for which they apply but, on the other hand, we could 
apply them to several rules at the same time and save time and space 
loading them.

Of course, one can easily add the same code to the handlers of XML 
pattern rules, so that one could use the same construct, but it would be 
a little more complex, as we'd have to add these rules actually to the 
disambiguation rules for a given language (and implement disambiguation 
for all languages that want to use anti-patterns). Another possibility 
is to have the same code to immunize tokens for particular rules. Note 
however that if we want to share the same antipatterns among several 
rules in the same rulegroup, then it may be a little bit tricky.

What do you think? Adding a rule ID in current immunization rules seems 
very easy, and may do all the work we need.

Regards,
Marcin

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Anti-patterns and immunization

2014-02-25 Thread Daniel Naber
On 2014-02-25 10:47, Marcin Miłkowski wrote:

 The only drawback is that we would store the antipatterns separately
 from the rules for which they apply but,

I think this is a problem... I also want the source code to be simple, 
but considering we have 10,000 rules or so, we shouldn't accept the 
rules becoming more difficult to understand just because it makes our 
code a bit simpler.

Regards
  Daniel


--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Anti-patterns and immunization

2014-02-25 Thread Marcin Miłkowski
W dniu 2014-02-25 15:57, Daniel Naber pisze:
 On 2014-02-25 10:47, Marcin Miłkowski wrote:

 The only drawback is that we would store the antipatterns separately
 from the rules for which they apply but,

 I think this is a problem... I also want the source code to be simple,
 but considering we have 10,000 rules or so, we shouldn't accept the
 rules becoming more difficult to understand just because it makes our
 code a bit simpler.

Note that Java rules cannot have selective anti-patterns right now, so 
we need to immunize tokens for all possible rules, which is unsafe, as 
we may suppress genuine rule matches as well. I use immunization to 
suppress word repeat rule for several idiomatic expressions, as this is 
very easy. Adding exceptions that cover multi-word sequences in a Java 
rule has always been a pain.

I think we need both targeted immunization in the disambiguator (mostly 
for multiple rules, also Java rules, such as word repeat rule) and 
anti-patterns. That would be flexible and simple for rule creators.

Regards,
Marcin

--
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis  security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel