Anti-patterns and immunization
Hi all, I have a simple idea how to implement anti-patterns that we talked about earlier on this list (see http://wiki.languagetool.org/xml-pattern-rule-extensions). The idea is to extend the current immunization by adding a simple parameter: a list of rule IDs, for which the immunization is supposed to work. The pattern rule matcher (and the rest of matchers) would simply check whether their own rule ID is on the list, and that would stop them from matching. The only drawback is that we would store the antipatterns separately from the rules for which they apply but, on the other hand, we could apply them to several rules at the same time and save time and space loading them. Of course, one can easily add the same code to the handlers of XML pattern rules, so that one could use the same construct, but it would be a little more complex, as we'd have to add these rules actually to the disambiguation rules for a given language (and implement disambiguation for all languages that want to use anti-patterns). Another possibility is to have the same code to immunize tokens for particular rules. Note however that if we want to share the same antipatterns among several rules in the same rulegroup, then it may be a little bit tricky. What do you think? Adding a rule ID in current immunization rules seems very easy, and may do all the work we need. Regards, Marcin -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Anti-patterns and immunization
On 2014-02-25 10:47, Marcin Miłkowski wrote: The only drawback is that we would store the antipatterns separately from the rules for which they apply but, I think this is a problem... I also want the source code to be simple, but considering we have 10,000 rules or so, we shouldn't accept the rules becoming more difficult to understand just because it makes our code a bit simpler. Regards Daniel -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel
Re: Anti-patterns and immunization
W dniu 2014-02-25 15:57, Daniel Naber pisze: On 2014-02-25 10:47, Marcin Miłkowski wrote: The only drawback is that we would store the antipatterns separately from the rules for which they apply but, I think this is a problem... I also want the source code to be simple, but considering we have 10,000 rules or so, we shouldn't accept the rules becoming more difficult to understand just because it makes our code a bit simpler. Note that Java rules cannot have selective anti-patterns right now, so we need to immunize tokens for all possible rules, which is unsafe, as we may suppress genuine rule matches as well. I use immunization to suppress word repeat rule for several idiomatic expressions, as this is very easy. Adding exceptions that cover multi-word sequences in a Java rule has always been a pain. I think we need both targeted immunization in the disambiguator (mostly for multiple rules, also Java rules, such as word repeat rule) and anti-patterns. That would be flexible and simple for rule creators. Regards, Marcin -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk ___ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel