W dniu 2014-01-28 11:58, Kumara Bhikkhu pisze: > <rule id="CONDUCT_A" name="'Conduct' nominalization"> > <pattern> > <token regexp="yes">conduct.*</token>
Try: <token inflected="yes">conduct</token> instead. I tried your rule using our rule editor: http://community.languagetool.org/ruleEditor/expert and we have lots of matches, but some of them are quite useless, for example: Mozart's Davide penitente (1785), his Piano Concerto KV 482 (1785), the Clarinet Quintet (1789) and the 40th Symphony (1788) had been premiered on the suggestion of Salieri, who supposedly conducted a performance of it in 1791. (wikipedia) You can use the following queries on corpus.byu.edu corpora: conduct a|an|the|no [n*] of|into conduct a|an|the|no [n*] * of|into conduct a|an|the|no [n*] * * of|into conduct a|an|the|no [n*] * * * of|into And you will see that you need to exclude also other words, such as 'business', 'range', 'bit', and so forth. Could you refine your rule? Right now it's really greedy, and will result with lots of false alarms, beside the ones that really detect redundant phrases. Thanks, Marcin ------------------------------------------------------------------------------ WatchGuard Dimension instantly turns raw network data into actionable security intelligence. It gives you real-time visual feedback on key security issues and trends. Skip the complicated setup - simply import a virtual appliance and go from zero to informed in seconds. http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel