W dniu 2014-01-28 11:58, Kumara Bhikkhu pisze:
> <rule id="CONDUCT_A" name="'Conduct' nominalization">
> <pattern>
> <token regexp="yes">conduct.*</token>

Try:

          <token inflected="yes">conduct</token>

instead.

I tried your rule using our rule editor:

http://community.languagetool.org/ruleEditor/expert

and we have lots of matches, but some of them are quite useless, for 
example:

Mozart's Davide penitente (1785), his Piano Concerto KV 482 (1785), the 
Clarinet Quintet (1789) and the 40th Symphony (1788) had been premiered 
on the suggestion of Salieri, who supposedly conducted a performance of 
it in 1791. (wikipedia)

You can use the following queries on corpus.byu.edu corpora:

conduct a|an|the|no [n*] of|into

conduct a|an|the|no [n*] * of|into

conduct a|an|the|no [n*] * * of|into

conduct a|an|the|no [n*] * * * of|into

And you will see that you need to exclude also other words, such as 
'business', 'range', 'bit', and so forth. Could you refine your rule? 
Right now it's really greedy, and will result with lots of false alarms, 
beside the ones that really detect redundant phrases.

Thanks,
Marcin

------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to