Hi,

there's now a first and limited implementation of the <regexp> syntax in 
master. Instead of

<pattern><token>foo</token></pattern>

you can now use

<regexp>foo</regex>

But be aware that this is a real regular expression that ignores tokens, 
so it matches anything with the substring 'foo'. Also, the regular 
expression is case-insensitive by default. You can have a look at the 
German grammar.xml for many examples.

To make use of these, you can adapt and run RuleSimplifier in the dev 
package. It tries to convert simple rules automatically, but it's just a 
hack, the new rules need to be tested and adapted manually. It also only 
touches rules without '<marker>' elements. There's no <marker> for 
regexp, it's always the complete match that will be underlined. You 
obviously cannot use the regex to access the part-of-speech tags of the 
match. But replacements are also limited, e.g. changing case currently 
doesn't work. By using \1 you can access the first matching group, i.e. 
the first parenthesis group of the regexp etc.

Please let me know how this works for you.

Regards
  Daniel


------------------------------------------------------------------------------
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to