On 2014-04-27 22:18, Dominique Pellé wrote: > I wish I could check the POS tag of a portion of > a token.
(Replying to an old thread here...) I think the new rule filter offers a solution for this that does not require any changes to the XML. Mike posted an example where he wanted to match in(.*) and un(.*), but only if the matching part is an adjective. It might work with a PartialPosTagFilter (that is yet to be developed) like this: <pattern> <token regexp="yes">(in|un).*</token> <pattern> <filter class="org.lt.PartialPosTagFilter" args="no:1 regexp:(?:in|un)(.*) postag_regexp:JJ message:'Approved adjective, but not-approved prefix. Use NOT + <adjective>.'"/> no:1 refers to the relevant token regexp:(?:in|un)(.*) specifies the part of the token to be checked postag_regexp:JJ checks whether the matching part of the regex matches 'JJ' (adjective) message: the message to be shown if postag_regexp matches; if it doesn't, the rule match will be discarded So as with any rule filter, we have a rule that matches candidates, and a filter that keeps only those matches that are actually useful. What do you think? Regards Daniel ------------------------------------------------------------------------------ Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce. Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel