Re: Suggestion: find POS tag of portion of a word in XML rules

Daniel Naber Tue, 02 Dec 2014 02:26:53 -0800

On 2014-09-09 13:50, Daniel Naber wrote:

> I think the new rule filter offers a solution for this that does not
> require any changes to the XML. Mike posted an example where he wanted
> to match in(.*) and un(.*), but only if the matching part is an
> adjective.


I've just added such a filter. It can be used like this:

      <pattern>
          <token regexp="yes">in.*</token>
      </pattern>
      <filter 
class="org.languagetool.rules.en.EnglishPartialPosTagFilter"
              args="no:1 regexp:in(.*) postag_regexp:JJ"/>

This will only keep matches for words that start with 'in' and where the 
part after the 'in' is an adjective (POS tag 'JJ'). The 'no:1' is the 
token position, i.e. here the first (and in this case only) matching 
<token> is referred to.

It's available only for English, but it can easily be made available for 
other languages (see EnglishPartialPosTagFilter, all the logic is in 
PartialPosTagFilter that needs to be extended).

Regards
  Daniel


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Re: Suggestion: find POS tag of portion of a word in XML rules

Reply via email to