W dniu 2013-06-25 13:02, gulp21 pisze:
>>> <token><exception postag="SUB:.+:PLU:.+|UNKNOWN"
>>> postag_regexp="yes"/></token> <!-- the token should be allowed to be
>>> SENT_END -->
>>>
>>> So how can I make sure that SENT_END is not included in the exception?
>>
>> Do you want to match anything or just punctuation marks? If the latter,
>> you could tag punctuation marks (via dictionary or via disambiguator),
>> and they would not have "UNKNOWN" tag.
>
> It is supposed to match every token except those which are tagged
> “SUB:.+:PLU:.+” or are “UNKNOWN” (where “UNKNOWN” is not supposed to
> match any punctuation marks).
>
> The token should match
>
> Tier[Tier/SUB:AKK:SIN:NEU, Tier/SUB:DAT:SIN:NEU, Tier/SUB:NOM:SIN:NEU]
>
> but not
>
> Blablubb[null/null]
> .[</S>]
> Tiere[Tier/SUB:AKK:PLU:NEU, Tier/SUB:DAT:SIN:NEU, Tier/SUB:GEN:PLU:NEU,
> Tier/SUB:NOM:PLU:NEU]

So add a rule in a disambiguator to tag the dot. Easy peasy.

Best,
marcin

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to