W dniu 2013-06-25 13:02, gulp21 pisze: >>> <token><exception postag="SUB:.+:PLU:.+|UNKNOWN" >>> postag_regexp="yes"/></token> <!-- the token should be allowed to be >>> SENT_END --> >>> >>> So how can I make sure that SENT_END is not included in the exception? >> >> Do you want to match anything or just punctuation marks? If the latter, >> you could tag punctuation marks (via dictionary or via disambiguator), >> and they would not have "UNKNOWN" tag. > > It is supposed to match every token except those which are tagged > “SUB:.+:PLU:.+” or are “UNKNOWN” (where “UNKNOWN” is not supposed to > match any punctuation marks). > > The token should match > > Tier[Tier/SUB:AKK:SIN:NEU, Tier/SUB:DAT:SIN:NEU, Tier/SUB:NOM:SIN:NEU] > > but not > > Blablubb[null/null] > .[</S>] > Tiere[Tier/SUB:AKK:PLU:NEU, Tier/SUB:DAT:SIN:NEU, Tier/SUB:GEN:PLU:NEU, > Tier/SUB:NOM:PLU:NEU]
So add a rule in a disambiguator to tag the dot. Easy peasy. Best, marcin ------------------------------------------------------------------------------ This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev _______________________________________________ Languagetool-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/languagetool-devel
