Hi,

while trying to improve the German VIELZAHL_PLUS_SINGULAR rule, I 
noticed a (in my view) strange behaviour of the POS tag UNKNOWN, since 
it also matches a period which has a SENT_END tag. Here is a simple example:

         <rule id="TEST" name="German Test">
             <pattern>
                 <token>Hallo</token>
                 <token postag="UNKNOWN"/>
             </pattern>
             <message><suggestion>Hi</suggestion></message>
             <example type="correct"><marker>Hallo du</marker>.</example>
             <example type="incorrect"><marker>Hallo 
dufff</marker>.</example>
             <example type="correct"><marker>Hallo.</marker></example>
<!-- FAILS, POS tags: <S> Hallo[Hallo/SUB:AKK:SIN:NEU, 
Hallo/SUB:DAT:SIN:NEU, Hallo/SUB:NOM:SIN:NEU].[</S>] -->
         </rule>

I’d expect that the last sentence is considered to be correct because 
.[</S>] is actually not a “word” which is UNKNOWN.

Is the period intentionally tagged as UNKNOWN?

Regards
Markus

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to