W dniu 27.04.2016 o 11:43, Mike Unwalla pisze:
> Hi All,
>
> English tagset.txt explains the meaning of each postag, and includes these 2 
> lines:
> NN:U    Mass noun             #new tag - deviation from Penn: admiration, 
> air, Afrikaans
> NN:UN    Noun used as mass    #new tag - deviation from Penn: establishment, 
> wax, afternoon
>
> What is the difference between NN:U and NN:UN?

NN:U - nouns that are always uncountable.

NN:UN - nouns that might be used in the plural form and with an 
indefinite article, depending on their meaning (for example, "information").


> What features of a word cause that word to be categorised the way it is?

There's a hard-coded list of such words, and some simple heuristics 
(words that end with -ation are usually abstract nouns).

The script that does is found at

languagetool\languagetool-language-modules\en\src\main\resources\org\languagetool\resource\en\get_unc.awk

The list of uncountable nouns is in uncountable.txt, and partly 
countable in partlycountable.txt. Feel free to edit those, and let me 
know when you want to recreate the dictionary (we shouldn't do it too 
frequently, as binary files do not sit well with github's infrastructure).

> Why, for example, is 'air' NN:U and 'wax' NN:UN?

This list was prepared manually based on some corpus searches and 
dictionaries. Actually, 'air' should be 'NN:UN'. My principle was that 
if there's a plural or indefinite article use of a noun, it should be 
'NN:UN', but to be safe, I assumed NN:UN to be the default option (so 
that there would be no complaint in the rules that use the 
countable/uncountable distinction).

Feel

Regards,
Marcin


------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to