--- Linas Vepstas <[EMAIL PROTECTED]> wrote:
> So, after asserting "aluminum is a mass noun", it might plausibly deduce 
> "most minerals are mass nouns" -- one could call this "data mining".
> This would use the same algo as deducing that many of the things called
> "lincoln" are "counties". 
> 
> I want to know how far down this path one can go, and how far anyone has
> gone. I can see that it might not be good path, but I don't see any
> alternatives at the moment.

Well, one alternative is to deduce that aluminum is a mass noun by the low
frequency of phrases like "an aluminum is" from a large corpus of text (or
count Google hits).  You could also deduce that aluminum is an adjective from
phrases like "an aluminum chair", etc.  More generally, you would cluster
words in the high dimensional vector space of their immediate context, then
derive rules for moving from cluster to cluster.

However, the fact that this method is not used in the best language models
suggests it may exceed the computational limits of your PC.  This might
explain why we keep wading into the swamp.


-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=60633102-37c859

Reply via email to