--- Linas Vepstas <[EMAIL PROTECTED]> wrote: > So, after asserting "aluminum is a mass noun", it might plausibly deduce > "most minerals are mass nouns" -- one could call this "data mining". > This would use the same algo as deducing that many of the things called > "lincoln" are "counties". > > I want to know how far down this path one can go, and how far anyone has > gone. I can see that it might not be good path, but I don't see any > alternatives at the moment.
Well, one alternative is to deduce that aluminum is a mass noun by the low frequency of phrases like "an aluminum is" from a large corpus of text (or count Google hits). You could also deduce that aluminum is an adjective from phrases like "an aluminum chair", etc. More generally, you would cluster words in the high dimensional vector space of their immediate context, then derive rules for moving from cluster to cluster. However, the fact that this method is not used in the best language models suggests it may exceed the computational limits of your PC. This might explain why we keep wading into the swamp. -- Matt Mahoney, [EMAIL PROTECTED] ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244&id_secret=60633102-37c859
