On Sat, Jan 24, 2015 at 11:59 AM, Piaget Modeler via AGI
<[email protected]> wrote:
> How do you represent IS ? Do you differentiate IS from TYPE-OF (i.e., IS-A), 
> or INSTANCE-OF ?
>
> Take for example,
>
> IS(apple, fruit)  - TYPE-OF
> IS(John_Smith, Politician) - INSTANCE-OF
> IS(my_coat, green) -  ???

IS could also be Islamic State.

Language evolves. Knowledge representation systems that assign a fixed
set of meanings to words have a long history of failure. I don't know
why anyone still pursues this approach.

I understand that a structured knowledge representation doesn't
require a supercomputer like a neural language model. Initially it
looks like the right approach too, because rule coverage has a power
law distribution, with the IS-A construct ranked right at the top. You
can cover half of the language with just a few hundred rules. The
problem is that nobody knows how many rules you need to cover the
other half. Doug Lenat (Cyc) has been plugging away at it for over 30
years. Apparently it was a lot more than he thought.

First, our brains evolved to be able to learn language. Then language
evolved to have a structure that can be learned in a few years on a
noisy, massively parallel 10 petaflop computer with 100 terabits of
memory.

The rules (I believe there are 10^8 to 10^9 of them) can be grouped
roughly into lexical, semantics, and grammar. Rules in each set can be
learned after learning a large portion of the previous set. Note that
I listed semantics before grammar, which is the opposite of the way
most parsers work (or actually, don't work). Children learn the rules
for splitting continuous text into words by age 7 to 10 months. They
learn to associate words with other words and with nonverbal
perceptions (grounding) starting around 1 year. They start forming
grammatically correct sentences around age 2-3.

We can divide grammar rules into categorization (X is a noun) and
rules for ordering words (adjectives precede nouns in English). Most
rules are very specific. For example, we say "salt and pepper", not
"pepper and salt". We use high level grammar rules to solve math
problems, so there is an obvious learning hierarchy here too.

I am not sure how much this helps. Most of us don't have the resources
to do the 10^24 operations needed to properly learn natural language,
other than in our own brains. We usually compromise and do something
we can afford, but there is an obvious tradeoff between CPU, memory,
and text prediction accuracy which I have documented at
http://mattmahoney.net/dc/text.html
A highly optimized program running for a week on a high end desktop
with 32 GB of memory still falls well short of what humans can do.

-- 
-- Matt Mahoney, [email protected]


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Reply via email to