In our rule encoding approach, we will need about 5000 mapping rules to
map
syntactic parses of commonsense sentences into term logic relationships.
Our
inference engine will then generalize these into hundreds of thousands
or millions
of specialized rules.
How would your rules handle the "on" cases that you gave? What do your
rules match on (specific words, word types, object types, something else)?
Are your rules all at the same level or are they "tiered" somehow?
My gut instinct is that 5000 rules is way, way high for both the most
general and second-tiers and that you can do exception-based learning after
those two tiers.
We have about 1000 rules in place now and will soon stop coding them and
start
experimenting with using inference to generalize and apply them. If
this goes well,
then we'll put in the work to encode the rest of the rules (which is
not very fun work,
as you might imagine).
Can you give about ten examples of rules? (That would answer a lot of my
questions above)
Where did you get the rules? Did you hand-code them or get them from
somewhere?
----- Original Message -----
From: "Benjamin Goertzel" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Wednesday, January 09, 2008 5:04 PM
Subject: Re: [agi] Incremental Fluid Construction Grammar released
And how would a young child or foreigner interpret on the Washington
Monument or "shit list"? Both are physical objects and a book *could* be
resting on them.
Sorry, my shit list is purely mental in nature ;-) ... at the moment, I
maintain
a task list but not a shit list... maybe I need to get better organized!!!
Ben, your question is *very* disingenuous.
Who, **me** ???
There is a tremendous amount of
domain/real-world knowledge that is absolutely required to parse your
sentences. Do you have any better way of approaching the problem?
I've been putting a lot of thought and work into trying to build and
maintain precedence of knowledge structures with respect to
disambiguating
(and overriding incorrect) parsing . . . . and don't believe that it's
going
to be possible without a severe amount of knwledge . . . .
What do you think?
OK...
Let's assume one is working within the scope of an AI system that
includes an NLP parser,
a logical knowledge representation system, and needs some intelligent way
to map
the output of the latter into the former.
Then, in this context, there are three approaches, which may be tried
alone or in combination:
1)
Hand-code rules to map the output of the parser into a much less
ambiguous logical format
2)
Use statistical learning across a huge corpus of text to somehow infer
these rules
[I did not ever flesh out this approach as it seemed implausible, but
I have to recognize
its theoretical possibility]
3)
Use **embodied** learning, so that the system can statistically infer
the rules from the
combination of parse-trees with logical relationships that it observes
to describe
situations it sees
[This is the best approach in principle, but may require years and
years of embodied
interaction for a system to learn.]
Obviously, Cycorp has taken Approach 1, with only modest success. But
I think part of
the reason they have not been more successful is a combination of a
bad choice of
parser with a bad choice of knowledge representation. They use a
phrase structure
grammar parser and predicate logic, whereas I believe if one uses a
dependency
grammar parser and term logic, the process becomes a lot easier. So
far as I can tell,
in texai you are replicating Cyc's choices in this regard (phrase
structure grammar +
predicate logic).
In Novamente, we are aiming at a combination of the 3 approaches.
We are encoding a bunch of rules, but we don't ever expect to get anywhere
near
complete coverage with them, and we have mechanisms (some designed, some
already in place) that can
generalize the rule base to learn new, probabilistic rules, based on
statistical corpus
analysis and based on embodied experience.
In our rule encoding approach, we will need about 5000 mapping rules to
map
syntactic parses of commonsense sentences into term logic relationships.
Our
inference engine will then generalize these into hundreds of thousands
or millions
of specialized rules.
This is current work, research in progress.
We have about 1000 rules in place now and will soon stop coding them and
start
experimenting with using inference to generalize and apply them. If
this goes well,
then we'll put in the work to encode the rest of the rules (which is
not very fun work,
as you might imagine).
Emotionally and philosophically, I am more drawn to approach 3 (embodied
learning), but pragmatically, I have reluctantly concluded that the
hybrid approach
we're currently taking has the greatest odds of rapid success.
In the longer term, we intend to throw out the standalone grammar parser
we're
using and have syntax parsing done via our core AI processing -- but we're
now
using a standalone grammar parser as a sort of "scaffolding."
I note that this is not the main NM R&D thrust right now -- it is at
the moment somewhat
separate from our work on embodied imitative/reinforcement/corrective
learning of
virtual agents. However, the two streams of work are intended to come
together, as
I've outlined in my paper for WCCI 2008,
http://www.goertzel.org/new_research/WCCI_AGI.pdf
-- Ben
-- Ben G
-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&
-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=83988543-d13e41