and
Ben: So, I feel much of the present discussion on NLP interpretation isI hoped to get more into this area with the discussion.
bypassing the hard problem, which is enabling an AGI system to learn
the millions or billions of commonsense (probabilistic) rules relating
to basic relationships like with_tool, which humans learn from
experience....
That was a really good description below, and I agree with most all of it.
Its hard to learn these on a one-by-one basis though Im afraid, without already having in the background that "salad is eaten with a fork, as a tool" in the DB, with confidences, and probabilities.
After a few passes with salad / ate / fork the frames should be primed, and then when it encounters new information it can use and reason with that. One of the easiest first level differentiation of the Roles of the prepositions is simply the ISA category of the types, Person vs tool or other.
It also allows for a lot of missing information assumption, which is really interesting to me.
Like, upon getting the sentence "I ate the salad" We would infer that the salad was being eaten most likely with a fork, so we could conditionally assume there is also a form in the current environment, and things like, what si in the salad (lettuce) The thing I want to make sure of there, is that the confidence values, and the probabilites are carried along through all these assumptions, so once it starts assuming something like an elephant in the room, it will realize that that is not probably or likely.
One better source than a strait google search Ive found, is a huge set of novels, 600+, and text stats on it, it removes much of the nonsense that you find on crud web pages, alternatively, getting it from new sources, is a fairly good way as well, I am using google news and a special engine for that.
James
Ben Goertzel <[EMAIL PROTECTED]> wrote:
Hi,
About
> But a simple example is
> ate a pepperoni pizza
> ate a tuna pizza
> ate a VEGAN SUPREME pizza
> ate a Mexican pizza
> ate a pineapple pizza
I feel this discussion of sentence parsing and interpretation is
taking a somewhat misleading direction, by focusing on examples that
are in fact very easy to parse and semantically interpret.
When dealing with realistic sentence parsing, things don't work out so
simply as in "The cat ate the mouse."
There is even some complexity of course in the above example
"ate a Mexican pizza"
as the adjective Mexican could mean "in the country of Mexico",
"Mexican style", or "with a topping composed of pieces of a Mexican
person" ;-)
But the above examples, overalll, are so simplistic they cover up the
real problems with commonsense knowledge and language understandng...
Among other complications that arise in practice (some even worse),
there are prepositions. Let's go back to our prior example of "with"
...
Humans can unproblematically assign senses to "with" in the following sentences.
**********
I ate the salad with a fork
I ate the salad with a tweezers
I ate the salad with my favorite uncle
I ate the salad with my favorite pepper
"I ate the salad with my favorite uncle," said the cannibal
"I ate the salad with my favorite pepper," said the salt.
I ate the salad with gusto
I ate the salad with Ragu
I ate the salad with Gusto
I eat steak with red wine, and fish with white wine
I eat fish with beer batter
******
Our intended approach to this problem (preposition disambiguation)
within Novamente is to teach the system groundings for many sentences
of this nature within the AGISim simulation world. Then, when it sees
a sentence of this nature containing a concept that it hasn't seen in
the simulation world, it must match the concept to ones it has seen in
the simulation world, and make a guess.
For instance, it may have seen and learned to understand
"I ate the salad with a fork"
"I ate the salad with an olive"
in the sim world, so that when it sees
"I ate the salad with a tweezers"
it needs to realize that a tweezers is more like a fork than like an
olive (since it is not edible, and is a tool), and so the sense of
"with" in this latter sentence is probably like the sense in "I ate
the salad with a fork."
[One way for the system to realize the similarity between fork and
tweezers is to use WordNet, in which both are classified as
noun.artifact]
What the sim world grounding gives the AI system is a full
understanding of what the various senses of "with" actually mean. For
instance, in "I ate the salad with a fork", what we really have is
with_tool( I ate the salad, a fork)
i.e. (in one among many possible notations)
with_tool( A, B)
Inheritance(B, fork)
B := eat(I, salad)
past(B)
and thru interacting and learning in the sim world, the system learns
various relationships related to the predicate with_tool. Once it
guesses that the "I ate the salad with the tweezers" is correctly
mapped into
with_tool( A, B)
Inheritance(B, tweezers)
B := eat(I, salad)
past(B)
then it can use its knowledge about with_tool, gained in the sim
world, to reason about the situation described.
For example, it often takes an agent practice to learn to use a tool.
A system that has had some experience with instances of with_tool in
the sim world will know this, and will have learned that the
effectiveness with which B can be used by C as a tool for A may depend
on the amount of experience that C has in using B as a tool,
particularly in using B as a tool in contexts similar to A.
Thus, the system could respond to
"I ate the salad with a tweezers"
with
"Is that difficult?"
(knowing that, since eating salad with tweezers is unusual (since e.g.
a Google search reveals few instances of it), it is likely that the
speaker may not have had much practice doing it, so it may be
difficult for the speaker.)
This is just one among very very many examples of probabilistic/fuzzy
commonsense knowledge about preposition senses. In order to interpret
texts correctly an AGI system needs to have this commonsense
knowledge. Otherwise, even if it correctly figures out that the
intended meaning is
with_tool( I eat the salad, a tweezers)
it won't be able to draw the commonsensically expected implications from this.
So, how to get all this probabilistic commonsense knowledge (which in
humans is mostly unconscious) into the AGI system?
This is where we are back to the good old alternatives, of
a-- embodied learning
b-- exhaustive education through NLP dialogue in very simple English
c-- exhaustive education through dialogue in some artificial language
like Lojban++
d-- make a big nasty database like Cyc (and try to do a better job)
My bet is that a is the best foundational approach, with some
augmentation by the other approaches. (Though I don't plan to embark
upon d at all, I am willing to make use of DB's constructed by others.
I note that there is no standard DB of preposition senses, though we
have made one within Novamente for a narrow-AI NLP consulting project,
a couple years ago.)
Note that in my above example of interpretation of a very simple
sentence I casually assumed integration of
* simple language parsing
* experience with tool usage in a sim world
* WordNet
* frequency counting based on Google searches
I think this kind of integrative approach has plenty of promise. We
are proceeding in this direction with Novamente, but slowly, due to
having only a small amount of staff focused on the project, and having
a pretty complex (necessarily, I believe) AI design.
But IMO, from an AGI point of view all this is sorta "surface level"
stuff. The key part in the above story is the learning engine that
allows the system to learn commonsense information about tool usage
from its embodied experience. The other inferences involved are not
that hard and are in fact easily carried out by Novamente's current
inference system. The language parsing involved in the above example
is trivial and is done by our NLP parser as by many, many others. We
are already using WordNet; we aren't using frequency counting based on
Google searches but that's obviously "just engineering".... What is
harder, and is the focus of much of our effort right now, is learning
useful generalizable commonsense rules from embodied experience, using
a combination of probabilistic inference, evolutionary learning and
(to be integrated in 2007) economic attention allocation.
So, I feel much of the present discussion on NLP interpretation is
bypassing the hard problem, which is enabling an AGI system to learn
the millions or billions of commonsense (probabilistic) rules relating
to basic relationships like with_tool, which humans learn from
experience....
Eric Baum argues that we humans have a lot of inbuilt inductive bias
that helps us to learn these rules more efficiently than an AI system
would be able to. This may be right.... OTOH our AGI systems have
WordNet and Google, and possibly cleverer learning algorithms than the
human brain....
-- Ben G
-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303
_______________________________________
James Ratcliff - http://falazar.com
New Torrent Site, Has TV and Movie Downloads! http://www.falazar.com/projects/Torrents/tvtorrents_show.php
Sponsored Link
Talk more and pay less. Vonage can save you up to $300 a year on your phone bill. Sign up now.
This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303
