The particular NL parser paper in question, Collins's "Convolution Kernels
for Natural Language"
(http://l2r.cs.uiuc.edu/~danr/Teaching/CS598-05/Papers/Collins-kernels.pdf)
is actually saying something quite important that extends way beyond parsers
and is highly applicable to AGI in general.  

It is actually showing that you can do something roughly equivalent to
growing neural gas (GNG) in a space with something approaching 500,000
dimensions, but you can do it without normally having to deal with more than
a few of those dimensions at one time.  GNG is an algorithm I learned about
from reading Peter Voss that allows one to learn how to efficiently
represent a distribution in a relatively high dimensional space in a totally
unsupervised manner.  But there really seem to be no reason why there should
be any limit to the dimensionality of the space in which the Collin's
algorithm works, because it does not use an explicit vector representation,
nor, if I recollect correctly, a Euclidian distance metric, but rather a
similarity metric which is generally much more appropriate for matching in
very high dimensional spaces.

But what he is growing are not just points representing where data has
occurred in a high dimensional space, but sets of points that define
hyperplanes for defining the boundaries between classes.  My recollection is
that this system learns automatically from both labeled data (instances of
correct parse trees) and randomly generated deviations from those instances.
His particular algorithm matches tree structures, but with modification it
would seem to be extendable to matching arbitrary nets.  Other versions of
it could be made to operate, like GNG, in an unsupervised manner.

If you stop and think about what this is saying and generalize from it, it
provides an important possible component in an AGI tool kit. What it shows
is not limited to parsing, but it would seem possibly applicable to
virtually any hierarchical or networked representation, including nets of
semantic web RDF triples, and semantic nets, and predicate logic
expressions.  At first glance it appears it would even be applicable to
kinkier net matching algorithms, such as an Augmented transition network
(ATN) matching.

So if one reads this paper with a mind to not only what it specifically
shows, but to what how what it shows could be expanded, this paper says
something very important.  That is, that one can represent, learn, and
classify things in very high dimensional spaces -- such as 10^1000000000000
dimensional spaces -- and do it efficiently provided the part of the space
being represented is sufficiently sparsely connected.

I had already assumed this, before reading this paper, but the paper was
valuable to me because it provided a mathematically rigorous support for my
prior models, and helped me better understand the mathematical foundations
of my own prior intuitive thinking.  

It means that systems like Novemente can deal in very high dimensional
spaces relatively efficiently. It does not mean that all processes that can
be performed in such spaces will be computationally cheap (for example,
combinatorial searches), but it means that many of them, such as GNG like
recording of experience, and simple indexed based matching can scale
relatively well in a sparsely connected world.

That is important, for those with the vision to understand.

Ed Porter

-----Original Message-----
From: Benjamin Goertzel [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 04, 2007 8:59 PM
To: agi@v2.listbox.com
Subject: Re: Hacker intelligence level [WAS Re: [agi] Funding AGI research]

> Thus: building a NL parser, no matter how good it is, is of no use
> whatsoever unless it can be shown to emerge from (or at least fit with)
> a learning mechanism that allows the system itself to generate its own
> understanding (or, at least, acquisition) of grammar IN THE CONTEXT OF A
> MECHANISM THAT ALSO ACCOMPLISHES REAL UNDERSTANDING. When that larger
> issue is dealt with, a NL parser will arise naturally, and any previous
> work on non-developmental, hand-built parsers will be completely
> discarded. You were trumpeting the importance of work that I know will
> be thrown away later, and in the mean time will be of no help in
> resolving the important issues.

Richard, you discount the possibility that said NL parser will play a key
role in the adaptive emergence of a system that can generate its own
linguistic understanding.  I.e., you discount the possibility that, with the
right learning mechanism and instructional environment, hand-coded
rules may serve as part of the initial seed for a learning process that will
eventually generate knowledge obsoleting these initial hand-coded
rules.

It's fine that you discount this possibility -- I just want to point out
that
in doing so, you are making a bold and unsupported theoretical hypothesis,
rather than stating an obvious or demonstrated fact.

Vaguely similarly, the "grammar" of child language is largely thrown
away in adulthood, yet it was useful as scaffolding in leading to the
emergence of adult language.

-- Ben G

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=72148682-ee7a63

<<attachment: winmail.dat>>

Reply via email to