Matt,

You've stated that any knowledge that can be demonstrated verbally CAN
in principle be taught verbally.   I don't agree that this is
necessarily true for ANY learning system, but that's not the point I
want to argue.

My larger point is that this doesn't imply that this is how humans do
it.  So, if a human has learned a verbal behavior, and has been
exposed to 1GB of text, it does not imply that said human has learned
said behavior from said text.

In fact there is much evidence that this is NOT the case -- this is
what the whole literature on "symbol grounding" is about.  Humans
happen to learn a lot of their verbal behaviors based on non-verbal
stimuli and actions.  But, this is not to say that some other AI
system couldn't learn to IMITATE human verbal behaviors based only on
studying human behaviors, of course.

IMO, focusing AI narrowly on text processing is a bad direction for
near-term AGI research.  I think that focusing on symbol grounding and
perception/action/cognition integration is a better approach.  But
this "better approach" is not likely, in the immediate term, to be the
best approach to excelling at the Hutter Prize task.  Which gets back
to my point that seeking to win the Hutter Prize is probably not a
good guide for near-term AGI development.

-- Ben G


On 8/13/06, Matt Mahoney <[EMAIL PROTECTED]> wrote:

I will try to answer several posts here.

First, I said that there is no knowledge that you can demonstrate verbally
that cannot also be learned verbally.  For simple cases, this is easy to
show.  If you test for knowledge X by asking question Q, expecting answer A,
then you can train a machine "the answer to Q is A".  I realize for many
practical cases that there could be many questions about Q and you can't
anticipate them all.  In other words, X could be a procedure or algorithm
for generating answers from an intractably large set of questions.  For
example, X could be the rules for addition or playing chess.  In this case,
you could train the machine by giving it the algorithm in the form of
natural language text (here is how you play chess...).

Humans possess a lot of knowledge that cannot be demonstrated verbally.
Examples: how to ride a bicycle, how to catch a ball, what a banana tastes
like, what my face looks like.  The English language is inadequate to convey
such knowledge fully, although some partial knowledge transfer is possible
(I have brown hair).  Now try to think of questions to test for the parts of
the knowledge that cannot be conveyed verbally.  Sure, you could ask what
color my hair is.  Try to ask a question about knowledge that cannot be
conveyed verbally to the machine at all.  If you can't convey this knowledge
to the machine, it can't convey it to you.

An important question is: how much information does a machine need to pass
the Turing test?  The machine only needs knowledge that can be verbally
tested.  Information theory says that this quantity cannot exceed the
entropy of the training data plus the algorithmic complexity (length of the
program) of the machine prior to training.  From my argument above, all of
the training data can be in the form of text.  I estimate that the average
adult has been exposed to about 1 GB of speech (transcribed) and writing
since birth.  This is why I chose 1 GB for the large text benchmark.  I do
not claim that the Wikipedia data is the *right* text to train an AI system,
but I think it is the right amount, and I believe that the algorithms we
would use on the right training set would be very similar to the ones we
would use on this training set.


Second, on lossy vs. lossless compression.  It would be a good demonstration
of AI if we could compress text using lossy techniques and uncompress to
different text that had the same meaning.  We can already do this at a
simple level, e.g. swapping spaces and linefeeds, or substituting synonyms,
or swapping the order of articles.  We can't yet do this in the more
conceptual way that humans could, but I think that a lossless model could
demonstrate this capability.  For example, an AI-level language model would
recognize the similarity of "I ate a Big Mac" and "I ate at McDonalds" by
compressing the concatenated pair of strings to a size only slightly larger
than either string compressed by itself.  This ability could then be used to
generate conceptually similar strings (in O(n) time as I described earlier).


Third, on AIXI, this is a mathematically proven result, so there is no need
to test it experimentally.  The purpose of the Hutter prize is to encourage
research in human intelligence with regard to verbally expressable
knowledge, not the more general case.  The general case is known to be
undecidable, or at least intractable in environments controlled by a finite
state machine.

AIXI requires the assumption that the environment be computable by a Turing
machine.  I think this is reasonable.  People actually do behave like
rational agents.  If they didn't, we would not have Occam's razor.

Here is an example: you draw 100 marbles from an urn.  All of them are red.
What do you predict will be the color of the next marble?  Answer this way:
what is the shortest program you could write that outputs 101 words, where
the first 100 are "red"?


Fourth, a program that downloads the Wikipedia benchmark violates the rules
of the prize.  The decompressor must run on a computer without a network
connection.  Rules are here:
http://cs.fit.edu/~mmahoney/compression/textrules.html
 -- Matt Mahoney, [EMAIL PROTECTED]

________________________________
 To unsubscribe, change your address, or temporarily deactivate your
subscription, please go to
http://v2.listbox.com/member/[EMAIL PROTECTED]

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]

Reply via email to