Re: [agi] A question on the symbol-system hypothesis

Matt Mahoney Wed, 15 Nov 2006 11:24:53 -0800

Sorry if I did not make clear the distinction between knowing the learning 
algorithm for AGI (which we can do) and knowing what was learned (which we 
can't).

My point about Google is to illustrate that distinction.  The Google database 
is about 10^14 bits.  (It keeps a copy of the searchable part of the Internet 
in RAM).  The algorithm is deterministic.  You could, in principle, model the 
Google server in a more powerful machine and use it to predict the result of a 
search.  But where does this get you?  You can't predict the result of the 
simulation any more than you could predict the result of the query you are 
simulating.  In practice the human brain has finite limits just like any other 
computer.

My point about AGI is that constructing an internal representation that allows 
debugging the learned knowledge is pointless.  A more powerful AGI could do it, 
but you can't.  You can't do any better than to manipulate the input and 
observe the output.  If you tell your robot to do something and it sits in a 
corner instead, you can't do any better than to ask it why, hope for a sensible 
answer, and retrain it.  Trying to debug the reasoning for its behavior would 
be like trying to understand why a driver made a left turn by examining the 
neural firing patterns in the driver's brain.

-- Matt Mahoney, [EMAIL PROTECTED]

----- Original Message ----
From: Mark Waser <[EMAIL PROTECTED]>
To: [email protected]
Sent: Wednesday, November 15, 2006 9:39:14 AM
Subject: Re: [agi] A question on the symbol-system hypothesis

Mark Waser wrote:
>>   Given sufficient time, anything  should be able to be understood and 
>> debugged.
>>     Give me *one* counter-example to  the above . . . .
Matt Mahoney replied:
> Google.  You cannot predict the results of a search.  It does not help 
> that you have full access to the Internet.  It would not help even if 
> Google gave you full access to their server.

This is simply not correct.  Google uses a single non-random algorithm 
against a database to determine what results it returns.  As long as you 
don't update the database, the same query will return the exact same results 
and, with knowledge of the algorithm, looking at the database manually will 
also return the exact same results.

Full access to the Internet is a red herring.  Access to Google's database 
at the time of the query will give the exact precise answer.  This is also, 
exactly analogous to an AGI since access to the AGI's internal state will 
explain the AGI's decision (with appropriate caveats for systems that 
deliberately introduce randomness -- i.e. when the probability is 60/40, the 
AGI flips a weighted coin -- but in even those cases, the answer will still 
be of the form that "the AGI ended up with a 60% probability of X and 40% 
probability of Y and the weighted coin landed on the 40% side).

>> When we build AGI, we will understand it the way we understand Google. 
>> We know how a search engine works.  We will understand how learning 
>> works.  But we will not be able to predict or control what we build, even 
>> if we poke inside.

I agree with your first three statements but again, the fourth is simply not 
correct (as well as a blatant invitation to UFAI).  Google currently 
exercises numerous forms of control over their search engine.  It is known 
that they do successfully exclude sites (for visibly trying to game 
PageRank, etc.).  They constantly tweak their algorithms to change/improve 
the behavior and results.  Note also that there is a huge difference between 
saying that something is/can be exactly controlled (or able to be exactly 
predicted without knowing it's exact internal state) and that something's 
behavior is bounded (i.e. that you can be sure that something *won't* 
happen -- like all of the air in a room suddenly deciding to occupy only 
half the room).  No complex and immense system is precisely controlled but 
many complex and immense systems are easily bounded.

----- Original Message ----- 
From: "Matt Mahoney" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Tuesday, November 14, 2006 10:34 PM
Subject: Re: [agi] A question on the symbol-system hypothesis

I will try to answer several posts here.  I said that the knowledge base of 
an AGI must be opaque because it has 10^9 bits of information, which is more 
than a person can comprehend.  By opaque, I mean that you can't do any 
better by examining or modifying the internal representation than you could 
by examining or modifying the training data.  For a text based AI with 
natural language ability, the 10^9 bits of training data would be about a 
gigabyte of text, about 1000 books.  Of course you can sample it, add to it, 
edit it, search it, run various tests on it, and so on.  What you can't do 
is read, write, or know all of it.  There is no internal representation that 
you could convert it to that would allow you to do these things, because you 
still have 10^9 bits of information.  It is a limitation of the human brain 
that it can't store more information than this.

It doesn't matter if you agree with the number 10^9 or not.  Whatever the 
number, either the AGI stores less information than the brain, in which case 
it is not AGI, or it stores more, in which case you can't know everything it 
does.

Mark Waser wrote:

>I certainly don't buy the "mystical" approach that says that  sufficiently 
>large neural nets will come up with sufficiently complex  >discoveries that 
>we can't understand them.

James Ratcliff wrote:

>Having looked at the nueral network type AI algorithms, I dont see any 
>fathomable way that that type of architecture could
>create a full AGI by itself.

Nobody has created an AGI yet.  Currently the only working model of 
intelligence we have is based on neural networks.  Just because we can't 
understand it doesn't mean it is wrong.

James Ratcliff wrote:

>Also it is a critical task for expert systems to explain why they are
doing what they are doing, and for business application,
>I for one am
not goign to blindy trust what the AI says, without a little background.

I expect this ability to be part of a natural language model.  However, any 
explanation will be based on the language model, not the internal workings 
of the knowledge representation.  That remains opaque.  For example:

Q: Why did you turn left here?
A: Because I need gas.

There is no need to explain that there is an opening in the traffic, that 
you can see a place where you can turn left without going off the road, that 
the gas gauge reads "E", and that you learned that turning the steering 
wheel counterclockwise makes the car turn left, even though all of this is 
part of the thought process.  The language model is responsible for knowing 
that you already know this.  There is no need either (or even the ability) 
to explain the sequence of neuron firings from your eyes to your arm 
muscles.

>and this is one of the requirements for the Project Halo contest (took and 
>passed the AP chemistry exam)
>http://www.projecthalo.com/halotempl.asp?cid=30

This is a perfect example of why a transparent KR does not scale.  The 
expert system described was coded from 70 pages of a chemistry textbook in 
28 person-months.  Assuming 1K bits per page, this is a rate of 4 minutes 
per bit, or 2500 times slower than transmitting the same knowledge as 
natural language.

Mark Waser wrote:
>   Given sufficient time, anything  should be able to be understood and 
> debugged.
...
>     Give me *one* counter-example to  the above . . . .

Google.  You cannot predict the results of a search.  It does not help that 
you have full access to the Internet.  It would not help even if Google gave 
you full access to their server.

When we build AGI, we will understand it the way we understand Google.  We 
know how a search engine works.  We will understand how learning works.  But 
we will not be able to predict or control what we build, even if we poke 
inside.

-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] A question on the symbol-system hypothesis

Reply via email to