Re: [agi] A question on the symbol-system hypothesis

Mark Waser Thu, 16 Nov 2006 06:57:57 -0800

> The knowledge base has high complexity.  You can't debug it.  You can examine 
> it and edit it but you can't verify its correctness.

While the knowledge base is complex, I disagree with the way in which you're 
attempting to use the first sentence.  The knowledge base *isn't* so complex 
that it causes a truly insoluble problem.  The true problem is that the 
knowledge base will have a large enough size and will grow and change quickly 
enough that you can't maintain 100% control over the contents or even the 
integrity of it.

I disagree with the second but believe that it may just be your semantics 
because of the third sentence.  The question is what we mean by "debug".  If 
you mean remove all incorrect knowledge, then the answer is obviously "yes, we 
can't remove all incorrect knowledge" because odd sequences of observed events 
and incomplete knowledge means that globally incorrect knowledge *is* the 
correct deduction from experience.  On the other hand, we certainly should be 
able to debug how the knowledge base operates, make sure that it maintains an 
acceptable degree of internal integrity, and responds correctly when it detects 
a major integrity problem.  The *process* and global behavior of the knowledge 
base is what is important and it *can* be debugged.  Minor mistakes and errors 
are just the cost of being limited in an erratic world.

> An AGI with a correct learning algorithm might still behave badly.

No!  An AGI with a correct learning algorithm may, through an odd sequence of 
events and incomplete knowledge, come to an incorrect conclusion and take an 
action that it would not have taken if it had perfect knowledge -- BUT -- this 
is entirely correct behavior, not bad behavior.  Calling it bad behavior 
dramatically obscures what you are trying to do.

> You can't examine the knowledge base to find out why. 

No, no, no, no, NO!  If you (or the AI) can't go back through the causal chain 
and explain exactly why an action was taken, then you have created an unsafe 
AI.  A given action depends upon a small part of the knowledge base (which may 
then depend upon ever larger sections in an ongoing pyramid) and you can debug 
an action and see what lead to an action (that you believe is incorrect but the 
AI believes is correct).

> You can't manipulate the knowledge base data to fix it. 

Bull.  You should be able to correctly come across a piece of incorrect 
knowledge that lead to an incorrect decision.  You should be able to find the 
supporting knowledge structures.  If the knowledge is truly incorrect, you 
should be able to provide evidence/experiences to the AI that leads it to 
correct the incorrect knowledge (or, you could just even just tack the correct 
knowledge in the knowledge base, fix it so that it temporarily can't be 
altered, and run your integrity repair routines -- which, I contend, any AI 
that is going to go anywhere must have).

> At least you can't do these things any better than manipulating the inputs 
> and observing the outputs. 

No.  I can find structures in the knowledge base and alter them.  I would 
prefer not to.  I would strongly prefer that it take the form of a conversation 
where I "ask" the AGI what it's reasoning was, where it answers, where I point 
out where I believe it's knowledge is incorrect and provide proof, and where it 
can then alter its own knowledge base appropriately.

> The reason is that the knowledge base is too complex.  In theory you could do 
> these things if you lived long enough, but you won't.  For practical 
> purposes, the AGI knowledge base is a black box. 

No.  I disagreed with your previous statement and I disagree with the reason.  
The knowledge base is not that complex.  It is that large.  And the AI should 
not be a black box *at all*.  You should be able to examine any given piece to 
any given level of detail at will -- you just can't hold all of it in mind at 
once.  Yes, that will lead to circumstances where it surprises you -- but we're 
not looking for 100% predictability.  We're looking for an intelligence with 
*bounded* behavior.

> You need to design your goals, learning algorithm, data set and test program 
> with this in mind. 

Prove to me that the AGI knowledge base is a black box and I will.  However, 
you have already told me that I  "can examine it and edit it" -- so what the 
heck do *you* mean by a black box?

> Trying to build transparency into the data structure would be pointless.  
> Information theory forbids it.

Bull, information theory does not forbid transparency into the data structure.  
Prove this and you would invalidate a huge swath of AGI research.  What makes 
you say this?  I believe that this is the core of your argument and would like 
to see *any* sort of evidence/argument to support this grandiose claim.

> I am sure I won't convince you, so maybe you have a different explanation why 
> 50 years of building structured knowledge bases has not worked, and what you 
> think can be done about it?

Hmmm.  Let's see . . . . Codd's paper was published in 1970, so the first 
fifteen years were devoted to getting to that point.  And SQL wasn't even 
commercially available until Oracle was released in 1979, so we're down to only 
about half that time.  Cyc didn't start until 1984 after Machine Learning 
started in the early 1980s.  Many people took horrible detours into neural 
networks while a lot of the rest were forced to constrain their systems by the 
limited computing ability available to them (I remember spending thousands of 
dollars per month running biochemistry simulations on a Vax that I was able to 
easily run on a PC less than five years later).  In the past twenty years, 
people have continued to make financially-proven progress in applications like 
genome databases.

It looks to me like 50 years of building structured knowledge bases has worked 
and that were getting better at it all the time and also that we can do more 
and more as space and computing power is getting cheaper and cheaper and 
languages and techniques are getting more and more powerful.  What hasn't 
worked *yet* is self-structuring databases and we're learning more all the time 
. . . . 

So *prove* to me why information theory forbids transparency of a knowledge 
base.

        Mark

P.S.  Yes, yes, I've seen that Google article before where the author believes 
that he "proves" that "Google is currently storing multiple copies of the 
entire web in RAM."  Numerous debunking articles have also come out including 
the facts that Google does not store HTML code, that what is stored *is* stored 
in compressed form, and --from Google, itself -- that it does *not* store 
sections that do not include "new instances of significant terms".  But I can 
certainly understand your personal knowledge base deriving the "fact" that 
"Google DOES keep the searchable part of the Internet in memory" if that 
article is all that you've seen on the topic (though one would have hoped that 
an integrity check or a reality check would have prompted further evaluation -- 
particularly since the article itself mentions that that would require an 
unreasonably/impossibly large amount of RAM.)

----- Original Message ----- 
From: "Matt Mahoney" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Wednesday, November 15, 2006 6:41 PM
Subject: Re: [agi] A question on the symbol-system hypothesis

Mark Waser wrote:
>Are you conceding that you can predict the results of a Google 
search?

OK, you are right.  You can type the same query twice.  Or if you live long 
enough you can do it the hard way.  But you won't.

>Are you now conceding that it is not true that "Models that are simple  enough 
>to debug are too simple to scale."?

OK, you are right again.  Plain text is a simple way to represent knowledge.  I 
can search and edit terabytes of it.

But this is not the point I wanted to make.  I am sure I expressed it badly.  
The point is there are two parts to AGI, a learning algorithm and a knowledge 
base.  The learning algorithm has low complexity.  You can debug it, meaning 
you can examine the internals to test it and verify it is working the way you 
want.  The knowledge base has high complexity.  You can't debug it.  You can 
examine it and edit it but you can't verify its correctness.

An AGI with a correct learning algorithm might still behave badly.  You can't 
examine the knowledge base to find out why.  You can't manipulate the knowledge 
base data to fix it.  At least you can't do these things any better than 
manipulating the inputs and observing the outputs.  The reason is that the 
knowledge base is too complex.  In theory you could do these things if you 
lived long enough, but you won't.  For practical purposes, the AGI knowledge 
base is a black box.  You need to design your goals, learning algorithm, data 
set and test program with this in mind.  Trying to build transparency into the 
data structure would be pointless.  Information theory forbids it.  Opacity is 
not advantagous or desirable.  It is just unavoidable.

I am sure I won't convince you, so maybe you have a different explanation why 
50 years of building structured knowledge bases has not worked, and what you 
think can be done about it?

And Google DOES keep the searchable part of the Internet in memory
http://blog.topix.net/archives/000011.html

because they have enough hardware to do it.
http://en.wikipedia.org/wiki/Supercomputer#Quasi-supercomputing

-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=303

Re: [agi] A question on the symbol-system hypothesis

Reply via email to