Re: [agi] Some thoughts of an AGI designer

Richard Loosemore Mon, 10 Mar 2008 10:37:27 -0700

Mark Waser wrote:

I am in sympathy with some aspects of Mark's position, but I also seea serious problem running through the whole debate: everyone ismaking statements based on unstated assumptions about the motivationsof AGI systems.
Bummer. I thought that I had been clearer about my assumptions. Let metry to concisely point them out again and see if you can show me where Ihave additional assumptions that I'm not aware that I'm making (which Iwould appreciate very much).
Assumption - The AGI will be a goal-seeking entity.

And I think that is it.    :-)


Okay, I can use that as an illustration of what I am getting at.

There are two main things.

One is that the statement "The AGI will be a goal-seeking entity" hasmany different interpretations, ad I am arguing that these differentinterpretations have a massive impact on what kind of behavior you canexpect to see.

It is almost impossible to list all the different interpretations, buttwo of the more extreme variants are the two that I have describedbefore: a "Goal-Stack" system in which the goals are represented in thesame form as the knowledge that the system stores, and a "MotivationalEmotional System" which biasses the functioning of the system and isintimately connected with the development of its knowledge. The GSsystem has the dangerous feature that any old fool could go in andrewrite the top level goal so it reads "make as much computronium aspossible" or "cultivate dandelions" or "learn how to do crochet". TheMES system, on the other hand, can be set up to have values such as oursand to feel empathy with human beings, and once set up that way youwould have to re-grow the system before you could get it to have someother set of values.

Clearly, these two interpretations of "The AGI will be a goal-seekingentity" have such different properties that, unless there is detailedclarification of what the meaning is, we cannot continue to discuss whatthey would do.

My second point is that some possible choices of the meaning of "The AGIwill be a goal-seeking entity" will actually not cash out into acoherent machine design, so we would be wasting our time if weconsidered how that kind of AGI would behave.

In particular, there are severe doubts about whether the Goal-Stack typeof system can ever make it up to the level of a full intelligence. I'llgo one further on that: I think that one of the main reasons we havetrouble getting AI systems to be AGI is precisely because we have notyet realised that they need to be driven by something more than a GoalStack. It is not the only reason, but its a big one.

So the message is: we need to know exactly details of the AGI'smotivation system ("The AGI will be a goal-seeking entity" is notspecific enough), and we need to then be sure that the details we giveare going to lead to a type of AGI that can actually be an AGI.


These questions, I think, are the real battleground.

BTW, this is not a direct attack on what you were saying, because Ibelieve that there is a version of what you are saying (about anintrinsic tendency toward a Friendliness attractor) that I agree with.My problem is that so much of the current discussion is tangled up withhidden assumptions that I think that the interesting part of yourmessage is getting lost.

EVERYTHING depends on what assumptions you make, and yet each voice inthis debate is talking as if their own assumption can be taken forgranted.
I agree with you and am really trying to avoid this. I will addressyour specific examples below and would appreciate any others that youcan point out.
The three most common of these assumptions are:
1) That it will have the same motivations as humans, but with atendency toward the worst that we show.
I don't believe that I'm doing this. I believe that all goal-seekinggenerally tends to be optimized by certain behaviors (the Omohundrodrives). I believe that humans show many of these behaviors becausethese behaviors are relatively optimal in relation to the alternatives(and because humans are relatively optimal). But I also believe thatthe AGI will also have dramatically different motivations from humanswhere the human motivations were evolved stepping stones that were onthe necessary and optimal path for one environment but haven't beeneliminated now that they are unnecessary and sub-optimal in the currentenvironment/society (Richard's "the worst that we show").

I am in complete disagreement with Omuhundro's idea that there are acanonical set of drives.

This is like saying that there is a canonical set of colors that AGIswill come in: Cambridge Blue, Lemon Yellow and True Black.


What color the thing is will be what color you decide to paint it!

Ditto for its goals and motivations: what you decide to put into it iswhat it does, so I cannot make any sense of statements like "I alsobelieve that the AGI will also have dramatically different motivationsfrom humans". Answer is Yes if you put that kind of weird motivationsystem into it, and No if you put a human-like motivation system into it.

Are you assuming that when an AGI is built, we will have to wait untilwe switch it on before we have any clue what its motivations will be?

2) That it will have some kind of "Gotta Optimize My UtilityFunction" motivation.
I agree with the statement but I believe that it is a logical follow-onto my assumption that the AGI is a goal-seeking entity (i.e. it's anOmohundro drive). Would you agree, Richard?
3) That it will have an intrinsic urge to increase the power of itsown computational machinery.
Again, I agree with the statement but I believe that it is a logicalfollow-on to my single initial assumption (i.e. it's another Omohundrodrive). Wouldn't you agree?


Well, on both counts, not really.

An MES system does not have a Utility Function. Also, an MES systemthat was (e.g.) set up to have human-empathy motivations would not beobsessed with the desire to increase its computational machinery. Atleast, it would not do so to the exclusion of its other motivations.

There are other assumptions, but these seem to be the big three.
And I would love to go through all of them, actually (or debate one ofmy answers above).

There may be a misunderstanding about why I listed them: I really justwanted to give examples of assumptions whose consequences have not beenexplicitly examined.

My above discussion of how the assumptions can have wildly divergingconsequences is probably enough of a debate-starter to be going on with.


So this is my claim, in summary:

1) The statement "Assumption - The AGI will be a goal-seeking entity"is not yet specific enough to yield predictions about how the systemwill behave, since (at the very least) this statement can be taken toinclude both the "Goal-Stack" type of drive and the "MotivationalEmotional System" type, and these two have wildly different properties.

2) If you mean to refer to a simple Goal-Stack system, then my previouscritiques apply: in this case, it is not clear that any AGI built usinga GS would be able to function well enough to make it to adulthood. Ifmy critiques are valid, then we need not consider the behavior ofGS-type AGI systems, because there will never be any such systems.

3) Any statement that says "An AGI will probably behave like X" isstrictly without content unless some mention is made of what motivationsor goals were put into the system in the first place - and withoutsuch a qualifier, the statement is tantamount to speculation about whatcolor it will be without saying what color we chose to paint it.



Does this make sense?

I don't think I have avoided your other questions (both above andbelow), I am just trying to package my response in this one set of points.




Richard Loosemore

So what I hear is a series of statements <snip> (Except, of course,that nobody is actually coming right out and saying what color of AGIthey assume.)
I thought that I pretty explicitly was . . . .         :-(
In the past I have argued strenuously that (a) you cannot divorce adiscussion of friendliness from a discussion of what design of AGI youare talking about,
And I have reached the conclusion that you are somewhat incorrect. Ibelieve that goal-seeking entities OF ANY DESIGN of sufficientintelligence (goal-achieving ability) will see an attractor in myparticular vision of Friendliness (which I'm deriving by *assuming* theattractor and working backwards from there -- which I guess you couldcall a second assumption if you *really* had to ;-).
and (b) some assumptions about AGI motivation are extremely incoherent.
If you perceive me as incoherent, please point out where. My primaryAGI motivation is "self-interest" (defined as achievement of *MY* goals-- which directly derives from my assumption that "the AGI will be agoal-seeking entity"). All other motivations are clearly logicallyderived from that primary motivation. If you see an example where thisdoesn't appear to be the case, *please* flag it for me (since I need tofix it :-).
And yet in spite of all my efforts that I have made, there seems to beno acknowledgement of the importance of these two points.
I think that I've acknowledged both in the past and will continue to doso (despite the fact that I am now somewhat debating the first point --more the letter than the spirit :-).


-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=95818715-a78a9b
Powered by Listbox: http://www.listbox.com

Re: [agi] Some thoughts of an AGI designer

Reply via email to