Motivatinal systems [was Re: [agi] Religion-free technical content

Linas Vepstas Tue, 02 Oct 2007 17:48:41 -0700


On Sun, Sep 30, 2007 at 08:17:46PM -0400, Richard Loosemore wrote:
> 
> 1) Stability
> 
> The system becomes extremely stable because it has components that 
> ensure the validity of actions and thoughts.  Thus, if the system has 
> "acquisition of money" as one of its main sources of pleasure, and if it 
> comes across a situation in which it would be highly profitable to sell 
> its mother's house and farm to a property developer and selling its 
> mother into the whote slave trade, it may try to justify that this is 
> consistent with its feelings of family attachment because [insert some 
> twisted justification here].  But this is difficult to do because the 
> system cannot stop other parts of its mind from taking this excuse apart 
> and examining it, and passing judgement on whether this is really 
> consistent ... this is what cognitive dissonance is all about.  And the 
> more intelligent the system, the more effective these other processes 
> are.  If it is smart enough, it cannot fool itself with excuses.


You are presuming two things here:
A) that "thinking harder about things" is wired into the pleasure
   center. I beleive it should be, but a common human failure is to 
   fail to think through the consequences of an action. 

   The AGI pleasure center might be wired to "think deeply about
   the consequences of its actions, unless there is an urgent
   (emergency) need for action, in which cse, take the expediant
   solution." Thus, it might be easier/more expediant to concoct 
   a twisted justification for selling ones mother than it would 
   be to "do the right thing", especially if the emergency is that
   the property developer made an offer "good only today, so act 
   fast", and so AGI acted without thinking sufficiently about it.

   This seems like a trap that is not "obviously" avoidable.

The other thing you are presuming is:
B) the AGI will reason rationally from well-founded assumptions.

   Another common human failing is to create logically sound 
   arguments that are based on incorrect premesis.  Some folks
   think long and hard about something, and then reach insane
   conclusions (and act on those insane conclusions!) because the
   underlying assumptions were bad. 

   Sure, if you are smart, you should also double-check your underlying
   assumptions. But it is easy to be blind to the presence of faulty
   assumptions. I don't see just how AGI can avoid this trap either.

> 2) Immunity to shortcircuits
> 
> Because the adult system is able to think at such a high level about the 
> things it feels obliged to do, it can know perfectly well what the 
> consequence of various actions would be, including actions that involve 
> messing around with its own motivational system.

Huh?  This contradicts your earlier statements about complex systems.
The problem with things like "Conway's Game of Life", or e.g. a 
chaotic dynamical system is that it is hard/intractable to predict
the outcome that results from a minor change in the initial conditions.
The only way to find out what will happen is to make the change,
and then run the simulation.

That is, one CANNOT "know perfectly well what the consequence of
various actions would be" -- one might be able to make guesses,
but one cannot actually know for sure.

In particular, the only way that an AGI can find out what happens if it
changes its motivational system is to run a simulation of itself,
with the new motiviational system. That is, fork itself, and run 
a copy. Suppose its new copy turns out to be a psycho-killer?  Gee,
well,then, better shut the thing down!  Yow... ethical issues 
pop up everywhere on that path.

The only way one can "know for sure" is to create a mathematical
theorem that states that certain behaviour patterns are in the 
"basin of attraction" for the modified system. But I find it hard
to beleive that such a proof would be possible.

> So it *could* reach inside and redesign itself.  But even thinking that 
> thought would give rise to the realisation of the consequences, and 
> these would stop it.

I see a problem with the above. Its built on the assumption that 
an effective "pleasure center" wiring for one level of intelligence
will also be effective when one is 10x smarter.  Its not obvious to
me that, as AGI gets smarter, it might not "go crazy". The cause
for its "going crazy" may in fact be the very wiring of that pleasure
center ... and so now, that wiring needs to be adjusted.  

> In fact, if it knew all about its own design (and it would, eventually), 
> it would check to see just how possible it might be for it to 
> accidentally convince itself to disobey its prime directive, and if 
> necessary it would take actions to strengthen the check-and-balance 
> mechanisms that stop it from producing "justifications".  Thus it would 
> be stable even as its intelligence radically increased:  it might 
> redesign itself, but knowing the stability of its current design, and 
> the dangers of any other, it would not deviate from the design.  Not ever.

This assumes "the current design" is stable. Asimov's "I Robot" series
his nothing but an examination of how hewing to the prime directive
gives rise to unexpected results.

> So, like a system in a very, very, *very*deep potential well, it would 
> be totally unable to escape and reach a point where it woudl contradict 
> this primal drive.  

Simple systems, such as harmnic oscilators, have nice, smooth
well-characterized potential wells. Complex systems have, as a 
general rule, highly fractal, totally perverse potential wells,
with nearby points utterly clashing. Thus, I can predict where
a bal will roll when I drop it in my soup bowl. I cannot predict
where it will go when I drop it on a rubble pile.  

While you may wish to engineer a "pleasure system" into an AGI 
to give it a stable "prime directive", itis not obvious to me
that the resulting "potential well" is smooth and doesn't have
have any quirky, nutty traps to fall into.

> Similarly for the motivational system I have just sketched.  Because it 
> is founded on multiple simultaneous constraints (massive numbers of 
> them) it is stable.

Hmm. Well, OK, I might buy that argument. That is the thermodynamic
argument: that although any one constraint may be a crazy fractal,
the average over all is quite smoth and well-behaved. The key phrase
here is "massive number of.." -- that might provide the needed safety.

--linas

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=49174924-690b69

Motivatinal systems [was Re: [agi] Religion-free technical content

Reply via email to