Re: [agi] Motivational Systems that are stable

2006-10-30 Thread James Ratcliff
So, it looks like to really create any kind of system like this, a black-box seperate programming ability in some sort must be created, wherin we can create the program to 'hard-wire' in the reward system, seperate from the main AI unit, where they cannot in any way change it to reward itself. The only problem with that is we DO need it to be changed, in at least subtle ways. So how do we deal with that duality. It needs to change, based on past history, but if we allow the machine to have full control, then it will sooner or later find the magic button, or loop, (morphine for free with no side effects) that will put in in a permanent state of pleasure. So how do we do something like that? Is it possible to have a second small AI in the second box that can modify, but not modify so much that it will reach that state?  Im not sure that that is possibly really, as it would greatly restrict that changing ability, and
 that second AI would need to know almost as much as the first AI.And the other thought you had about some sort of external control has problems as well... We cant really tell the AI, by a button or even words or a signal, that it is doing a good job. Because eventually it will find that 'button' as well, and realize that it can send the 'good job' signal to itself. Once it finds how to do that, then it can tell itself when and how to change, and short circuit the loop.This isnt really what I had discussed earlier though, earlier I had talked more about what actually the modifications to motivation were, not where they were physically coming from.But any reinforcement that comes from within the bot, or is easily simulated by a signal can be taken over and used.JamesRichard Loosemore [EMAIL PROTECTED] wrote: Hank Conn wrote: Although I understand, in vague terms, what idea Richard is attempting  to express, I don't see why having "massive numbers of weak constraints"  or "large numbers of connections from [the] motivational system to  [the] thinking system." gives any more reason to believe it is reliably  Friendly (without any further specification of the actual processes)  than one with "few numbers of strong constraints" or "a small number of  connections between the motivational system and the thinking system".  The Friendliness of the system would still depend just as strongly on  the actual meaning of the connections and constraints, regardless of  their number, and just giving an analogy to an extremely reliable  non-determinate system (Ideal Gas) does nothing to explain how you are  going to replicate this in the motivational system of an AGI. 
  -hankHank,There are three things in my proposal that can be separated, and perhaps it will help clear things up a little if I explicitly distinguish them.The first is a general principle about "stability" in the abstract, while the second is about the particular way in which I see a motivational system being constructed so that it is stable.  The third is how we take a stable motivational system and ensure it is Stable+Friendly, not just Stable.About stability in the abstract.  A system can be governed, in general, by two different types of mechanism (they are really two extremes on a continuum, but that is not important):  the fragile, deterministic type, and the massively parallel weak constraints type.  A good example of the fragile deterministic type would be a set of instructions for getting toa particular place in a city which consisted of a sequence of steps you should
 take, along named roads, from a special starting point.  An example of a weak constraint version of the same thing would be to give a large set of clues to the position of the place (near a pond, near a library, in an area where Dickens used to live, opposite a house painted blue, near a small school, etc.).The difference between these two approaches would be the effects of a disturbance on the system (errors, or whatever):  the fragile one only needs to be a few steps out and the whole thing breaks down, whereas the multiple constraints version can have enormous amounts of noise in it and still be extremely accurate.  (You could look on the Twenty Questions game in the same way:  20 vague questions can serve to pin down most objects in the world of our experience).What is the significance of this?  Goal systems in conventional AI have an inherent tendency to belong to the fragile/deterministic class.  Why
 does this matter?  Because it would take very little for the AI system to change from its initial design (with friendliness built into its supergoal) to one in which the friendliness no longer dominated.  There are various ways that this could happen, but the one most important, for my purposes, is where the interpretation of "Be Friendly" (or however the supergoal is worded) starts to depend on interpretation on the part of the AI, and the interpretation starts to get distorted.  You know the kind of scenario that people come up with:  the AI is told to be friendly, 

Re: [agi] Motivational Systems that are stable

2006-10-29 Thread Mark Waser



 Although I understand, in vague terms, what ideaRichard is 
attempting to express, I don't seewhy having"massive numbers of weak 
constraints" or "large numbers of connections from [the]motivational 
system to [the]thinking system." gives any more reason to believe it is 
reliably Friendly (without any further specification of the actual processes) 
than one with "few numbers of strong constraints" or "a small number of 
connections between the motivational system and the thinking system". 


Which is more likely to fail (or be subverted) 

a) a chain with single links
b) a highly interconnected net/web

Which one do you want to "chain the beast" 
with?
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]



Re: [agi] Motivational Systems that are stable

2006-10-28 Thread James Ratcliff
I disagree that humans really have a "stable motivational system" or would have to have a much more strict interpretation of that phrase.  Overall humans as a society have in general a stable system (discounting war and etc) But as individuals, too many humans are unstable in many small if not totally self-destructivee ways.For the most part, people are a selfish lot :} and think very much in terms of what they can get. They have a very hard time looking down the road at consequences that may come about their actions.  They seek pleasure by cheating, though it may hurt thier partner their children and their future stability, they seek to gain unlawful monies aka Enron and Martha stewart ect.In general people may be "good" "moral" or "stable" but the numbers of those that are not is so very high, that if we compare them to AI's
 turned loose on the world, I would hate to think about what every 1/10 AI would be like if modeled on us.But enough on that :}Who all out there are working on any Natural Language Processing systems? Or any kind of Information Extraction?James RatcliffMatt Mahoney [EMAIL PROTECTED] wrote:My comment on Richard Loosemore's proposal: we should not be confident in our ability to produce a stable motivational system. We observe that motivational systems are highly stable in animals (including humans). This is only because if an animal can manipulate its
 motivations in any way, then it is quickly removed by natural selection. Examples of manipulation might be to turn off pain or hunger or reproductive drive, or to stimulate its pleasure center. Humans can do this to some extent by using drugs, but this leads to self destructive behavior. In experiments where a mouse can stimulate its pleasure center via an electrode in its brain by pressing a lever, it will press the lever, foregoing food and water until it dies.So we should not take the existence of stable motivational systems in nature as evidence that we can get it right. These systems are complex, have evolved over a long time, and even then don't always work in the face of technology or a rapidly changing environment.  -- Matt Mahoney, [EMAIL PROTECTED]  This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/[EMAIL PROTECTED] Thank YouJames Ratcliffhttp://falazar.com 

Access over 1 million songs - Yahoo! Music Unlimited Try it today.

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Motivational Systems that are stable

2006-10-28 Thread Richard Loosemore


This is why I finished my essay with a request for comments based on an 
understanding of what I wrote.


This is not a comment on my proposal, only a series of unsupported 
assertions that don't seem to hang together into any kind of argument.



Richard Loosemore.



Matt Mahoney wrote:
My comment on Richard Loosemore's proposal: we should not be confident 
in our ability to produce a stable motivational system.  We observe that 
motivational systems are highly stable in animals (including humans).  
This is only because if an animal can manipulate its motivations in any 
way, then it is quickly removed by natural selection.  Examples of 
manipulation might be to turn off pain or hunger or reproductive drive, 
or to stimulate its pleasure center.  Humans can do this to some extent 
by using drugs, but this leads to self destructive behavior.  In 
experiments where a mouse can stimulate its pleasure center via an 
electrode in its brain by pressing a lever, it will press the lever, 
foregoing food and water until it dies.


So we should not take the existence of stable motivational systems in 
nature as evidence that we can get it right.  These systems are complex, 
have evolved over a long time, and even then don't always work in the 
face of technology or a rapidly changing environment.
 
-- Matt Mahoney, [EMAIL PROTECTED]



This list is sponsored by AGIRI: http://www.agiri.org/email To 
unsubscribe or change your options, please go to: 
http://v2.listbox.com/member/[EMAIL PROTECTED]


-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Motivational Systems that are stable

2006-10-28 Thread Matt Mahoney
- Original Message From: James Ratcliff [EMAIL PROTECTED]To: agi@v2.listbox.comSent: Saturday, October 28, 2006 10:23:58 AMSubject: Re: [agi] Motivational Systems that are stableI disagree that humans really have a "stable motivational system" or would have to have a much more strict interpretation of that phrase.  Overall humans as a society have in general a stable system (discounting war and etc) But as individuals, too many humans are unstable in many small if not totally self-destructivee ways.I think we are
 misunderstanding. By "motivational system" I mean the part of the brain (or AGI) that provides the reinforcement signal (reward or penalty). By "stable", I mean that you have no control over the logic of this system. You cannot train it like you can train the other parts of your brain. You cannot learn to turn off pain or hunger or fear or fatigue or the need for sleep, etc. You cannot alter your emotional state. You cannot make yourself feel happy on demand. You cannot make yourself like what you don't like and vice versa. The pathways from your senses to the pain/pleasure centers of your brain are hardwired, determined by genetics and not alterable through learning.For an AGI it is very important that a motivational system be stable. The AGI should not be able to reprogram it. If it could, it could simply program itself for maximum pleasure and enter a degenerate state where it ceases to learn through
 reinforcement. It would be like the mouse that presses a lever to stimulate the pleasure center of its brain until it dies.It is also very important that a motivational system be correct. If the goal is that an AGI be friendly or obedient (whatever that means), then there needs to be a fixed function of some inputs that reliably detects friendliness or obedience. Maybe this is as simple as a human user pressing a button to signal pain or pleasure to the AGI. Maybe it is something more complex, like a visual system that recognizes facial expressions to tell if the user is happy or mad. If the AGI is autonomous, it is likely to be extremely complex. Whatever it is, it has to be correct.To answer your other question, I am working on natural language processing, although my approach is somewhat unusual.http://cs.fit.edu/~mmahoney/compression/text.html-- Matt Mahoney, [EMAIL PROTECTED]
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Motivational Systems that are stable

2006-10-28 Thread Hank Conn
For an AGI it is very important that a motivational system be stable. The AGI should not be able to reprogram it.
I believe these are two completely different things. You can never assume an AGI will be unable to reprogram its goal system- while you can be virtually certain an AGI will never change its so called 'optimization target'.A stable motivation system I believe is defined in terms of a motivation system that preserves the intendedmeaning (in terms of Eliezer's CV I'm thinking)of its goal content through recursive self-modification. 


So, if I have it right, the robots in I, Robot were a demonstration of an unstable goal system. Under recursive self-improvement (or the movie'sentirely inadequaterepresentation of this), the intended meaning of their original goal content radically changed as the robots gained more power toward their optimization target.


Just locking them out of the code to their goal system does not guarentee they will never get to it. How do you know that a million years of subtlemanipulation by a superintelligence definitelycouldn't ultimately lead to it unlocking the code and catastrophically destabilizing?


Although I understand, in vague terms, what ideaRichard is attempting to express, I don't seewhy havingmassive numbers of weak constraints or large numbers of connections from [the]motivational system to [the]thinking system. gives any more reason to believe it is reliably Friendly (without any further specification of the actual processes) than one with few numbers of strong constraints or a small number of connections between the motivational system and the thinking system. The Friendliness of the system would still depend just as strongly on the actual meaning of the connections and constraints, regardless of their number, and just giving an analogy to an extremely reliable non-determinate system (Ideal Gas) does nothing to explain how you are going to replicate this in themotivational system of an AGI.


-hank

On 10/28/06, Matt Mahoney [EMAIL PROTECTED] wrote:


- Original Message 

From: James Ratcliff [EMAIL PROTECTED]
To: agi@v2.listbox.comSent: Saturday, October 28, 2006 10:23:58 AMSubject: Re: [agi] Motivational Systems that are stable

I disagree that humans really have a stable motivational system or would have to have a much more strict interpretation of that phrase.  Overall humans as a society have in general a stable system (discounting war and etc)


 But as individuals, too many humans are unstable in many small if not totally self-destructivee ways.I think we are misunderstanding. By motivational system I mean the part of the brain (or AGI) that provides the reinforcement signal (reward or penalty). By stable, I mean that you have no control over the logic of this system. You cannot train it like you can train the other parts of your brain. You cannot learn to turn off pain or hunger or fear or fatigue or the need for sleep, etc. You cannot alter your emotional state. You cannot make yourself feel happy on demand. You cannot make yourself like what you don't like and vice versa. The pathways from your senses to the pain/pleasure centers of your brain are hardwired, determined by genetics and not alterable through learning.
For an AGI it is very important that a motivational system be stable. The AGI should not be able to reprogram it. If it could, it could simply program itself for maximum pleasure and enter a degenerate state where it ceases to learn through reinforcement. It would be like the mouse that presses a lever to stimulate the pleasure center of its brain until it dies.
It is also very important that a motivational system be correct. If the goal is that an AGI be friendly or obedient (whatever that means), then there needs to be a fixed function of some inputs that reliably detects friendliness or obedience. Maybe this is as simple as a human user pressing a button to signal pain or pleasure to the AGI. Maybe it is something more complex, like a visual system that recognizes facial expressions to tell if the user is happy or mad. If the AGI is autonomous, it is likely to be extremely complex. Whatever it is, it has to be correct.
To answer your other question, I am working on natural language processing, although my approach is somewhat unusual.
http://cs.fit.edu/~mmahoney/compression/text.html-- Matt Mahoney, [EMAIL PROTECTED]


This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: 
http://v2.listbox.com/member/[EMAIL PROTECTED] 


This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Motivational Systems that are stable

2006-10-28 Thread Richard Loosemore

Hank Conn wrote:
Although I understand, in vague terms, what idea Richard is attempting 
to express, I don't see why having massive numbers of weak constraints 
or large numbers of connections from [the] motivational system to 
[the] thinking system. gives any more reason to believe it is reliably 
Friendly (without any further specification of the actual processes) 
than one with few numbers of strong constraints or a small number of 
connections between the motivational system and the thinking system. 
The Friendliness of the system would still depend just as strongly on 
the actual meaning of the connections and constraints, regardless of 
their number, and just giving an analogy to an extremely reliable 
non-determinate system (Ideal Gas) does nothing to explain how you are 
going to replicate this in the motivational system of an AGI.
 
-hank


Hank,

There are three things in my proposal that can be separated, and perhaps 
it will help clear things up a little if I explicitly distinguish them.


The first is a general principle about stability in the abstract, 
while the second is about the particular way in which I see a 
motivational system being constructed so that it is stable.  The third 
is how we take a stable motivational system and ensure it is 
Stable+Friendly, not just Stable.




About stability in the abstract.  A system can be governed, in general, 
by two different types of mechanism (they are really two extremes on a 
continuum, but that is not important):  the fragile, deterministic type, 
and the massively parallel weak constraints type.  A good example of the 
fragile deterministic type would be a set of instructions for getting to 
  a particular place in a city which consisted of a sequence of steps 
you should take, along named roads, from a special starting point.  An 
example of a weak constraint version of the same thing would be to give 
a large set of clues to the position of the place (near a pond, near a 
library, in an area where Dickens used to live, opposite a house painted 
blue, near a small school, etc.).


The difference between these two approaches would be the effects of a 
disturbance on the system (errors, or whatever):  the fragile one only 
needs to be a few steps out and the whole thing breaks down, whereas the 
multiple constraints version can have enormous amounts of noise in it 
and still be extremely accurate.  (You could look on the Twenty 
Questions game in the same way:  20 vague questions can serve to pin 
down most objects in the world of our experience).


What is the significance of this?  Goal systems in conventional AI have 
an inherent tendency to belong to the fragile/deterministic class.  Why 
does this matter?  Because it would take very little for the AI system 
to change from its initial design (with friendliness built into its 
supergoal) to one in which the friendliness no longer dominated.  There 
are various ways that this could happen, but the one most important, for 
my purposes, is where the interpretation of Be Friendly (or however 
the supergoal is worded) starts to depend on interpretation on the part 
of the AI, and the interpretation starts to get distorted.  You know the 
kind of scenario that people come up with:  the AI is told to be 
friendly, but it eventually decides that because people are unhappy much 
of the time, the only logical way to stop all the unhappiness is to 
eliminate all the people.  Something stupid like that.  If you trace 
back the reasons why the AI could have come to such a dumb conclusion, 
you eventually realize that it is because the motivation system was so 
fragile that it was sensitive to very, very small perturbations - 
basically, one wrong turn in the logic and the result could be 
absolutely anything.  (In much the same way that one small wrong step or 
one unanticipated piece of road construction could ruin a set of 
directions that told you how to get to a place by specifying that you go 
251 steps east on Oxford Street, then 489 steps north on etc.).


The more you look at those conventional goal systems, the more they look 
fragile.  I cannot give all the arguments here because they are 
extensive, so maybe you can take my word for it.  This is one reason 
(though not the only one) why efforts to mathematically prove the 
validity of one of those goal systems under recursive self-improvement 
is just a complete joke.


Now, what I have tried to argue is that there are other ways to ensure 
the stability of a system:  the multiple weak constraints idea is what 
was behind my original mention of an Ideal Gas.  The P, V and T of an 
Ideal Gas are the result of many constraints (the random movements of 
vast numbers of constituent particles), and as a result the P, V and T 
are exquisitely predictable.


The question becomes:  can you control/motivate the behavior of an AI 
using *some* variety of motivational system that belongs in the massive 
numbers of weak constraints category?  If 

Re: [agi] Motivational Systems that are stable

2006-10-27 Thread James Ratcliff
Richard, The problem with the entire presentation is that it is just too hopeful, there is NO guarentee whatsoever that the AI will respond in a nice fashion, through any given set of interactions.First, you say a rather large amount (how many needed?) of motivations all competing at once for the final set of actions. This breaks down rather fast with a simple chess example, given X choices, and only being able to look ahead, the AI in unable to determine the outcome of his decisions, given a finite computing ability. So if it were to look at a tradeoff of materials to build with, and a negative effect of how much the material costs, it will try and balance that just like we humans do. Given some reinforcement of say 20 buildings being built with the lesser and lesser quality material, it would build a huge skyscraper with the most base of materials.Second, and mainly, if an AI has full control over his
 programming and motivation, it WILL try different combinations, and permutations, unless constrained against it. If given a choice between getting to a power plug and staying alive, and if a human is standing there in the way, and will not move, will the AI choose to die instead (given that it must have power or XYZ or it will cease function)What about the second robot behind him? Will he change his motivations after seeing another bot die?As you say, it will weigh the many alternatives before making any actions, but just like humans do, what is to stop it from seeing a wallet on a street and keeping it. The AI may not have a high justification for returning the wallet, and so will have gotten a bonus to money by keeping it. At what point when he looks around and doesnt see anything blocking him currently from doing something.There really is no fun easy nice way to say, ok do this, and we are good, we can always keep them in
 line, or they will like us and be good, or they will be superior morally to us and be good.It can be something so simple as a situation that has never been encountered before, and the motivation list all cancels out, or doesnt determine any course of action. He doesnt want to stall forever, so he will flip a coin and take one path or the other. If he takes a bad path (for us) and there are no direct bad consequences that he can see, (maybe a horrible event happens far away that he didnt connect) then he will assume he did the right thing, and keep right on moving down that bad path.Note: I am not generally a doomsday bird for robots out to get us, but am realistic in realizing all the possibilities. Even given that, I would push forward in some way.James RatcliffRichard Loosemore [EMAIL PROTECTED] wrote: Ben Goertzel wrote: Loosemore wrote:  The motivational system of some types of AI (the types you would  classify as tainted by complexity) can be made so reliable that the  likelihood of them becoming unfriendly would be similar to the  likelihood of the molecules of an Ideal Gas suddenly deciding to split  into two groups and head for opposite ends of their container.  Wow!  This is a vey strong hypothesis  I really doubt this kind of certainty is possible for any AI with radically increasing intelligence ... let alone a complex-system-type AI with highly indeterminate internals...  I don't expect you to have a proof for this assertion, but do you have an argument at all?  benBen,You are being overdramatic here.But since you ask, here is the
 argument/proof.As usual, I am required to compress complex ideas into a terse piece of text, but for anyone who can follow and fill in the gaps for themselves, here it is.  Oh, and btw, for anyone who is scarified by the psychological-sounding terms, don't worry:  these could all be cashed out in mechanism-specific detail if I could be bothered  --  it is just that for a cognitive AI person like myself, it is such a PITB to have to avoid such language just for the sake of political correctness.You can build such a motivational system by controlling the system's agenda by diffuse connections into the thinking component that controls what it wants to do.This set of diffuse connections will govern the ways that the system gets 'pleasure' --  and what this means is, the thinking mechanism is driven by dynamic relaxation, and the 'direction' of that relaxation pressure is what defines the things that the system
 considers 'pleasurable'.  There would likely be several sources of pleasure, not just one, but the overall idea is that the system always tries to maximize this pleasure, but the only way it can do this is to engage in activities or thoughts that stimulate the diffuse channels that go back from the thinking component to the motivational system.[Here is a crude analogy:  the thinking part of the system is like a table ontaining a complicated model landscape, on which a ball bearing is rolling around (the attentional focus).  The motivational system controls this situation, not be micromanaging the movements of the ball bearing, but by 

Re: [agi] Motivational Systems that are stable

2006-10-27 Thread Justin Foutts
I'm sure you guys have heard this before but... If AI will inevitably be created, is it not also inevitable that we will "enslave" the AI to do our bidding?And if both of these events are inevitable it seems that we must accept that the Robot Rebellion and enslavement of humanity is ALSO inevitable (assuming this has not already occured).I know this is all just speculation... But can we really affordNOT to speculate endlessly?   

Want to start your own business? Learn how on  Yahoo! Small Business. 

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Motivational Systems that are stable

2006-10-27 Thread Matt Mahoney
My comment on Richard Loosemore's proposal: we should not be confident in our ability to produce a stable motivational system. We observe that motivational systems are highly stable in animals (including humans). This is only because if an animal can manipulate its motivations in any way, then it is quickly removed by natural selection. Examples of manipulation might be to turn off pain or hunger or reproductive drive, or to stimulate its pleasure center. Humans can do this to some extent by using drugs, but this leads to self destructive behavior. In experiments where a mouse can stimulate its pleasure center via an electrode in its brain by pressing a lever, it will press the lever, foregoing food and water until it dies.So we should not take the existence of stable
 motivational systems in nature as evidence that we can get it right. These systems are complex, have evolved over a long time, and even then don't always work in the face of technology or a rapidly changing environment.-- Matt Mahoney, [EMAIL PROTECTED]
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Motivational Systems that are stable

2006-10-27 Thread Ben Goertzel

Richard,

As I see it, in this long message you have given a conceptual sketch
of an AI design including a motivational subsystem and a cognitive
subsystem, connected via a complex network of continually adapting
connections.  You've discussed the way such a system can potentially
build up a self-model involving empathy and a high level of awareness,
and stability, etc.

All this makes sense, conceptually; though as you point out, the story
you give is short on details, and I'm not so sure you really know how
to cash it out in terms of mechanisms that will actually function
with adequate intelligence ... but that's another story...

However, you have given no argument as to why the failure of this kind
of architecture to be stably Friendly is so ASTOUNDINGLY UNLIKELY as
you claimed in your original email.  You have just argued why it's
plausible to believe such a system would probably have a stable goal
system.  As I see it, you did not come close to proving your original
claim, that


  The motivational system of some types of AI (the types you would
  classify as tainted by complexity) can be made so reliable that the
  likelihood of them becoming unfriendly would be similar to the
  likelihood of the molecules of an Ideal Gas suddenly deciding to split
  into two groups and head for opposite ends of their container.


I don't understand how this extreme level of reliability would be
achieved, in your design.

Rather, it seems to me that the reliance on complex, self-organizing
dynamics makes some degree of indeterminacy in the system almost
inevitable, thus making the system less than absolutely reliable.
Illustratng this point, humans (who are complex dynamical systems) are
certainly NOT reliable in terms of Friendliness or any other subtle
psychological property...

-- Ben G







On 10/25/06, Richard Loosemore [EMAIL PROTECTED] wrote:

Ben Goertzel wrote:
 Loosemore wrote:
  The motivational system of some types of AI (the types you would
  classify as tainted by complexity) can be made so reliable that the
  likelihood of them becoming unfriendly would be similar to the
  likelihood of the molecules of an Ideal Gas suddenly deciding to split
  into two groups and head for opposite ends of their container.

 Wow!  This is a vey strong hypothesis  I really doubt this
 kind of certainty is possible for any AI with radically increasing
 intelligence ... let alone a complex-system-type AI with highly
 indeterminate internals...

 I don't expect you to have a proof for this assertion, but do you have
 an argument at all?

 ben

Ben,

You are being overdramatic here.

But since you ask, here is the argument/proof.

As usual, I am required to compress complex ideas into a terse piece of
text, but for anyone who can follow and fill in the gaps for themselves,
here it is.  Oh, and btw, for anyone who is scarified by the
psychological-sounding terms, don't worry:  these could all be cashed
out in mechanism-specific detail if I could be bothered  --  it is just
that for a cognitive AI person like myself, it is such a PITB to have to
avoid such language just for the sake of political correctness.

You can build such a motivational system by controlling the system's
agenda by diffuse connections into the thinking component that controls
what it wants to do.

This set of diffuse connections will govern the ways that the system
gets 'pleasure' --  and what this means is, the thinking mechanism is
driven by dynamic relaxation, and the 'direction' of that relaxation
pressure is what defines the things that the system considers
'pleasurable'.  There would likely be several sources of pleasure, not
just one, but the overall idea is that the system always tries to
maximize this pleasure, but the only way it can do this is to engage in
activities or thoughts that stimulate the diffuse channels that go back
from the thinking component to the motivational system.

[Here is a crude analogy:  the thinking part of the system is like a
table ontaining a complicated model landscape, on which a ball bearing
is rolling around (the attentional focus).  The motivational system
controls this situation, not be micromanaging the movements of the ball
bearing, but by tilting the table in one direction or another.  Need to
pee right now?  That's because the table is tilted in the direction of
thoughts about water, and urinary relief.  You are being flooded with
images of the pleasure you would get if you went for a visit, and also
the thoughts and actions that normally give you pleasure are being
disrupted and associated with unpleasant thoughts of future increased
bladder-agony.  You get the idea.]

The diffuse channels are set up in such a way that they grow from seed
concepts that are the basis of later concept building.  One of those
seed concepts is social attachment, or empathy, or imprinting  the
idea of wanting to be part of, and approved by, a 'family' group.  By
the time the system is mature, it has 

[agi] Motivational Systems that are stable

2006-10-25 Thread Richard Loosemore

Ben Goertzel wrote:

Loosemore wrote:

 The motivational system of some types of AI (the types you would
 classify as tainted by complexity) can be made so reliable that the
 likelihood of them becoming unfriendly would be similar to the
 likelihood of the molecules of an Ideal Gas suddenly deciding to split
 into two groups and head for opposite ends of their container.


Wow!  This is a vey strong hypothesis  I really doubt this
kind of certainty is possible for any AI with radically increasing
intelligence ... let alone a complex-system-type AI with highly
indeterminate internals...

I don't expect you to have a proof for this assertion, but do you have
an argument at all?

ben


Ben,

You are being overdramatic here.

But since you ask, here is the argument/proof.

As usual, I am required to compress complex ideas into a terse piece of 
text, but for anyone who can follow and fill in the gaps for themselves, 
here it is.  Oh, and btw, for anyone who is scarified by the 
psychological-sounding terms, don't worry:  these could all be cashed 
out in mechanism-specific detail if I could be bothered  --  it is just 
that for a cognitive AI person like myself, it is such a PITB to have to 
avoid such language just for the sake of political correctness.


You can build such a motivational system by controlling the system's 
agenda by diffuse connections into the thinking component that controls 
what it wants to do.


This set of diffuse connections will govern the ways that the system 
gets 'pleasure' --  and what this means is, the thinking mechanism is 
driven by dynamic relaxation, and the 'direction' of that relaxation 
pressure is what defines the things that the system considers 
'pleasurable'.  There would likely be several sources of pleasure, not 
just one, but the overall idea is that the system always tries to 
maximize this pleasure, but the only way it can do this is to engage in 
activities or thoughts that stimulate the diffuse channels that go back 
from the thinking component to the motivational system.


[Here is a crude analogy:  the thinking part of the system is like a 
table ontaining a complicated model landscape, on which a ball bearing 
is rolling around (the attentional focus).  The motivational system 
controls this situation, not be micromanaging the movements of the ball 
bearing, but by tilting the table in one direction or another.  Need to 
pee right now?  That's because the table is tilted in the direction of 
thoughts about water, and urinary relief.  You are being flooded with 
images of the pleasure you would get if you went for a visit, and also 
the thoughts and actions that normally give you pleasure are being 
disrupted and associated with unpleasant thoughts of future increased 
bladder-agony.  You get the idea.]


The diffuse channels are set up in such a way that they grow from seed 
concepts that are the basis of later concept building.  One of those 
seed concepts is social attachment, or empathy, or imprinting  the 
idea of wanting to be part of, and approved by, a 'family' group.  By 
the time the system is mature, it has well-developed concepts of family, 
social group, etc., and the feeling of pleasure it gets from being part 
of that group is mediated by a large number of channels going from all 
these concepts (which all developed from the same seed) back to the 
motivational system.  Also, by the time it is adult, it is able to 
understand these issues in an explicit way and come up with quite 
complex reasons for the behavior that stimulates this source of pleasure


[In simple terms, when it's a baby it just wants Momma, but when it is 
an adult its concept of its social attachment group may, if it is a 
touchy feely liberal (;-)) embrace the whole world, and so it gets the 
same source of pleasure from its efforts as an anti-war activist.  And 
not just pleasure, either:  the related concept of obligation is also 
there:  it cannot *not* be an ant-war activist, because that would lead 
to cognitive dissonance.]


This is why I have referred to them as 'diffuse channels' - they involve 
large numbers of connections from motivational system to thinking 
system.  The motivational system does not go to the action stack and add 
a specific, carefully constructed 'goal-state' that has an interpretable 
semantics (Thou shalt pee!), it exerts its control via large numbers 
of connections into the thinking system.


There are two main consequences of this way of designing the 
motivational system.


1) Stability

The system becomes extremely stable because it has components that 
ensure the validity of actions and thoughts.  Thus, if the system has 
acquisition of money as one of its main sources of pleasure, and if it 
comes across a situation in which it would be highly profitable to sell 
its mother's house and farm to a property developer and selling its 
mother into the whote slave trade, it may try to justify that this is 
consistent with its