RE: [agi] Can humans keep superintelligences under control

Edward W. Porter Sun, 04 Nov 2007 11:02:01 -0800

In response to Richard Loosemore's Post of Sun 11/4/2007 12:15 PM
responding to my prior message of Sat 11/3/2007 3:28 PM

ED's prior msg>>>>> For example, humans might for short sighted personal
gain (such as when using them in weapon systems)

RL>>>> Whoaa!  You assume that it would be possible to "use" an AGI for
personal gain, or in a weapon system.  If it starts out with the supposed
empathic motivational system, it would not allow this.

ED>>>> First, there is good reason to believe that military and/or selfish
use of computers will continue into the future and that a lot of AGI's
will be made with motivational systems other that the one's you put so
much trust in.

This is supported by: (A) a lot of funding of AI is being done, not only
by America's defense department, but also by other defense departments,
including in countries like South Korea and Israel, and probably also in
many other countries; and (B) there is a tremendous amount of use of
computers for selfish reasons, such as for stealing information, sending
spam emails, stealing the use of other people's computers, and performing
extortions by threatening denial of service attacks and hacking into
financial transaction systems,

Second, even even if early AGI's are given your motivational system, it
would only seem to make sense that human would want the ability to modify
such systems.  Even if the human's making such system's had good
motivations they might not know exactly how to best create such a
motivational system and they would almost certain want the ability to
improve it, or override it if it was misbehaving.  So, until we trust AGIs
so much that we put their motivational systems beyond our control, they
will be designed so that humans could modify such systems, and that would
mean humans culd modify them as I have suggested above.

Third, you have not indicated how AGI's with your motivational systems
could insure they were not hacked.  A superintelligence under its own
malicious motivations or the control of a malicious human could code, and
understand code and its flaws a millions of times better than humans, so
it is going to be hard to keep your "white-hat" machines from being hacked
by "black-hat" machines.  About ten years ago I talked with an MIT
profession who was one of DARPA's head guy on computer security and he
told me it was a problem much bigger than any solution he knew of other
than keeping a machine totally disconnected.  That would apply to AGI's

Forth, and most importantly, you have not shown how your system would deal
with all the problems I have raised in prior posts on this subject about,
possible conflicting goals, conflicts between application of the same goal
to different parts of the world, the need for a system dealing in a
complex and rapidly changing world(as the near- and post-singularity world
will be) to change its sub-goals and its understanding of how to best
serve its motivations in such a rapidly changing world.  You have only
provided verbal handwaving.  Nothing that even begins to answer such hard
and real questions.  The below interchanges provides some examples of such
handwaving.

ED's prior msg>>>>> Or over time the inherent biases that were designed to
make AGI's have empathy for humans, might cause it to have empathy for
some humans more than others

RL>>>> As part of its initial (assumed) empathy, it will set up mechanisms
to monitor such things.  It could not possibly start having "more empathy
for some humans more than others" without *also* being aware of the fact
that, by being so biassed, it was in conflict with its general motivation.
So it would not do such a thing.  (We have to remember not to assume it
would be both superintelligent, and also suscetible to such easily-caught
problems).

ED>>>> Assuming for the moment your hypothetical system has stayed loyal
to people, I understand how in the situation described above it probably
would know it had a conflict.

But if the computer treats all people equally, then how good would it be
if it aids all people equally regardless of the relative good and/or harm
they are doing.  And if it aids people in doing harm, is it or is it not
being empathetic to people, and who is to decide what is good and what is
harm.  Would it help a new Hitler, or would it fight him, or would it
equally aid both sides, or would it stay out of the struggle completely.
Would it help fight Al Quida?  Would it help American in Iraq? Would it
help company A try to develop a product that will help out compete company
B?  Is that empathetic to mankind?  Who defines what is empathetic to
mankind.

And if they will not help companies, nations, or ideologies compete, how
likely is it that your "empathic" AGIs will be the one that governments
and corporations will pay to have built, when other, more nationally,
corporately, or personally loyal and useful AGI can be built?

ED's prior msg>>>>> , or might cause them to make decisions that they
> think are in our best interest, but would not.

RL>>>> Again, this assumes (implicitly) that it would both be generally
and
broadly empathic -- which means sensitive to our wishes -- and at the
same time, for some inexplicable reason, decide to do something that it
thinks is in our best interests, without consulting us.  Effectively
assumed that it would spontaneously *stop* being empathic, without
explaining how it could happen.

ED>>>> You talk about being "broadly empathic" to humans as if that answer
all questions.  It doesn't.  In fact, it avoids almost all of the most
relevant ones.

There is no broad consensus among humans what life is about, what its main
purposes should be, what the social contract should be, what are our
duties to each other, etc.  So the question become which of the beliefs
and value systems that are disputed among people would it allow itself to
be used to support, and if it refused to be used for any purposes about
which there is dispute among humans, what good would it be.

Would machines that are empathetic to people addict all of humanity to the
better-that-crystal-meth-eternal-rush machine Jiri Jelinek has been
pushing in the Nirvana? Manyana? Never! Thread, or would it not.  If they
shared Jiri's values they would.  If they shared mine they would not.  If
they shared mine they would actively discourage people from using such a
machine, assuming it was shown to have the totally consuming addictive
power Jiri assumes they would have.

So it would appear that your concept of "generally and broadly empathic"
is such a broad generality as to be totally useless for most human
endeavors in our competitive world.

ED's prior msg>>>>> The world is too complicated and is going to change
too rapidly in the next one hundred, one thousand, or ten thousand years
for any goal system designed circa 2015 to remain appropriate until the
end of history - unless history ends pretty soon.

RL>>>> Not true:  the statement was that it would stay empathic to our
motivations.

Only if the goal system were particularly rigid would this be a problem,
and by assumption I am talking about motivational systems that are
stable (diffuse systems, along the lines of my previously mentioned post).

ED>>>> Your approach to motivational system might be more stable than
others, but it is not clear that it would reliably deal with issues of the
types I have discussed above (people, themselves, often don't) and it is
not clear it would remainloyal to the best interests of human (if it were
possible to know exactly what that entailed) to the end of history, as you
claim.  Again you have refused to answer the central thrust of my
question.

Ed Porter

-----Original Message-----
From: Richard Loosemore [mailto:[EMAIL PROTECTED]
Sent: Sunday, November 04, 2007 12:15 PM
To: [email protected]
Subject: Re: [agi] Can humans keep superintelligences under control

Edward W. Porter wrote:
> Richard in your November 02, 2007 11:15 AM post you stated:
>
> "If AI systems are built with motivation systems that are stable, then
> we could predict that they will remain synchronized with the goals of
> the human race until the end of history."
>
> and
>
> "I can think of many, many types of non-goal-stack motivational
> systems
> for which [Matt's statement about the inherent instability of goal
> systems of recursively self improving AGIs] is a complete falsehood."
>
> In your 11/3/2007 1:17 PM post you described what I assume to be such
> a
> suppostedly stable  "non-goal-stack motivational system." as follows:
>
> " consider the motivational system of the
> best kind of AGI:  it is motivated by a balanced set of desires that
> include the desire to explore and learn, and empathy for the human
> species.  By definition, I would think, this simple cluster of desires
> and empathic motivations *are* the things that "give it pleasure".
>
> and
>
> "I think that in general, making the AGI as similar to us as possible
> (but without the aggressive and dangerous motivations that we are
> victims of) would be a good idea simply because we want them to start
> out with a strong empathy for us, and we want them to stay that way."
>
> I think this type of motivational system makes a lot of sense, but for
> all the reasons stated in my Fri 11/2/2007 2:07 PM post (arguments you
> have not responded to) as well as many other reasons, it does appear at
> all certain such a motivational system would reliably remain stable and
> "synchronized with the goals of the human race until the end of
> history," as you claim.
>
> For example, humans might for short sighted personal gain (such as
> when
> using them in weapon systems)

Whoaa!  You assume that it would be possible to "use" an AGI for
personal gain, or in a weapon system.  If it starts out with the
supposed empathic motivational system, it would not allow this.

  or accidentally alter such a motivational
> system.

Again, under the assumption, it would not allow such 'accidental"
alteration.

  Or over time the inherent biases that were designed to make
> AGI's have empathy for humans, might cause it to have empathy for some
> humans more than others

As part of its initial (assumed) empathy, it will set up mechanisms to
monitor such things.  It could not possibly start having "more empathy
for some humans more than others" without *also* being aware of the fact
that, by being so biassed, it was in conflict with its general
motivation.  So it would not do such a thing.  (We have to remember not
to assume it would be both superintelligent, and also suscetible to such
easily-caught problems).

, or might cause them to make decisions that they
> think are in our best interest, but would not.

Again, this assumes (implicitly) that it would both be generally and
broadly empathic -- which means sensitive to our wishes -- and at the
same time, for some inexplicable reason, decide to do something that it
thinks is in our best interests, without consulting us.  Effectively
assumed that it would spontaneously *stop* being empathic, without
explaining how it could happen.

   Or perhaps AGI robots
> would begin to embody the "human features" that they have been taught
> to
> be empathetic to better than people.  Etc.

Does this mean things like beginning to get aggressive, or jealous, etc?
  this is where the technical characteristics of a motivational system
become important:  this kind of drift would be impossible unless the
motivational system already had aggressiveness modules built in (which
is not the case, by assumption).

> The world is too complicated and is going to change too rapidly in the
> next one hundred, one thousand, or ten thousand years for any goal
> system designed circa 2015 to remain appropriate until the end of
> history - unless history ends pretty soon.

Not true:  the statement was that it would stay empathic to our
motivations.

Only if the goal system were particularly rigid would this be a problem,
and by assumption I am talking about motivational systems that are
stable (diffuse systems, along the lines of my previously mentioned post).

> If I am wrong I would appreciate the enlightenment and increased hope
> that would come with being shown how I am wrong.

I apologize for giving too brief answers to these questions.  I have too
much stuff that is not written out in long form.

Richard Loosemore

> Ed Porter

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=60974589-a715b8

RE: [agi] Can humans keep superintelligences under control

Reply via email to