Re: [agi] Can humans keep superintelligences under control

Richard Loosemore Mon, 05 Nov 2007 08:49:39 -0800

Edward W. Porter wrote:

In response to Richard Loosemore’s Post of Sun 11/4/2007 12:15 PMresponding to my prior message of Sat 11/3/2007 3:28 PM
ED’s prior msg>>>>> For example, humans might for short sighted personalgain (such as when using them in weapon systems)
RL>>>> Whoaa! You assume that it would be possible to "use" an AGI forpersonal gain, or in a weapon system. If it starts out with thesupposed empathic motivational system, it would not allow this.
ED>>>> First, there is good reason to believe that military and/orselfish use of computers will continue into the future and that a lot ofAGI’s will be made with motivational systems other that the one’s youput so much trust in.

If I may interrupt here: this assumes that things carry on much as theyare, but with AGI as well.

That is an extremely unlikely scenario, once you examined theconsequences of having even one safe AGI in existence. As I have saidbefore, as soon as you have one, you will want to get it to recursivelyself-improve until it is superintelligent, and then it will immediatelylook around, see *all* of the dangers that you foresee below, andquietly, nonviolently deal with them. At that point there will not beany more unfriendly AGIs around.

I have summarized, in the above paragraph, a rather large body ofthought, so please accept my apologies for everything left out... Buteven if you do not see the validity of this argument as it stands,please do accept the fact that this is a very real possibility, and thatfor that reason nobody can come to further conclusions (in particularthe ones you list below) until the viability of that scenario has beencarefully examined. Everything (but *everything*) depends on thewhether or not the scenario I just described is the one that will prevail.

For a variety of reasons, I believe it is overwhelmingly likely, butthat is a long story.

This is supported by: (A) a lot of funding of AI is being done, not onlyby America’s defense department, but also by other defense departments,including in countries like South Korea and Israel, and probably also inmany other countries; and (B) there is a tremendous amount of use ofcomputers for selfish reasons, such as for stealing information, sendingspam emails, stealing the use of other people’s computers, andperforming extortions by threatening denial of service attacks andhacking into financial transaction systems,
Second, even even if early AGI’s are given your motivational system, itwould only seem to make sense that human would want the ability tomodify such systems. Even if the human’s making such system’s had goodmotivations they might not know exactly how to best create such amotivational system and they would almost certain want the ability toimprove it, or override it if it was misbehaving. So, until we trustAGIs so much that we put their motivational systems beyond our control,they will be designed so that humans could modify such systems, and thatwould mean humans culd modify them as I have suggested above.

Too much depends on exactly how the motivational systems are structured.I do not believe, in practice, that people will *want* to modify themin any significant way. Such modifications as do occur will be done byglobal consultation: the initial design will be such that *only* globalconsultation (amongs the AGIs and the humans) would allow modificationsto be made. That is part of the security mechanism that makes it safe:all the AGIs will be watching one another, and any attempt to makeunauthorized changes will be instantly spotted.

Third, you have not indicated how AGI’s with your motivational systemscould insure they were not hacked. A superintelligence under its ownmalicious motivations or the control of a malicious human could code,and understand code and its flaws a millions of times better thanhumans, so it is going to be hard to keep your “white-hat” machines frombeing hacked by “black-hat” machines. About ten years ago I talked withan MIT profession who was one of DARPA’s head guy on computer securityand he told me it was a problem much bigger than any solution he knew ofother than keeping a machine totally disconnected. That would apply toAGI’s
Forth, and most importantly, you have not shown how your system woulddeal with all the problems I have raised in prior posts on this subjectabout, possible conflicting goals, conflicts between application of thesame goal to different parts of the world, the need for a system dealingin a complex and rapidly changing world(as the near- andpost-singularity world will be) to change its sub-goals and itsunderstanding of how to best serve its motivations in such a rapidlychanging world. You have only provided verbal handwaving. Nothing thateven begins to answer such hard and real questions. The belowinterchanges provides some examples of such handwaving.


Did you read my previous posts on Motivational Systems That Are Stable?

Unfortunately, I could only summarize the position briefly in the postyou are quoting here, so I think that is why some of the arguments looklike handwaving.

ED’s prior msg>>>>> Or over time the inherent biases that were designedto make AGI’s have empathy for humans, might cause it to have empathyfor some humans more than others
RL>>>> As part of its initial (assumed) empathy, it will set upmechanisms to monitor such things. It could not possibly start having"more empathy for some humans more than others" without *also* beingaware of the fact that, by being so biassed, it was in conflict with itsgeneral motivation. So it would not do such a thing. (We have toremember not to assume it would be both superintelligent, and alsosuscetible to such easily-caught problems).
ED>>>> Assuming for the moment your hypothetical system has stayed loyalto people, I understand how in the situation described above it probablywould know it had a conflict.But if the computer treats all people equally, then how good would it beif it aids all people equally regardless of the relative good and/orharm they are doing. And if it aids people in doing harm, is it or isit not being empathetic to people, and who is to decide what is good andwhat is harm.

Sorry, I have to interrupt here, because you are in the midst of aparticular type of (how can I put it?) fallacy (?). The "fallacy" is tofirst assume the AGI would be superintelligent (which is the assumptionthat the discussion is based on), and then quietly slip in a scenario inwhich the AGI does something incredibly, bizarrely stupid: namely, thissuperintelligent AGI, with its enormous, carefully balanced empathy forthe human species, suddenly decides that because it is trying to "bebalanced" in the way it empathizes with people it must give a homicidalmaniac an equal chance to fulfil his dreams of genocide!

That kind of "dumb computer" AI is exactly the type of science fictionmistake that we must get away from, surely? We cannot assume it issmart, and then say "But what it ..." and then insert a mind-bogglinglydumb behavior.

So, in this case, "empathy for the human race" means that the human racesays that genocidal dictators are not acceptible. The AGI is perfectlywell aware of this, so it just quietly tells the hypthetical Hitlerthat, awfully sorry but no genocide today thankyou.



Would it help a new Hitler, or would it fight him, or

would it equally aid both sides, or would it stay out of the strugglecompletely. Would it help fight Al Quida? Would it help American inIraq? Would it help company A try to develop a product that will helpout compete company B? Is that empathetic to mankind? Who defines whatis empathetic to mankind.
And if they will not help companies, nations, or ideologies compete, howlikely is it that your “empathic” AGIs will be the one that governmentsand corporations will pay to have built, when other, more nationally,corporately, or personally loyal and useful AGI can be built?
ED’s prior msg>>>>> , or might cause them to make decisions that they
 think are in our best interest, but would not.
RL>>>> Again, this assumes (implicitly) that it would both be generally and
broadly empathic -- which means sensitive to our wishes -- and at the
same time, for some inexplicable reason, decide to do something that it
thinks is in our best interests, without consulting us.  Effectively
assumed that it would spontaneously *stop* being empathic, without
explaining how it could happen.
ED>>>> You talk about being “broadly empathic” to humans as if thatanswer all questions. It doesn’t. In fact, it avoids almost all of themost relevant ones.
There is no broad consensus among humans what life is about, what itsmain purposes should be, what the social contract should be, what areour duties to each other, etc. So the question become which of thebeliefs and value systems that are disputed among people would it allowitself to be used to support, and if it refused to be used for anypurposes about which there is dispute among humans, what good would it be.
Would machines that are empathetic to people addict all of humanity tothe better-that-crystal-meth-eternal-rush machine Jiri Jelinek has beenpushing in the Nirvana? Manyana? Never! Thread, or would it not. Ifthey shared Jiri’s values they would. If they shared mine they wouldnot. If they shared mine they would actively discourage people fromusing such a machine, assuming it was shown to have the totallyconsuming addictive power Jiri assumes they would have.

No they would not: this is an outrageous distortion of what "broadlyempathic" means. Does it really take a brain the size of a planet torealize that "... addicting all of humanity... " to anything isobviously not an act of empathy for the species as a whole? Youpostulate this crazy scenario of it suddenly deciding that it "should"take the action of addicting everyone to something, against their will,when it is obvious that most people would consider this the veryopposite of empathy (the first rule of empathy is to let people maketheir own decisions, after all!).

If it is obvious to most people that this would not be empathic, why doyou insert the out-of-left-field assertion that the AGI might do this?Again, you implictly assume that, for some bizarre reason, the AGI wouldbe both superintelligent and so dumb as not to be able to understandword one about what empathy actually is.

So it would appear that your concept of “generally and broadly empathic”is such a broad generality as to be totally useless for most humanendeavors in our competitive world.


Sorry:  completely false conclusion.

ED’s prior msg>>>>> The world is too complicated and is going to changetoo rapidly in the next one hundred, one thousand, or ten thousand yearsfor any goal system designed circa 2015 to remain appropriate until theend of history – unless history ends pretty soon.
RL>>>> Not true: the statement was that it would stay empathic to ourmotivations.
Only if the goal system were particularly rigid would this be a problem,
and by assumption I am talking about motivational systems that are
stable (diffuse systems, along the lines of my previously mentioned post).
ED>>>> Your approach to motivational system might be more stable thanothers, but it is not clear that it would reliably deal with issues ofthe types I have discussed above (people, themselves, often don’t) andit is not clear it would remainloyal to the best interests of human (ifit were possible to know exactly what that entailed) to the end ofhistory, as you claim. Again you have refused to answer the centralthrust of my question.

The previous post of mine was long and detailed: did you look it up,read it and fully understand it before you assembled these criticisms?I must say that it does not appear so.





Richard Loosemore

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=61212037-5ae4a0

Re: [agi] Can humans keep superintelligences under control

Reply via email to