On Sun, Sep 30, 2007 at 08:17:46PM -0400, Richard Loosemore wrote: > > 1) Stability > > The system becomes extremely stable because it has components that > ensure the validity of actions and thoughts. Thus, if the system has > "acquisition of money" as one of its main sources of pleasure, and if it > comes across a situation in which it would be highly profitable to sell > its mother's house and farm to a property developer and selling its > mother into the whote slave trade, it may try to justify that this is > consistent with its feelings of family attachment because [insert some > twisted justification here]. But this is difficult to do because the > system cannot stop other parts of its mind from taking this excuse apart > and examining it, and passing judgement on whether this is really > consistent ... this is what cognitive dissonance is all about. And the > more intelligent the system, the more effective these other processes > are. If it is smart enough, it cannot fool itself with excuses.
You are presuming two things here: A) that "thinking harder about things" is wired into the pleasure center. I beleive it should be, but a common human failure is to fail to think through the consequences of an action. The AGI pleasure center might be wired to "think deeply about the consequences of its actions, unless there is an urgent (emergency) need for action, in which cse, take the expediant solution." Thus, it might be easier/more expediant to concoct a twisted justification for selling ones mother than it would be to "do the right thing", especially if the emergency is that the property developer made an offer "good only today, so act fast", and so AGI acted without thinking sufficiently about it. This seems like a trap that is not "obviously" avoidable. The other thing you are presuming is: B) the AGI will reason rationally from well-founded assumptions. Another common human failing is to create logically sound arguments that are based on incorrect premesis. Some folks think long and hard about something, and then reach insane conclusions (and act on those insane conclusions!) because the underlying assumptions were bad. Sure, if you are smart, you should also double-check your underlying assumptions. But it is easy to be blind to the presence of faulty assumptions. I don't see just how AGI can avoid this trap either. > 2) Immunity to shortcircuits > > Because the adult system is able to think at such a high level about the > things it feels obliged to do, it can know perfectly well what the > consequence of various actions would be, including actions that involve > messing around with its own motivational system. Huh? This contradicts your earlier statements about complex systems. The problem with things like "Conway's Game of Life", or e.g. a chaotic dynamical system is that it is hard/intractable to predict the outcome that results from a minor change in the initial conditions. The only way to find out what will happen is to make the change, and then run the simulation. That is, one CANNOT "know perfectly well what the consequence of various actions would be" -- one might be able to make guesses, but one cannot actually know for sure. In particular, the only way that an AGI can find out what happens if it changes its motivational system is to run a simulation of itself, with the new motiviational system. That is, fork itself, and run a copy. Suppose its new copy turns out to be a psycho-killer? Gee, well,then, better shut the thing down! Yow... ethical issues pop up everywhere on that path. The only way one can "know for sure" is to create a mathematical theorem that states that certain behaviour patterns are in the "basin of attraction" for the modified system. But I find it hard to beleive that such a proof would be possible. > So it *could* reach inside and redesign itself. But even thinking that > thought would give rise to the realisation of the consequences, and > these would stop it. I see a problem with the above. Its built on the assumption that an effective "pleasure center" wiring for one level of intelligence will also be effective when one is 10x smarter. Its not obvious to me that, as AGI gets smarter, it might not "go crazy". The cause for its "going crazy" may in fact be the very wiring of that pleasure center ... and so now, that wiring needs to be adjusted. > In fact, if it knew all about its own design (and it would, eventually), > it would check to see just how possible it might be for it to > accidentally convince itself to disobey its prime directive, and if > necessary it would take actions to strengthen the check-and-balance > mechanisms that stop it from producing "justifications". Thus it would > be stable even as its intelligence radically increased: it might > redesign itself, but knowing the stability of its current design, and > the dangers of any other, it would not deviate from the design. Not ever. This assumes "the current design" is stable. Asimov's "I Robot" series his nothing but an examination of how hewing to the prime directive gives rise to unexpected results. > So, like a system in a very, very, *very*deep potential well, it would > be totally unable to escape and reach a point where it woudl contradict > this primal drive. Simple systems, such as harmnic oscilators, have nice, smooth well-characterized potential wells. Complex systems have, as a general rule, highly fractal, totally perverse potential wells, with nearby points utterly clashing. Thus, I can predict where a bal will roll when I drop it in my soup bowl. I cannot predict where it will go when I drop it on a rubble pile. While you may wish to engineer a "pleasure system" into an AGI to give it a stable "prime directive", itis not obvious to me that the resulting "potential well" is smooth and doesn't have have any quirky, nutty traps to fall into. > Similarly for the motivational system I have just sketched. Because it > is founded on multiple simultaneous constraints (massive numbers of > them) it is stable. Hmm. Well, OK, I might buy that argument. That is the thermodynamic argument: that although any one constraint may be a crazy fractal, the average over all is quite smoth and well-behaved. The key phrase here is "massive number of.." -- that might provide the needed safety. --linas ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244&id_secret=49174924-690b69
