Re: [agi] Reply to Bill Hubbard's post: Mon, 10 Feb 2003

Eliezer S. Yudkowsky Fri, 14 Feb 2003 09:25:09 -0800

Brad Wyble wrote:
>
>> 3) A society of selfish AIs may develop certain (not really
>> primatelike) rules for enforcing cooperative interactions among
>> themselves; but you cannot prove for any entropic specification, and
>> I will undertake to *disprove* for any clear specification, that this
>> creates any rational reason to assign a greater probability to the
>> proposition that the AI society will protect human beings.
>
> No, you can't guarantee anything.

Read the above statement carefully. I am saying that selfish AIs as a group are no more likely to protect humans than an individual selfish AI. Because of the apparent popularity of the argument "oh, well, the Singularity means I can say anything I like about what AIs do, and you can't disprove it", I am specifying that this means we have no rational (Bayesian) reason to expect AI groups to be more helpful toward humans than AI individuals, without specification of AI initial conditions of a degree adequate to create an individual unselfish AI. I.e., "we have no rational reason to expect groups to help".

In plain language: I am not saying that a group of selfish AIs is "not guaranteed" not to kill you. I am saying that if you give me any clear and complete specification (no room for handwaving) of a group of AIs that are not individually helpful of humans, I will show that the direct extrapolation of that specification shows group behaviors that are also not helpful to humans. Of course no one has ever given a clear specification except Marcus Hutter - but given Hutter's specification it is very straightforward to show that grouping makes no difference.

In even plainer language: If you rely on groups of AIs to police themselves you *will* get killed unless a miracle happens.

A miracle m may be defined as a complex event which we have no Bayesian reason to expect, ergo, having probability 2^-K(m).

The Singularity means that there are unknown unknowns that could potentially supervene on any point in your model - but to expect an unknown unknown to intervene in the form of *specific complexity* that pulls your butt out of the fire is praying for a miracle. In terms of formal reasoning, the Singularity "unknown unknown" effect can only add to the formal entropy of a model. This doesn't make the Singularity a net bad thing because there's more to the Singularity than just the unknown unknown effect; there are positive known unknowns like "How much moral value can a superintelligence create?", Singularity scenarios that are comparatively more and less tolerant of unknown unknows, and straightforward extrapolations to more moral value than the human goal system can represent. But it means you've got to be careful.

> But if AI's are in any way grounded
> in our mentality, protective (as opposed to not anti-humanitarian)
> tendencies have a good chance of evolving. This is a good argument for
> modeling AGI's on our brains. While humans are not angels by any
> means, I do believe that we would, *as a community* look after lesser
> beings once having achieved a comfortable level of need-fulfillment. I
> say this by analogy to our current situation in which developed
> countries are starting to show an interest in the preservation of the
> environment and endangered species, even at great expense and
> inconvenience.

You have to ground the AIs in our mentality in a very specific way, which (as it happens) directly transfers protective tendencies *as well as* transferring the initial conditions from which protective tendencies develop. Your intuition that you can create a simple AI design that naturally develops protective tendencies is wrong. "Protective tendencies" turn out to be a hell of a lot more complex than they look to humans, who expect other minds to behave like humans. This is a verrry complex thing that looks to humans like a simple thing. Humans already have this complexity built into them, so we exhibit protective tendencies given a wide variety of simple external conditions, but *that's not the whole dependency* despite our intuition that it's the external condition that causes the protectiveness. The reason the external condition is seen as "causing" the protectiveness is that the innate complexity is species-universal and is hence an invisible enabling condition. It'd be like dropping your glass, watching it shatter, and then saying "Damn, too bad I wasn't on the Moon where the gravity is lower" instead of "I wish my hands hadn't been so sweaty".

There are simple external conditions that provoke protective tendencies in humans following chains of logic that seem entirely natural to us. Our intuition that reproducing these simple external conditions serve to provoke protective tendencies in AIs is knowably wrong, failing an unsupported specific complex miracle.

> Eliezer, I think your quest to provide a surefire guarantee against a
> singularity that eliminates mankind is unfulfillable. We will be
> rolling the dice in creating AGI's and the best we can do is load them.

I'm not providing a surefire guarantee. I'm... "loading the dice" is a good enough description if taken in the fully general sense of biasing possible futures. *But* to load the dice effectively, you have to actually *understand* the dice. Vague arguments about what humans do in situation XYZ are not going to work in the absence of an understanding of what the dependencies are.

Or to put it another way, you see Friendliness in AIs as pretty likely regardless, and you think I'm going to all these lengths to provide a guarantee. I'm not. I'm going to all these lengths to create a *significant probability* of Friendliness.

--
Eliezer S. Yudkowsky http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Reply to Bill Hubbard's post: Mon, 10 Feb 2003

Reply via email to