Re: [agi] What should we do to be prepared?

j.k. Fri, 07 Mar 2008 14:30:28 -0800

On 03/07/2008 08:09 AM,, Mark Waser wrote:

There is one unique attractor in state space.
No. I am not claiming that there is one unique attractor. I ammerely saying that there is one describable, reachable, stableattractor that has the characteristics that we want. There are*clearly* other attractors. For starters, my attractor requiressufficient intelligence to recognize it's benefits. There iscertainly another very powerful attractor for simpler, brute forceapproaches (which frequently have long-term disastrous consequencesthat aren't seen or are ignored).

Of course. An earlier version said "there is one unique attractor that<identify friendliness here>", and while editing it somehow ended up inthat obviously wrong form.

Since any sufficiently advanced species will eventually be drawntowards F, the CEV of all species is F.
While I believe this to be true, I am not convinced that it isnecessary for my argument. I think that it would make my argument alot easier if I could prove it to be true -- but I currently don't seea way to do that. Anyone want to chime in here?

Ah, okay. I thought you were going to argue this following on fromOmohundro's paper about drives common to all sufficiently advanced AIsand extend it to all sufficiently advanced intelligences, but that's myhallucination.

Therefore F is not species-specific, and has nothing to do with anyparticular species or the characteristics of the first species thatdevelops an AGI (AI).
I believe that the F that I am proposing is not species-specific. Myproblem is that there may be another attractor F' existing somewherefar off in state space that some other species might start out closeenough to that it would be pulled into that attractor instead. Inthat case, there would be the question as to how the species in thetwo different attractors interact. My belief is that it would be tothe mutual benefit of both but I am not able to prove that at this time.

For there to be another attractor F', it would of necessity have to bean attractor that is not desirable to us, since you said there is onlyone stable attractor for us that has the desired characteristics. Idon't see how beings subject to these two different attractors wouldfind mutual benefit in general, since if they did, then F' would havethe desirable characteristics that we wish a stable attractor to have,but it doesn't.

This means that genuine conflict between friendly species or betweenfriendly individuals is not even possible, so there is no question ofan AI needing to arbitrate between the conflicting interests of twofriendly individuals or groups of individuals. Of course, there willstill be conflicts between non-friendlies, and the AI may arbitrateand/or intervene.
Wherever/whenever there is a shortage of resources (i.e. not all goalscan be satisfied), goals will conflict. Friendliness describes thebehavior that should result when such conflicts arise. Friendlyentities should not need arbitration or intervention but shouldwelcome help in determining the optimal solution (which is *close* toarbitration but subtly different in that it is not adverserial). Iwould rephrase your general point as a true, adverserial relationshipis not even possible.

That's a better way of putting it. Conflict will be possible, butthey'll always be resolved via exchange of information rather than bullets.

The AI will not be empathetic towards homo sapiens sapiens inparticular. It will be empathetic towards f-beings (friendly beingsin the technical sense), whether they exist or not (since the AImight be the only being anywhere near the attractor).
Yes. It will also be empathic towards beings with the potential tobecome f-beings because f-beings are a tremendous resource/benefit.

You've said elsewhere that the constraints on how it deals withnon-friendlies are rather minimal, so while it might beempathic/empathetc, it will still have no qualms about kicking ass andinflicting pain where necessary.

This means no specific acts of the AI towards any species orindividuals are ruled out, since it might be part of their CEV (whichis the CEV of all beings), even though they are not smart enough torealize it.
Absolutely correct and dead wrong at the same time. You could inventspecific incredibly low-probabaility but possible circumstances where*any* specific act is justified. I'm afraid that my vision ofFriendliness certainly does permit the intentional destruction of thehuman race if that is the *only* way to preserve a hundred moreintelligent, more advanced, more populous races. On the other hand,given the circumstance space that we are likely to occupy with a hugecertainty, the intentional destruction of the human race is mostcertainly ruled out. Or, in other words, there are no infiniteguarantees but we can reduce the dangers to infinitessimally small levels.

I think you're fudging a bit here. If we are only likely to occupy thecircumstance space with probability less than 1, then the intentionaldestruction of the human race is not 'most certainly ruled out': it iswith very high probability less than 1 ruled out. I'm not trying to sayit's likely; only that's it's possible. I make this point to distinguishyour approach from other approaches that purport to make absoluteguarantees about certain things (as in some ethical systems wherecertain things are *always* wrong, regardless of context or circumstance).

Since the AI empathizes not with humanity but with f-beings ingeneral, it is possible (likely) that some of humanity's mostfundamental beliefs may be wrong from the perspective of an f-being.
Absolutely. Jihad is fundamentally wrong from the perspective of anf-being. A jihadist is *not* an f-being. It's actions are entirelycontrary to the tenets of Friendly action.

And we are not yet f-beings in general, since our current location instate space is so far from F. Or do you believe that some (many?) of usare close to F?

Without getting into the debate of the merits of virtual-space versusmeat-space and uploading, etc., it seems to follow that *if* the viewthat everything of importance is preserved (no arguments about this,it is an assumption for the sake of this point only) in virtual-spaceand *if* turning the Earth into computronium and uploading humanityand all of Earth's beings would be vastly more efficient a use of theplanet, *then* the AI should do this (perhaps would be morallyobligated to do this) -- even if every human being pleads for thisnot to occur. The AI would have judged that if we were only smarter,faster, more the kind of people we would like to be, etc., we wouldactually prefer the computronium scenario.
The weak point of this argument lies in the phrase "the AI would havejudged that if <any clause>, we would actually prefer <any clause>".Extrapolation is a tremendously error-prone process and what the AI isattempting to do here *absolutely requires* that it has a betterknowledge of YOUR goals than you do for this to be a Friendly act. Wejustifiably do this all the time when we do unpleasant things for ourchild's health. But, the intelligent parent (or Friendly entity) doesnot do such things without a really high probability that they arecorrect.
Note: I realize that this is going to be a point of muchunhappiness/contention/debate and there will be endless arguments asto exactly where the line is. This is all well and good but I hopethat we don't lose the forest for the trees (this is why I'm not doingmath at this point). This specific case ends up with an inflammatoryconclusion because it starts out by ASSUMING an equally inflammatorypremise (i.e. that all human beings are incorrect about their goals).I would argue that this is simply a case of garbage in, garbage out.

I don't think it's inflammatory or a case of garbage in to contemplatethat all of humanity could be wrong. For much of our history, there havebeen things that *every single human was wrong about*. This is merelythe assertion that we can't make guarantees about what vastly superiorf-beings will find to be the case. We may one day outgrow our attachmentto meatspace, and we may be wrong in our belief that everythingessential can be preserved in meatspace, but we might not be at thatpoint yet when the AI has to make the decision.

It's become apparent to me in thinking about this that 'friendliness'is really not a good term for the attitude of an f-being that we aretalking about: that of acting solely in the interest of f-beings(whether others exist or not) and in consistency with the CEV of allsufficiently ... beings. It is really just acting rationally(according to a system that we do not understand and may vehementlydisagree with).
Actually, I would argue that Friendliness is a good term because thatis the net result to us if we are Friendly; however, a possibly betterterm is simply "enlightened self-interest" since that describes why anf-being would want to act that way (i.e. why Friendliness is anattractor).

Yes, when you talk about Friendliness as that distant attractor, itstarts to sound an awful lot like "enlightenment", where self-interestis one aspect of that enlightenment, and friendly behavior is anotheraspect.

:-) I haven't addressed this question yet but the short answer isthat there is no requirement for intervention (for a variety ofreasons that I haven't established on this forum the necessarygroundwork to easily explain).


Looking forward to it. Thanks for the detailed response.

joseph

-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=95818715-a78a9b
Powered by Listbox: http://www.listbox.com

Re: [agi] What should we do to be prepared?

Reply via email to