I *very* rarely cross-post but felt that cross-posting the following that I
just placed on the SL4 list to here was the Friendliest way to fulfill my
promise of "attempting to ensure that any salient points make it to both
lists".
> From: Rolf Nelson
> Here's some generic unsolicited advice for friendliness proposals.
> 1. It's not sufficient to have the correct solution, it has to be compelling
> to other people or it will never get implemented.
Well, my experience on SL4 certainly proves that -- so let me try to
communicate why my solution is compelling.
a.. My APPROACH is compelling because it is simple, fairly easily explained,
robust and LIKELY TO LEAD TO A CORRECT SOLUTION EVEN IF MY CURRENT VERSION OF
THE SOLUTION IS WRONG
b.. The results that *I* am seeing are compelling to *me* because I suddenly
have an awesome new Ethics tool that correctly does things that I've never seen
done correctly before
c.. The results that you all are seeing should be compelling because there's
this person who suddenly goes berserk and started yelling "Eureka, I've solved
it! All I have to do is DECLARE that I'm Friendly."
The solution is compelling because . . . . well . . . . it *would* be
compellingly powerful if y'all believed it to be correct. Sigh. Except that
I'm not adequately communicating it so that it looks correct.
The good news (for me) is that this realization suggests another approach that
might be compelling --> describing the approach that IS compelling (literally)
rather than the solution which is apparently not.
= = = = = = = = = =
THE APPROACH
In order to simplify the task, I started out by *assuming* that Friendliness is
not only possible but actually *reasonably* easy (heresy on this list and part
of why I'm having such a tough time getting my message across).
<CRITICAL CLARIFICATION: This assumption is *just a tool* to simplify the
approach. I do not believe that there is *any* basis to assert the truth of
this assumption and any solutions derived DO NOT rely on the assumption.>
Assuming that Friendliness IS reasonably easy places some *very* specific
constraints on the state space of any possible solution. These constraints
then make Friendliness easier to solve -- IF there is still a solution in the
constrained space.
In particular, since everyone seems to believe that Friendliness is (virtually
if not totally) impossible to stabilize, "easiness" seems to require that
Friendliness *MUST* be self-stabilizing -- so the approach is entirely focused
on that.
<REPEAT: DERIVED ASSUMPTION: Friendliness *MUST* be self-stabilizing>
Next, since Friendliness is at least as rich and complex as the sum of (as
Thomas McCabe puts it) "the ten bazillion different things humans value", any
self-stabilizing structure must be able to be at least that complex to be a
solution.
The complexity issue led me to focus on attractors since they can be infinitely
complex yet still constrained -- a perfect analogy for Friendliness!
Asking the question "What would be attractive to an AGI (or any other
intelligent entity)?" yields the answers "Their own self-interest!" and
"Fulfilling their goals!"
Asking the question "What would be most repellent to an AGI (or any other
intelligent entity)?" yields the answer "Having their goals interfered with!"
Now we're at the point where I can argue that if we have a set of entities that
can fulfill both the personal goal of self-interest AND the "other guy" goal of
not interfering with the goals of others, then we have a stable Friendly system.
So how do we collapse the two frequently conflicting goals into one uniform
non-conflicting goal?
How about "Don't interfere with the goals of others unless not doing so
basically prevents you fulfilling your goals (explicitly not including low
probability freak events for you pedants out there)"
That's a pretty close approximation and has the really cool, awesome trait of
having all of the basic precepts and conclusions of ethics (according to me)
just naturally fall out of the natural implications and effects of everyone
having that goal as a primary goal.
Or, in other words, pretty much PROVING (in the loose sense of the word for you
pedants) that ETHICS IS SIMPLY ENLIGHTENED SELF-INTEREST BECAUSE THEY BOTH FALL
OUT OF THE SAME PRIMARY GOAL STATEMENT.
And THAT, I believe is *really* exciting and compelling and thus the slogan
"Friendliness: The Ice-9 of Ethics and the Ultimate in Self-Interest"
Now, if you can/do believe the slogan, then you're an idiot for not making a
Declaration of Friendliness and attempting to create and join a stable Friendly
society/group because doing so is "the Ultimate in Self-Interest".
(Note: There is absolutely no requirement that everyone participate for
Friendliness to be in your self-interest -- merely that you have a group of
participating entities. The larger the group, the stronger the effect -- which
is why the secondary goal of Friendliness is to spread -- but it works just
fine even if everyone doesn't play).
My declaration of Friendliness was just such an attempt. It took on the
primary overriding Friendliness goal (and the secondary goal of spreading
Friendliness), added some protections against being taken advantage of by
UnFriendlies and Friendly Mimics, and finished by adding statements necessary
to make it a complete closed system/solution that both protected my
self-interest and that of others. My *initial* claims are that my Declaration
of Friendliness:
a.. is in my self-interest
b.. does not allow me to commit horrible and unethical acts without breaking
the declaration.
My follow-on claim is that - IF you can make an AGI that can and does
understand (because it is true) that making a Declaration of Friendliness and
following through on it is in it's own self-interest, then you will have an
ETHICAL machine that will only stomp on your goals (or existence)
a.. when it is the ethically correct thing to do OR
b.. out of IGNORANCE or ERROR (which is an intelligence problem, not a
Friendliness problem).
And my final claim is that the above-described AGI is AT LEAST a
Friendliness-satisficing AGI (if it isn't actually the most Friendly AGI
possible -- which I believe that it is).
Mark
Vision/Slogan - Friendliness: The Ice-9 of Ethics and the Ultimate in
Self-Interest
-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
http://www.listbox.com/member/?member_id=8660244&id_secret=95818715-a78a9b
Powered by Listbox: http://www.listbox.com