Re: [singularity] AI and politics

Michael Anissimov Thu, 07 Jun 2007 13:14:27 -0700

On 6/7/07, Eugen Leitl <[EMAIL PROTECTED]> wrote:


That's not an argument. That rapid environmental changes are dangerous
is an argument. You need an argument to refute that argument.


Speed kills most of the time.  A fast superintelligence that values
humanity will selectively use chunks of matter in such a way as to be
fast but not overtly destructive.  For instance, turning the core of
the Earth into a supercomputer while leaving the surface as is.

We don't have to suspect that evolutionary dynamics is full of extinctions,
we *know* it. If you think you can sustainably strip darwinian regime
I'd like to see an argument how you propose to do that.


The Darwinian regime has already been stripped by, for instance,
medicine.  Evolution in humans does not select against the weaker or
the dumber, because the weaker get medical help whereas the dumber get
financial help from the government.  The standard argument for why
Darwinian dynamics wouldn't apply to superintelligence is here
(http://sl4.org/archive/0401/7506.html), but you've already seen it
and not been convinced.  According to a response of yours immediately
after (http://sl4.org/archive/0401/7551.html) you don't seem to
believe that the dynamics of intelligent-design-and-test or recursive
self-improvement are distinct than the dynamics of Darwinian natural
selection, but it's obvious to most everyone that they are.

When we take a hands-off approach, a Darwinian-style outcome is most
likely.  A human-indifferent seed will grow to a human-indifferent
superintelligence, that grinds us up for lunch.  But a human-friendly
seed has a chance of growing into a human-friendly superintelligence.

If Mr. Rogers were somehow still alive and had access to intelligence
enhancement technology where he could look ahead and see the projected
consequences before making each change, would be choose to make
himself into a superintelligence that killed all of humanity, just to
"continue being a part of the Darwinian regime"?  According to you,
apparently yes.  But I have trouble believing Mr. Rogers, or any
sufficiently friendly SI seed, would do such a thing.

Darwinian behavioral tendencies fundamentally come from self-centered
goal systems, which are observed in every form of life on Earth.  But
there is no fundamental information-theoretic reason why every being
must be self-centered.  Some human beings seem quite selfless, even
despite the self-centered basic programming intrinsic to our genome.
We know that behaviors like trust and love are mediated by certain
neurotransmitter - oxytocin, for instance.  If benevolence or
indifference towards others are properties mediated by brain structure
and neurotransmitter densities, then it should be possible to boost
these qualities by studying which neurophysical interventions enhance
them and which don't.  The same thing will be done with AI experiments
in virtual microworlds.

If I'm still "sounding like a broken record", please tell me, but I'm
trying to come at this from a different angle each time I try to
explain it.

The key question seems to have to do with whether true selfless
benevolence, or even just compliance, is at all possible.  I see no
reason why it wouldn't be.  In human beings, decisions about whether
to act kind or unkind are partially based on appraisals of the other
guy's muscles or firepower.  We have a tendency suck up and kick down.
The reason why this exists is quite obvious for evolutionary reasons.
But for a completely blank slate mind, why suck up and kick down?
Why not suck up and suck down, or kick up and suck down?  Why
conditional niceness?   Why not unconditional niceness?

I think people who have a good chance of precipitating a hard takeoff
runaway are dangerous, and need watching. As long as people are bipedal
primates, the dynamics should be s l o w. Slowing things down is a
hard problem, but this doesn't mean we shouldn't try.


I'd like an AI that can engage in fast dynamics but purposefully
compartmentalize those dynamics such that we aren't destroyed.  Note
that many rock structures change over slow timescales, we change fast,
but we haven't destroyed all the world's rock structures.  This may be
partially a matter that it's not technologically possible yet, but
even if it were, we could choose not to.

The difference between blind Darwinian evolution is that intelligence
can *choose*.  Intelligence can *choose* not to destroy something,
even if it has the capacity to do so.  An animal cannot make that
choice.

How do you define friendly? I keep asking this question, and I keep
asking it for a very good reason. Once you give me your definition,
I will explain the reasons.


Not wiping out humanity is a great start.  The personal definition of
Friendly AI I like is "an AI that we don't regret creating".

I'm not dismissing it because I'm suspicious, I'm dismissing it because
people who keep repeating the 'friendly friendly friendly' mantra are
dangerously deluded, and need a reality check.


So you believe Mr. Rogers would kill everyone if he became
superintelligent?  ;-)  If Mr. Rogers can hit a friendly region of
superintelligent mind configuration space, then an AI with the
explicit motivation to do so would be able to achieve the same thing.

Wise and charismatic are always relative. I don't believe that
*you* can make superintelligences which are are wise and charismatic
against bipedal primates. (The *you* includes anyone who walks
on two legs, not just Michael A.).


Say that someone found a set of alleles associated with genius and
charisma, and created a child such that those alleles were expressed.
The child grows up to be wise and charismatic.  (Defined as most
people who meet them agree that person X is obviously quite wise and
charismatic.)  This isn't too implausible, right?  If it happened, it
would invalidate your statement above.

Creating a friendly AI from scratch *might* be harder than marginally
improving human intelligence and charisma, but the only way to figure
it out is to have people focused on the problem full-time.

I'm also not interested in repeated assertions. I'm interested in how you
can make it so, spelled out formally. Your first step is describing
'friendly' in a formal system, constructively. Your second step is
using that constructive definition as a source of development
constraints. Your third step is building an open-ended supercritical
seed which utilizes results from your third step, asserting insertion
into the 'friendly' behaviour space region target, while maintaining a
sufficient fitness delta to anything else which is not you.


This is a huge, incredibly challenging problem, but it's what has to
be done if AI comes first.  It could take decades of work.  I do have
my own thoughts on the matter, but here I'm interested more in arguing
the overall feasibility of friendly superintelligence rather than any
specific plan.  Sort of like saying flying is possible but without
presenting the blueprints to a functioning plane.

I sometimes find that friendly superintelligence is easier to accept
in principle if you think of it in terms of an enhanced friendly
human.  If the sweetest girl you know were made immortal and given
intelligence enhancement technology, would you actually behave as if
you were sure it would result in your destruction?  You seem to be
okay with IA but skeptical of IA, when in reality they're both subsets
of mind configuration space in general.  If friendly IA is possible,
then friendly AI must be as well.

If you can make a good case even for the first step, I'm willing
to listen. If you can't make even that first step, I continue to
point and laugh. You can continue to pout, but this doesn't
make your case any stronger.


Okay, this is fair enough.  I'll refrain from arguing about this in
the future until I have a good step 1 I can propose.  Intelligence and
goals/motivations are incredibly complex though, so it's a definite
challenge.  It's hard to define anything in psychology formally which
is why most psychology is such soft science.

This is not arguments. This is waffle. (I agree that humanity
is a random achor, but I happen to be a member of that set, and
as long as I and my kids are that, I can't help about that
particular bias. If we all are dead the point is moot anyway).


So you agree that beings more charismatic and intelligent than humans
can exist, you just don't see a reliable way to get there from here.
If so, this is fair enough.

I see I'm being misunderstood. My point was that iterated interactions
between very asymmetrical players have no measurable payoffs for the
bigger player. Because of this the biosphere only gives, and the humans
only take. With bigger players than us, we only get a chance to see how
the receiving end of habitat destruction looks like. It's not personal,
but it still kills fine.


Sorry for misunderstanding you, I've just been embroiled in online
conversations with people who think constant insults are a form of
productive argument, in recent days.

Payoffs are defined in terms of goals.  If the bigger player's goals
revolve around selfishness and personal gain, then yes, the little guy
is irrelevant.  If the bigger player's goals explicitly are concerned
with the welfare of the little guy, then the little guy has little to
worry about.  For example, I *choose* to be a vegetarian because I
care about animal welfare.  The "payoff" is a decrease in animal
suffering.

Why put effort into raising children?  Just so they can take care of
you when you're old?  Or because it's actually possible to genuinely
care for the welfare of someone outside of one's self?

--
Michael Anissimov
Lifeboat Foundation      http://lifeboat.com
http://acceleratingfuture.com/michael/blog

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=4007604&user_secret=7d7fb4d8

Re: [singularity] AI and politics

Reply via email to