RE: [agi] unFriendly AIXI... and Novamente?

Ben Goertzel Wed, 12 Feb 2003 06:56:27 -0800


> Your intuitions say... I am trying to summarize my impression of your
> viewpoint, please feel free to correct me... "AI morality is a matter of
> experiential learning, not just for the AI, but for the programmers.  To
> teach an AI morality you must give it the right feedback on moral
> questions and reinforce the right behaviors... and you must also learn
> *about* the deep issues of AI morality by raising a young AI.  It isn't
> pragmatically realistic to work out elaborate theories of AI morality in
> advance; you must learn what you need to know as you go along.  Moreover,
> learning what you need to know, as you go along, is a good strategy for
> creating a superintelligence... or at least, the rational estimate of the
> goodness of that strategy is sufficient to make it a good idea to try and
> create a superintelligence, and there aren't any realistic
> strategies that
> are better.  An informal, intuitive theory of AI morality is good enough
> to spark experiential learning in the *programmer* that carries you all
> the way to the finish line.  You'll learn what you need to know as you go
> along.  The most fundamental theoretical and design challenge is
> making AI
> happen, at all; that's the really difficult part that's defeated everyone
> else so far.  Focus on making AI happen.  If you can make AI happen,
> you'll learn how to create moral AI from the experience."


Hmmm.  This is almost a good summary of my perspective, but you've still
not come to grips with the extent of my uncertainty ;)

I am not at all SURE that "An informal, intuitive theory of AI morality is
good enough to spark experiential learning in the *programmer* that carries
you all the way to the finish line." where by the "finish line" you mean
an AGI whose ongoing evolution will lead to beneficial effects for both
humans and AGI's.

I'm open to the possibility that it may someday become clear, as AGI work
progresses, that a systematic theory of AGI morality is necessary in order
to proceed safely.

But I suspect that, in order for me to feel that such a theory was
necessary,
I'd have to understand considerably more about AGI than I do right now.

And I suspect that the only way I'm going to come to understand considerably
more about AGI, is through experimentation with AGI systems.  (This is where
my views differ from Shane's; he is more bullish on the possibility of
learning a lot about AGI through mathematical theory.  I think this will
happen, but I think the math theory will only get really useful when it is
evolving in unison with practical AGI work.)

Right now, it is not clear to me that a systematic theory of AGI morality
is necessary in order to proceed safely.  And it is also not clear to me
that a systematic theory of AGI morality is possible to formulate based
on our current state of knowledge about AGI.

> In contrast, I felt that it was a good idea to develop a theory of AI
> morality in advance, and have developed this theory to the point where it
> currently predicts, counter to my initial intuitions and to my
> considerable dismay:
>
> 1)  AI morality is an extremely deep and nonobvious challenge
> which has no
> significant probability of going right by accident.

I agree it's a deep and nonobvious challenge.  You've done a great job of
demonstrating that.

I don't agree that any of your published writings have shown it "has no
significant probability of going right by accident."

> 2)  If you get the deep theory wrong, there is a strong possibility of a
> silent catastrophic failure: the AI appears to be learning
> everything just
> fine, and both you and the AI are apparently making all kinds of
> fascinating discoveries about AI morality, and everything seems to be
> going pretty much like your intuitions predict above, but when the AI
> crosses the cognitive threshold of superintelligence it takes actions
> which wipe out the human species as a side effect.

Clearly this could happen, but I haven't read anything in your writings
leading to even a heuristic, intuitive probability estimate for the
outcome.

> AIXI, which is a completely defined formal system, definitely undergoes a
> failure of exactly this type.
>
> Ben, you need to be able to spot this.  Think of it as a practice run for
> building a real transhuman AI.  If you can't spot the critical structural
> property of AIXI's foundations that causes AIXI to undergo silent
> catastrophic failure, then a real-world reprise of that situation with
> Novamente would mean you don't have the deep theory to choose good
> foundations deliberately, you can't spot bad foundations deductively, and
> because the problems only show up when the AI reaches superintelligence,
> you won't get experiential feedback on the failure of your theory until
> it's too late.  Exploratory research on AI morality doesn't work for AIXI
> - it doesn't even visibly fail.  It *appears* to work until it's
> too late.
>   If you don't spot the problem in advance, you lose.
>
> If I can demonstrate that your current strategy for AI development would
> undergo silent catastrophic failure in AIXI - that your stated strategy,
> practiced on AIXI, would wipe out the human species, and you didn't spot
> it - will you acknowledge that as a "practice loss"?  A practice loss
> isn't the end of the world.  I have one practice loss on my record too.
> But when that happened I took it seriously; I changed my behavior as a
> result.  If you can't spot the silent failure in AIXI, would you then
> *please* admit that your current strategy on AI morality development is
> not adequate for building a transhuman AI?  You don't have to
> halt work on
> Novamente, just accept that you're not ready to try and create a
> transhuman AI *yet*.
>
> I can spot the problem in AIXI because I have practice looking for silent
> failures, because I have an underlying theory that makes it immediately
> obvious which useful properties are formally missing from AIXI, and
> because I have a specific fleshed-out idea for how to create
> moral systems
> and I can see AIXI doesn't work that way.  Is it really all that
> implausible that you'd need to reach that point before being able to
> create a transhuman Novamente?  Is it really so implausible that AI
> morality is difficult enough to require at least one completely dedicated
> specialist?

Eliezer, I have not thought very hard about AIXI/AIXItl and its
implications.

For better or for worse, I am typing these e-mails at about 90 words a
minute,
inbetween doing other things that are of higher short-term priority ;)

What I have or have not "spotted" about AIXI/AIXItl doesn't mean very much
to me.

I don't have time this week to sit back for a few hours and think hard about
the
possible consequences of AIXI/AIXItl as a real AI system.  For example, I
have to
leave the house in half an hour for a meeting related to some possible
Novamente
funding; then when I get home I have a paper on genetic regulatory network
inference
using Novamente to finish, etc. etc.  I wish I had all day to think about
theoretical
AGI but it's not the case.  Hopefully, if the practical stuff I'm doing now
succeeds,
in a couple years when Novamente is further along I WILL be in a position
where I can
focus 80% instead of 30% of my time on the pure AGI aspects.

What I do or do not spot about AIXI or any other system in a few spare
moments, doesn't
say  much about what I and the whole Novamente team would or would not spot
about
Novamente, which we understand far more deeply than I understand AIXI, and
which
we are focusing a lot of our time on.

-- Ben G


-------
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] unFriendly AIXI... and Novamente?

Reply via email to