Philip,
The discussion at times seems to have progressed on the basis that
AIXI / AIXItl could choose to do all sorts amzing, powerful things. But
what I'm uncear on is what generates the infinite space of computer
programs?
Does AIXI / AIXItl itself generate these programs? Or does it
Ben Goertzel wrote:
Agreed, except for the very modest resources part. AIXI could
potentially accumulate pretty significant resources pretty quickly.
Agreed. But if the AIXI needs to dissassemble the planet to build its
defense mechanism, the fact that it is harmless afterwards isn't going to
To avoid the problem entirely, you have to figure out how to make
an AI that
doesn't want to tinker with its reward system in the first place. This, in
turn, requires some tricky design work that would not necessarily seem
important unless one were aware of this problem. Which, of course,
Ben Goertzel wrote:
I don't think that preventing an AI from tinkering with its
reward system is the only solution, or even the best one...
It will in many cases be appropriate for an AI to tinker with its goal
system...
I don't think I was being clear there. I don't mean the AI should be
Ben Goertzel wrote:
I don't think that preventing an AI from tinkering with its
reward system is the only solution, or even the best one...
It will in many cases be appropriate for an AI to tinker with its goal
system...
I don't think I was being clear there. I don't mean the AI
This seems to be a non-sequitor. The weakness of AIXI is not that it's
goals don't change, but that it has no goals other than to maximize an
externally given reward. So it's going to do whatever it predicts will
most efficiently produce that reward, which is to coerce or subvert
the
I wrote:
I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
operating program leading it to hurt or annihilate humans, though.
It might learn a program involving actually doing beneficial acts
for humans
Or, it might learn a program that just tells humans what they
Wei Dai wrote:
Eliezer S. Yudkowsky wrote:
Important, because I strongly suspect Hofstadterian superrationality
is a *lot* more ubiquitous among transhumans than among us...
It's my understanding that Hofstadterian superrationality is not generally
accepted within the game theory research
On Wed, Feb 19, 2003 at 11:02:31AM -0500, Ben Goertzel wrote:
I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
operating program leading it to hurt or annihilate humans, though.
It might learn a program involving actually doing beneficial acts for humans
Or, it might
The AIXI would just contruct some nano-bots to modify the reward-button so
that it's stuck in the down position, plus some defenses to
prevent the reward mechanism from being further modified. It might need to
trick humans initially into allowing it the ability to construct such
nano-bots,
Wei Dai wrote:
The AIXI would just contruct some nano-bots to modify the reward-button so
that it's stuck in the down position, plus some defenses to
prevent the reward mechanism from being further modified. It might need to
trick humans initially into allowing it the ability to construct such
Now, there is no easy way to predict what strategy it will settle on, but
build a modest bunker and ask to be left alone surely isn't it. At the
very least it needs to become the strongest military power in the world, and
stay that way. It might very well decide that exterminating the human
On Wed, Feb 19, 2003 at 11:56:46AM -0500, Eliezer S. Yudkowsky wrote:
The mathematical pattern of a goal system or decision may be instantiated
in many distant locations simultaneously. Mathematical patterns are
constant, and physical processes may produce knowably correlated outputs
given
Now, there is no easy way to predict what strategy it will settle on, but
build a modest bunker and ask to be left alone surely isn't it. At the
very least it needs to become the strongest military power in the
world, and
stay that way. I
...
Billy Brown
I think this line of thinking
Wei Dai wrote:
Ok, I see. I think I agree with this. I was confused by your phrase
Hofstadterian superrationality because if I recall correctly, Hofstadter
suggested that one should always cooperate in one-shot PD, whereas you're
saying only cooperate if you have sufficient evidence that the
Ben Goertzel wrote:
I think this line of thinking makes way too many assumptions about the
technologies this uber-AI might discover.
It could discover a truly impenetrable shield, for example.
It could project itself into an entirely different universe...
It might decide we pose so little
Billy Brown wrote:
Ben Goertzel wrote:
I think this line of thinking makes way too many assumptions about
the technologies this uber-AI might discover.
It could discover a truly impenetrable shield, for example.
It could project itself into an entirely different universe...
It might decide
It should also be pointed out that we are describing a state of
AI such that:
a) it provides no conceivable benefit to humanity
Not necessarily true: it's plausible that along the way, before learning how
to whack off by stimulating its own reward button, it could provide some
benefits to
Eliezer S. Yudkowsky wrote:
Important, because I strongly suspect Hofstadterian superrationality
is a *lot* more ubiquitous among transhumans than among us...
It's my understanding that Hofstadterian superrationality is not generally
accepted within the game theory research community as a
Wei Dai wrote:
Important, because I strongly suspect Hofstadterian superrationality
is a *lot* more ubiquitous among transhumans than among us...
It's my understanding that Hofstadterian superrationality is not generally
accepted within the game theory research community as a valid
Eliezer,
Allowing goals to change in a coupled way with thoughts memories, is not
simply adding entropy
-- Ben
Ben Goertzel wrote:
I always thought that the biggest problem with the AIXI model is that it
assumes that something in the environment is evaluating the AI
and giving
it
On Tue, Feb 18, 2003 at 06:58:30PM -0500, Ben Goertzel wrote:
However, I do think he ended up making a good point about AIXItl, which is
that an AIXItl will probably be a lot worse at modeling other AIXItl's, than
a human is at modeling other humans. This suggests that AIXItl's playing
Hi Eliezer/Ben,
My recollection was that Eliezer initiated the Breaking AIXI-tl
discussion as a way of proving that friendliness of AGIs had to be
consciously built in at the start and couldn't be assumed to be
teachable at a later point. (Or have I totally lost the plot?)
Do you feel the
systems, rather than for any pragmatic implications it may yave.
-- Ben
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
Behalf Of Philip Sutton
Sent: Sunday, February 16, 2003 9:42 AM
To: [EMAIL PROTECTED]
Subject: Re: [agi] Breaking AIXI-tl - AGI friendliness
Hi Ben,
From a high order implications point of view I'm not sure that we need
too much written up from the last discussion.
To me it's almost enough to know that both you and Eliezer agree that
the AIXItl system can be 'broken' by the challenge he set and that a
human digital simulation
To me it's almost enough to know that both you and Eliezer agree that
the AIXItl system can be 'broken' by the challenge he set and that a
human digital simulation might not. The next step is to ask so what?.
What has this got to do with the AGI friendliness issue.
This last point of
Eliezer S. Yudkowsky wrote:
Bill Hibbard wrote:
On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:
It *could* do this but it *doesn't* do this. Its control process is such
that it follows an iterative trajectory through chaos which is forbidden
to arrive at a truthful solution, though it
Eliezer/Ben,
When you've had time to draw breath can you explain, in non-obscure,
non-mathematical language, what the implications of the AIXI-tl
discussion are?
Thanks.
Cheers, Philip
---
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to
Hi,
There's a physical challenge which operates on *one* AIXI-tl and breaks
it, even though it involves diagonalizing the AIXI-tl as part of the
challenge.
OK, I see what you mean by calling it a physical challenge. You mean
that, as part of the challenge, the external agent posing the
hi,
No, the challenge can be posed in a way that refers to an arbitrary agent
A which a constant challenge C accepts as input.
But the problem with saying it this way, is that the constant challenge
has to have an infinite memory capacity.
So in a sense, it's an infinite constant ;)
No,
Ben Goertzel wrote:
hi,
No, the challenge can be posed in a way that refers to an arbitrary agent
A which a constant challenge C accepts as input.
But the problem with saying it this way, is that the constant challenge
has to have an infinite memory capacity.
So in a sense, it's an infinite
Anyway, a constant cave with an infinite tape seems like a constant
challenge to me, and a finite cave that breaks any {AIXI-tl, tl-human}
contest up to l=googlebyte also still seems interesting, especially as
AIXI-tl is supposed to work for any tl, not just sufficiently high tl.
It's a
Eliezer S. Yudkowsky wrote:
Let's imagine I'm a superintelligent magician, sitting in my castle,
Dyson Sphere, what-have-you. I want to allow sentient beings some way
to visitme, but I'm tired of all these wandering AIXI-tl spambots that
script kiddies code up to brute-force my entrance
Ben Goertzel wrote:
In a naturalistic universe, where there is no sharp boundary between
the physics of you and the physics of the rest of the world, the
capability to invent new top-level internal reflective choices can be
very important, pragmatically, in terms of properties of distant
Ben Goertzel wrote:
AIXI-tl can learn the iterated PD, of course; just not the
oneshot complex PD.
But if it's had the right prior experience, it may have an operating program
that is able to deal with the oneshot complex PD... ;-)
Ben, I'm not sure AIXI is capable of this. AIXI may
Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
Behalf Of Eliezer S. Yudkowsky
Sent: Saturday, February 15, 2003 3:36 PM
To: [EMAIL PROTECTED]
Subject: Re: [agi] Breaking AIXI-tl
Ben Goertzel wrote:
AIXI-tl can learn the iterated PD, of course; just not the
oneshot complex PD
I guess that for AIXI to learn this sort of thing, it would have to be
rewarded for understanding AIXI in general, for proving theorems about AIXI,
etc. Once it had learned this, it might be able to apply this knowledge in
the one-shot PD context But I am not sure.
For those of us
Really, when has a computer (with the exception of certain Microsoft
products) ever been able to disobey it's human masters?
It's easy to get caught up in the romance of superpowers, but come on,
there's nothing to worry about.
-Daniel
Hi Daniel,
Clearly there is nothing to worry about
Even if a (grown) human is playing PD2, it outperforms AIXI-tl playing
PD2.
Well, in the long run, I'm not at all sure this is the case. You haven't
proved this to my satisfaction.
In the short run, it certainly is the case. But so what? AIXI-tl is damn
slow at learning, we know that.
The
Ben Goertzel wrote:
Even if a (grown) human is playing PD2, it outperforms AIXI-tl
playing PD2.
Well, in the long run, I'm not at all sure this is the case. You
haven't proved this to my satisfaction.
PD2 is very natural to humans; we can take for granted that humans excel
at PD2. The
Bill Hibbard wrote:
On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:
It *could* do this but it *doesn't* do this. Its control process is such
that it follows an iterative trajectory through chaos which is forbidden
to arrive at a truthful solution, though it may converge to a stable
attractor.
Eliezer S. Yudkowsky asked Ben Goertzel:
Do you have a non-intuitive mental simulation mode?
LOL --#:^D
It *is* a valid question, Eliezer, but it makes me laugh.
Michael Roy Ames
[Who currently estimates his *non-intuitive mental simulation mode* to
contain about 3 iterations of 5
I'll read the rest of your message tomorrow...
But we aren't *talking* about whether AIXI-tl has a mindlike operating
program. We're talking about whether the physically realizable
challenge,
which definitely breaks the formalism, also breaks AIXI-tl in practice.
That's what I originally
Hmmm My friend, I think you've pretty much convinced me with this last
batch of arguments. Or, actually, I'm not sure if it was your excellently
clear arguments or the fact that I finally got a quiet 15 minutes to really
think about it (the three kids, who have all been out sick from
Ben Goertzel wrote:
I'll read the rest of your message tomorrow...
But we aren't *talking* about whether AIXI-tl has a mindlike
operating program. We're talking about whether the physically
realizable challenge, which definitely breaks the formalism, also
breaks AIXI-tl in practice. That's
Eliezer S. Yudkowsky wrote:
But if this isn't immediately obvious to you, it doesn't seem like a top
priority to try and discuss it...
Argh. That came out really, really wrong and I apologize for how it
sounded. I'm not very good at agreeing to disagree.
Must... sleep...
--
Eliezer S.
Okay, let's see, I promised:
An intuitively fair, physically realizable challenge, with important
real-world analogues, formalizable as a computation which can be fed
either a tl-bounded uploaded human or an AIXI-tl, for which the human
enjoys greater success measured strictly by total reward
Shane Legg wrote:
Eliezer,
Yes, this is a clever argument. This problem with AIXI has been
thought up before but only appears, at least as far as I know, in
material that is currently unpublished. I don't know if anybody
has analysed the problem in detail as yet... but it certainly is
a very
Eliezer S. Yudkowsky wrote:
Has the problem been thought up just in the sense of What happens when
two AIXIs meet? or in the formalizable sense of Here's a computational
challenge C on which a tl-bounded human upload outperforms AIXI-tl?
I don't know of anybody else considering human upload
Hi Eliezer,
An intuitively fair, physically realizable challenge, with important
real-world analogues, formalizable as a computation which can be fed
either a tl-bounded uploaded human or an AIXI-tl, for which the human
enjoys greater success measured strictly by total reward over time, due
50 matches
Mail list logo