Re: [agi] Breaking AIXI-tl

Eliezer S. Yudkowsky Sat, 15 Feb 2003 07:24:38 -0800

Ben Goertzel wrote:

It's really the formalizability of the challenge as a computation which
can be fed either a *single* AIXI-tl or a *single* tl-bounded uploaded
human that makes the whole thing interesting at all... I'm sorry I didn't
succeed in making clear the general class of real-world analogues for
which this is a special case.

OK....  I don't see how the challenge you've described is
"formalizable as a computation which can be fed either a tl-bounded uploaded
human or an AIXI-tl."


The challenge involves cloning the agent being challenged.  Thus it is not a
computation feedable to the agent, unless you assume the agent is supplied
with a cloning machine...

You're not feeding the *challenge* to the *agent*. You're feeding the *agent* to the *challenge*. There's a constant computation C, which accepts as input an arbitrary agent, either a single AIXI-tl or a single tl-bounded upload, and creates a problem environment on which the upload is superior to the AIXI-tl. As part of this operation computation C internally clones the agent, but that operation all takes place inside C. That's why I call it diagonalizing.

If I were to take a very rough stab at it, it would be that the
cooperation case with your own clone is an extreme case of many scenarios
where superintelligences can cooperate with each other on the one-shot
Prisoner's Dilemna provided they have *loosely similar* reflective goal
systems and that they can probabilistically estimate that enough loose
similarity exists.

Yah, but the definition of a superintelligence is relative to the agent
being challenged.

For any fixed superintelligent agent A, there are AIXItl's big enough to
succeed against it in any cooperative game.

To "break" AIXI-tl, the challenge needs to be posed in a way that refers to
AIXItl's own size, i.e. one has to say something like "Playing a cooperative
game with other intelligences of intelligence at least f(t,l)"  where if is
some increasing function....

No, the challenge can be posed in a way that refers to an arbitrary agent A which a constant challenge C accepts as input. For the naturalistic metaphor of a physical challenge, visualize a cavern into which an agent walks, rather than a game the agent is given to play.

If the intelligence of the opponents is fixed, then one can always make an
AIXItl win by increasing t and l ...

So your challenges are all of the form:

* For any fixed AIXItl, here is a challenge that will defeat it

Here is a constant challenge C which accepts as input an arbitrary agent A, and defeats AIXI-tl but not tl-Corbin.

ForAll AIXItl's A(t,l), ThereExists a challenge C(t,l) so that fails_at(A,C)

or alternatively

ForAll AIXItl's A(t,l), ThereExists a challenge C(A(t,l)) so that
fails_at(A,C)

rather than of the form

* Here is a challenge that will defeat any AIXItl

No, the charm of the physical challenge is exactly that there exists a physically constant cavern which defeats any AIXI-tl that walks into it, while being tractable for wandering tl-Corbins.

ThereExists a challenge C so that ForAll AIXItl's A(t,l), fails_at(A,C)

The point is that the challenge C is a function C(t,l) rather than being
independent of t and l

Nope.  One cave.

This of course is why your challenge doesn't break Hutter's theorem.  But
it's a distinction that your initial verbal formulation didn't make very
clearly (and I understand, the distinction is not that easy to make in
words.)

No, the reason my challenge breaks Hutter's assumptions (though not disproving the theorem itself) is that it examines the internal state of the agent in order to clone it. My secondary thesis is that this is not a physically "unfair" scenario because correlations between self and environment are ubiquitous in naturalistic reality.

Of course, it's also true that

ForAll uploaded humans H, ThereExists a challenge C(H) so that fails_at(H,C)

What you've shown that's interesting is that

ThereExists a challenge C, so that:
-- ForAll AIXItl's A(t,l), fails_at(A,C(A))
-- for many uploaded humans H, succeeds_at(H,C(H))

(Where, were one to try to actually prove this, one would substitute
"uploaded humans" with "other AI programs" or something).

This is almost right but, again, the point is that I'm thinking of C as a constant physical situation a single agent can face, a real-world cavern that it walks into. You could, if you wanted to filter those mere golem AIXI-tls out of your magician's castle, but let in real Corbins, construct a computationally simple barrier that did the trick... (Assuming tabula rasa AIXI-tls, so as not to start that up again.)

The interesting part is that these little
natural breakages in the formalism create an inability to take part in
what I think might be a fundamental SI social idiom, conducting binding
negotiations by convergence to goal processes that are guaranteed to have
a correlated output, which relies on (a) Bayesian-inferred initial
similarity between goal systems, and (b) the ability to create a
top-level
reflective choice that wasn't there before, that (c) was abstracted over
an infinite recursion in your top-level predictive process.

I think part of what you're saying here is that AIXItl's are not designed to
be able to participate in a community of equals....  This is certainly true.

Well, yes, as a special case of AIXI-tl's being unable to carry out reasoning where their internal processes are correlated with the environment.

--
Eliezer S. Yudkowsky http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

Reply via email to