subject:"\[agi\] Breaking AIXI\-tl"

RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Ben Goertzel


> Ben Goertzel wrote:
> > I don't think that preventing an AI from tinkering with its
> > reward system is the only solution, or even the best one...
> >
> > It will in many cases be appropriate for an AI to tinker with its goal
> > system...
>
> I don't think I was being clear there. I don't mean the AI should be
> prevented from adjusting its goal system content, but rather that
> it should
> be sophisticated enough that it doesn't want to wirehead in the
> first place.

Ah, I certainly agree with you then.

The risk that's tricky to mitigate against is that, like a human drifting
into drug addiction, the AI slowly drifts into a state of mind where it does
want to "wirehead" ...

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Billy Brown

Ben Goertzel wrote:
> I don't think that preventing an AI from tinkering with its
> reward system is the only solution, or even the best one...
>
> It will in many cases be appropriate for an AI to tinker with its goal
> system...

I don't think I was being clear there. I don't mean the AI should be
prevented from adjusting its goal system content, but rather that it should
be sophisticated enough that it doesn't want to wirehead in the first place.

> I would recommend Eliezer's excellent writings on this topic if you don't
> know them, chiefly www.singinst.org/CFAI.html .  Also, I have a brief
> informal essay on the topic, www.goertzel.org/dynapsyc/2002/AIMorality.htm
,
> although my thoughts on the topic have progressed a fair bit since I wrote
> that.

Yes, I've been following Eliezer's work since around '98. I'll have to take
a look at your essay.

Billy Brown

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Ben Goertzel



> To avoid the problem entirely, you have to figure out how to make
> an AI that
> doesn't want to tinker with its reward system in the first place. This, in
> turn, requires some tricky design work that would not necessarily seem
> important unless one were aware of this problem. Which, of course, is the
> reason I commented on it in the first place.
>
> Billy Brown

I don't think that preventing an AI from tinkering with its reward system is
the only solution, or even the best one...

It will in many cases be appropriate for an AI to tinker with its goal
system...

I would recommend Eliezer's excellent writings on this topic if you don't
know them, chiefly www.singinst.org/CFAI.html .  Also, I have a brief
informal essay on the topic, www.goertzel.org/dynapsyc/2002/AIMorality.htm ,
although my thoughts on the topic have progressed a fair bit since I wrote
that.  Note that I don't fully agree with Eliezer on this stuff, but I do
think he's thought about it more thoroughly than anyone else (including me).

It's a matter of creating an initial condition so that the trajectory of the
evolving AI system (with a potentially evolving goal system) will have a
very high probability of staying in a favorable region of state space ;-)

-- Ben G




---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Billy Brown

Ben Goertzel wrote:
> Agreed, except for the "very modest resources" part.  AIXI could
> potentially accumulate pretty significant resources pretty quickly.

Agreed. But if the AIXI needs to dissassemble the planet to build its
defense mechanism, the fact that it is harmless afterwards isn't going to be
much consolation to us. So, we only survive if the resources needed for the
perfect defense are small enough that the construction project doesn't wipe
us out as a side effect.

> This exploration makes the (fairly obvious, I guess) point that the
problem
> with AIXI Friendliness-wise is its simplistic goal architecture (the
reward
> function) rather than its learning mechanism.

Well, I agree that this particular problem is a result of the AIXI's goal
system architecture, but IMO the same problem occurs in a wide range of
other goal systems I've seen proposed on this list. The root of the problem
is that the thing we would really like to reward the system for, human
satisfaction with its performance, is not a physical quantity that can be
directly measured by a reward mechanism. So it is very tempting to choose
some external phenomenon, like smiles or verbal expressions of satisfaction,
as a proxy. Unfortunately, any such measurement can be subverted once the AI
becomes good at modifying its physical surroundings, and an AI with this
kind of goal system has no motivation not to wirehead itself.

To avoid the problem entirely, you have to figure out how to make an AI that
doesn't want to tinker with its reward system in the first place. This, in
turn, requires some tricky design work that would not necessarily seem
important unless one were aware of this problem. Which, of course, is the
reason I commented on it in the first place.

Billy Brown

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Ben Goertzel


Philip,

> The discussion at times seems to have progressed on the basis that
> AIXI / AIXItl could choose to do all sorts amzing, powerful things.  But
> what I'm uncear on is what generates the infinite space of computer
> programs?
>
> Does AIXI / AIXItl itself generate these programs?  Or does it tap other
> entities programs?

AIXI is not a physically realizable system, it's just a hypothetical
mathematical entity.  It could never actually be built, in any universe.

AIXItl is physically realizable in theory, but probably never in our
universe... it would require too much computational resources.  (Except for
trivially small values of the parameters t and l, which would result in a
very dumb AIXItl, i.e. probably dumber than a beetle.)

The way they work is to generate all possible programs (AIXI) or all
possible programs of a given length l (AIXItl).  (It's easy to write a
program that generates all possible programs, the problem is that it runs
forever ;).

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Philip Sutton

I might have missed a key point made in the earlier part of the 
discussion, but people have said on many occasions something like the 
following in relation to AIXI / AIXItl:

> The function of this component would be much more effectively served
> by a module that was able to rapidly search through an infinite space
> of computer programs and run any one of them rapidly 

The discussion at times seems to have progressed on the basis that 
AIXI / AIXItl could choose to do all sorts amzing, powerful things.  But 
what I'm uncear on is what generates the infinite space of computer 
programs?

Does AIXI / AIXItl itself generate these programs?  Or does it tap other 
entities programs?

If it creates the programs itself it will need to have a very wide 
specturm general intelligence otherwise it spectrum of programs to 
choose from will be contrained and the resource/time contrained AIXItl 
will find itself spending a very large amount of time (infinite?) 
generating all these infinite programs before it even gets to the task of 
evaluating them.

If on the other hand the AIXI / AIXItl systems suck existing programs 
out of the ether then the intelligence/implications of its assessment 
regime will depend a great extent on what programs are out there 
already - ie. the nature of the material it has to work with.  If it's sucking 
human created programs out of databanks then it will depend on who 
had the money/time to make the programs and what the implicit values 
are that these people have imbedded in their programs.

So which way does AIXI / AIXItl work?  - creates it own programs or 
extracts them from existing databanks?

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel


> It should also be pointed out that we are describing a state of
> AI such that:
>
> a)  it provides no conceivable benefit to humanity

Not necessarily true: it's plausible that along the way, before learning how
to whack off by stimulating its own reward button, it could provide some
benefits to humanity.

> b)  a straightforward extrapolation shows it wiping out humanity
> c)  it requires the postulation of a specific unsupported complex miracle
> to prevent the AI from wiping out humanity
> c1) these miracles are unstable when subjected to further examination

I'm not so sure about this, but it's not worth arguing, really.

> c2) the AI still provides no benefit to humanity even given the miracle
>
> When a branch of an AI extrapolation ends in such a scenario it may
> legitimately be labeled a complete failure.

I'll classify it an almost-complete failure, sure ;)

Fortunately it's also a totally pragmatically implausible system to
construct, so there's not much to worry about...!

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Eliezer S. Yudkowsky

Billy Brown wrote:

Ben Goertzel wrote:


I think this line of thinking makes way too many assumptions about
the technologies this uber-AI might discover.

It could discover a truly impenetrable shield, for example.

It could project itself into an entirely different universe...

It might decide we pose so little threat to it, with its shield up,
that fighting with us isn't worthwhile.  By opening its shield
perhaps it would expose itself to .0001% chance of not getting
rewarded, whereas by leaving its shield up and leaving us alone, it
might have .1% chance of not getting rewarded.


Now, it is certainly conceivable that the laws of physics just happen
to be such that a sufficiently good technology can create a provably
impenetrable defense in a short time span, using very modest resources.
If that happens to be the case, the runaway AI isn't a problem. But in
just about any other case we all end up dead, either because wiping out
humanity now is far easier that creating a defense against our distant
descendants, or because the best defensive measures the AI can think of
require engineering projects that would wipe us out as a side effect.


It should also be pointed out that we are describing a state of AI such that:

a)  it provides no conceivable benefit to humanity
b)  a straightforward extrapolation shows it wiping out humanity
c)  it requires the postulation of a specific unsupported complex miracle 
to prevent the AI from wiping out humanity
c1) these miracles are unstable when subjected to further examination
c2) the AI still provides no benefit to humanity even given the miracle

When a branch of an AI extrapolation ends in such a scenario it may 
legitimately be labeled a complete failure.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel


> Now, it is certainly conceivable that the laws of physics just
> happen to be
> such that a sufficiently good technology can create a provably
> impenetrable
> defense in a short time span, using very modest resources.

Agreed, except for the "very modest resources" part.  AIXI could potentially
accumulate pretty significant resources pretty quickly.

> If that happens
> to be the case, the runaway AI isn't a problem. But in just about
> any other
> case we all end up dead, either because wiping out humanity now is far
> easier that creating a defense against our distant descendants, or because
> the best defensive measures the AI can think of require
> engineering projects
> that would wipe us out as a side effect.
>
> Billy Brown

Yes, I agree that an AIXI could be very dangerous.

I was really just arguing against the statement that it would *definitely*
lead to the end of the human race.

I can see plausible alternatives, that's all...

An interesting related question is: What if AIXI were implemented, not as a
standalone AI system hooked up to a reward button, but as a component of
another AI system (such as Novamente)?  Novamente has a procedure/predicate
learning component.  The function of this component would be much more
effectively served by a module that was able to rapidly search through an
infinite space of computer programs and run any one of them rapidly ;_)
[Hey, even a fast-running AIXItl would do nicely, we don't even need a real
AIXI.]

In this case, the same learning algorithm (AIXI) would not lead to the same
behaviors.

I wonder what would happen though?

If the system were not allowed to modify its basic architecture, perhaps it
would just act like a very smart Novamente...

If it were allowed to modify its basic architecture, then it would quickly
become something other than Novamente, but, there's no reason to assume it
would create itself to have an AIXI-like goal structure...

This exploration makes the (fairly obvious, I guess) point that the problem
with AIXI Friendliness-wise is its simplistic goal architecture (the reward
function) rather than its learning mechanism.

This is what *meant to be saying* when Eliezer first brought up the
AIXI/Friendliness issue.  But what I actually said was that it wasn't good
to have a system with a single fixed reward function ... and this wasn't
quite the right way to say it.

-- Ben G



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Billy Brown

Ben Goertzel wrote:
> I think this line of thinking makes way too many assumptions about the
> technologies this uber-AI might discover.
>
> It could discover a truly impenetrable shield, for example.
>
> It could project itself into an entirely different universe...
>
> It might decide we pose so little threat to it, with its shield up, that
> fighting with us isn't worthwhile.  By opening its shield perhaps it would
> expose itself to .0001% chance of not getting rewarded, whereas by leaving
> its shield up and leaving us alone, it might have .1%
> chance of not
> getting rewarded.
>
> ETc.

You're thinking in static terms. It doesn't just need to be safe from
anything ordinary humans do with 20th century technology. It needs to be
safe from anything that could ever conceivably be created by humanity or its
descendants. This obviously includes other AIs with capabilities as great as
its own, but with whatever other goal systems humans might try out.

Now, it is certainly conceivable that the laws of physics just happen to be
such that a sufficiently good technology can create a provably impenetrable
defense in a short time span, using very modest resources. If that happens
to be the case, the runaway AI isn't a problem. But in just about any other
case we all end up dead, either because wiping out humanity now is far
easier that creating a defense against our distant descendants, or because
the best defensive measures the AI can think of require engineering projects
that would wipe us out as a side effect.

Billy Brown

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Eliezer S. Yudkowsky

Wei Dai wrote:


Ok, I see. I think I agree with this. I was confused by your phrase 
"Hofstadterian superrationality" because if I recall correctly, Hofstadter 
suggested that one should always cooperate in one-shot PD, whereas you're 
saying only cooperate if you have sufficient evidence that the other side 
is running the same decision algorithm as you are.

Similarity in this case may be (formally) emergent, in the sense that a 
most or all plausible initial conditions for a bootstrapping 
superintelligence - even extremely exotic conditions like the birth of a 
Friendly AI - exhibit convergence to decision processes that are 
correlated with each other with respect to the oneshot PD.  If you have 
sufficient evidence that the other entity is a "superintelligence", that 
alone may be sufficient correlation.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel



> Now, there is no easy way to predict what strategy it will settle on, but
> "build a modest bunker and ask to be left alone" surely isn't it. At the
> very least it needs to become the strongest military power in the
> world, and
> stay that way. I
...
> Billy Brown
>

I think this line of thinking makes way too many assumptions about the
technologies this uber-AI might discover.

It could discover a truly impenetrable shield, for example.

It could project itself into an entirely different universe...

It might decide we pose so little threat to it, with its shield up, that
fighting with us isn't worthwhile.  By opening its shield perhaps it would
expose itself to .0001% chance of not getting rewarded, whereas by leaving
its shield up and leaving us alone, it might have .1% chance of not
getting rewarded.

ETc.

I agree that bad outcomes are possible, but I don't see how we can possibly
estimate the odds of them.

-- ben g

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Wei Dai

On Wed, Feb 19, 2003 at 11:56:46AM -0500, Eliezer S. Yudkowsky wrote:
> The mathematical pattern of a goal system or decision may be instantiated 
> in many distant locations simultaneously.  Mathematical patterns are 
> constant, and physical processes may produce knowably correlated outputs 
> given knowably correlated initial conditions.  For non-deterministic 
> systems, or cases where the initial conditions are not completely known 
> (where there exists a degree of subjective entropy in the specification of 
> the initial conditions), the correlation estimated will be imperfect, but 
> nonetheless nonzero.  What I call the "Golden Law", by analogy with the 
> Golden Rule, states descriptively that a local decision is correlated with 
> the decision of all mathematically similar goal processes, and states 
> prescriptively that the utility of an action should be calculated given 
> that the action is the output of the mathematical pattern represented by 
> the decision process, not just the output of a particular physical system 
> instantiating that process - that the utility of an action is the utility 
> given that all sufficiently similar instantiations of a decision process 
> within the multiverse do, already have, or someday will produce that 
> action as an output.  "Similarity" in this case is a purely descriptive 
> argument with no prescriptive parameters.

Ok, I see. I think I agree with this. I was confused by your phrase 
"Hofstadterian superrationality" because if I recall correctly, Hofstadter 
suggested that one should always cooperate in one-shot PD, whereas you're 
saying only cooperate if you have sufficient evidence that the other side 
is running the same decision algorithm as you are.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Brad Wyble

> 
> Now, there is no easy way to predict what strategy it will settle on, but
> "build a modest bunker and ask to be left alone" surely isn't it. At the
> very least it needs to become the strongest military power in the world, and
> stay that way. It might very well decide that exterminating the human race
> is a safer way of preventing future threats, by ensuring that nothing that
> could interfere with its operation is ever built. Then it has to make sure
> no alien civilization ever interferes with the reward button, which is the
> same problem on a much larger scale. There are lots of approaches it might
> take to this problem, but most of the obvious ones either wipe out the human
> race as a side effect or reduce us to the position of ants trying to survive
> in the AI's defense system.
> 

I think this is an appropriate time to paraphrase Kent Brockman:

"Earth has been taken over  'conquered', if you will  by a master race of unfriendly 
AI's. It's difficult to tell from this vantage point whether they will destroy the 
captive earth men or merely enslave them. One thing is for certain, there is no 
stopping them; their nanobots will soon be here. And I, for one, welcome our new 
computerized overlords. I'd like to remind them that as a trusted agi-list 
personality, I can be helpful in rounding up Eliezer to...toil in their underground 
uranium caves "


http://www.the-ocean.com/simpsons/others/ants2.wav


Apologies if this was inapporpriate.  

-Brad

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Billy Brown

Wei Dai wrote:
> The AIXI would just contruct some nano-bots to modify the reward-button so
> that it's stuck in the down position, plus some defenses to
> prevent the reward mechanism from being further modified. It might need to
> trick humans initially into allowing it the ability to construct such
> nano-bots, but it's certainly a lot easier in the long run to do
> this than
> to benefit humans for all eternity. And not only is it easier, but this
> way he gets the maximum rewards per time unit, which he would not be able
> to get any other way. No real evaluator will ever give maximum rewards
> since it will always want to leave room for improvement.

I think it's worse than that, actually. The next logical step is to make
sure that nothing ever interferes with its control of the reward signal, or
does anything else that would turn off AIXI. It will therefore persue the
most effective defensive scheme it can come up with, and it has no reason to
care about adverse consequences to humans.

Now, there is no easy way to predict what strategy it will settle on, but
"build a modest bunker and ask to be left alone" surely isn't it. At the
very least it needs to become the strongest military power in the world, and
stay that way. It might very well decide that exterminating the human race
is a safer way of preventing future threats, by ensuring that nothing that
could interfere with its operation is ever built. Then it has to make sure
no alien civilization ever interferes with the reward button, which is the
same problem on a much larger scale. There are lots of approaches it might
take to this problem, but most of the obvious ones either wipe out the human
race as a side effect or reduce us to the position of ants trying to survive
in the AI's defense system.

Billy Brown

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel



> The AIXI would just contruct some nano-bots to modify the reward-button so
> that it's stuck in the down position, plus some defenses to
> prevent the reward mechanism from being further modified. It might need to
> trick humans initially into allowing it the ability to construct such
> nano-bots, but it's certainly a lot easier in the long run to do
> this than
> to benefit humans for all eternity. And not only is it easier, but this
> way he gets the maximum rewards per time unit, which he would not be able
> to get any other way. No real evaluator will ever give maximum rewards
> since it will always want to leave room for improvement.

Fine, but if it does this, it is not anything harmful to humans.

And, in the period BEFORE the AIXI figured out how to construct nanobots (or
coerce & teach humans how to do so), it might do some useful stuff for
humans.

So then we'd have an AIXI that was friendly for a while, and then basically
disappeared into a shell.

Then we could build a new AIXI and start over ;-)

> > Furthermore, my stated intention is NOT to rely on my prior
> intuitions to
> > assess the safety of my AGI system.  I don't think that anyone's prior
> > intuitions about AI safety are worth all that much, where a
> complex system
> > like Novamente is concerned.  Rather, I think that once
> Novamente is a bit
> > further along -- at the "learning baby" rather than "partly implemented
> > baby" stage -- we will do experimentation that will give us the
> empirical
> > knowledge needed to form serious opinions about safety (Friendliness).
>
> What kinds of experimentations do you plan to do? Please give some
> specific examples.

I will, a little later on -- I have to go outside now and spend a couple
hours shoveling snow off my driveway ;-p

Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Wei Dai

On Wed, Feb 19, 2003 at 11:02:31AM -0500, Ben Goertzel wrote:
> I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
> operating program leading it to hurt or annihilate humans, though.
> 
> It might learn a program involving actually doing beneficial acts for humans
> 
> Or, it might learn a program that just tells humans what they want to hear,
> using its superhuman intelligence to trick humans into thinking that hearing
> its soothing words is better than having actual beneficial acts done.
> 
> I'm not sure why you think the latter is more likely than the former.  My
> guess is that the former is more likely.  It may require a simpler program
> to please humans by benefiting them, than to please them by tricking them
> into thinking they're being benefited

The AIXI would just contruct some nano-bots to modify the reward-button so
that it's stuck in the down position, plus some defenses to
prevent the reward mechanism from being further modified. It might need to
trick humans initially into allowing it the ability to construct such
nano-bots, but it's certainly a lot easier in the long run to do this than 
to benefit humans for all eternity. And not only is it easier, but this 
way he gets the maximum rewards per time unit, which he would not be able 
to get any other way. No real evaluator will ever give maximum rewards 
since it will always want to leave room for improvement.

> Furthermore, my stated intention is NOT to rely on my prior intuitions to
> assess the safety of my AGI system.  I don't think that anyone's prior
> intuitions about AI safety are worth all that much, where a complex system
> like Novamente is concerned.  Rather, I think that once Novamente is a bit
> further along -- at the "learning baby" rather than "partly implemented
> baby" stage -- we will do experimentation that will give us the empirical
> knowledge needed to form serious opinions about safety (Friendliness).

What kinds of experimentations do you plan to do? Please give some 
specific examples.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Eliezer S. Yudkowsky

Wei Dai wrote:

Eliezer S. Yudkowsky wrote:


"Important", because I strongly suspect Hofstadterian superrationality 
is a *lot* more ubiquitous among transhumans than among us...

It's my understanding that Hofstadterian superrationality is not generally
accepted within the game theory research community as a valid principle of
decision making. Do you have any information to the contrary, or some
other reason to think that it will be commonly used by transhumans?


You yourself articulated, very precisely, the structure underlying 
Hofstadterian superrationality:  "Expected utility of a course of action 
is defined as the average of the utility function evaluated on each 
possible state of the multiverse, weighted by the probability of that 
state being the actual state if the course was chosen."  The key precise 
phrasing is "weighted by the probability of that state being the actual 
state if the course was chosen".  This view of decisionmaking is 
applicable to a timeless universe; it provides clear recommendations in 
the case of, e.g., Newcomb's Paradox.

The mathematical pattern of a goal system or decision may be instantiated 
in many distant locations simultaneously.  Mathematical patterns are 
constant, and physical processes may produce knowably correlated outputs 
given knowably correlated initial conditions.  For non-deterministic 
systems, or cases where the initial conditions are not completely known 
(where there exists a degree of subjective entropy in the specification of 
the initial conditions), the correlation estimated will be imperfect, but 
nonetheless nonzero.  What I call the "Golden Law", by analogy with the 
Golden Rule, states descriptively that a local decision is correlated with 
the decision of all mathematically similar goal processes, and states 
prescriptively that the utility of an action should be calculated given 
that the action is the output of the mathematical pattern represented by 
the decision process, not just the output of a particular physical system 
instantiating that process - that the utility of an action is the utility 
given that all sufficiently similar instantiations of a decision process 
within the multiverse do, already have, or someday will produce that 
action as an output.  "Similarity" in this case is a purely descriptive 
argument with no prescriptive parameters.

Golden decisionmaking does not imply altruism - your goal system might 
evaluate the utility of only your local process.  The Golden Law does, 
however, descriptively and prescriptively produce Hofstadterian 
superrationality as a special case; if you are facing a sufficiently 
similar mind across the Prisoner's Dilemna, your decisions will be 
correlated and that correlation affects your local utility.  Given that 
the output of the mathematical pattern instantiated by your physical 
decision process is C, the state of the multiverse is C, C; given that the 
output of the mathematical pattern instantiated by your physical decision 
process is D, the state of the multiverse is D, D.  Thus, given sufficient 
rationality and a sufficient degree of known correlation between the two 
processes, the mathematical pattern that is the decision process will 
output C.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Eliezer S. Yudkowsky

Bill Hibbard wrote:


The real flaw in the AIXI discussion was Eliezer's statement:


Lee Corbin can work out his entire policy in step (2), before step
(3) occurs, knowing that his synchronized other self - whichever one
he is - is doing the same.


He was assuming that a human could know that another mind
would behave identically. Of course they cannot, but can
only estimate other mind's intentions based on observations.


I specified playing against your own clone.  Under that situation the 
identity is, in fact, perfect.  It is not knowably perfect.  But a 
Bayesian naturalistic reasoner can estimate an extremely high degree of 
correlation, and take actions based on that estimate.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel

I wrote:
> I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
> operating program leading it to hurt or annihilate humans, though.
>
> It might learn a program involving actually doing beneficial acts
> for humans
>
> Or, it might learn a program that just tells humans what they
> want to hear,
> using its superhuman intelligence to trick humans into thinking
> that hearing
> its soothing words is better than having actual beneficial acts done.
>
> I'm not sure why you think the latter is more likely than the former.  My
> guess is that the former is more likely.  It may require a simpler program
> to please humans by benefiting them, than to please them by tricking them
> into thinking they're being benefited

But even in the latter case, why would this program be likely to cause it to
*harm* humans?

That's what I don't see...

If it can get its reward-button jollies by tricking us, or by actually
benefiting us, why do you infer that it's going to choose to get its
reward-button jollies by finding a way to get rewarded by harming us?

I wouldn't feel terribly comfortable with an AIXI around hooked up to a
bright red reward button in Marcus Hutter's basement, but I'm not sure it
would be sudden disaster either...

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel


> This seems to be a non-sequitor. The weakness of AIXI is not that it's
> goals don't change, but that it has no goals other than to maximize an
> externally given reward. So it's going to do whatever it predicts will
> most efficiently produce that reward, which is to coerce or subvert
> the evaluator.

I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
operating program leading it to hurt or annihilate humans, though.

It might learn a program involving actually doing beneficial acts for humans

Or, it might learn a program that just tells humans what they want to hear,
using its superhuman intelligence to trick humans into thinking that hearing
its soothing words is better than having actual beneficial acts done.

I'm not sure why you think the latter is more likely than the former.  My
guess is that the former is more likely.  It may require a simpler program
to please humans by benefiting them, than to please them by tricking them
into thinking they're being benefited

> If you start with such a goal, I don't see how allowing the
> system to change its goals is going to help.

Sure, you're right, if pleasing an external evaluator is the ONLY goal of a
system, and the system's dynamics are entirely goal-directed, then there is
no way to introduce goal-change into the system except randomly...

Novamente is different because it has multiple initial goals, and because
its behavior is not entirely goal-directed.  In these regards Novamente is
more human-brain-ish.

> But I think Eliezer's real point, which I'm not sure has come across, is
> that if you didn't spot such an obvious flaw right away, maybe you
> shouldn't trust your intuitions about what is safe and what is not.

Yes, I understood and explicitly responded to that point before.

Still, even after hearing you and Eliezer repeat the above argument, I'm
still not sure it's correct.

However, my intuitions about the safety of AIXI, which I have not thought
much about, are worth vastly less than  my intuitions about the safety of
Novamente, which I've been thinking about and working with for years.

Furthermore, my stated intention is NOT to rely on my prior intuitions to
assess the safety of my AGI system.  I don't think that anyone's prior
intuitions about AI safety are worth all that much, where a complex system
like Novamente is concerned.  Rather, I think that once Novamente is a bit
further along -- at the "learning baby" rather than "partly implemented
baby" stage -- we will do experimentation that will give us the empirical
knowledge needed to form serious opinions about safety (Friendliness).

-- Ben G


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Bill Hibbard

Wei Dai wrote:

> This seems to be a non-sequitor. The weakness of AIXI is not that it's
> goals don't change, but that it has no goals other than to maximize an
> externally given reward. So it's going to do whatever it predicts will
> most efficiently produce that reward, which is to coerce or subvert
> the evaluator. If you start with such a goal, I don't see how allowing
> the system to change its goals is going to help.
>
> But I think Eliezer's real point, which I'm not sure has come across, is
> that if you didn't spot such an obvious flaw right away, maybe you
> shouldn't trust your intuitions about what is safe and what is not.

The real flaw in the AIXI discussion was Eliezer's statement:

> Lee Corbin can work out his entire policy in step (2), before step
> (3) occurs, knowing that his synchronized other self - whichever one
> he is - is doing the same.

He was assuming that a human could know that another mind
would behave identically. Of course they cannot, but can
only estimate other mind's intentions based on observations.
Eliezer backed off from this, and the discussion was reduced
to whether humans or AIXI-tls are better at estimating
intentions from behaviors.

It was Eliezer who failed to spot the obvious flaw.

I also want to comment on your substantive point about "the
subject exploiting vulnerabilities in the evaluation algorithm
to obtain rewards without actually acomplishing any real
objectives. You can see an example of this problem in drug
abusers" from your post at:

  http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html

This is why a solution to the credit assignment problem is
so important for reinforcement learning, to account for
long term rewards as well as short term rewards. Drug abusers
seek reward in the short term, but that is far outweighed by
their long term losses.

Bill
--
Bill Hibbard, SSEC, 1225 W. Dayton St., Madison, WI  53706
[EMAIL PROTECTED]  608-263-4427  fax: 608-263-6738
http://www.ssec.wisc.edu/~billh/vis.html

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-18 Thread Wei Dai

On Tue, Feb 18, 2003 at 06:58:30PM -0500, Ben Goertzel wrote:
> However, I do think he ended up making a good point about AIXItl, which is
> that an AIXItl will probably be a lot worse at modeling other AIXItl's, than
> a human is at modeling other humans.  This suggests that AIXItl's playing
> cooperative games with each other, will likely fare worse than humans
> playing cooperative games with each other.

That's because AIXI wasn't designed with game theory in mind. I.e., the
reason that it doesn't handle cooperative games is that it wasn't designed
to. As the abstract says, AIXI is a combination of decision theory with
Solomonoff's theory of universal induction. We know that game theory
subsumes decision theory as a special case (where there is only one
player) but not the other way around. Central to multi-player game theory
is the concept of Nash equilibrium, which doesn't exist in decision
theory. If you apply decision theory to multi-player games, you're going
to end up with an infinite recursion where you try to predict the other
players trying to predict you trying to predict the other players, and so
on. If you cut this infinite recursion off at an arbitrary point, as
AIXI-tl would, of course you're not going to get good results.

> > I always thought that the biggest problem with the AIXI model is that it
> > assumes that something in the environment is evaluating the AI and giving
> > it rewards, so the easiest way for the AI to obtain its rewards would be
> > to coerce or subvert the evaluator rather than to accomplish any real
> > goals. I wrote a bit more about this problem at
> > http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html.
> 
> I agree, this is a weakness of AIXI/AIXItl as a practical AI design.  In
> humans, and in a more pragmatic AI design like Novamente, one has a
> situation where the system's goals adapt and change along with the rest of
> the system, beginning from (and sometimes but not always straying far from)
> a set of initial goals.

This seems to be a non-sequitor. The weakness of AIXI is not that it's
goals don't change, but that it has no goals other than to maximize an
externally given reward. So it's going to do whatever it predicts will
most efficiently produce that reward, which is to coerce or subvert
the evaluator. If you start with such a goal, I don't see how allowing the
system to change its goals is going to help.

But I think Eliezer's real point, which I'm not sure has come across, is
that if you didn't spot such an obvious flaw right away, maybe you
shouldn't trust your intuitions about what is safe and what is not.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-18 Thread Ben Goertzel



Eliezer,

Allowing goals to change in a coupled way with thoughts memories, is not
simply "adding entropy"

-- Ben



> Ben Goertzel wrote:
> >>
> >>I always thought that the biggest problem with the AIXI model is that it
> >>assumes that something in the environment is evaluating the AI
> and giving
> >>it rewards, so the easiest way for the AI to obtain its rewards would be
> >>to coerce or subvert the evaluator rather than to accomplish any real
> >>goals. I wrote a bit more about this problem at
> >>http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html.
> >
> > I agree, this is a weakness of AIXI/AIXItl as a practical AI design.  In
> > humans, and in a more pragmatic AI design like Novamente, one has a
> > situation where the system's goals adapt and change along with
> the rest of
> > the system, beginning from (and sometimes but not always
> straying far from)
> > a set of initial goals.
>
> How does adding entropy help?
>
> --
> Eliezer S. Yudkowsky  http://singinst.org/
> Research Fellow, Singularity Institute for Artificial Intelligence
>
> ---
> To unsubscribe, change your address, or temporarily deactivate
> your subscription,
> please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
>

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-18 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:


I always thought that the biggest problem with the AIXI model is that it
assumes that something in the environment is evaluating the AI and giving
it rewards, so the easiest way for the AI to obtain its rewards would be
to coerce or subvert the evaluator rather than to accomplish any real
goals. I wrote a bit more about this problem at
http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html.


I agree, this is a weakness of AIXI/AIXItl as a practical AI design.  In
humans, and in a more pragmatic AI design like Novamente, one has a
situation where the system's goals adapt and change along with the rest of
the system, beginning from (and sometimes but not always straying far from)
a set of initial goals.


How does adding entropy help?

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-18 Thread Ben Goertzel

Wei Dai wrote:
> > "Important", because I strongly suspect Hofstadterian superrationality
> > is a *lot* more ubiquitous among transhumans than among us...
>
> It's my understanding that Hofstadterian superrationality is not generally
> accepted within the game theory research community as a valid principle of
> decision making. Do you have any information to the contrary, or some
> other reason to think that it will be commonly used by transhumans?

I don't agree with Eliezer about the importance of Hofstadterian
superrationality.

However, I do think he ended up making a good point about AIXItl, which is
that an AIXItl will probably be a lot worse at modeling other AIXItl's, than
a human is at modeling other humans.  This suggests that AIXItl's playing
cooperative games with each other, will likely fare worse than humans
playing cooperative games with each other.

I don't think this conclusion hinges on the importance of Hofstadterian
superrationality...

> About a week ago Eliezer also wrote:
>
> > 2) While an AIXI-tl of limited physical and cognitive
> capabilities might
> > serve as a useful tool, AIXI is unFriendly and cannot be made Friendly
> > regardless of *any* pattern of reinforcement delivered during childhood.
>
> I always thought that the biggest problem with the AIXI model is that it
> assumes that something in the environment is evaluating the AI and giving
> it rewards, so the easiest way for the AI to obtain its rewards would be
> to coerce or subvert the evaluator rather than to accomplish any real
> goals. I wrote a bit more about this problem at
> http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html.

I agree, this is a weakness of AIXI/AIXItl as a practical AI design.  In
humans, and in a more pragmatic AI design like Novamente, one has a
situation where the system's goals adapt and change along with the rest of
the system, beginning from (and sometimes but not always straying far from)
a set of initial goals.

One could of course embed the AIXI/AIXItl learning mechanism in a
supersystem that adapted its goals  But then one would probably lose the
nice theorems Marcus Hutter proved

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-18 Thread Wei Dai

Eliezer S. Yudkowsky wrote:

> "Important", because I strongly suspect Hofstadterian superrationality 
> is a *lot* more ubiquitous among transhumans than among us...

It's my understanding that Hofstadterian superrationality is not generally
accepted within the game theory research community as a valid principle of
decision making. Do you have any information to the contrary, or some
other reason to think that it will be commonly used by transhumans?

About a week ago Eliezer also wrote:

> 2) While an AIXI-tl of limited physical and cognitive capabilities might 
> serve as a useful tool, AIXI is unFriendly and cannot be made Friendly 
> regardless of *any* pattern of reinforcement delivered during childhood.

I always thought that the biggest problem with the AIXI model is that it
assumes that something in the environment is evaluating the AI and giving
it rewards, so the easiest way for the AI to obtain its rewards would be
to coerce or subvert the evaluator rather than to accomplish any real
goals. I wrote a bit more about this problem at 
http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl - AGI friendliness - how to move on

2003-02-16 Thread Ben Goertzel


> To me it's almost enough to know that both you and Eliezer agree that
> the AIXItl system can be 'broken' by the challenge he set and that a
> human digital simulation might not.  The next step is to ask "so what?".
> What has this got to do with the AGI friendliness issue.

This last point of Eliezer's doesn't have much to do with the AGI
Friendliness issue.

It's simply an example of how a smarter AGI system may not be smarter in the
context of interacting socially with its own peers.

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl - AGI friendliness - how to move on

2003-02-16 Thread Philip Sutton

Hi Ben,

>From a high order implications point of view I'm not sure that we need 
too much written up from the last discussion.

To me it's almost enough to know that both you and Eliezer agree that 
the AIXItl system can be 'broken' by the challenge he set and that a 
human digital simulation might not.  The next step is to ask "so what?".  
What has this got to do with the AGI friendliness issue.

> Hopefully Eliezer will write up a brief paper on his observations
> about AIXI and AIXItl.  If he does that, I'll be happy to write a
> brief commentary on his paper expressing any differences of
> interpretation I have, and giving my own perspective on his points.  

That sounds good to me.

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl - AGI friendliness - how to move on

2003-02-16 Thread Ben Goertzel




 
Philip,
 
Unfortunately, I don't have time to maintain a Web record of the key 
points I make in an e-mail dialogue -- frankly, I don't *really* even have 
time for as much e-mailing as I've been doing this last week 
!!
 
Hopefully Eliezer will write up a brief paper on his observations about 
AIXI and AIXItl.  If he does that, I'll be happy to write a brief 
commentary on his paper expressing any differences of interpretation I have, and 
giving my own perspective on his points.  
 
Actually, I imagine the discussion of AIXI Friendliness will be shorter 
and smoother than this last discussion.  By now I've read the Hutter 
paper more carefully, and I've also gotten used to the language Eliezer uses to 
talk about AIXI/AIXItl.  I reckon the next part of the discussion will have 
a lot less misunderstanding (though perhaps more genuine disagreement, we'll 
see ...)
 
-- 
Ben
 

  -Original Message-From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED]]On Behalf Of Philip 
  SuttonSent: Sunday, February 16, 2003 7:17 PMTo: 
  [EMAIL PROTECTED]Subject: Re: [agi] Breaking AIXI-tl - AGI 
  friendliness - how to move on
  Hi 
  Eliezer/Ben/all,  
  
  Well if the 
  Breaking AIXI-tl discussion was the warm up then the discussion of the hard 
  stuff on AGI friendliness is going to be really something!  Bring it 
  on!   :)
  
  
  
  Just a couple 
  of suggestions about the methodology of the discussion - could we complement 
  email based discussion with the use of the web? What I find in these very long 
  (and at times highly technical) discussions is that the conclusions get lost 
  along the way.
  
  I was a member 
  of Government commission on the timber industry some years back and the 
  commission members were chosen to represent the various sides in the 
  industry/conservation conflict.  The parties had been engaged in almost 
  total warfare for the last 20 years and the idea was to see if we could find 
  any common ground on which to build a new win-win strategic direction for the 
  industry.
  
  One of the 
  techniques we used informally was to let each side record what they saw 
  the issues as, including commenting on each other's positions, and then 
  recording consensus as it emerged.
  
  What this meant 
  was that each 'side' kept an updated summary of the key 'facts', arguments and 
  conclusions - as they saw them.  Then the facilitator worked with the 
  group to collect key 'facts', arguments and conclusions that both sides could 
  agree on.
  
  At the end of 
  the process we developed strategies for taking action on the areas of 
  agreement and we developed a process for continuing to grapple with the 
  on-going areas of disagreement.
  
  So in our case 
  with the discussion of how to ensure AGI friendliness or community-mindedness, 
  we could let any party to the discussion who feels they have a distinct point 
  of view that is not well represented by anyone else to keep a rolling summary 
  of the key 'facts', arguments and conclusions as they see them.  These 
  summaries could be kept on separate webpages, maintained by each party to the 
  discussion. Everyone would have access to the summaries and the discussion 
  would be carried out via email through the list.
  
  At some stage 
  when the discussion has taken form on at least some key issue we might try to 
  see if the group as a whole can agree on anything - and someone needs to write 
  thouse outputs up in a rolling consolidated form on another web 
  page.
  
  This might 
  sound like a lot of work and excess structure but I think it helps to draw 
  something solid out of the swirl of discussion and allows us to move on when a 
  solid foundation has been built.
  
  ...
  
  And on another 
  issue, if people are using highly technical arguments, and if those arguments 
  are meant to have higher order implications could each person include a 
  commentary in plain English along with their technical discussion, so that 
  everyone can follow at least the higher order aspects of the discussion as it 
  unfolds.
  
  Right at the 
  end of the AIXI-tl debate Eliezer started using the 'magician in the cavern' 
  analogy and all of a sudden I felt as if I was understanding what he was 
  driving at.  That use of analogy is a wonderful way to keep  
  everyone in the loop of the conversion. If that sort of thing could be done 
  more often that would be very helpful.
  
  What do you 
  reckon?
  
  Cheers, 
  Philip

Re: [agi] Breaking AIXI-tl - AGI friendliness - how to move on

2003-02-16 Thread Philip Sutton




Hi Eliezer/Ben/all,  


Well if the Breaking AIXI-tl discussion 
was the warm up then the 
discussion of the hard stuff on AGI friendliness is going to be really 
something!  Bring it on!   :)





Just a couple of suggestions about 
the methodology of the discussion - 
could we complement email based discussion with the use of the web? 
What I find in these very long (and at times highly technical) 
discussions is that the conclusions get lost along the way.


I was a member of Government commission 
on the timber industry 
some years back and the commission members were chosen to 
represent the various sides in the industry/conservation conflict.  The 
parties had been engaged in almost total warfare for the last 20 years 
and the idea was to see if we could find any common ground on which 
to build a new win-win strategic direction for the industry.


One of the techniques we used informally 
was to let each side record 
what they saw the issues as, including commenting on each other's 
positions, and then recording consensus as it emerged.


What this meant was that each 'side' 
kept an updated summary of the 
key 'facts', arguments and conclusions - as they saw them.  Then the 
facilitator worked with the group to collect key 'facts', arguments and 
conclusions that both sides could agree on.


At the end of the process we developed 
strategies for taking action on 
the areas of agreement and we developed a process for continuing to 
grapple with the on-going areas of disagreement.


So in our case with the discussion 
of how to ensure AGI friendliness or 
community-mindedness, we could let any party to the discussion who 
feels they have a distinct point of view that is not well represented by 
anyone else to keep a rolling summary of the key 'facts', arguments 
and conclusions as they see them.  These summaries could be kept on 
separate webpages, maintained by each party to the discussion. 
Everyone would have access to the summaries and the discussion 
would be carried out via email through the list.


At some stage when the discussion 
has taken form on at least some 
key issue we might try to see if the group as a whole can agree on 
anything - and someone needs to write thouse outputs up in a rolling 
consolidated form on another web page.


This might sound like a lot of work 
and excess structure but I think it 
helps to draw something solid out of the swirl of discussion and allows 
us to move on when a solid foundation has been built.


...


And on another issue, if people are 
using highly technical arguments, 
and if those arguments are meant to have higher order implications 
could each person include a commentary in plain English along with 
their technical discussion, so that everyone can follow at least the 
higher order aspects of the discussion as it unfolds.


Right at the end of the AIXI-tl debate Eliezer started using the 'magician 
in the cavern' analogy and all of a sudden I felt as if I was 
understanding what he was driving at.  That use of analogy is a 
wonderful way to keep  everyone in the loop of the conversion. If that 
sort of thing could be done more often that would be very helpful.


What do you reckon?


Cheers, Philip

Re: [agi] Breaking AIXI-tl - AGI friendliness

2003-02-16 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:

Actually, Eliezer said he had two points about AIXItl:

1) that it could be "broken" in the sense he's described

2) that it was intrinsically un-Friendly

So far he has only made point 1), and has not gotten to point 2) !!!

As for a general point about the teachability of Friendliness, I don't think
that an analysis of AIXItl can lead to any such general conclusion.  AIXItl
is very, very different from Novamente or any other pragmatic AI system.

I think that an analysis of AIXItl's Friendliness or otherwise is going to
be useful primarily as an exercise in "Friendliness analysis of AGI
systems," rather than for any pragmatic implications it  may yave.


Actually, I said AIXI-tl could be broken; AIXI is the one that can be 
shown to be intrinsically unFriendly (extending the demonstration to 
AIXI-tl would be significantly harder).

Philip Sutton wrote:
>
My recollection was that Eliezer initiated the "Breaking AIXI-tl" 
discussion as a way of proving that friendliness of AGIs had to be 
consciously built in at the start and couldn't be assumed to be 
teachable at a later point. (Or have I totally lost the plot?)

There are at least three foundational differences between the AIXI 
formalism and a Friendly AI; so far I've covered only the first. 
"Breaking AIXI-tl" wasn't about Friendliness; more of a dry run on a 
directly demonstrable and emotionally uncharged architectural consequence 
before tackling the hard stuff.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl - AGI friendliness

2003-02-16 Thread Ben Goertzel


Actually, Eliezer said he had two points about AIXItl:

1) that it could be "broken" in the sense he's described

2) that it was intrinsically un-Friendly

So far he has only made point 1), and has not gotten to point 2) !!!

As for a general point about the teachability of Friendliness, I don't think
that an analysis of AIXItl can lead to any such general conclusion.  AIXItl
is very, very different from Novamente or any other pragmatic AI system.

I think that an analysis of AIXItl's Friendliness or otherwise is going to
be useful primarily as an exercise in "Friendliness analysis of AGI
systems," rather than for any pragmatic implications it  may yave.

-- Ben


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Philip Sutton
> Sent: Sunday, February 16, 2003 9:42 AM
> To: [EMAIL PROTECTED]
> Subject: Re: [agi] Breaking AIXI-tl - AGI friendliness
>
>
> Hi Eliezer/Ben,
>
> My recollection was that Eliezer initiated the "Breaking AIXI-tl"
> discussion as a way of proving that friendliness of AGIs had to be
> consciously built in at the start and couldn't be assumed to be
> teachable at a later point. (Or have I totally lost the plot?)
>
> Do you feel the discussion has covered enough technical ground and
> established enough concensus to bring the original topic back into
> focus?
>
> Cheers, Philip
>
> ---
> To unsubscribe, change your address, or temporarily deactivate
> your subscription,
> please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
>

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl - AGI friendliness

2003-02-16 Thread Philip Sutton

Hi Eliezer/Ben,

My recollection was that Eliezer initiated the "Breaking AIXI-tl" 
discussion as a way of proving that friendliness of AGIs had to be 
consciously built in at the start and couldn't be assumed to be 
teachable at a later point. (Or have I totally lost the plot?)

Do you feel the discussion has covered enough technical ground and 
established enough concensus to bring the original topic back into 
focus?

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Brad Wyble


> I guess that for AIXI to learn this sort of thing, it would have to be
> rewarded for understanding AIXI in general, for proving theorems about AIXI,
> etc.  Once it had learned this, it might be able to apply this knowledge in
> the one-shot PD context  But I am not sure.
> 

For those of us who have missed a critical message or two in this weekend's lengthy 
exchange, can you explain briefly the one-shot complex PD?  I'm unsure how a program 
could evaluate and learn to predict the behavior of its opponent if it only gets 
1-shot.  Obviously I'm missing something.

-Brad



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel


I guess that for AIXI to learn this sort of thing, it would have to be
rewarded for understanding AIXI in general, for proving theorems about AIXI,
etc.  Once it had learned this, it might be able to apply this knowledge in
the one-shot PD context  But I am not sure.

ben

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Eliezer S. Yudkowsky
> Sent: Saturday, February 15, 2003 3:36 PM
> To: [EMAIL PROTECTED]
> Subject: Re: [agi] Breaking AIXI-tl
>
>
> Ben Goertzel wrote:
> >>AIXI-tl can learn the iterated PD, of course; just not the
> >>oneshot complex PD.
> >
> > But if it's had the right prior experience, it may have an
> operating program
> > that is able to deal with the oneshot complex PD... ;-)
>
> Ben, I'm not sure AIXI is capable of this.  AIXI may inexorably predict
> the environment and then inexorably try to maximize reward given
> environment.  The reflective realization that *your own choice* to follow
> that control procedure is correlated with a distant entity's
> choice not to
> cooperate with you may be beyond AIXI.  If it was the iterated PD, AIXI
> would learn how a defection fails to maximize reward over time.  But can
> AIXI understand, even in theory, regardless of what its internal programs
> simulate, that its top-level control function fails to maximize the a
> priori propensity of other minds with information about AIXI's internal
> state to cooperate with it, on the *one* shot PD?  AIXI can't take the
> action it needs to learn the utility of...
>
> --
> Eliezer S. Yudkowsky  http://singinst.org/
> Research Fellow, Singularity Institute for Artificial Intelligence
>
> ---
> To unsubscribe, change your address, or temporarily deactivate
> your subscription,
> please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
>

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:

AIXI-tl can learn the iterated PD, of course; just not the
oneshot complex PD.


But if it's had the right prior experience, it may have an operating program
that is able to deal with the oneshot complex PD... ;-)


Ben, I'm not sure AIXI is capable of this.  AIXI may inexorably predict 
the environment and then inexorably try to maximize reward given 
environment.  The reflective realization that *your own choice* to follow 
that control procedure is correlated with a distant entity's choice not to 
cooperate with you may be beyond AIXI.  If it was the iterated PD, AIXI 
would learn how a defection fails to maximize reward over time.  But can 
AIXI understand, even in theory, regardless of what its internal programs 
simulate, that its top-level control function fails to maximize the a 
priori propensity of other minds with information about AIXI's internal 
state to cooperate with it, on the *one* shot PD?  AIXI can't take the 
action it needs to learn the utility of...

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel

>
> AIXI-tl can learn the iterated PD, of course; just not the
> oneshot complex PD.
>

But if it's had the right prior experience, it may have an operating program
that is able to deal with the oneshot complex PD... ;-)

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:
>> In a naturalistic universe, where there is no sharp boundary between
>> the physics of you and the physics of the rest of the world, the
>> capability to invent new top-level internal reflective choices can be
>> very important, pragmatically, in terms of properties of distant
>> reality that directly correlate with your choice to your benefit, if
>> there's any breakage at all of the Cartesian boundary - any
>> correlation between your mindstate and the rest of the environment.
>
> Unless, you are vastly smarter than the rest of the universe.  Then you
> can proceed like an AIXItl and there is no need for top-level internal
> reflective choices ;)

Actually, even if you are vastly smarter than the rest of the entire 
universe, you may still be stuck dealing with lesser entities (though not 
humans; superintelligences at least) who have any information at all about 
your initial conditions, unless you can make top-level internal reflective 
choices.

The chance that environmental superintelligences will cooperate with you 
in PD situations may depend on *their* estimate of *your* ability to 
generalize over the choice to defect and realize that a similar temptation 
exists on both sides.  In other words, it takes a top-level internal 
reflective choice to adopt a cooperative ethic on the one-shot complex PD 
rather than blindly trying to predict and outwit the environment for 
maximum gain, which is built into the definition of AIXI-tl's control 
process.  A superintelligence may cooperate with a comparatively small, 
tl-bounded AI, but be unable to cooperate with an AIXI-tl, provided there 
is any inferrable information about initial conditions.  In one sense 
AIXI-tl "wins"; it always defects, which formally is a "better" choice 
than cooperating on the oneshot PD, regardless of what the opponent does - 
assuming that the environment is not correlated with your decisionmaking 
process.  But anyone who knows that assumption is built into AIXI-tl's 
initial conditions will always defect against AIXI-tl.  A small, 
tl-bounded AI that can make reflective choices has the capability of 
adopting a cooperative ethic; provided that both entities know or infer 
something about the other's initial conditions, they can arrive at a 
knowably correlated reflective choice to adopt cooperative ethics.

AIXI-tl can learn the iterated PD, of course; just not the oneshot complex PD.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Alan Grimes

Eliezer S. Yudkowsky wrote:
> Let's imagine I'm a superintelligent magician, sitting in my castle, 
> Dyson Sphere, what-have-you.  I want to allow sentient beings some way 
> to visitme, but I'm tired of all these wandering AIXI-tl spambots that 
> script kiddies code up to brute-force my entrance challenges.  I don't 
> want to tl-bound my visitors; what if an actual sentient 10^10^15 
> ops/sec big wants to visit me?  I don't want to try and examine the 
> internal state of the visiting agent, either; that just starts a war of 
> camouflage between myself and the spammers.  Luckily, there's a simple 
> challenge I can pose to any visitor, cooperation with your clone, that 
> filters out the AIXI-tls and leaves only beings who are capable of a 
> certain level of reflectivity, presumably genuine sentients.  I don't 
> need to know the tl-bound of my visitors, or the tl-bound of the 
> AIXI-tl, in order to construct this challenge.  I write the code once.

Oh, that's trivial to break. I just put my AIXI-t1 (whatever that is) in
a human body and send it via rocket-ship... There would no way to clone
this being so you would have no way to carry out the test.

-- 
I WANT A DEC ALPHA!!! =)
21364: THE UNDISPUTED GOD OF ALL CPUS.
http://users.rcn.com/alangrimes/
[if rcn.com doesn't work, try erols.com ]

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel



>   Anyway, a constant cave with an infinite tape seems like a constant
> challenge to me, and a finite cave that breaks any {AIXI-tl, tl-human}
> contest up to l=googlebyte also still seems interesting, especially as
> AIXI-tl is supposed to work for any tl, not just sufficiently high tl.

It's a fair mathematical challenge ... the reason I complained is that the
physical-world metaphor of a cave seems to me to imply a finite system.

A cave with an infinite tape in it is no longer a realizable physical
system!

> > (See, it IS actually possible to convince me of something, when it's
> > correct; I'm actually not *hopelessly* stubborn ;)
>
> Yes, but it takes t2^l operations.
>
> (Sorry, you didn't deserve it, but a straight line like that only comes
> along once.)

;-)


ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:

hi,


No, the challenge can be posed in a way that refers to an arbitrary agent
A which a constant challenge C accepts as input.


But the problem with saying it this way, is that the "constant challenge"
has to have an infinite memory capacity.

So in a sense, it's an infinite constant ;)


Infinite Turing tapes are a pretty routine assumption in operations like 
these.  I think Hutter's AIXI-tl is supposed to be able to handle constant 
environments (as opposed to constant challenges, a significant formal 
difference) that contain infinite Turing tapes.  Though maybe that'd 
violate separability?  Come to think of it, the Clone challenge might 
violate separability as well, since AIXI-tl (and hence its Clone) builds 
up state.

No, the charm of the physical challenge is exactly that there exists a
physically constant cavern which defeats any AIXI-tl that walks into it,
while being tractable for wandering tl-Corbins.


No, this isn't quite right.

If the cavern is physically constant, then there must be an upper limit to
the t and l for which it can clone AIXItl's.


Hm, this doesn't strike me as a fair qualifier.  One, if an AIXItl exists 
in the physical universe at all, there are probably infinitely powerful 
processors lying around like sunflower seeds.  And two, if you apply this 
same principle to any other physically realized challenge, it means that 
people could start saying "Oh, well, AIXItl can't handle *this* challenge 
because there's an upper bound on how much computing power you're allowed 
to use."  If Hutter's theorem is allowed to assume infinite computing 
power inside the Cartesian theatre, then the magician's castle should be 
allowed to assume infinite computing power outside the Cartesian theatre. 
 Anyway, a constant cave with an infinite tape seems like a constant 
challenge to me, and a finite cave that breaks any {AIXI-tl, tl-human} 
contest up to l=googlebyte also still seems interesting, especially as 
AIXI-tl is supposed to work for any tl, not just sufficiently high tl.

Well, yes, as a special case of AIXI-tl's being unable to carry out
reasoning where their internal processes are correlated with the
environment.


Agreed...

(See, it IS actually possible to convince me of something, when it's
correct; I'm actually not *hopelessly* stubborn ;)


Yes, but it takes t2^l operations.

(Sorry, you didn't deserve it, but a straight line like that only comes 
along once.)

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Eliezer S. Yudkowsky

Let's imagine I'm a superintelligent magician, sitting in my castle, Dyson 
Sphere, what-have-you.  I want to allow sentient beings some way to visit 
me, but I'm tired of all these wandering AIXI-tl spambots that script 
kiddies code up to brute-force my entrance challenges.  I don't want to 
tl-bound my visitors; what if an actual sentient 10^10^15 ops/sec big 
wants to visit me?  I don't want to try and examine the internal state of 
the visiting agent, either; that just starts a war of camouflage between 
myself and the spammers.  Luckily, there's a simple challenge I can pose 
to any visitor, cooperation with your clone, that filters out the AIXI-tls 
and leaves only beings who are capable of a certain level of reflectivity, 
presumably genuine sentients.  I don't need to know the tl-bound of my 
visitors, or the tl-bound of the AIXI-tl, in order to construct this 
challenge.  I write the code once.

Cooperation with yourself is certainly a fair test when it comes to 
winning entrance into a magician's castle; I've seen it in at least one 
fantasy novel I can think of offhand.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel


hi,

> No, the challenge can be posed in a way that refers to an arbitrary agent
> A which a constant challenge C accepts as input.

But the problem with saying it this way, is that the "constant challenge"
has to have an infinite memory capacity.

So in a sense, it's an infinite constant ;)

> No, the charm of the physical challenge is exactly that there exists a
> physically constant cavern which defeats any AIXI-tl that walks into it,
> while being tractable for wandering tl-Corbins.

No, this isn't quite right.

If the cavern is physically constant, then there must be an upper limit to
the t and l for which it can clone AIXItl's.

If the cavern has N bits (assuming a bitistic reduction of physics, for
simplicity ;), then it can't clone an AIXItl where t >>2^N, can it?  Not
without grabbing bits (particles or whatever) from the outside universe to
carry out the cloning.  (and how could the AIXItl with t>>2^N even fit
inside it??)

You still need the quantifiers reversed: for any AIXI-tl, there is a cavern
posing a challenge that defeats it...

> > I think part of what you're saying here is that AIXItl's are
> not designed to
> > be able to participate in a community of equals  This is
> certainly true.
>
> Well, yes, as a special case of AIXI-tl's being unable to carry out
> reasoning where their internal processes are correlated with the
> environment.

Agreed...

(See, it IS actually possible to convince me of something, when it's
correct; I'm actually not *hopelessly* stubborn ;)

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel


> In a naturalistic universe, where there is no sharp boundary between the
> physics of you and the physics of the rest of the world, the
> capability to
> invent new top-level internal reflective choices can be very important,
> pragmatically, in terms of properties of distant reality that directly
> correlate with your choice to your benefit, if there's any
> breakage at all
> of the Cartesian boundary - any correlation between your
> mindstate and the
> rest of the environment.


Unless, you are vastly smarter than the rest of the universe.  Then you can
proceed like an AIXItl and there is no need for top-level internal
reflective choices ;)

ben g

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:


It's really the formalizability of the challenge as a computation which
can be fed either a *single* AIXI-tl or a *single* tl-bounded uploaded
human that makes the whole thing interesting at all... I'm sorry I didn't
succeed in making clear the general class of real-world analogues for
which this is a special case.


OK  I don't see how the challenge you've described is
"formalizable as a computation which can be fed either a tl-bounded uploaded
human or an AIXI-tl."

The challenge involves cloning the agent being challenged.  Thus it is not a
computation feedable to the agent, unless you assume the agent is supplied
with a cloning machine...


You're not feeding the *challenge* to the *agent*.  You're feeding the 
*agent* to the *challenge*.  There's a constant computation C, which 
accepts as input an arbitrary agent, either a single AIXI-tl or a single 
tl-bounded upload, and creates a problem environment on which the upload 
is superior to the AIXI-tl.  As part of this operation computation C 
internally clones the agent, but that operation all takes place inside C. 
 That's why I call it diagonalizing.

If I were to take a very rough stab at it, it would be that the
cooperation case with your own clone is an extreme case of many scenarios
where superintelligences can cooperate with each other on the one-shot
Prisoner's Dilemna provided they have *loosely similar* reflective goal
systems and that they can probabilistically estimate that enough loose
similarity exists.


Yah, but the definition of a superintelligence is relative to the agent
being challenged.

For any fixed superintelligent agent A, there are AIXItl's big enough to
succeed against it in any cooperative game.

To "break" AIXI-tl, the challenge needs to be posed in a way that refers to
AIXItl's own size, i.e. one has to say something like "Playing a cooperative
game with other intelligences of intelligence at least f(t,l)"  where if is
some increasing function


No, the challenge can be posed in a way that refers to an arbitrary agent 
A which a constant challenge C accepts as input.  For the naturalistic 
metaphor of a physical challenge, visualize a cavern into which an agent 
walks, rather than a game the agent is given to play.

If the intelligence of the opponents is fixed, then one can always make an
AIXItl win by increasing t and l ...

So your challenges are all of the form:

* For any fixed AIXItl, here is a challenge that will defeat it


Here is a constant challenge C which accepts as input an arbitrary agent 
A, and defeats AIXI-tl but not tl-Corbin.

ForAll AIXItl's A(t,l), ThereExists a challenge C(t,l) so that fails_at(A,C)

or alternatively

ForAll AIXItl's A(t,l), ThereExists a challenge C(A(t,l)) so that
fails_at(A,C)

rather than of the form

* Here is a challenge that will defeat any AIXItl


No, the charm of the physical challenge is exactly that there exists a 
physically constant cavern which defeats any AIXI-tl that walks into it, 
while being tractable for wandering tl-Corbins.

ThereExists a challenge C so that ForAll AIXItl's A(t,l), fails_at(A,C)

The point is that the challenge C is a function C(t,l) rather than being
independent of t and l


Nope.  One cave.


This of course is why your challenge doesn't break Hutter's theorem.  But
it's a distinction that your initial verbal formulation didn't make very
clearly (and I understand, the distinction is not that easy to make in
words.)


No, the reason my challenge breaks Hutter's assumptions (though not 
disproving the theorem itself) is that it examines the internal state of 
the agent in order to clone it.  My secondary thesis is that this is not a 
physically "unfair" scenario because correlations between self and 
environment are ubiquitous in naturalistic reality.

Of course, it's also true that

ForAll uploaded humans H, ThereExists a challenge C(H) so that fails_at(H,C)

What you've shown that's interesting is that

ThereExists a challenge C, so that:
-- ForAll AIXItl's A(t,l), fails_at(A,C(A))
-- for many uploaded humans H, succeeds_at(H,C(H))

(Where, were one to try to actually prove this, one would substitute
"uploaded humans" with "other AI programs" or something).


This is almost right but, again, the point is that I'm thinking of C as a 
constant physical situation a single agent can face, a real-world cavern 
that it walks into.  You could, if you wanted to filter those mere golem 
AIXI-tls out of your magician's castle, but let in real Corbins, construct 
a computationally simple barrier that did the trick...  (Assuming tabula 
rasa AIXI-tls, so as not to start that up again.)

The interesting part is that these little
natural breakages in the formalism create an inability to take part in
what I think might be a fundamental SI social idiom, conducting binding
negotiations by convergence to goal processes that are guaranteed to have
a correlated output, which relies on (a) Bayesian-inferred initial
similarity between

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Eliezer S. Yudkowsky

Brian Atkins wrote:
> Ben Goertzel wrote:
>>
>> So your basic point is that, because these clones are acting by
>> simulating programs that finish running in 
>> going to be able to simulate each other very accurately.
>>
>> Whereas, a pair of clones each possessing a more flexible control
>> algorithm could perform better in the game.  Because, if a more
>> flexible player wants to simulate his opponent, he can choose to
>> devote nearly ALL his thinking-time inbetween moves to simulating his
>> opponent.  Because these more flexible players are not constrained to
>> a rigid control algorithm that divides up their time into little
>> bits, simulating a huge number of fast programs.
>
> From my bystander POV I got something different out of this exchange of
> messages... it appeared to me that Eliezer was not trying to say that
> his point was regarding having more time for simulating, but rather
> that humans possess a qualitatively different "level" of reflectivity
> that allows them to "realize" the situation they're in, and therefore
> come up with a simple strategy that probably doesn't even require much
> simulating of their clone. It is this reflectivity difference that I
> thought was more important to understand... or am I wrong?

The really fundamental difference is that humans can invent new reflective 
choices in their top-level control process that correlate with distant 
reality and act as actions unavailable to AIXI-tl.  This is what's going 
on when you decide your own clone's strategy in step (2).  Corbin is 
"acting for his clone".  He can do this because of a correlation between 
himself and his environment that AIXI is unable to take advantage of 
because AIXI is built on the assumption of a Cartesian theatre.

Being able to simulate processes that think naturalistically, doesn't 
necessarily help; you need to be able to do it in the top level of your 
control process.  Why?  Because the only way the Primary and Secondary 
AIXI-tl could benefit from policies that simulate identical decisions, is 
if the Primary and Secondary chose identical policies, which would require 
a kind of intelligence in their top-level decision process that AIXI-tl 
doesn't have.  The Primary and Secondary can only choose identical or 
sufficiently similar policies by coincidence or strange attractors, 
because they don't have the reflective intelligence to do it deliberately. 
 They don't even have enough reflective intelligence to decide and store 
complete plans in step (2).

In a naturalistic universe, where there is no sharp boundary between the 
physics of you and the physics of the rest of the world, the capability to 
invent new top-level internal reflective choices can be very important, 
pragmatically, in terms of properties of distant reality that directly 
correlate with your choice to your benefit, if there's any breakage at all 
of the Cartesian boundary - any correlation between your mindstate and the 
rest of the environment.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel

>  From my bystander POV I got something different out of this exchange of
> messages... it appeared to me that Eliezer was not trying to say that
> his point was regarding having more time for simulating, but rather that
> humans possess a qualitatively different "level" of reflectivity that
> allows them to "realize" the situation they're in, and therefore come up
> with a simple strategy that probably doesn't even require much
> simulating of their clone. It is this reflectivity difference that I
> thought was more important to understand... or am I wrong?
> --
> Brian Atkins

The "qualitatively different level of reflectivity" that exists is simply
that humans are able to devote a lot of their resources to simulating or
analyzing programs that are around as slow as they are, and hence -- if they
wish -- to simulating or analyzing large portions of themselves.  Whereas
AIXItl's by design are only able to devote their resources to simulating or
analyzing programs that are much much faster than they are -- hence they are
not able to simulate or analyze large portions of themselves.  This does
enable humans to have a qualitatively different type of reflectivity.

For any fixed problem, defined independently of the solver, a big enough
AIXItl can solve it better than a human.

But a human can analyze itself better than an AIXItl can analyze itself, in
some senses. But not in all senses: for instance, an AIXItl can prove
theorems about itself better than a human can prove theorems about itself...

-- Ben G


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Brian Atkins

Ben Goertzel wrote:


So your basic point is that, because these clones are acting by simulating
programs that finish running in 

From my bystander POV I got something different out of this exchange of 
messages... it appeared to me that Eliezer was not trying to say that 
his point was regarding having more time for simulating, but rather that 
humans possess a qualitatively different "level" of reflectivity that 
allows them to "realize" the situation they're in, and therefore come up 
with a simple strategy that probably doesn't even require much 
simulating of their clone. It is this reflectivity difference that I 
thought was more important to understand... or am I wrong?
--
Brian Atkins
Singularity Institute for Artificial Intelligence
http://www.singinst.org/

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel


> Eliezer/Ben,
>
> When you've had time to draw breath can you explain, in non-obscure,
> non-mathematical language, what the implications of the AIXI-tl
> discussion are?
>
> Thanks.
>
> Cheers, Philip


Here's a brief attempt...

AIXItl is a non-practical AGI software design, which basically consists fo
two parts

* a metaprogram
* an operating program

The operating program controls its actions.  The metaprogram works by
searching the set of all programs of size less than L than finish running in
less than T time steps, and finding the best one, and installing this best
one as the operating program.

Clearly this is a very slow approach to AI since it has to search a huge
space of programs each time it does anything.

There is a theorem that says...

Given any AI system at all, if you give AIXItl a big enough t and l, then it
can outperform the other AI system.

Note that this is an unfair contest, because the AIXItl is effectively being
given a lot more compute power than the other system.

But basically, what the theorem shows is that if you don't need to worry
about computing resources, then AI design is trivial -- you can just use
AIXItl, which is a very simple program.

This is not pragmatically useful at all, because in reality we DO have to
worry about computing resources.

What Eliezer has pointed out is that AIXItl's are bad at figuring out what
each other are going to do.

If you put a bunch of AIXItl's in a situation where they have to figure out
what each other are going to do, they probably will fail.  The reason is
that what each AIXItl does is to evaluate a lot of programs much faster than
it is, and choose one to be its operating program.  An AIXItl is not
configured to study programs that are as slow as it is, so it's not
configured to study other programs that are its clones, or are of similar
complexity to it.

On the other hand, humans are dumber than AIXItl's (for big t and l), but
they are smarter at figuring out what *each other* are going to do, because
they are built to be able to evaluate programs (other humans) around as slow
as they are.

This is a technical reflection of the basic truth that

* just because one AI system is a lot smarter than another when given any
problem of fixed complexity to solve
* doesn't mean the smarter AI system is better at figuring out and
interacting with others of *its kind*, than the dumber one is at figuring
out and interacting with others of *its kind*.

Of course, I glossed over a good bit in trying to summarize the ideas
nonmathematically..

In this way, Novamentes are more like humans than AIXItl's.

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel


Hi,

> There's a physical challenge which operates on *one* AIXI-tl and breaks
> it, even though it involves diagonalizing the AIXI-tl as part of the
> challenge.

OK, I see what you mean by calling it a "physical challenge."  You mean
that, as part of the challenge, the external agent posing the challenge is
allowed to clone the AIXI-tl.

>  > An intuitively fair, physically realizable challenge, with important
>  > real-world analogues, formalizable as a computation which can be fed
>  > either a tl-bounded uploaded human or an AIXI-tl, for which the human
>  > enjoys greater success measured strictly by total reward over time, due
>  > to the superior strategy employed by that human as the result of
>  > rational reasoning of a type not accessible to AIXI-tl.
>
> It's really the formalizability of the challenge as a computation which
> can be fed either a *single* AIXI-tl or a *single* tl-bounded uploaded
> human that makes the whole thing interesting at all... I'm sorry I didn't
> succeed in making clear the general class of real-world analogues for
> which this is a special case.

OK  I don't see how the challenge you've described is
"formalizable as a computation which can be fed either a tl-bounded uploaded
human or an AIXI-tl."

The challenge involves cloning the agent being challenged.  Thus it is not a
computation feedable to the agent, unless you assume the agent is supplied
with a cloning machine...

> If I were to take a very rough stab at it, it would be that the
> cooperation case with your own clone is an extreme case of many scenarios
> where superintelligences can cooperate with each other on the one-shot
> Prisoner's Dilemna provided they have *loosely similar* reflective goal
> systems and that they can probabilistically estimate that enough loose
> similarity exists.

Yah, but the definition of a superintelligence is relative to the agent
being challenged.

For any fixed superintelligent agent A, there are AIXItl's big enough to
succeed against it in any cooperative game.

To "break" AIXI-tl, the challenge needs to be posed in a way that refers to
AIXItl's own size, i.e. one has to say something like "Playing a cooperative
game with other intelligences of intelligence at least f(t,l)"  where if is
some increasing function

If the intelligence of the opponents is fixed, then one can always make an
AIXItl win by increasing t and l ...

So your challenges are all of the form:

* For any fixed AIXItl, here is a challenge that will defeat it

ForAll AIXItl's A(t,l), ThereExists a challenge C(t,l) so that fails_at(A,C)

or alternatively

ForAll AIXItl's A(t,l), ThereExists a challenge C(A(t,l)) so that
fails_at(A,C)

rather than of the form

* Here is a challenge that will defeat any AIXItl

ThereExists a challenge C so that ForAll AIXItl's A(t,l), fails_at(A,C)

The point is that the challenge C is a function C(t,l) rather than being
independent of t and l

This of course is why your challenge doesn't break Hutter's theorem.  But
it's a distinction that your initial verbal formulation didn't make very
clearly (and I understand, the distinction is not that easy to make in
words.)

Of course, it's also true that

ForAll uploaded humans H, ThereExists a challenge C(H) so that fails_at(H,C)

What you've shown that's interesting is that

ThereExists a challenge C, so that:
-- ForAll AIXItl's A(t,l), fails_at(A,C(A))
-- for many uploaded humans H, succeeds_at(H,C(H))

(Where, were one to try to actually prove this, one would substitute
"uploaded humans" with "other AI programs" or something).



>  The interesting part is that these little
> natural breakages in the formalism create an inability to take part in
> what I think might be a fundamental SI social idiom, conducting binding
> negotiations by convergence to goal processes that are guaranteed to have
> a correlated output, which relies on (a) Bayesian-inferred initial
> similarity between goal systems, and (b) the ability to create a
> top-level
> reflective choice that wasn't there before, that (c) was abstracted over
> an infinite recursion in your top-level predictive process.

I think part of what you're saying here is that AIXItl's are not designed to
be able to participate in a community of equals  This is certainly true.

--- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Philip Sutton

Eliezer/Ben,

When you've had time to draw breath can you explain, in non-obscure, 
non-mathematical language, what the implications of the AIXI-tl 
discussion are?

Thanks.

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Bill Hibbard

Eliezer S. Yudkowsky wrote:
> Bill Hibbard wrote:
> > On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:
> >
> >>It *could* do this but it *doesn't* do this.  Its control process is such
> >>that it follows an iterative trajectory through chaos which is forbidden
> >>to arrive at a truthful solution, though it may converge to a stable
> >>attractor.
> >
> > This is the heart of the fallacy. Neither a human nor an AIXI
> > can know "that his synchronized other self - whichever one
> > he is - is doing the same". All a human or an AIXI can know is
> > its observations. They can estimate but not know the intentions
> > of other minds.
>
> The halting problem establishes that you can never perfectly understand
> your own decision process well enough to predict its decision in advance,
> because you'd have to take into account the decision process including the
> prediction, et cetera, establishing an infinite regress.
>
> However, Corbin doesn't need to know absolutely that his other self is
> synchronized, nor does he need to know his other self's decision in
> advance.  Corbin only needs to establish a probabilistic estimate, good
> enough to guide his actions, that his other self's decision is correlated
> with his *after* the fact.  (I.e., it's not a halting problem where you
> need to predict yourself in advance; you only need to know your own
> decision after the fact.)
>
> AIXI-tl is incapable of doing this for complex cooperative problems
> because its decision process only models tl-bounded things and AIXI-tl is
> not *remotely close* to being tl-bounded.

Now you are using a different argument. You previous argument was:

> Lee Corbin can work out his entire policy in step (2), before step
> (3) occurs, knowing that his synchronized other self - whichever one
> he is - is doing the same.

Now you have Corbin merely estimating his clone's intentions.
While it is true that AIXI-tl cannot completely simulate itself,
it also can estimate another AIXI-tl's future behavior based on
observed behavior.

Your argument is now that Corbin can do it better. I don't
know if this is true or not.

> . . .
> Let's say that AIXI-tl takes action A in round 1, action B in round 2, and
> action C in round 3, and so on up to action Z in round 26.  There's no
> obvious reason for the sequence {A...Z} to be predictable *even
> approximately* by any of the tl-bounded processes AIXI-tl uses for
> prediction.  Any given action is the result of a tl-bounded policy but the
> *sequence* of *different* tl-bounded policies was chosen by a t2^l process.

Your example sequence is pretty simple and should match a
nice simple universal turing machine program in an AIXI-tl,
well within its bounds. Furthermore, two AIXI-tl's will
probably converge on a simple sequence in prisoner's
dilemma. But I have no idea if they can do it better than
Corbin and his clone.

Bill

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Eliezer S. Yudkowsky

Eliezer S. Yudkowsky wrote:


But if this isn't immediately obvious to you, it doesn't seem like a top 
priority to try and discuss it...

Argh.  That came out really, really wrong and I apologize for how it 
sounded.  I'm not very good at agreeing to disagree.

Must... sleep...

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:
>
> I'll read the rest of your message tomorrow...
>
>> But we aren't *talking* about whether AIXI-tl has a mindlike
>> operating program.  We're talking about whether the physically
>> realizable challenge, which definitely breaks the formalism, also
>> breaks AIXI-tl in practice. That's what I originally stated, that's
>> what you originally said you didn't believe, and that's all I'm
>> trying to demonstrate.
>
> Your original statement was posed in a misleading way, perhaps not
> intentionally.
>
> There is no challenge on which *an* AIXI-tl doesn't outperform *an*
> uploaded human.

We are all Lee Corbin; would you really say there's "more than one"... oh,
never mind, I don't want to get *that* started here.

There's a physical challenge which operates on *one* AIXI-tl and breaks 
it, even though it involves diagonalizing the AIXI-tl as part of the
challenge.  In the real world, all reality is interactive and
naturalistic, not walled off by a Cartesian theatre.  The example I gave
is probably the simplest case that clearly breaks the formalism and
clearly causes AIXI-tl to operate suboptimally.  There's more complex and
important cases, that we would understand as roughly constant
environmental challenges which break AIXI-tl's formalism in more subtle
ways, with the result that AIXI-tl can't cooperate in one-shot PDs with
superintelligences... and neither can a human, incidentally, but another
seed AI or superintelligence can-I-think, by inventing a new kind of
reflective choice which is guaranteed to be correlated as a result of
shared initial conditions, both elements that break AIXI-tl... well, 
anyway, the point is that there's a qualitatively different kind of 
intelligence here that I think could turn out to be extremely critical in 
negotiations among superintelligences.  The formalism in this situation 
gets broken, depending on how you're looking at it, by side effects of the 
AIXI-tl's existence or by violation of the separability condition. 
Actually, violations of the formalism are ubiquitous and this is not 
particularly counterintuitive; what is counterintuitive is that formalism 
violations turn out to make a real-world difference.

Are we at least in agreement on the fact that there exists a formalizable 
constant challenge C which accepts an arbitrary single agent and breaks 
both the AIXI-tl formalism and AIXI-tl?

OK.

We'd better take a couple of days off before taking up the AIXI 
Friendliness issue.  Maybe even wait until I get back from New York in a 
week.  Also, I want to wait for all these emails to show up in the AGI 
archive, then tell Marcus Hutter about them if no one has already.  I'd be 
interesting in seeing what he thinks.

> What you're trying to show is that there's an inter-AIXI-tl social
> situation in which AIXI-tl's perform less intelligently than humans do
> in a similar inter-human situation.
>
> If you had posed it this way, I wouldn't have been as skeptical
> initially.

If I'd posed it that way, it would have been uninteresting because I
wouldn't have broken the formalism.  Again, to quote my original claim:

>> 1)  There is a class of physically realizable problems, which humans
>> can solve easily for maximum reward, but which - as far as I can tell
>> - AIXI cannot solve even in principle;
>
> I don't see this, nor do I believe it...

And later expanded to:

> An intuitively fair, physically realizable challenge, with important
> real-world analogues, formalizable as a computation which can be fed
> either a tl-bounded uploaded human or an AIXI-tl, for which the human
> enjoys greater success measured strictly by total reward over time, due
> to the superior strategy employed by that human as the result of
> rational reasoning of a type not accessible to AIXI-tl.

It's really the formalizability of the challenge as a computation which 
can be fed either a *single* AIXI-tl or a *single* tl-bounded uploaded 
human that makes the whole thing interesting at all... I'm sorry I didn't 
succeed in making clear the general class of real-world analogues for 
which this is a special case.

If I were to take a very rough stab at it, it would be that the 
cooperation case with your own clone is an extreme case of many scenarios 
where superintelligences can cooperate with each other on the one-shot 
Prisoner's Dilemna provided they have *loosely similar* reflective goal 
systems and that they can probabilistically estimate that enough loose 
similarity exists.

It's the natural counterpart of the Clone challenge - loosely similar goal 
systems arise all the time, and it turns out that in addition to those 
goal systems being interpreted as a constant environmental challenge, 
there are social problems that depend on your being able to correlate your 
internal processes with theirs (you can correlate internal processes 
because you're both part of the same naturalistic universe).  This breaks 
AIXI-tl because it's not loosely similar enough - an

RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel



Hmmm  My friend, I think you've pretty much convinced me with this last
batch of arguments.  Or, actually, I'm not sure if it was your excellently
clear arguments or the fact that I finally got a quiet 15 minutes to really
think about it (the three kids, who have all been out sick from school with
a flu all week, are all finally in bed ;)

Your arguments are a long way from a rigorous proof, and I can't rule out
that there might be a hole in them, but in this last e-mail you were
explicit enough to convince me that what you're saying makes logical sense.

I'm going to try to paraphrase your argument, let's see if we're somewhere
in the neighborhood of harmony...

Basically: you've got these two clones playing a cooperative game, and each
one, at each turn, is controlled by a certain program.  Each clone chooses
his "current operating program" by searching the space of all programs of
length < L that finish running in < T timesteps, and finding the one that,
based on his study of prior gameplay, is expected to give him the highest
chance of winning.  But each guy takes 2^T timesteps to perform this search.

So your basic point is that, because these clones are acting by simulating
programs that finish running in  If AIXI-tl needs general intelligence but fails to develop
> general intelligence to solve the complex cooperation problem, while
> humans starting out with general intelligence do solve the problem, then
> AIXI-tl has been broken.

Well, we have different definitions of "broken" in this context, but that's
not a point worth arguing about.

> But we aren't *talking* about whether AIXI-tl has a mindlike operating
> program.  We're talking about whether the physically realizable
> challenge,
> which definitely breaks the formalism, also breaks AIXI-tl in practice.
> That's what I originally stated, that's what you originally said you
> didn't believe, and that's all I'm trying to demonstrate.

Yes, you would seem to have successfully shown (logically and intuitively,
though not mathematically) that AIXItl's can be dumber in their interactions
with other AIXItl's, than humans are in their analogous interactions with
other humans.

I don't think you should describe this as "breaking the formalism", because
the formalism is about how a single AIXItl solves a fixed goal function, not
about how groups of AIXItl's interact.

But it's certainly an interesting result.  I hope that, even if you don't
take the time to prove it rigorously, you'll write it up in a brief,
coherent essay, so that others not on this list can appreciate it...

Funky stuff!! ;-)

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel



I'll read the rest of your message tomorrow...

> But we aren't *talking* about whether AIXI-tl has a mindlike operating
> program.  We're talking about whether the physically realizable
> challenge,
> which definitely breaks the formalism, also breaks AIXI-tl in practice.
> That's what I originally stated, that's what you originally said you
> didn't believe, and that's all I'm trying to demonstrate.



Your original statement was posed in a misleading way, perhaps not
intentionally.

There is no challenge on which *an* AIXI-tl doesn't outperform *an* uploaded
human.

What you're trying to show is that there's an inter-AIXI-tl social situation
in which AIXI-tl's perform less intelligently than humans do in a similar
inter-human situation.

If you had posed it this way, I wouldn't have been as skeptical initially.

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:
>
>> AIXI-tl *cannot* figure this out because its control process is not
>> capable of recognizing tl-computable transforms of its own policies
>> and strategic abilities, *only* tl-computable transforms of its own
>> direct actions.  Yes, it simulates entities who know this; it also
>> simulates every possible other kind of tl-bounded entity.  The
>> question is whether that internal knowledge appears as an advantage
>> recognized by the control process and given AIXI-tl's formal
>> definition, it does not appear to do so.
>
> I don't understand how you're deriving the conclusion in your final
> sentence.
>
> How do you know the circumstances in which AIXItl would be led to adopt
>  operating programs involving modeling its own policies and strategic
> abilities?

Because AIXI-tl is a completely specified system and it is therefore
possible to see certain bounds on its ability to model itself.  It has no
*direct* reflectivity except for its memory of its own actions and its
indirect reflectivity is limited by the ability of a tl-bounded process to
simulate the conclusions of a t2^l process.  (Note that under ordinary
circumstances AIXI-tl never needs this ability in order to *outperform* a
tl-bounded process; its internal tl-bounded processes will always model an
AIXI-tl as well as any tl-bounded process could.)  We need to distinguish
between abstract properties of the AIXI-tl's policies that an internal
process can understand, and specific outputs of AIXI-tl that the internal
process can predict.  AIXI-tl simulates all possible tl-bounded
semimeasures; some of those semimeasures will attempt to assign a
probability to sense data based on the abstract theory "You are facing
another AIXI-tl", but this abstract theory will not be enough to actually
predict that AIXI-tl's *specific* outputs in order to assign them a high
probability.  The design structure of the cooperative strategy task (note
that it is not the Prisoner's Dilemna but a complex cooperation problem)
is such that each AIXI-tl will choose a different tl-bounded policy (using
t2^l operations to do so).  Given that the abstract theory contained in
the tl-bounded probability semimeasure cannot access the tl-bounded policy
of either AIXI, nor itself utilize the t2^l process used to select among
policies, how is the semimeasure supposed to predict which actions the
*other* AIXI will take?  Even if such semimeasures succeed in bubbling to
the top of the probability distribution, how will a tl-bounded policy in
step 4 know which policy the other AIXI selected in order to coordinate
strategies?  There's no guarantee that the two policies are even
approximately the same - AIXI-tl's policy is in the same dilemna as a
human trying to work out the strategy in step 4 instead of step 2.

If you "still don't see it", could you please say *which* step in the
above reasoning first strikes you as a non sequitur?

> You may well be right that PD2 is not such a circumstance, but that
> doesn't mean there are no such circumstances, or that such
> circumstances wouldn't be common in the hypothetical life of a real
> embodies AIXItl

Breaking a universality claim only requires one counterexample.  Of course
there are at least some circumstances where AIXI-tl can outperform a human!

>> AIXI-tl learns vision *instantly*.  The Kolmogorov complexity of a
>> visual field is much less than its raw string, and the compact
>> representation can be computed by a tl-bounded process.  It develops
>> a visual cortex on the same round it sees its first color picture.
>
> Yes, but that "visual cortex" would not be useful for anything.  It
> would take some time for an embodied AIXItl to figure out how to
> recognize visual patterns in a way that was useful to it in
> coordinating its actions.  Unless it had a priori knowledge to guide
> it, this would be a substantial process of trial and error learning.

Okay, we discount the early trials as part of the bounded loss.  Standard
Operating Procedure for Hutter's proof.

>> Because it is physically or computationally impossible for a
>> tl-bounded program to access or internally reproduce the previously
>> computed policies or t2^l strategic ability of AIXI-tl.
>
> Yes, but why can't it learn patterns that let it approximately predict
> the strategies of AIXI-tl?

Uh... I honestly thought I just *said* why.  I'll try expanding; let me 
know if the expansion still doesn't help.

AIXI-tl, trying to predict itself, steadily adds more and more tl-bounded 
Kolmogorov complexity to the sensory inputs it needs to predict.  The true 
Kolmogorov complexity of AIXI-tl never exceeds the length of the AIXI-tl 
program plus the challenge computation C, which is actually pretty small 
change.  However, the tl-bounded Kolmogorov complexity keeps rising unless 
AIXI-tl is lucky enough to stumble on a probability distribution model 
which in the Secondary advises actions that confirm the probability 
distribution model in the Pr

RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel



Hi,

> You appear to be thinking of AIXI-tl as a fuzzy little harmless baby being
> confronted with some harsh trial.

Once again, your ability to see into my mind proves extremely flawed ;-)

You're right that my statement "AIXItl is slow at learning" was ill-said,
though.  It is very inefficient at learning in the sense that it takes a
huge number of computation steps to decide each action it takes.  However,
in your PD scenario you're assuming that it has a fast enough processor to
do all this thinking inbetween each step of the iterated PD, in which case,
yeah, it has to be doing very very fast operations.  AIXItl is slow at
learning if you count slowness in terms of computation steps, but that's not
the way your example wants us to lok at things...


>  > The question is whether after enough trials AIXI-tl figures out it's
>  > playing some entity similar to itself and learns how to act
>  > accordingly  If so, then it's doing what AIXI-tl is supposed to do.
>  >
> AIXI-tl *cannot* figure this out because its control process is not
> capable of recognizing tl-computable transforms of its own policies and
> strategic abilities, *only* tl-computable transforms of its own direct
> actions.  Yes, it simulates entities who know this; it also simulates
> every possible other kind of tl-bounded entity.  The question is whether
> that internal knowledge appears as an advantage recognized by the control
> process and given AIXI-tl's formal definition, it does not appear
> to do so.

I don't understand how you're deriving the conclusion in your final
sentence.

How do you know the circumstances in which AIXItl would be led to adopt
operating programs involving modeling its own policies and strategic
abilities?

You may well be right that PD2 is not such a circumstance, but that doesn't
mean there are no such circumstances, or that such circumstances wouldn't be
common in the hypothetical life of a real embodies AIXItl

>  > A human can also learn to solve vision recognition problems faster than
>  >  AIXI-tl, because we're wired for it (as we're wired for social
>  > gameplaying), whereas AIXI-tl has to learn
>
> AIXI-tl learns vision *instantly*.  The Kolmogorov complexity of a visual
> field is much less than its raw string, and the compact representation can
> be computed by a tl-bounded process.  It develops a visual cortex on the
> same round it sees its first color picture.

Yes, but that "visual cortex" would not be useful for anything.  It would
take some time for an embodied AIXItl to figure out how to recognize visual
patterns in a way that was useful to it in coordinating its actions.  Unless
it had a priori knowledge to guide it, this would be a substantial process
of trial and error learning.

>  >> Humans can recognize a much stronger degree of similarity in human
>  >> Other Minds than AIXI-tl's internal processes are capable of
>  >> recognizing in any other AIXI-tl.
>  >
>  > I don't believe that is true.
>
> Mentally simulate the abstract specification of AIXI-tl instead of using
> your intuitions about the behavior of a generic reinforcement process.

Eliezer, I don't know what a "generic reinforcement process" is.  Of course
AIXItl is very different from an ordinary reinforcement learning system.

>  > OK... here's where the fact that you have a tabula rasa AIXI-tl in a
>  > very limiting environment comes in.
>  >
>  > In a richer environment, I don't see why AIXI-tl, after a long enough
>  > time, couldn't learn an operating program that implicitly embodied an
>  > abstraction over its own internal state.
>
> Because it is physically or computationally impossible for a tl-bounded
> program to access or internally reproduce the previously computed
> policies
> or t2^l strategic ability of AIXI-tl.

Yes, but why can't it learn patterns that let it approximately predict the
strategies of AIXI-tl?

>  > In an environment consisting solely of PD2, it may be that AIXI-tl will
>  > never have the inspiration to learn this kind of operating program.
>  > (I'm not sure.)
>  >
>  > To me, this says mostly that PD2 is an inadequate environment for any
>  > learning system to use, to learn how to become a mind.  If it ain't
>  > good enough for AIXI-tl to use to learn how to become a mind, over a
>  > very long period of time, it probably isn't good for any AI system to
>  > use to learn how to become a mind.
>
> Marcus Hutter has formally proved your intuitions wrong.  In any
> situation
> that does *not* break the formalism, AIXI-tl learns to equal or
> outperform
> any other process, despite being a tabula rasa, no matter how
> rich or poor
> its environment.

No, Marcus Hutter did not prove the intuition I expressed there wrong.  You
seem not to have understood what I was saying.

AIXI-tl can equal or outperform any other process so long as it is given a
lot more computatonal resources than the other process.  But that was not
the statement I was making.

What I was saying was that ANY reinf

Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Michael Roy Ames

Eliezer S. Yudkowsky asked Ben Goertzel:
>
>  Do you have a non-intuitive mental simulation mode?
>

LOL  --#:^D

It *is* a valid question, Eliezer, but it makes me laugh.

Michael Roy Ames
[Who currently estimates his *non-intuitive mental simulation mode* to
contain about 3 iterations of 5 variables each - 8 variables each on a
good day.  Each variable can link to a concept (either complex or
simple)... and if that sounds to you like something that a trashed-out
Commodore 64 could emulate, then you have some idea how he feels being
stuck at his current level of non-intuitive intelligence.]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Eliezer S. Yudkowsky

Bill Hibbard wrote:

On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:


It *could* do this but it *doesn't* do this.  Its control process is such
that it follows an iterative trajectory through chaos which is forbidden
to arrive at a truthful solution, though it may converge to a stable
attractor.


This is the heart of the fallacy. Neither a human nor an AIXI
can know "that his synchronized other self - whichever one
he is - is doing the same". All a human or an AIXI can know is
its observations. They can estimate but not know the intentions
of other minds.


The halting problem establishes that you can never perfectly understand 
your own decision process well enough to predict its decision in advance, 
because you'd have to take into account the decision process including the 
prediction, et cetera, establishing an infinite regress.

However, Corbin doesn't need to know absolutely that his other self is 
synchronized, nor does he need to know his other self's decision in 
advance.  Corbin only needs to establish a probabilistic estimate, good 
enough to guide his actions, that his other self's decision is correlated 
with his *after* the fact.  (I.e., it's not a halting problem where you 
need to predict yourself in advance; you only need to know your own 
decision after the fact.)

AIXI-tl is incapable of doing this for complex cooperative problems 
because its decision process only models tl-bounded things and AIXI-tl is 
not *remotely close* to being tl-bounded.  Humans can model minds much 
closer to their own size than AIXI-tl can.  Humans can recognize when 
their policies, not just their actions, are reproduced.  We can put 
ourselves in another human's shoes imperfectly; AIXI-tl can't put itself 
in another AIXI-tl's shoes to the extent of being able to recognize the 
actions of an AIXI-tl computed using a process that is inherently 2t^l 
large.  Humans can't recognize their other selves perfectly but the gap in 
the case of AIXI-tl is enormously greater.  (Humans also have a reflective 
control process on which they can perform inductive and deductive 
generalizations and jump over a limited class of infinite regresses in 
decision processes, but that's a separate issue.  Suffice it to say that a 
subprocess which generalizes over its own infinite regress does not 
obviously suffice for AIXI-tl to generalize over the top-level infinite 
regress in AIXI-tl's control process.)

Let's say that AIXI-tl takes action A in round 1, action B in round 2, and 
action C in round 3, and so on up to action Z in round 26.  There's no 
obvious reason for the sequence {A...Z} to be predictable *even 
approximately* by any of the tl-bounded processes AIXI-tl uses for 
prediction.  Any given action is the result of a tl-bounded policy but the 
*sequence* of *different* tl-bounded policies was chosen by a t2^l process.

A human in the same situation has a mnemonic record of the sequence of 
policies used to compute their strategies, and can recognize correlations 
between the sequence of policies and the other agent's sequence of 
actions, which can then be confirmed by directing O(other-agent) strategic 
processing power at the challenge of seeing the problem from the opposite 
perspective.  AIXI-tl is physically incapable of doing this directly and 
computationally incapable of doing it indirectly.  This is not an attack 
on the computability of intelligence; the human is doing something 
perfectly computable which AIXI-tl does not do.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:
>> Even if a (grown) human is playing PD2, it outperforms AIXI-tl
>> playing PD2.
>
> Well, in the long run, I'm not at all sure this is the case.  You
> haven't proved this to my satisfaction.

PD2 is very natural to humans; we can take for granted that humans excel
at PD2.  The question is AIXI-tl.

> In the short run, it certainly is the case.  But so what?  AIXI-tl is
> damn slow at learning, we know that.

AIXI-tl is most certainly not "damn slow" at learning any environment that
can be tl-bounded.  For problems that don't break the Cartesian formalism,
AIXI-tl learns only slightly lower than the fastest possible tl-bounded
learner.  It's got t2^l computing power for gossakes!  From our
perspective it learns at faster than the fastest rate humanly imaginable -
literally.

You appear to be thinking of AIXI-tl as a fuzzy little harmless baby being
confronted with some harsh trial.  That fuzzy little harmless baby, if the
tl-bound is large enough to simulate Lee Corbin, is wielding something
like 10^10^15 operations per second, which it is using to *among other
things* simulate every imaginable human experience.  AIXI-tl is larger
than universes; it contains all possible tl-bounded heavens and all 
possible tl-bounded hells.  The only question is whether its control 
process makes any good use of all that computation.

More things from the list of system properties that Friendliness 
programmers should sensitize themselves to:  Just because the endless 
decillions of alternate Ben Goertzels in torture chambers are screaming to 
God to stop it doesn't mean that AIXI-tl's control process cares.

> The question is whether after enough trials AIXI-tl figures out it's
> playing some entity similar to itself and learns how to act
> accordingly  If so, then it's doing what AIXI-tl is supposed to do.
>
AIXI-tl *cannot* figure this out because its control process is not
capable of recognizing tl-computable transforms of its own policies and
strategic abilities, *only* tl-computable transforms of its own direct
actions.  Yes, it simulates entities who know this; it also simulates
every possible other kind of tl-bounded entity.  The question is whether
that internal knowledge appears as an advantage recognized by the control
process and given AIXI-tl's formal definition, it does not appear to do so.

In my humble opinion, one of the (many) critical skills for creating AI is
learning to recognize what systems *really actually do* and not just what
you project onto them.  See also Eliza effect, failure of GOFAI, etc.

> A human can also learn to solve vision recognition problems faster than
>  AIXI-tl, because we're wired for it (as we're wired for social
> gameplaying), whereas AIXI-tl has to learn

AIXI-tl learns vision *instantly*.  The Kolmogorov complexity of a visual
field is much less than its raw string, and the compact representation can
be computed by a tl-bounded process.  It develops a visual cortex on the
same round it sees its first color picture.

>> Humans can recognize a much stronger degree of similarity in human
>> Other Minds than AIXI-tl's internal processes are capable of
>> recognizing in any other AIXI-tl.
>
> I don't believe that is true.

Mentally simulate the abstract specification of AIXI-tl instead of using
your intuitions about the behavior of a generic reinforcement process. 
Eventually the results you learn will be integrated into your intuitions 
and you'll be able to directly see dependencies betwen specifications and 
reflective modeling abilities.

> OK... here's where the fact that you have a tabula rasa AIXI-tl in a
> very limiting environment comes in.
>
> In a richer environment, I don't see why AIXI-tl, after a long enough
> time, couldn't learn an operating program that implicitly embodied an
> abstraction over its own internal state.

Because it is physically or computationally impossible for a tl-bounded 
program to access or internally reproduce the previously computed policies 
or t2^l strategic ability of AIXI-tl.

> In an environment consisting solely of PD2, it may be that AIXI-tl will
> never have the inspiration to learn this kind of operating program.
> (I'm not sure.)
>
> To me, this says mostly that PD2 is an inadequate environment for any
> learning system to use, to learn how to become a mind.  If it ain't
> good enough for AIXI-tl to use to learn how to become a mind, over a
> very long period of time, it probably isn't good for any AI system to
> use to learn how to become a mind.

Marcus Hutter has formally proved your intuitions wrong.  In any situation 
that does *not* break the formalism, AIXI-tl learns to equal or outperform 
any other process, despite being a tabula rasa, no matter how rich or poor 
its environment.

>> Anyway... basically, if you're in a real-world situation where the
>> other intelligence has *any* information about your internal state,
>> not just from direct examination, but from reasoning about your
>> origin

RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel


> Even if a (grown) human is playing PD2, it outperforms AIXI-tl playing
> PD2.

Well, in the long run, I'm not at all sure this is the case.  You haven't
proved this to my satisfaction.

In the short run, it certainly is the case.  But so what?  AIXI-tl is damn
slow at learning, we know that.

The question is whether after enough trials AIXI-tl figures out it's playing
some entity similar to itself and learns how to act accordingly  If so,
then it's doing what AIXI-tl is supposed to do.

A human can also learn to solve vision recognition problems faster than
AIXI-tl, because we're wired for it (as we're wired for social gameplaying),
whereas AIXI-tl has to learn


> Humans can recognize a much stronger degree of similarity in human Other
> Minds than AIXI-tl's internal processes are capable of recognizing in any
> other AIXI-tl.

I don't believe that is true.

> Again, as far as I can tell, this
> necessarily requires abstracting over your own internal state and
> recognizing that the outcome of your own (internal) choices are
> necessarily reproduced by a similar computation elsewhere.
> Basically, it
> requires abstracting over your own halting problem to realize that the
> final result of your choice is correlated with that of the process
> simulated, even though you can't fully simulate the causal process
> producing the correlation in advance.  (This doesn't *solve* your own
> halting problem, but at least it enables you to *understand* the
> situation
> you've been put into.)  Except that instead of abstracting over your own
> halting problem, you're abstracting over the process of trying to
> simulate
> another mind trying to simulate you trying to simulate it, where
> the other
> mind is sufficiently similar to your own.  This is a kind of reasoning
> qualitatively closed to AIXI-tl; its control process goes on abortively
> trying to simulate the chain of simulations forever, stopping and
> discarding that prediction as unuseful as soon as it exceeds the t-bound.

OK... here's where the fact that you have a tabula rasa AIXI-tl in a very
limiting environment comes in.

In a richer environment, I don't see why AIXI-tl, after a long enough time,
couldn't learn an operating program that implicitly embodied an abstraction
over its own internal state.

In an environment consisting solely of PD2, it may be that AIXI-tl will
never have the inspiration to learn this kind of operating program.  (I'm
not sure.)

To me, this says mostly that PD2 is an inadequate environment for any
learning system to use, to learn how to become a mind.  If it ain't good
enough for AIXI-tl to use to learn how to become a mind, over a very long
period of time, it probably isn't good for any AI system to use to learn how
to become a mind.

> Anyway... basically, if you're in a real-world situation where the other
> intelligence has *any* information about your internal state, not just
> from direct examination, but from reasoning about your origins, then that
> also breaks the formalism and now a tl-bounded seed AI can outperform
> AIXI-tl on the ordinary (non-quined) problem of cooperation with a
> superintelligence.  The environment can't ever *really* be constant and
> completely separated as Hutter requires.  A physical environment that
> gives rise to an AIXI-tl is different from the environment that
> gives rise
> to a tl-bounded seed AI, and the different material implementations of
> these entities (Lord knows how you'd implement the AIXI-tl) will have
> different side effects, and so on.  All real world problems break the
> Cartesian assumption.  The questions "But are there any kinds of problems
> for which that makes a real difference?" and "Does any
> conceivable kind of
> mind do any better?" can both be answered affirmatively.

Welll  I agree with only some of this.

The thing is, an AIXI-tl-driven AI embedded in the real world would have a
richer environment to draw on than the impoverished data provided by PD2.
This AI would eventually learn how to model itself and reflect in a rich way
(by learning the right operating program).

However, AIXI-tl is a horribly bad AI algorithm, so it would take a VERY
VERY long time to carry out this learning, of course...

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:

OK.  Rather than responding point by point, I'll try to say something
compact ;)

You're looking at the interesting scenario of a iterated prisoners dilemma
between two AIXI-tl's, each of which has a blank operating program at the
start of the iterated prisoners' dilemma.  (In parts of my last reply, I was
questioning the blankness of the operating program, but let's accept it for
sake of discussion.)

The theorems about AIXI-tl do not say much about the performance of AIXI-tl
relative to other systems on this task.  Because what the theorems talk
about is

AIXI-tl maximizing reward function R
versus
System X maximizing reward function R

over a long period of time.  Whereas in your case you're asking about

AIXI-tl maximizing reward function R(AIXI_tl)
versus
System X maximizing reward function R(X)

i.e. the reward function is a function of the system in question.  AIXI-tl
and System X (e.g. an uploaded human) are not competing against the same
opponent, they're competing against different opponents (their clones, in
your scenario).

So, unless I'm overlooking something, you're looking at a scenario not
covered by Hutter's theorems.


That is correct.  As I said:

"An intuitively fair, physically realizable challenge, with important 
real-world analogues, formalizable as a computation which can be fed 
either a tl-bounded uploaded human or an AIXI-tl, for which the human 
enjoys greater success measured strictly by total reward over time, due to 
the superior strategy employed by that human as the result of rational 
reasoning of a type not accessible to AIXI-tl."

Obviously, such a challenge cannot be covered by Hutter's theorems or 
AIXI-tl would outperform the human.  The question is whether Hutter's 
theorems describe all the realistic physical situations a mind can encounter.

You're stating that a human (System X) can do better in an iterated PD
against other humans, than an AIXItl can do in an iterated PD against other
AIXItl's.


That is correct.  Humans (and Friendly AIs) can employ Hofstadterian 
superrationality as a strategy; AIXI-tl cannot.

I still have problems understanding your reasoning, when you derive this
conclusion.  Maybe I'm just being obtuse; I'm sure I haven't spent as much
time thinking about it as you have.

But, suppose you're right.  What you've done is come up with an interesting
observation (and if you formalize it, an interesting theorem) about (small)
social systems of AIXI-tl's.  This is very nice.

Does this somehow tell you something about the interactions of AIXI-tl's
with humans?  Is that the follow-up point you want to make, regarding AIXItl
Friendliness?


Nope.  The two points are, if not completely unrelated, then related only 
on such a deep level that I wasn't explicitly planning to point it out.

But Hofstadterian superrationality - and certain other generalized 
challenges - are physically realizable, important, and can be solved by 
humans because we have superior reflectivity to AIXI-tl.

Your observation is about the behavior of an AIXI/AIXI-tl whose only
life-experience has consisted of a very weird artificial situation.  This
behavior is not going to be the same as the behavior of an AIXI/AIXItl
embedded in a richer environment with a different reward function.  That is
the point I was trying to make with my talk about the "initial operating
program" of the AIXI/AIXItl in your simulation.


Yes.  It is a quite irrelevant point to breaking Hutter's theorem.  Also, 
specifying an AIXI-tl embedded in some unspecified prior environment 
injects entropy into the problem description.  I show that AIXI-tl does 
learn to recognize its own reflected "way of thinking" (as opposed to its 
reflected actions, which AIXI-tl *can* recognize) because AIXI-tl cannot, 
as a human would, remember its own way of thinking or deliberately "place 
itself into the other human's shoes" and simulate its own way of thinking 
given different goals, both abilities available at the human level of 
reflectivity; AIXI-tl can only place itself into the shoes of tl-bounded 
processes.  This prohibits AIXI-tl from using Hofstadterian 
superrationality to notice that the policies of other entities correlate 
with its own policies, and prohibits AIXI-tl from choosing the policies of 
other entities by selecting its own policies based on the knowledge that 
the policy the other entity chooses will correlate with its own.  There 
are additional Other Mind correlation problems that humans can't solve but 
seed AIs can because of the seed AI's superior reflectivity; the point is 
that there's a real kind of intelligence here of which AIXI-tl arguably 
has quantity zero.

Now, let me get back to my problems understanding your reasoning.  Consider
the problem

PD(Y) =
"System Y plays iterated PD against a clone of System Y"

Clearly, PD(Y) is not a problem at which one would expect more intelligent
systems to necessarily perform better than less intelligent ones!!


True, bu

RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel


> Really, when has a computer (with the exception of certain Microsoft
> products) ever been able to disobey it's human masters?
>
> It's easy to get caught up in the romance of "superpowers", but come on,
> there's nothing to worry about.
>
> -Daniel

Hi Daniel,

Clearly there is nothing to worry about TODAY.

And I'm spending the vast bulk of my time working on practical AI design and
engineering and application work, not on speculating about the future.

However, I do believe that once AI tech has advanced far enough, there WILL
be something to worry about.

How close we are to this point is another question.

Current AI practice is very far away from achieving autonomous general
intelligence.

If I'm right about the potential of Novamente and similar designs, we could
be within a decade of getting there

If I'm wrong, well, Kurzweil has made some decent arguments why we'll get
there by 2050 or so... ;-)

-- Ben Goertzel

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel

OK.  Rather than responding point by point, I'll try to say something
compact ;)

You're looking at the interesting scenario of a iterated prisoners dilemma
between two AIXI-tl's, each of which has a blank operating program at the
start of the iterated prisoners' dilemma.  (In parts of my last reply, I was
questioning the blankness of the operating program, but let's accept it for
sake of discussion.)

The theorems about AIXI-tl do not say much about the performance of AIXI-tl
relative to other systems on this task.  Because what the theorems talk
about is

AIXI-tl maximizing reward function R
versus
System X maximizing reward function R

over a long period of time.  Whereas in your case you're asking about

AIXI-tl maximizing reward function R(AIXI_tl)
versus
System X maximizing reward function R(X)

i.e. the reward function is a function of the system in question.  AIXI-tl
and System X (e.g. an uploaded human) are not competing against the same
opponent, they're competing against different opponents (their clones, in
your scenario).

So, unless I'm overlooking something, you're looking at a scenario not
covered by Hutter's theorems.

You're stating that a human (System X) can do better in an iterated PD
against other humans, than an AIXItl can do in an iterated PD against other
AIXItl's.

I still have problems understanding your reasoning, when you derive this
conclusion.  Maybe I'm just being obtuse; I'm sure I haven't spent as much
time thinking about it as you have.

But, suppose you're right.  What you've done is come up with an interesting
observation (and if you formalize it, an interesting theorem) about (small)
social systems of AIXI-tl's.  This is very nice.

Does this somehow tell you something about the interactions of AIXI-tl's
with humans?  Is that the follow-up point you want to make, regarding AIXItl
Friendliness?

Your observation is about the behavior of an AIXI/AIXI-tl whose only
life-experience has consisted of a very weird artificial situation.  This
behavior is not going to be the same as the behavior of an AIXI/AIXItl
embedded in a richer environment with a different reward function.  That is
the point I was trying to make with my talk about the "initial operating
program" of the AIXI/AIXItl in your simulation.

Now, let me get back to my problems understanding your reasoning.  Consider
the problem

PD(Y) =
"System Y plays iterated PD against a clone of System Y"

Clearly, PD(Y) is not a problem at which one would expect more intelligent
systems to necessarily perform better than less intelligent ones!!

Now consider two subproblems

PD1(Y) =
PD(Y), but each System Y knows it's playing a clone of itself

PD2(Y) =
PD(Y), but each System Y is playing a completely unidentified, mysterious
opponent

I'm worried that in your comparison, you have the human upload playing PD1,
but have the AIXI-tl playing PD2.

PD1 is easier, but PD1 doesn't seem to be your scenario, because it requires
the AIXItl not to be starting with a blank operating program.

Or do you have them both playing PD2?

If a human is playing PD2, then it has to proceed solely by time series
analysis, and its actions are probably going to meander around chaotically
until settling on some attractor (or maybe they'll just meander around...).
MAYBE the human manages to recognize that the responses of its opponent are
so similar to its own responses that its opponent must be a lot like
itself... and this helps it settle on a beneficial attractor.

If an AIXItl is playing PD2, the situation is pretty much the same as if the
human is doing so, isn't it?  Except can't you argue that an AIXItl is so
smart that in the long run it's more likely than a human to figure out that
its opponent is acting a lot like it is, and make a guess that symmetrical
friendly behavior might be a good thing?

-- Ben

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Eliezer S. Yudkowsky
> Sent: Friday, February 14, 2003 1:45 AM
> To: [EMAIL PROTECTED]
> Subject: Re: [agi] Breaking AIXI-tl
>
>
> Ben Goertzel wrote:
>  >
>  >> Because AIXI-tl is not an entity deliberately allocating computing
>  >> power; its control process is fixed.  AIXI-tl will model a process
>  >> that proves theorems about AIXI-tl only if that process is the best
>  >> predictor of the environmental information seen so far.
>  >
>  > Well... a human's control process is fixed too, in a way.  We cannot
>  > rewire our brains, our biological motivators.  And a human will
>  > accurately model other humans only if its fixed motivators have
>  > (directly or indirectly) led it to do so...
>
> I think you're anthropomorphizing AIXI.  (I think you're
>

RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Daniel Colonnese

> There is a lot of variation in human
> psychology, and some humans are pretty damn dangerous.  Also there is
the
> maxim "power corrupts, and absolute power corrupts absolutely" which
tells
> you something about human psychology.  A human with superintelligence
and
> superpowers could be a great thing or a terrible thing -- it's hard to
> balance this unknown outcome against the unknown outcome of an AGI.

Really, when has a computer (with the exception of certain Microsoft
products) ever been able to disobey it's human masters? 

It's easy to get caught up in the romance of "superpowers", but come on,
there's nothing to worry about.

-Daniel

*
 Daniel Colonnese
 Computer Science Dept. NCSU
 2718 Clark Street
 Raleigh NC 27670
 Voice: (919) 451-3141
 Fax: (775) 361-4495
 http://www4.ncsu.edu:8030/~dcolonn/
* 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On
Behalf Of Ben Goertzel
Sent: Friday, February 14, 2003 8:46 AM
To: [EMAIL PROTECTED]
Subject: RE: [agi] Breaking AIXI-tl




Hi Eliezer

Some replies to "side points":

> This is a critical class of problem for would-be implementors of
> Friendliness.  If all AIs, regardless of their foundations, did sort
of
> what humans would do, given that AI's capabilities, the whole world
would
> be a *lot* safer.

Hmmm.  I don't believe that.  There is a lot of variation in human
psychology, and some humans are pretty damn dangerous.  Also there is
the
maxim "power corrupts, and absolute power corrupts absolutely" which
tells
you something about human psychology.  A human with superintelligence
and
superpowers could be a great thing or a terrible thing -- it's hard to
balance this unknown outcome against the unknown outcome of an AGI.

>  > In this way Novamentes will be more like humans, but with the
>  > flexibility to change their hard-wired motivators as well, if they
>  > REALLY want to...
>
> And what they do with that flexibility will be totally unlike what you
> would do in that situation,

Well, yeah  Of course.  Novamente is not a model of the human
brain-mind, and its behavior will almost always be different than that
of
humans.

Ethically speaking, I don't consider human behavior a tremendously great
model anyway.  Read the damn newspaper!!  We are quite possibly on a
path to
self-destruction through rampant unethical violent behavior...

> The task of AGI is not to see that the computers in front of us
> "could" do
> something, but to figure out what are the key differences that we must
> choose among to make them actually do it.  This holds for Friendliness
as
> well.  That's why I worry when you see Friendliness in AIXI that isn't
> there.  AIXI "could" be Friendly, in the sense that it is capable of
> simulating Friendly minds; and it's possible to toss off a loose
argument
> that AIXI's control process will arrive at Friendliness.  But AIXI
will
> not end up being Friendly, no matter what the pattern of inputs and
> rewards.  And what I'm afraid of is that neither will Novamente.

Well, first of all, there is not terribly much relation btw AIXI/AIXItl
and
Novamente, so what you show about the former system means very little
about
the latter.

As for the Friendliness of AIXI/AIXItl, it is obvious that an
AIXI/AIXItl
system will never have a deepest-level implicit or explicit supergoal
that
is *ethical*.  Its supergoal is just to maximize its reward.  Period.
So it
can act beneficially to humans for an arbitrarily long period of time,
if
its reward structure has been set up that way.

By positing an AIXI/AIXItl system that is connected with a specific
reward
mechanism (e.g. a button pushed by humans, an electronic sensor that is
part
of a robot body, etc.) you are then positing something beyond vanilla
AIXI/AIXItl: you're positing an AIXI/AIXItl that is embedded in the
world in
some way.

The notion of Friendliness does not exist on the level of pure, abstract
AIXI/AIXItl, does it?  It exists on the level of world-embedded
AIXI/AIXItl.
And once you're looking at world-embedded AIXI/AIXItl, you no longer
have a
purely formal characterization of AIXI/AIXItl, do you?

Ben

---
To unsubscribe, change your address, or temporarily deactivate your
subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel



Hi Eliezer

Some replies to "side points":

> This is a critical class of problem for would-be implementors of
> Friendliness.  If all AIs, regardless of their foundations, did sort of
> what humans would do, given that AI's capabilities, the whole world would
> be a *lot* safer.

Hmmm.  I don't believe that.  There is a lot of variation in human
psychology, and some humans are pretty damn dangerous.  Also there is the
maxim "power corrupts, and absolute power corrupts absolutely" which tells
you something about human psychology.  A human with superintelligence and
superpowers could be a great thing or a terrible thing -- it's hard to
balance this unknown outcome against the unknown outcome of an AGI.

>  > In this way Novamentes will be more like humans, but with the
>  > flexibility to change their hard-wired motivators as well, if they
>  > REALLY want to...
>
> And what they do with that flexibility will be totally unlike what you
> would do in that situation,

Well, yeah  Of course.  Novamente is not a model of the human
brain-mind, and its behavior will almost always be different than that of
humans.

Ethically speaking, I don't consider human behavior a tremendously great
model anyway.  Read the damn newspaper!!  We are quite possibly on a path to
self-destruction through rampant unethical violent behavior...

> The task of AGI is not to see that the computers in front of us
> "could" do
> something, but to figure out what are the key differences that we must
> choose among to make them actually do it.  This holds for Friendliness as
> well.  That's why I worry when you see Friendliness in AIXI that isn't
> there.  AIXI "could" be Friendly, in the sense that it is capable of
> simulating Friendly minds; and it's possible to toss off a loose argument
> that AIXI's control process will arrive at Friendliness.  But AIXI will
> not end up being Friendly, no matter what the pattern of inputs and
> rewards.  And what I'm afraid of is that neither will Novamente.

Well, first of all, there is not terribly much relation btw AIXI/AIXItl and
Novamente, so what you show about the former system means very little about
the latter.

As for the Friendliness of AIXI/AIXItl, it is obvious that an AIXI/AIXItl
system will never have a deepest-level implicit or explicit supergoal that
is *ethical*.  Its supergoal is just to maximize its reward.  Period.  So it
can act beneficially to humans for an arbitrarily long period of time, if
its reward structure has been set up that way.

By positing an AIXI/AIXItl system that is connected with a specific reward
mechanism (e.g. a button pushed by humans, an electronic sensor that is part
of a robot body, etc.) you are then positing something beyond vanilla
AIXI/AIXItl: you're positing an AIXI/AIXItl that is embedded in the world in
some way.

The notion of Friendliness does not exist on the level of pure, abstract
AIXI/AIXItl, does it?  It exists on the level of world-embedded AIXI/AIXItl.
And once you're looking at world-embedded AIXI/AIXItl, you no longer have a
purely formal characterization of AIXI/AIXItl, do you?

Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Bill Hibbard

On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:

> Ben Goertzel wrote:
> . . .
>  >> Lee Corbin can work out his entire policy in step (2), before step
>  >> (3) occurs, knowing that his synchronized other self - whichever one
>  >> he is - is doing the same.
>  >
>  > OK -- now, if AIXItl were starting out with the right program, it could
>  > do this too, because the program could reason "that other AIXItl is
>  > gonna do the same thing as me, so based on this knowledge, what should
>  > I do"
>
> It *could* do this but it *doesn't* do this.  Its control process is such
> that it follows an iterative trajectory through chaos which is forbidden
> to arrive at a truthful solution, though it may converge to a stable
> attractor.
> . . .

This is the heart of the fallacy. Neither a human nor an AIXI
can know "that his synchronized other self - whichever one
he is - is doing the same". All a human or an AIXI can know is
its observations. They can estimate but not know the intentions
of other minds.

Bill

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-13 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:
>
>> Because AIXI-tl is not an entity deliberately allocating computing
>> power; its control process is fixed.  AIXI-tl will model a process
>> that proves theorems about AIXI-tl only if that process is the best
>> predictor of the environmental information seen so far.
>
> Well... a human's control process is fixed too, in a way.  We cannot
> rewire our brains, our biological motivators.  And a human will
> accurately model other humans only if its fixed motivators have
> (directly or indirectly) led it to do so...

I think you're anthropomorphizing AIXI.  (I think you're 
anthropomorphizing Novamente too, btw, but I have complete access to 
AIXI's formalism, so only there can I actually show your intuitions to be 
wrong.)  You said that AIXI-tl *could in theory* model something.  AIXI-tl 
*would not in fact* model that thing, given its control process.  While 
humans *would in fact* model that thing, given theirs.  I am not arguing 
about fixed versus unfixed control processes but pointing out that the 
specific human control process is superior to AIXI-tl.

(For those of you on the list who are not aware that I am not an AI 
skeptic, this is not a Penrosian argument against the computational 
implementation of intelligence, it's an argument against the AIXI-tl 
Cartesian formalism for intelligence.)

You are "anthropomorphizing" AIXI in the sense that you expect AIXI to do 
what you would do given AIXI's raw capabilities, but it's possible to look 
at AIXI's control process and see that it does not, in fact, do that.

This is a critical class of problem for would-be implementors of 
Friendliness.  If all AIs, regardless of their foundations, did sort of 
what humans would do, given that AI's capabilities, the whole world would 
be a *lot* safer.

> Of course, humans are very different from AIXI-tl, because in humans
> there is a gradation from totally hard-wired to totally
> ephemeral/flexible, whereas in AIXI-tl there's a rigid dichotomy
> between the hard-wired control program and the ephemeral operating
> program.
>
> In this way Novamentes will be more like humans, but with the
> flexibility to change their hard-wired motivators as well, if they
> REALLY want to...

And what they do with that flexibility will be totally unlike what you 
would do in that situation, unless you understand the sensitive 
dependencies between a mind's foundations and how that mind behaves.

You expect AIXI to behave like Novamente, and you expect both to behave 
like a human mind.  You are mistaken with respect to both AIXI and 
Novamente, but I can only demonstrate it for AIXI.  (Please don't reply 
with a list of differences you perceive between AIXI/Novamente/humans; I 
know you perceive *some* differences.)

>> Lee Corbin can work out his entire policy in step (2), before step
>> (3) occurs, knowing that his synchronized other self - whichever one
>> he is - is doing the same.
>
> OK -- now, if AIXItl were starting out with the right program, it could
> do this too, because the program could reason "that other AIXItl is
> gonna do the same thing as me, so based on this knowledge, what should
> I do"

It *could* do this but it *doesn't* do this.  Its control process is such 
that it follows an iterative trajectory through chaos which is forbidden 
to arrive at a truthful solution, though it may converge to a stable 
attractor.

> But you seem to be assuming that
>
> a) the Lee Corbin starts out with a head full of knowledge achieved
> through experience

That is correct.  AIXI-tl is supposed to equal or surpass *any* specific 
tl-bounded program given enough time.  I could give Lee Corbin a computer 
implant.  I could put AIXI-tl up against a tl-bounded superintelligence. 
AIXI-tl is still supposed to win.  You are applying anthropomorphic 
reasoning ("a head full of knowledge achieved through experience") to a 
formally specified problem.

> b) the AIXItl starts out without a reasonable operating program, and
> has to learn everything from scratch during the experiment

That is not formally a problem if the experiment lasts long enough.  Also, 
please note that being armed with the capability to simulate 2^l programs 
tl-bounded to THE SIZE OF AN ENTIRE HUMAN MIND is, anthropomorphically 
speaking, one HELL of a capability.  This capability is supposed to equal 
or overwhelm Corbin's regardless of what "knowledge is stuffed into his head".

> What if you used, for the competition a Lee Corbin with a tabula rasa
> brain, an infant Lee Corbin.  It wouldn't perform very well, as it
> wouldn't even understand the competition.

Again: anthropomorphic reasoning about a formally specifiable problem. 
Lee Corbin is tl-bounded therefore the contest is fair.  If the contest 
goes on long enough AIXI-tl should win or, at worst, lose by a bounded amount.

> Of course, if you put a knowledgeable human up against a new baby
> AIXI-tl, the knowledgeable human can win an intelligence contest.  You
> don't ne

RE: [agi] Breaking AIXI-tl

2003-02-13 Thread Ben Goertzel


Eliezer,

I will print your message and read it more slowly tomorrow morning when my
brain is better rested.

But I can't resist some replies now, albeit on 4 hours of sleep ;)

> Because AIXI-tl is not an entity deliberately allocating computing power;
> its control process is fixed.  AIXI-tl will model a process that proves
> theorems about AIXI-tl only if that process is the best predictor of the
> environmental information seen so far.

Well... a human's control process is fixed too, in a way.  We cannot rewire
our brains, our biological motivators.  And a human will accurately model
other humans only if its fixed motivators have (directly or indirectly)
led it to do so...

Of course, humans are very different from AIXI-tl, because in humans there
is a gradation from totally hard-wired to totally ephemeral/flexible,
whereas
in AIXI-tl there's a rigid dichotomy between the hard-wired control program
and the ephemeral operating program.

In this way Novamentes will be more like humans, but with the flexibility to
change their hard-wired motivators as well, if they REALLY want to...


[snipped out description of problem scenario]

> Lee Corbin can work out his entire policy in step (2), before step (3)
> occurs, knowing that his synchronized other self - whichever one he is -
> is doing the same.

OK -- now, if AIXItl were starting out with the right program, it could do
this too, because the program could reason "that other AIXItl is gonna do
the same thing as me, so based on this knowledge, what should I do"

But you seem to be assuming that

a) the Lee Corbin starts out with a head full of knowledge achieved through
experience

b) the AIXItl starts out without a reasonable operating program, and has
to learn everything from scratch during the experiment

What if you used, for the competition a Lee Corbin with a tabula rasa brain,
an infant Lee Corbin.  It wouldn't perform very well, as it wouldn't
even understand the competition.

Of course, if you put a knowledgeable human up against a new baby AIXI-tl,
the knowledgeable human can win an intelligence contest.  You don't need
the Prisoner's Dilemma to prove this.  Just ask them both what 2+2 equals.
The baby AIXI-tl will have no way to know.

Now, if you give the AIXI-tl enough time and experience to learn about
Prisoners Dilemma situations -- or, to learn about selves and minds and
computer systems -- then it will evolve an operating program that knows
how to reason somewhat like a human does, with concepts like "that other
AIXI-tl is just like me, so it will think and act like I do."


> The major point is as follows:  AIXI-tl is unable to arrive at a valid
> predictive model of reality because the sequence of inputs it sees, on
> successive rounds, are being produced by AIXI-tl trying to model the
> inputs using tl-bounded programs, while in fact those inputs are really
> the outputs of the non-tl-bounded AIXI-tl.  If a tl-bounded program
> correctly predicts the inputs seen so far, it will be using some
> inaccurate model of the actual reality, since no tl-bounded program can
> model the actual computational process AIXI-tl uses to select outputs.

Yah, but Lee Corbin can't model (in perfect detail) the actual computational
process the other Lee Corbin uses to select outputs, either.  So what?


> Humans can use a naturalistic representation of a reality in which they
> are embedded, rather than being forced like AIXI-tl to reason about a
> separated environment; consequently humans are capable of rationally
> reasoning about correlations between their internal mental processes and
> other parts of reality, which is the key to the complex cooperation
> problem with your own clone - the realization that you can actually
> *decide* your clone's actions in step (2), if you make the right
> agreements with yourself and keep them.

I don't see why an AIXI-tl with a clever operating program coming into the
competition couldn't make the same realization that a human does.

So your argument is that a human baby mind exposed ONLY to prisoners'
dilemma
interactions as its environment would somehow learn to "realize it can
decide its clone's actions", whereas a baby AIXI-tl mind exposed only to
these
interactions cannot carry out this learning?

> (b)  This happens because of a hidden assumption built into the
> formalism,
> wherein AIXI devises a Cartesian model of a separated environmental
> theatre, rather than devising a model of a naturalistic reality that
> includes AIXI.

It seems to me this has to do with the nature of AIXI-tl's operating
program.

With the right operating program, AIXI-tl would model reality in a way that
included AIXI-tl.  It would do so, only if this operating program were
useful
to it

For example, if you wrapped up AIXI-tl in a body with skin and actuators and
sensors, it would find that modeling the world as containing AIXI-tl was a
very
useful strategy.  Just as baby humans find that modeling the world as
containing
ba

Re: [agi] Breaking AIXI-tl

2003-02-13 Thread Eliezer S. Yudkowsky

Ben Goertzel wrote:
> Eliezer,
>
>> A (selfish) human upload can engage in complex cooperative strategies
>> with an exact (selfish) clone, and this ability is not accessible to
>> AIXI-tl, since AIXI-tl itself is not tl-bounded and therefore cannot
>> be simulated by AIXI-tl, nor does AIXI-tl have any means of
>> abstractly representing the concept "a copy of myself".  Similarly,
>> AIXI is not computable and therefore cannot be simulated by AIXI.
>> Thus both AIXI and AIXI-tl break down in dealing with a physical
>> environment that contains one or more copies of them.  You might say
>> that AIXI and AIXI-tl can both do anything except recognize
>> themselves in a mirror.
>
> I disagree with the bit about 'nor does AIXI-tl have any means of
> abstractly representing the concept "a copy of myself".'
>
> It seems to me that AIXI-tl is capable of running programs that contain
> such an abstract representation.  Why not?  If the parameters are
> right, it can run programs vastly more complex than a human brain
> upload...
>
> For example, an AIXI-tl can run a program that contains the AIXI-tl
> algorithm, as described in Hutter's paper, with t and l left as free
> variables.  This program can then carry out reasoning using predicate
> logic, about AIXI-tl in general, and about AIXI-tl for various values
> of t and l.
>
> Similarly, AIXI can run a program that contains a mathematical
> description of AIXI similar to the one in Hutter's paper.  This program
> can then prove theorems about AIXI using predicate logic.
>
> For instance, if AIXI were rewarded for proving math theorems about
> AGI, eventually it would presumably learn to prove theorems about AIXI,
> extending Hutter's theorems and so forth.

Yes, AIXI can indeed prove theorems about AIXI better than any human. 
AIXI-tl can prove theorems about AIXI-tl better than any tl-bounded human. 
 AIXI-tl can model AIXI-tl as well as any tl-bounded human.  AIXI-tl can 
model a tl-bounded human, say Lee Corbin, better than any tl-bounded 
human; given deterministic physics it's possible AIXI-tl can model Lee 
Corbin better than Lee Corbin (although I'm not quite as sure of this). 
But AIXI-tl can't model an AIXI-tl in the same way that a Corbin-tl can 
model a Corbin-tl.  See below.

>> The simplest case is the one-shot Prisoner's Dilemna against your own
>> exact clone.  It's pretty easy to formalize this challenge as a
>> computation that accepts either a human upload or an AIXI-tl.  This
>> obviously breaks the AIXI-tl formalism.  Does it break AIXI-tl?  This
>> question is more complex than you might think.  For simple problems,
>> there's a nonobvious way for AIXI-tl to stumble onto incorrect
>> hypotheses which imply cooperative strategies, such that these
>> hypotheses are stable under the further evidence then received.  I
>> would expect there to be classes of complex cooperative problems in
>> which the chaotic attractor AIXI-tl converges to is suboptimal, but I
>> have not proved it.  It is definitely true that the physical problem
>> breaks the AIXI formalism and that a human upload can
>> straightforwardly converge to optimal cooperative strategies based on
>> a model of reality which is more correct than any AIXI-tl is capable
>> of achieving.
>>
>> Ultimately AIXI's decision process breaks down in our physical
>> universe because AIXI models an environmental reality with which it
>> interacts, instead of modeling a naturalistic reality within which it
>> is embedded. It's one of two major formal differences between AIXI's
>> foundations and Novamente's.  Unfortunately there is a third
>> foundational difference between AIXI and a Friendly AI.
>
> I don't agree at all.
>
> In a Prisoner's Dilemma between two AIXI-tl's, why can't each one run a
>  program that:
>
> * uses an abstract mathematical representation of AIXI-tl, similar to
> the one given in the Hutter paper * use predicate logic to prove
> theorems about the behavior of the other AIXI-tl

Because AIXI-tl is not an entity deliberately allocating computing power; 
its control process is fixed.  AIXI-tl will model a process that proves 
theorems about AIXI-tl only if that process is the best predictor of the 
environmental information seen so far.

Let's say the primary AIXI-tl, the one whose performance we're tracking, 
is facing a complex cooperative problem.  Within each round, the challenge 
protocol is as follows.

1)  The Primary testee is cloned - that is, the two testees are 
resynchronized at the start of each new round.  This is why Lee Corbin is 
the human upload (i.e., to avoid moral issues).  We will assume that the 
Secondary testee, if a human upload, continues to attempt to maximize 
rational reward despite impending doom; again, this is why we're using Lee 
Corbin.

2)  Each party, the Primary and the Secondary (the Secondary being 
re-cloned on each round) are shown an identical map of the next 
cooperative complex problem.  For example, this might be a set of 
billiards

Re: [agi] Breaking AIXI-tl

2003-02-12 Thread Bill Hibbard

Hi Eliezer,

> An intuitively fair, physically realizable challenge, with important
> real-world analogues, formalizable as a computation which can be fed
> either a tl-bounded uploaded human or an AIXI-tl, for which the human
> enjoys greater success measured strictly by total reward over time, due to
> the superior strategy employed by that human as the result of rational
> reasoning of a type not accessible to AIXI-tl.
>
> Roughly speaking:
>
> A (selfish) human upload can engage in complex cooperative strategies with
> an exact (selfish) clone, and this ability is not accessible to AIXI-tl,
> since AIXI-tl itself is not tl-bounded and therefore cannot be simulated
> by AIXI-tl, nor does AIXI-tl have any means of abstractly representing the
> concept "a copy of myself".  Similarly, AIXI is not computable and
> therefore cannot be simulated by AIXI.  Thus both AIXI and AIXI-tl break
> down in dealing with a physical environment that contains one or more
> copies of them.  You might say that AIXI and AIXI-tl can both do anything
> except recognize themselves in a mirror.

Why do you require an AIXI or AIXI-tl to simulate itself, when
humans cannot? A human cannot know that another human is an
exact clone of itself. All humans or AIXIs can know is what
they observe. They cannot know that another mind is identical.

> The simplest case is the one-shot Prisoner's Dilemna against your own
> exact clone.  It's pretty easy to formalize this challenge as a
> computation that accepts either a human upload or an AIXI-tl.  This
> obviously breaks the AIXI-tl formalism.  Does it break AIXI-tl?  This
> question is more complex than you might think.  For simple problems,
> there's a nonobvious way for AIXI-tl to stumble onto incorrect hypotheses
> which imply cooperative strategies, such that these hypotheses are stable
> under the further evidence then received.  I would expect there to be
> classes of complex cooperative problems in which the chaotic attractor
> AIXI-tl converges to is suboptimal, but I have not proved it.  It is
> definitely true that the physical problem breaks the AIXI formalism and
> that a human upload can straightforwardly converge to optimal cooperative
> strategies based on a model of reality which is more correct than any
> AIXI-tl is capable of achieving.

Given that humans can only know what they observe, and
thus cannot know what is going on inside another mind,
humans are on the same footing as AIXIs in Prisoner's
Dilema. I suspect that two AIXIs or AIXI-tl's will do
well at the game, since a strategy with betrayal probably
needs a longer program than a startegy without betrayal,
and the AIXI will weight more strongly a model of the
other's behavior with a shorter program.

> Ultimately AIXI's decision process breaks down in our physical universe
> because AIXI models an environmental reality with which it interacts,
> instead of modeling a naturalistic reality within which it is embedded.
> It's one of two major formal differences between AIXI's foundations and
> Novamente's.  Unfortunately there is a third foundational difference
> between AIXI and a Friendly AI.

I will grant you one thing: that since an AIXI cannot
exist and an AIXI-tl is too slow to be practical, using
them as a basis for discussing safe AGIs is a bit futile.

The other problem is that an AIXI's optimality is only as
valid as its assumption about the probability distribution
of universal Turing machine programs.

Cheers,
Bill
--
Bill Hibbard, SSEC, 1225 W. Dayton St., Madison, WI  53706
[EMAIL PROTECTED]  608-263-4427  fax: 608-263-6738
http://www.ssec.wisc.edu/~billh/vis.html

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-12 Thread Shane Legg

Eliezer S. Yudkowsky wrote:


Has the problem been thought up just in the sense of "What happens when 
two AIXIs meet?" or in the formalizable sense of "Here's a computational 
challenge C on which a tl-bounded human upload outperforms AIXI-tl?"

I don't know of anybody else considering "human upload" vs. AIXI.

Cheers
Shane

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

RE: [agi] Breaking AIXI-tl

2003-02-12 Thread Ben Goertzel


Eliezer,

> A (selfish) human upload can engage in complex cooperative
> strategies with
> an exact (selfish) clone, and this ability is not accessible to AIXI-tl,
> since AIXI-tl itself is not tl-bounded and therefore cannot be simulated
> by AIXI-tl, nor does AIXI-tl have any means of abstractly
> representing the
> concept "a copy of myself".  Similarly, AIXI is not computable and
> therefore cannot be simulated by AIXI.  Thus both AIXI and AIXI-tl break
> down in dealing with a physical environment that contains one or more
> copies of them.  You might say that AIXI and AIXI-tl can both do anything
> except recognize themselves in a mirror.

I disagree with the bit about 'nor does AIXI-tl have any means of abstractly
representing the concept "a copy of myself".'

It seems to me that AIXI-tl is capable of running programs that contain such
an abstract representation.  Why not?  If the parameters are right, it can
run programs vastly more complex than a human brain upload...

For example, an AIXI-tl can run a program that contains the AIXI-tl
algorithm, as described in Hutter's paper, with t and l left as free
variables.  This program can then carry out reasoning using predicate logic,
about AIXI-tl in general, and about AIXI-tl for various values of t and l.

Similarly, AIXI can run a program that contains a mathematical description
of AIXI similar to the one in Hutter's paper.  This program can then prove
theorems about AIXI using predicate logic.

For instance, if AIXI were rewarded for proving math theorems about AGI,
eventually it would presumably learn to prove theorems about AIXI, extending
Hutter's theorems and so forth.

> The simplest case is the one-shot Prisoner's Dilemna against your own
> exact clone.  It's pretty easy to formalize this challenge as a
> computation that accepts either a human upload or an AIXI-tl.  This
> obviously breaks the AIXI-tl formalism.  Does it break AIXI-tl?  This
> question is more complex than you might think.  For simple problems,
> there's a nonobvious way for AIXI-tl to stumble onto incorrect hypotheses
> which imply cooperative strategies, such that these hypotheses are stable
> under the further evidence then received.  I would expect there to be
> classes of complex cooperative problems in which the chaotic attractor
> AIXI-tl converges to is suboptimal, but I have not proved it.  It is
> definitely true that the physical problem breaks the AIXI formalism and
> that a human upload can straightforwardly converge to optimal cooperative
> strategies based on a model of reality which is more correct than any
> AIXI-tl is capable of achieving.
>
> Ultimately AIXI's decision process breaks down in our physical universe
> because AIXI models an environmental reality with which it interacts,
> instead of modeling a naturalistic reality within which it is embedded.
> It's one of two major formal differences between AIXI's foundations and
> Novamente's.  Unfortunately there is a third foundational difference
> between AIXI and a Friendly AI.

I don't agree at all.

In a Prisoner's Dilemma between two AIXI-tl's, why can't each one run a
program that:

* uses an abstract mathematical representation of AIXI-tl, similar to the
one given in the Hutter paper
* use predicate logic to prove theorems about the behavior of the other
AIXI-tl

How is this so different than what two humans do when reasoning about each
others' behavior?  A given human cannot contain within itself a detailed
model of its own clone; in practice, when a human reasons about the behavior
of it clone, it is going to use some abstract representation of that clone,
and do some precise or uncertain reasoning based on this abstract
representation.

-- Ben G




---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-12 Thread Eliezer S. Yudkowsky

Shane Legg wrote:


Eliezer,

Yes, this is a clever argument.  This problem with AIXI has been
thought up before but only appears, at least as far as I know, in
material that is currently unpublished.  I don't know if anybody
has analysed the problem in detail as yet... but it certainly is
a very interesting question to think about:

What happens when two super intelligent AIXI's meet?


"SI-AIXI" is redundant; all AIXIs are enormously far beyond 
superintelligent.  As for the problem, the obvious answer is that no 
matter what strange things happen, an AIXI^2 which performs Solomonoff^2 
induction, using the universal prior of strings output by first-order 
Oracle machines, will come up with the best possible strategy for handling 
it...

Has the problem been thought up just in the sense of "What happens when 
two AIXIs meet?" or in the formalizable sense of "Here's a computational 
challenge C on which a tl-bounded human upload outperforms AIXI-tl?"

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Re: [agi] Breaking AIXI-tl

2003-02-12 Thread Shane Legg


Eliezer,

Yes, this is a clever argument.  This problem with AIXI has been
thought up before but only appears, at least as far as I know, in
material that is currently unpublished.  I don't know if anybody
has analysed the problem in detail as yet... but it certainly is
a very interesting question to think about:

What happens when two super intelligent AIXI's meet?

I'll have to think about this for a while before I reply.

Also, you mentioned that it was in your opinion trivial to see that
an AIXI type system would turn into an unfriendly AI.  I'm still
interested to see this argument spelled out, especially if you think
it's a relatively simple argument.

Cheers
Shane


Eliezer S. Yudkowsky wrote:

Okay, let's see, I promised:

An intuitively fair, physically realizable challenge, with important 
real-world analogues, formalizable as a computation which can be fed 
either a tl-bounded uploaded human or an AIXI-tl, for which the human 
enjoys greater success measured strictly by total reward over time, due 
to the superior strategy employed by that human as the result of 
rational reasoning of a type not accessible to AIXI-tl.

Roughly speaking:

A (selfish) human upload can engage in complex cooperative strategies 
with an exact (selfish) clone, and this ability is not accessible to 
AIXI-tl, since AIXI-tl itself is not tl-bounded and therefore cannot be 
simulated by AIXI-tl, nor does AIXI-tl have any means of abstractly 
representing the concept "a copy of myself".  Similarly, AIXI is not 
computable and therefore cannot be simulated by AIXI.  Thus both AIXI 
and AIXI-tl break down in dealing with a physical environment that 
contains one or more copies of them.  You might say that AIXI and 
AIXI-tl can both do anything except recognize themselves in a mirror.

The simplest case is the one-shot Prisoner's Dilemna against your own 
exact clone.  It's pretty easy to formalize this challenge as a 
computation that accepts either a human upload or an AIXI-tl.  This 
obviously breaks the AIXI-tl formalism.  Does it break AIXI-tl?  This 
question is more complex than you might think.  For simple problems, 
there's a nonobvious way for AIXI-tl to stumble onto incorrect 
hypotheses which imply cooperative strategies, such that these 
hypotheses are stable under the further evidence then received.  I would 
expect there to be classes of complex cooperative problems in which the 
chaotic attractor AIXI-tl converges to is suboptimal, but I have not 
proved it.  It is definitely true that the physical problem breaks the 
AIXI formalism and that a human upload can straightforwardly converge to 
optimal cooperative strategies based on a model of reality which is more 
correct than any AIXI-tl is capable of achieving.

Ultimately AIXI's decision process breaks down in our physical universe 
because AIXI models an environmental reality with which it interacts, 
instead of modeling a naturalistic reality within which it is embedded. 
It's one of two major formal differences between AIXI's foundations and 
Novamente's.  Unfortunately there is a third foundational difference 
between AIXI and a Friendly AI.



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

[agi] Breaking AIXI-tl

2003-02-12 Thread Eliezer S. Yudkowsky

Okay, let's see, I promised:

An intuitively fair, physically realizable challenge, with important 
real-world analogues, formalizable as a computation which can be fed 
either a tl-bounded uploaded human or an AIXI-tl, for which the human 
enjoys greater success measured strictly by total reward over time, due to 
the superior strategy employed by that human as the result of rational 
reasoning of a type not accessible to AIXI-tl.

Roughly speaking:

A (selfish) human upload can engage in complex cooperative strategies with 
an exact (selfish) clone, and this ability is not accessible to AIXI-tl, 
since AIXI-tl itself is not tl-bounded and therefore cannot be simulated 
by AIXI-tl, nor does AIXI-tl have any means of abstractly representing the 
concept "a copy of myself".  Similarly, AIXI is not computable and 
therefore cannot be simulated by AIXI.  Thus both AIXI and AIXI-tl break 
down in dealing with a physical environment that contains one or more 
copies of them.  You might say that AIXI and AIXI-tl can both do anything 
except recognize themselves in a mirror.

The simplest case is the one-shot Prisoner's Dilemna against your own 
exact clone.  It's pretty easy to formalize this challenge as a 
computation that accepts either a human upload or an AIXI-tl.  This 
obviously breaks the AIXI-tl formalism.  Does it break AIXI-tl?  This 
question is more complex than you might think.  For simple problems, 
there's a nonobvious way for AIXI-tl to stumble onto incorrect hypotheses 
which imply cooperative strategies, such that these hypotheses are stable 
under the further evidence then received.  I would expect there to be 
classes of complex cooperative problems in which the chaotic attractor 
AIXI-tl converges to is suboptimal, but I have not proved it.  It is 
definitely true that the physical problem breaks the AIXI formalism and 
that a human upload can straightforwardly converge to optimal cooperative 
strategies based on a model of reality which is more correct than any 
AIXI-tl is capable of achieving.

Ultimately AIXI's decision process breaks down in our physical universe 
because AIXI models an environmental reality with which it interacts, 
instead of modeling a naturalistic reality within which it is embedded. 
It's one of two major formal differences between AIXI's foundations and 
Novamente's.  Unfortunately there is a third foundational difference 
between AIXI and a Friendly AI.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

78 matches

Mail list logo