RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Ben Goertzel

Philip,

 The discussion at times seems to have progressed on the basis that
 AIXI / AIXItl could choose to do all sorts amzing, powerful things.  But
 what I'm uncear on is what generates the infinite space of computer
 programs?

 Does AIXI / AIXItl itself generate these programs?  Or does it tap other
 entities programs?

AIXI is not a physically realizable system, it's just a hypothetical
mathematical entity.  It could never actually be built, in any universe.

AIXItl is physically realizable in theory, but probably never in our
universe... it would require too much computational resources.  (Except for
trivially small values of the parameters t and l, which would result in a
very dumb AIXItl, i.e. probably dumber than a beetle.)

The way they work is to generate all possible programs (AIXI) or all
possible programs of a given length l (AIXItl).  (It's easy to write a
program that generates all possible programs, the problem is that it runs
forever ;).

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Billy Brown
Ben Goertzel wrote:
 Agreed, except for the very modest resources part.  AIXI could
 potentially accumulate pretty significant resources pretty quickly.

Agreed. But if the AIXI needs to dissassemble the planet to build its
defense mechanism, the fact that it is harmless afterwards isn't going to be
much consolation to us. So, we only survive if the resources needed for the
perfect defense are small enough that the construction project doesn't wipe
us out as a side effect.

 This exploration makes the (fairly obvious, I guess) point that the
problem
 with AIXI Friendliness-wise is its simplistic goal architecture (the
reward
 function) rather than its learning mechanism.

Well, I agree that this particular problem is a result of the AIXI's goal
system architecture, but IMO the same problem occurs in a wide range of
other goal systems I've seen proposed on this list. The root of the problem
is that the thing we would really like to reward the system for, human
satisfaction with its performance, is not a physical quantity that can be
directly measured by a reward mechanism. So it is very tempting to choose
some external phenomenon, like smiles or verbal expressions of satisfaction,
as a proxy. Unfortunately, any such measurement can be subverted once the AI
becomes good at modifying its physical surroundings, and an AI with this
kind of goal system has no motivation not to wirehead itself.

To avoid the problem entirely, you have to figure out how to make an AI that
doesn't want to tinker with its reward system in the first place. This, in
turn, requires some tricky design work that would not necessarily seem
important unless one were aware of this problem. Which, of course, is the
reason I commented on it in the first place.

Billy Brown

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Ben Goertzel


 To avoid the problem entirely, you have to figure out how to make
 an AI that
 doesn't want to tinker with its reward system in the first place. This, in
 turn, requires some tricky design work that would not necessarily seem
 important unless one were aware of this problem. Which, of course, is the
 reason I commented on it in the first place.

 Billy Brown

I don't think that preventing an AI from tinkering with its reward system is
the only solution, or even the best one...

It will in many cases be appropriate for an AI to tinker with its goal
system...

I would recommend Eliezer's excellent writings on this topic if you don't
know them, chiefly www.singinst.org/CFAI.html .  Also, I have a brief
informal essay on the topic, www.goertzel.org/dynapsyc/2002/AIMorality.htm ,
although my thoughts on the topic have progressed a fair bit since I wrote
that.  Note that I don't fully agree with Eliezer on this stuff, but I do
think he's thought about it more thoroughly than anyone else (including me).

It's a matter of creating an initial condition so that the trajectory of the
evolving AI system (with a potentially evolving goal system) will have a
very high probability of staying in a favorable region of state space ;-)

-- Ben G




---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Billy Brown
Ben Goertzel wrote:
 I don't think that preventing an AI from tinkering with its
 reward system is the only solution, or even the best one...

 It will in many cases be appropriate for an AI to tinker with its goal
 system...

I don't think I was being clear there. I don't mean the AI should be
prevented from adjusting its goal system content, but rather that it should
be sophisticated enough that it doesn't want to wirehead in the first place.

 I would recommend Eliezer's excellent writings on this topic if you don't
 know them, chiefly www.singinst.org/CFAI.html .  Also, I have a brief
 informal essay on the topic, www.goertzel.org/dynapsyc/2002/AIMorality.htm
,
 although my thoughts on the topic have progressed a fair bit since I wrote
 that.

Yes, I've been following Eliezer's work since around '98. I'll have to take
a look at your essay.

Billy Brown

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-20 Thread Ben Goertzel

 Ben Goertzel wrote:
  I don't think that preventing an AI from tinkering with its
  reward system is the only solution, or even the best one...
 
  It will in many cases be appropriate for an AI to tinker with its goal
  system...

 I don't think I was being clear there. I don't mean the AI should be
 prevented from adjusting its goal system content, but rather that
 it should
 be sophisticated enough that it doesn't want to wirehead in the
 first place.

Ah, I certainly agree with you then.

The risk that's tricky to mitigate against is that, like a human drifting
into drug addiction, the AI slowly drifts into a state of mind where it does
want to wirehead ...

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel

 This seems to be a non-sequitor. The weakness of AIXI is not that it's
 goals don't change, but that it has no goals other than to maximize an
 externally given reward. So it's going to do whatever it predicts will
 most efficiently produce that reward, which is to coerce or subvert
 the evaluator.

I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
operating program leading it to hurt or annihilate humans, though.

It might learn a program involving actually doing beneficial acts for humans

Or, it might learn a program that just tells humans what they want to hear,
using its superhuman intelligence to trick humans into thinking that hearing
its soothing words is better than having actual beneficial acts done.

I'm not sure why you think the latter is more likely than the former.  My
guess is that the former is more likely.  It may require a simpler program
to please humans by benefiting them, than to please them by tricking them
into thinking they're being benefited

 If you start with such a goal, I don't see how allowing the
 system to change its goals is going to help.

Sure, you're right, if pleasing an external evaluator is the ONLY goal of a
system, and the system's dynamics are entirely goal-directed, then there is
no way to introduce goal-change into the system except randomly...

Novamente is different because it has multiple initial goals, and because
its behavior is not entirely goal-directed.  In these regards Novamente is
more human-brain-ish.

 But I think Eliezer's real point, which I'm not sure has come across, is
 that if you didn't spot such an obvious flaw right away, maybe you
 shouldn't trust your intuitions about what is safe and what is not.

Yes, I understood and explicitly responded to that point before.

Still, even after hearing you and Eliezer repeat the above argument, I'm
still not sure it's correct.

However, my intuitions about the safety of AIXI, which I have not thought
much about, are worth vastly less than  my intuitions about the safety of
Novamente, which I've been thinking about and working with for years.

Furthermore, my stated intention is NOT to rely on my prior intuitions to
assess the safety of my AGI system.  I don't think that anyone's prior
intuitions about AI safety are worth all that much, where a complex system
like Novamente is concerned.  Rather, I think that once Novamente is a bit
further along -- at the learning baby rather than partly implemented
baby stage -- we will do experimentation that will give us the empirical
knowledge needed to form serious opinions about safety (Friendliness).

-- Ben G


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel

I wrote:
 I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
 operating program leading it to hurt or annihilate humans, though.

 It might learn a program involving actually doing beneficial acts
 for humans

 Or, it might learn a program that just tells humans what they
 want to hear,
 using its superhuman intelligence to trick humans into thinking
 that hearing
 its soothing words is better than having actual beneficial acts done.

 I'm not sure why you think the latter is more likely than the former.  My
 guess is that the former is more likely.  It may require a simpler program
 to please humans by benefiting them, than to please them by tricking them
 into thinking they're being benefited

But even in the latter case, why would this program be likely to cause it to
*harm* humans?

That's what I don't see...

If it can get its reward-button jollies by tricking us, or by actually
benefiting us, why do you infer that it's going to choose to get its
reward-button jollies by finding a way to get rewarded by harming us?

I wouldn't feel terribly comfortable with an AIXI around hooked up to a
bright red reward button in Marcus Hutter's basement, but I'm not sure it
would be sudden disaster either...

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Eliezer S. Yudkowsky
Wei Dai wrote:

Eliezer S. Yudkowsky wrote:


Important, because I strongly suspect Hofstadterian superrationality 
is a *lot* more ubiquitous among transhumans than among us...

It's my understanding that Hofstadterian superrationality is not generally
accepted within the game theory research community as a valid principle of
decision making. Do you have any information to the contrary, or some
other reason to think that it will be commonly used by transhumans?


You yourself articulated, very precisely, the structure underlying 
Hofstadterian superrationality:  Expected utility of a course of action 
is defined as the average of the utility function evaluated on each 
possible state of the multiverse, weighted by the probability of that 
state being the actual state if the course was chosen.  The key precise 
phrasing is weighted by the probability of that state being the actual 
state if the course was chosen.  This view of decisionmaking is 
applicable to a timeless universe; it provides clear recommendations in 
the case of, e.g., Newcomb's Paradox.

The mathematical pattern of a goal system or decision may be instantiated 
in many distant locations simultaneously.  Mathematical patterns are 
constant, and physical processes may produce knowably correlated outputs 
given knowably correlated initial conditions.  For non-deterministic 
systems, or cases where the initial conditions are not completely known 
(where there exists a degree of subjective entropy in the specification of 
the initial conditions), the correlation estimated will be imperfect, but 
nonetheless nonzero.  What I call the Golden Law, by analogy with the 
Golden Rule, states descriptively that a local decision is correlated with 
the decision of all mathematically similar goal processes, and states 
prescriptively that the utility of an action should be calculated given 
that the action is the output of the mathematical pattern represented by 
the decision process, not just the output of a particular physical system 
instantiating that process - that the utility of an action is the utility 
given that all sufficiently similar instantiations of a decision process 
within the multiverse do, already have, or someday will produce that 
action as an output.  Similarity in this case is a purely descriptive 
argument with no prescriptive parameters.

Golden decisionmaking does not imply altruism - your goal system might 
evaluate the utility of only your local process.  The Golden Law does, 
however, descriptively and prescriptively produce Hofstadterian 
superrationality as a special case; if you are facing a sufficiently 
similar mind across the Prisoner's Dilemna, your decisions will be 
correlated and that correlation affects your local utility.  Given that 
the output of the mathematical pattern instantiated by your physical 
decision process is C, the state of the multiverse is C, C; given that the 
output of the mathematical pattern instantiated by your physical decision 
process is D, the state of the multiverse is D, D.  Thus, given sufficient 
rationality and a sufficient degree of known correlation between the two 
processes, the mathematical pattern that is the decision process will 
output C.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Wei Dai
On Wed, Feb 19, 2003 at 11:02:31AM -0500, Ben Goertzel wrote:
 I'm not sure why an AIXI, rewarded for pleasing humans, would learn an
 operating program leading it to hurt or annihilate humans, though.
 
 It might learn a program involving actually doing beneficial acts for humans
 
 Or, it might learn a program that just tells humans what they want to hear,
 using its superhuman intelligence to trick humans into thinking that hearing
 its soothing words is better than having actual beneficial acts done.
 
 I'm not sure why you think the latter is more likely than the former.  My
 guess is that the former is more likely.  It may require a simpler program
 to please humans by benefiting them, than to please them by tricking them
 into thinking they're being benefited

The AIXI would just contruct some nano-bots to modify the reward-button so
that it's stuck in the down position, plus some defenses to
prevent the reward mechanism from being further modified. It might need to
trick humans initially into allowing it the ability to construct such
nano-bots, but it's certainly a lot easier in the long run to do this than 
to benefit humans for all eternity. And not only is it easier, but this 
way he gets the maximum rewards per time unit, which he would not be able 
to get any other way. No real evaluator will ever give maximum rewards 
since it will always want to leave room for improvement.

 Furthermore, my stated intention is NOT to rely on my prior intuitions to
 assess the safety of my AGI system.  I don't think that anyone's prior
 intuitions about AI safety are worth all that much, where a complex system
 like Novamente is concerned.  Rather, I think that once Novamente is a bit
 further along -- at the learning baby rather than partly implemented
 baby stage -- we will do experimentation that will give us the empirical
 knowledge needed to form serious opinions about safety (Friendliness).

What kinds of experimentations do you plan to do? Please give some 
specific examples.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel


 The AIXI would just contruct some nano-bots to modify the reward-button so
 that it's stuck in the down position, plus some defenses to
 prevent the reward mechanism from being further modified. It might need to
 trick humans initially into allowing it the ability to construct such
 nano-bots, but it's certainly a lot easier in the long run to do
 this than
 to benefit humans for all eternity. And not only is it easier, but this
 way he gets the maximum rewards per time unit, which he would not be able
 to get any other way. No real evaluator will ever give maximum rewards
 since it will always want to leave room for improvement.

Fine, but if it does this, it is not anything harmful to humans.

And, in the period BEFORE the AIXI figured out how to construct nanobots (or
coerce  teach humans how to do so), it might do some useful stuff for
humans.

So then we'd have an AIXI that was friendly for a while, and then basically
disappeared into a shell.

Then we could build a new AIXI and start over ;-)

  Furthermore, my stated intention is NOT to rely on my prior
 intuitions to
  assess the safety of my AGI system.  I don't think that anyone's prior
  intuitions about AI safety are worth all that much, where a
 complex system
  like Novamente is concerned.  Rather, I think that once
 Novamente is a bit
  further along -- at the learning baby rather than partly implemented
  baby stage -- we will do experimentation that will give us the
 empirical
  knowledge needed to form serious opinions about safety (Friendliness).

 What kinds of experimentations do you plan to do? Please give some
 specific examples.

I will, a little later on -- I have to go outside now and spend a couple
hours shoveling snow off my driveway ;-p

Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Billy Brown
Wei Dai wrote:
 The AIXI would just contruct some nano-bots to modify the reward-button so
 that it's stuck in the down position, plus some defenses to
 prevent the reward mechanism from being further modified. It might need to
 trick humans initially into allowing it the ability to construct such
 nano-bots, but it's certainly a lot easier in the long run to do
 this than
 to benefit humans for all eternity. And not only is it easier, but this
 way he gets the maximum rewards per time unit, which he would not be able
 to get any other way. No real evaluator will ever give maximum rewards
 since it will always want to leave room for improvement.

I think it's worse than that, actually. The next logical step is to make
sure that nothing ever interferes with its control of the reward signal, or
does anything else that would turn off AIXI. It will therefore persue the
most effective defensive scheme it can come up with, and it has no reason to
care about adverse consequences to humans.

Now, there is no easy way to predict what strategy it will settle on, but
build a modest bunker and ask to be left alone surely isn't it. At the
very least it needs to become the strongest military power in the world, and
stay that way. It might very well decide that exterminating the human race
is a safer way of preventing future threats, by ensuring that nothing that
could interfere with its operation is ever built. Then it has to make sure
no alien civilization ever interferes with the reward button, which is the
same problem on a much larger scale. There are lots of approaches it might
take to this problem, but most of the obvious ones either wipe out the human
race as a side effect or reduce us to the position of ants trying to survive
in the AI's defense system.

Billy Brown

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Brad Wyble
 
 Now, there is no easy way to predict what strategy it will settle on, but
 build a modest bunker and ask to be left alone surely isn't it. At the
 very least it needs to become the strongest military power in the world, and
 stay that way. It might very well decide that exterminating the human race
 is a safer way of preventing future threats, by ensuring that nothing that
 could interfere with its operation is ever built. Then it has to make sure
 no alien civilization ever interferes with the reward button, which is the
 same problem on a much larger scale. There are lots of approaches it might
 take to this problem, but most of the obvious ones either wipe out the human
 race as a side effect or reduce us to the position of ants trying to survive
 in the AI's defense system.
 

I think this is an appropriate time to paraphrase Kent Brockman:

Earth has been taken over  'conquered', if you will  by a master race of unfriendly 
AI's. It's difficult to tell from this vantage point whether they will destroy the 
captive earth men or merely enslave them. One thing is for certain, there is no 
stopping them; their nanobots will soon be here. And I, for one, welcome our new 
computerized overlords. I'd like to remind them that as a trusted agi-list 
personality, I can be helpful in rounding up Eliezer to...toil in their underground 
uranium caves 


http://www.the-ocean.com/simpsons/others/ants2.wav


Apologies if this was inapporpriate.  

-Brad

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Wei Dai
On Wed, Feb 19, 2003 at 11:56:46AM -0500, Eliezer S. Yudkowsky wrote:
 The mathematical pattern of a goal system or decision may be instantiated 
 in many distant locations simultaneously.  Mathematical patterns are 
 constant, and physical processes may produce knowably correlated outputs 
 given knowably correlated initial conditions.  For non-deterministic 
 systems, or cases where the initial conditions are not completely known 
 (where there exists a degree of subjective entropy in the specification of 
 the initial conditions), the correlation estimated will be imperfect, but 
 nonetheless nonzero.  What I call the Golden Law, by analogy with the 
 Golden Rule, states descriptively that a local decision is correlated with 
 the decision of all mathematically similar goal processes, and states 
 prescriptively that the utility of an action should be calculated given 
 that the action is the output of the mathematical pattern represented by 
 the decision process, not just the output of a particular physical system 
 instantiating that process - that the utility of an action is the utility 
 given that all sufficiently similar instantiations of a decision process 
 within the multiverse do, already have, or someday will produce that 
 action as an output.  Similarity in this case is a purely descriptive 
 argument with no prescriptive parameters.

Ok, I see. I think I agree with this. I was confused by your phrase 
Hofstadterian superrationality because if I recall correctly, Hofstadter 
suggested that one should always cooperate in one-shot PD, whereas you're 
saying only cooperate if you have sufficient evidence that the other side 
is running the same decision algorithm as you are.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel


 Now, there is no easy way to predict what strategy it will settle on, but
 build a modest bunker and ask to be left alone surely isn't it. At the
 very least it needs to become the strongest military power in the
 world, and
 stay that way. I
...
 Billy Brown


I think this line of thinking makes way too many assumptions about the
technologies this uber-AI might discover.

It could discover a truly impenetrable shield, for example.

It could project itself into an entirely different universe...

It might decide we pose so little threat to it, with its shield up, that
fighting with us isn't worthwhile.  By opening its shield perhaps it would
expose itself to .0001% chance of not getting rewarded, whereas by leaving
its shield up and leaving us alone, it might have .1% chance of not
getting rewarded.

ETc.

I agree that bad outcomes are possible, but I don't see how we can possibly
estimate the odds of them.

-- ben g

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Eliezer S. Yudkowsky
Wei Dai wrote:


Ok, I see. I think I agree with this. I was confused by your phrase 
Hofstadterian superrationality because if I recall correctly, Hofstadter 
suggested that one should always cooperate in one-shot PD, whereas you're 
saying only cooperate if you have sufficient evidence that the other side 
is running the same decision algorithm as you are.

Similarity in this case may be (formally) emergent, in the sense that a 
most or all plausible initial conditions for a bootstrapping 
superintelligence - even extremely exotic conditions like the birth of a 
Friendly AI - exhibit convergence to decision processes that are 
correlated with each other with respect to the oneshot PD.  If you have 
sufficient evidence that the other entity is a superintelligence, that 
alone may be sufficient correlation.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Billy Brown
Ben Goertzel wrote:
 I think this line of thinking makes way too many assumptions about the
 technologies this uber-AI might discover.

 It could discover a truly impenetrable shield, for example.

 It could project itself into an entirely different universe...

 It might decide we pose so little threat to it, with its shield up, that
 fighting with us isn't worthwhile.  By opening its shield perhaps it would
 expose itself to .0001% chance of not getting rewarded, whereas by leaving
 its shield up and leaving us alone, it might have .1%
 chance of not
 getting rewarded.

 ETc.

You're thinking in static terms. It doesn't just need to be safe from
anything ordinary humans do with 20th century technology. It needs to be
safe from anything that could ever conceivably be created by humanity or its
descendants. This obviously includes other AIs with capabilities as great as
its own, but with whatever other goal systems humans might try out.

Now, it is certainly conceivable that the laws of physics just happen to be
such that a sufficiently good technology can create a provably impenetrable
defense in a short time span, using very modest resources. If that happens
to be the case, the runaway AI isn't a problem. But in just about any other
case we all end up dead, either because wiping out humanity now is far
easier that creating a defense against our distant descendants, or because
the best defensive measures the AI can think of require engineering projects
that would wipe us out as a side effect.

Billy Brown

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-19 Thread Eliezer S. Yudkowsky
Billy Brown wrote:

Ben Goertzel wrote:


I think this line of thinking makes way too many assumptions about
the technologies this uber-AI might discover.

It could discover a truly impenetrable shield, for example.

It could project itself into an entirely different universe...

It might decide we pose so little threat to it, with its shield up,
that fighting with us isn't worthwhile.  By opening its shield
perhaps it would expose itself to .0001% chance of not getting
rewarded, whereas by leaving its shield up and leaving us alone, it
might have .1% chance of not getting rewarded.


Now, it is certainly conceivable that the laws of physics just happen
to be such that a sufficiently good technology can create a provably
impenetrable defense in a short time span, using very modest resources.
If that happens to be the case, the runaway AI isn't a problem. But in
just about any other case we all end up dead, either because wiping out
humanity now is far easier that creating a defense against our distant
descendants, or because the best defensive measures the AI can think of
require engineering projects that would wipe us out as a side effect.


It should also be pointed out that we are describing a state of AI such that:

a)  it provides no conceivable benefit to humanity
b)  a straightforward extrapolation shows it wiping out humanity
c)  it requires the postulation of a specific unsupported complex miracle 
to prevent the AI from wiping out humanity
c1) these miracles are unstable when subjected to further examination
c2) the AI still provides no benefit to humanity even given the miracle

When a branch of an AI extrapolation ends in such a scenario it may 
legitimately be labeled a complete failure.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


RE: [agi] Breaking AIXI-tl

2003-02-19 Thread Ben Goertzel

 It should also be pointed out that we are describing a state of
 AI such that:

 a)  it provides no conceivable benefit to humanity

Not necessarily true: it's plausible that along the way, before learning how
to whack off by stimulating its own reward button, it could provide some
benefits to humanity.

 b)  a straightforward extrapolation shows it wiping out humanity
 c)  it requires the postulation of a specific unsupported complex miracle
 to prevent the AI from wiping out humanity
 c1) these miracles are unstable when subjected to further examination

I'm not so sure about this, but it's not worth arguing, really.

 c2) the AI still provides no benefit to humanity even given the miracle

 When a branch of an AI extrapolation ends in such a scenario it may
 legitimately be labeled a complete failure.

I'll classify it an almost-complete failure, sure ;)

Fortunately it's also a totally pragmatically implausible system to
construct, so there's not much to worry about...!

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-18 Thread Wei Dai
Eliezer S. Yudkowsky wrote:

 Important, because I strongly suspect Hofstadterian superrationality 
 is a *lot* more ubiquitous among transhumans than among us...

It's my understanding that Hofstadterian superrationality is not generally
accepted within the game theory research community as a valid principle of
decision making. Do you have any information to the contrary, or some
other reason to think that it will be commonly used by transhumans?

About a week ago Eliezer also wrote:

 2) While an AIXI-tl of limited physical and cognitive capabilities might 
 serve as a useful tool, AIXI is unFriendly and cannot be made Friendly 
 regardless of *any* pattern of reinforcement delivered during childhood.

I always thought that the biggest problem with the AIXI model is that it
assumes that something in the environment is evaluating the AI and giving
it rewards, so the easiest way for the AI to obtain its rewards would be
to coerce or subvert the evaluator rather than to accomplish any real
goals. I wrote a bit more about this problem at 
http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-18 Thread Ben Goertzel


Wei Dai wrote:
  Important, because I strongly suspect Hofstadterian superrationality
  is a *lot* more ubiquitous among transhumans than among us...

 It's my understanding that Hofstadterian superrationality is not generally
 accepted within the game theory research community as a valid principle of
 decision making. Do you have any information to the contrary, or some
 other reason to think that it will be commonly used by transhumans?

I don't agree with Eliezer about the importance of Hofstadterian
superrationality.

However, I do think he ended up making a good point about AIXItl, which is
that an AIXItl will probably be a lot worse at modeling other AIXItl's, than
a human is at modeling other humans.  This suggests that AIXItl's playing
cooperative games with each other, will likely fare worse than humans
playing cooperative games with each other.

I don't think this conclusion hinges on the importance of Hofstadterian
superrationality...

 About a week ago Eliezer also wrote:

  2) While an AIXI-tl of limited physical and cognitive
 capabilities might
  serve as a useful tool, AIXI is unFriendly and cannot be made Friendly
  regardless of *any* pattern of reinforcement delivered during childhood.

 I always thought that the biggest problem with the AIXI model is that it
 assumes that something in the environment is evaluating the AI and giving
 it rewards, so the easiest way for the AI to obtain its rewards would be
 to coerce or subvert the evaluator rather than to accomplish any real
 goals. I wrote a bit more about this problem at
 http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html.

I agree, this is a weakness of AIXI/AIXItl as a practical AI design.  In
humans, and in a more pragmatic AI design like Novamente, one has a
situation where the system's goals adapt and change along with the rest of
the system, beginning from (and sometimes but not always straying far from)
a set of initial goals.

One could of course embed the AIXI/AIXItl learning mechanism in a
supersystem that adapted its goals  But then one would probably lose the
nice theorems Marcus Hutter proved

-- Ben G






---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-18 Thread Ben Goertzel


Eliezer,

Allowing goals to change in a coupled way with thoughts memories, is not
simply adding entropy

-- Ben



 Ben Goertzel wrote:
 
 I always thought that the biggest problem with the AIXI model is that it
 assumes that something in the environment is evaluating the AI
 and giving
 it rewards, so the easiest way for the AI to obtain its rewards would be
 to coerce or subvert the evaluator rather than to accomplish any real
 goals. I wrote a bit more about this problem at
 http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html.
 
  I agree, this is a weakness of AIXI/AIXItl as a practical AI design.  In
  humans, and in a more pragmatic AI design like Novamente, one has a
  situation where the system's goals adapt and change along with
 the rest of
  the system, beginning from (and sometimes but not always
 straying far from)
  a set of initial goals.

 How does adding entropy help?

 --
 Eliezer S. Yudkowsky  http://singinst.org/
 Research Fellow, Singularity Institute for Artificial Intelligence

 ---
 To unsubscribe, change your address, or temporarily deactivate
 your subscription,
 please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-18 Thread Wei Dai
On Tue, Feb 18, 2003 at 06:58:30PM -0500, Ben Goertzel wrote:
 However, I do think he ended up making a good point about AIXItl, which is
 that an AIXItl will probably be a lot worse at modeling other AIXItl's, than
 a human is at modeling other humans.  This suggests that AIXItl's playing
 cooperative games with each other, will likely fare worse than humans
 playing cooperative games with each other.

That's because AIXI wasn't designed with game theory in mind. I.e., the
reason that it doesn't handle cooperative games is that it wasn't designed
to. As the abstract says, AIXI is a combination of decision theory with
Solomonoff's theory of universal induction. We know that game theory
subsumes decision theory as a special case (where there is only one
player) but not the other way around. Central to multi-player game theory
is the concept of Nash equilibrium, which doesn't exist in decision
theory. If you apply decision theory to multi-player games, you're going
to end up with an infinite recursion where you try to predict the other
players trying to predict you trying to predict the other players, and so
on. If you cut this infinite recursion off at an arbitrary point, as
AIXI-tl would, of course you're not going to get good results.

  I always thought that the biggest problem with the AIXI model is that it
  assumes that something in the environment is evaluating the AI and giving
  it rewards, so the easiest way for the AI to obtain its rewards would be
  to coerce or subvert the evaluator rather than to accomplish any real
  goals. I wrote a bit more about this problem at
  http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html.
 
 I agree, this is a weakness of AIXI/AIXItl as a practical AI design.  In
 humans, and in a more pragmatic AI design like Novamente, one has a
 situation where the system's goals adapt and change along with the rest of
 the system, beginning from (and sometimes but not always straying far from)
 a set of initial goals.

This seems to be a non-sequitor. The weakness of AIXI is not that it's
goals don't change, but that it has no goals other than to maximize an
externally given reward. So it's going to do whatever it predicts will
most efficiently produce that reward, which is to coerce or subvert
the evaluator. If you start with such a goal, I don't see how allowing the
system to change its goals is going to help.

But I think Eliezer's real point, which I'm not sure has come across, is
that if you didn't spot such an obvious flaw right away, maybe you
shouldn't trust your intuitions about what is safe and what is not.

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl - AGI friendliness

2003-02-16 Thread Philip Sutton
Hi Eliezer/Ben,

My recollection was that Eliezer initiated the Breaking AIXI-tl 
discussion as a way of proving that friendliness of AGIs had to be 
consciously built in at the start and couldn't be assumed to be 
teachable at a later point. (Or have I totally lost the plot?)

Do you feel the discussion has covered enough technical ground and 
established enough concensus to bring the original topic back into 
focus?

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl - AGI friendliness

2003-02-16 Thread Ben Goertzel

Actually, Eliezer said he had two points about AIXItl:

1) that it could be broken in the sense he's described

2) that it was intrinsically un-Friendly

So far he has only made point 1), and has not gotten to point 2) !!!

As for a general point about the teachability of Friendliness, I don't think
that an analysis of AIXItl can lead to any such general conclusion.  AIXItl
is very, very different from Novamente or any other pragmatic AI system.

I think that an analysis of AIXItl's Friendliness or otherwise is going to
be useful primarily as an exercise in Friendliness analysis of AGI
systems, rather than for any pragmatic implications it  may yave.

-- Ben


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
 Behalf Of Philip Sutton
 Sent: Sunday, February 16, 2003 9:42 AM
 To: [EMAIL PROTECTED]
 Subject: Re: [agi] Breaking AIXI-tl - AGI friendliness


 Hi Eliezer/Ben,

 My recollection was that Eliezer initiated the Breaking AIXI-tl
 discussion as a way of proving that friendliness of AGIs had to be
 consciously built in at the start and couldn't be assumed to be
 teachable at a later point. (Or have I totally lost the plot?)

 Do you feel the discussion has covered enough technical ground and
 established enough concensus to bring the original topic back into
 focus?

 Cheers, Philip

 ---
 To unsubscribe, change your address, or temporarily deactivate
 your subscription,
 please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl - AGI friendliness - how to move on

2003-02-16 Thread Philip Sutton
Hi Ben,

From a high order implications point of view I'm not sure that we need 
too much written up from the last discussion.

To me it's almost enough to know that both you and Eliezer agree that 
the AIXItl system can be 'broken' by the challenge he set and that a 
human digital simulation might not.  The next step is to ask so what?.  
What has this got to do with the AGI friendliness issue.

 Hopefully Eliezer will write up a brief paper on his observations
 about AIXI and AIXItl.  If he does that, I'll be happy to write a
 brief commentary on his paper expressing any differences of
 interpretation I have, and giving my own perspective on his points.  

That sounds good to me.

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl - AGI friendliness - how to move on

2003-02-16 Thread Ben Goertzel

 To me it's almost enough to know that both you and Eliezer agree that
 the AIXItl system can be 'broken' by the challenge he set and that a
 human digital simulation might not.  The next step is to ask so what?.
 What has this got to do with the AGI friendliness issue.

This last point of Eliezer's doesn't have much to do with the AGI
Friendliness issue.

It's simply an example of how a smarter AGI system may not be smarter in the
context of interacting socially with its own peers.

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Bill Hibbard
Eliezer S. Yudkowsky wrote:
 Bill Hibbard wrote:
  On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:
 
 It *could* do this but it *doesn't* do this.  Its control process is such
 that it follows an iterative trajectory through chaos which is forbidden
 to arrive at a truthful solution, though it may converge to a stable
 attractor.
 
  This is the heart of the fallacy. Neither a human nor an AIXI
  can know that his synchronized other self - whichever one
  he is - is doing the same. All a human or an AIXI can know is
  its observations. They can estimate but not know the intentions
  of other minds.

 The halting problem establishes that you can never perfectly understand
 your own decision process well enough to predict its decision in advance,
 because you'd have to take into account the decision process including the
 prediction, et cetera, establishing an infinite regress.

 However, Corbin doesn't need to know absolutely that his other self is
 synchronized, nor does he need to know his other self's decision in
 advance.  Corbin only needs to establish a probabilistic estimate, good
 enough to guide his actions, that his other self's decision is correlated
 with his *after* the fact.  (I.e., it's not a halting problem where you
 need to predict yourself in advance; you only need to know your own
 decision after the fact.)

 AIXI-tl is incapable of doing this for complex cooperative problems
 because its decision process only models tl-bounded things and AIXI-tl is
 not *remotely close* to being tl-bounded.

Now you are using a different argument. You previous argument was:

 Lee Corbin can work out his entire policy in step (2), before step
 (3) occurs, knowing that his synchronized other self - whichever one
 he is - is doing the same.

Now you have Corbin merely estimating his clone's intentions.
While it is true that AIXI-tl cannot completely simulate itself,
it also can estimate another AIXI-tl's future behavior based on
observed behavior.

Your argument is now that Corbin can do it better. I don't
know if this is true or not.

 . . .
 Let's say that AIXI-tl takes action A in round 1, action B in round 2, and
 action C in round 3, and so on up to action Z in round 26.  There's no
 obvious reason for the sequence {A...Z} to be predictable *even
 approximately* by any of the tl-bounded processes AIXI-tl uses for
 prediction.  Any given action is the result of a tl-bounded policy but the
 *sequence* of *different* tl-bounded policies was chosen by a t2^l process.

Your example sequence is pretty simple and should match a
nice simple universal turing machine program in an AIXI-tl,
well within its bounds. Furthermore, two AIXI-tl's will
probably converge on a simple sequence in prisoner's
dilemma. But I have no idea if they can do it better than
Corbin and his clone.

Bill

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Philip Sutton
Eliezer/Ben,

When you've had time to draw breath can you explain, in non-obscure, 
non-mathematical language, what the implications of the AIXI-tl 
discussion are?

Thanks.

Cheers, Philip

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel

Hi,

 There's a physical challenge which operates on *one* AIXI-tl and breaks
 it, even though it involves diagonalizing the AIXI-tl as part of the
 challenge.

OK, I see what you mean by calling it a physical challenge.  You mean
that, as part of the challenge, the external agent posing the challenge is
allowed to clone the AIXI-tl.

   An intuitively fair, physically realizable challenge, with important
   real-world analogues, formalizable as a computation which can be fed
   either a tl-bounded uploaded human or an AIXI-tl, for which the human
   enjoys greater success measured strictly by total reward over time, due
   to the superior strategy employed by that human as the result of
   rational reasoning of a type not accessible to AIXI-tl.

 It's really the formalizability of the challenge as a computation which
 can be fed either a *single* AIXI-tl or a *single* tl-bounded uploaded
 human that makes the whole thing interesting at all... I'm sorry I didn't
 succeed in making clear the general class of real-world analogues for
 which this is a special case.

OK  I don't see how the challenge you've described is
formalizable as a computation which can be fed either a tl-bounded uploaded
human or an AIXI-tl.

The challenge involves cloning the agent being challenged.  Thus it is not a
computation feedable to the agent, unless you assume the agent is supplied
with a cloning machine...

 If I were to take a very rough stab at it, it would be that the
 cooperation case with your own clone is an extreme case of many scenarios
 where superintelligences can cooperate with each other on the one-shot
 Prisoner's Dilemna provided they have *loosely similar* reflective goal
 systems and that they can probabilistically estimate that enough loose
 similarity exists.

Yah, but the definition of a superintelligence is relative to the agent
being challenged.

For any fixed superintelligent agent A, there are AIXItl's big enough to
succeed against it in any cooperative game.

To break AIXI-tl, the challenge needs to be posed in a way that refers to
AIXItl's own size, i.e. one has to say something like Playing a cooperative
game with other intelligences of intelligence at least f(t,l)  where if is
some increasing function

If the intelligence of the opponents is fixed, then one can always make an
AIXItl win by increasing t and l ...

So your challenges are all of the form:

* For any fixed AIXItl, here is a challenge that will defeat it

ForAll AIXItl's A(t,l), ThereExists a challenge C(t,l) so that fails_at(A,C)

or alternatively

ForAll AIXItl's A(t,l), ThereExists a challenge C(A(t,l)) so that
fails_at(A,C)

rather than of the form

* Here is a challenge that will defeat any AIXItl

ThereExists a challenge C so that ForAll AIXItl's A(t,l), fails_at(A,C)

The point is that the challenge C is a function C(t,l) rather than being
independent of t and l

This of course is why your challenge doesn't break Hutter's theorem.  But
it's a distinction that your initial verbal formulation didn't make very
clearly (and I understand, the distinction is not that easy to make in
words.)

Of course, it's also true that

ForAll uploaded humans H, ThereExists a challenge C(H) so that fails_at(H,C)

What you've shown that's interesting is that

ThereExists a challenge C, so that:
-- ForAll AIXItl's A(t,l), fails_at(A,C(A))
-- for many uploaded humans H, succeeds_at(H,C(H))

(Where, were one to try to actually prove this, one would substitute
uploaded humans with other AI programs or something).



  The interesting part is that these little
 natural breakages in the formalism create an inability to take part in
 what I think might be a fundamental SI social idiom, conducting binding
 negotiations by convergence to goal processes that are guaranteed to have
 a correlated output, which relies on (a) Bayesian-inferred initial
 similarity between goal systems, and (b) the ability to create a
 top-level
 reflective choice that wasn't there before, that (c) was abstracted over
 an infinite recursion in your top-level predictive process.

I think part of what you're saying here is that AIXItl's are not designed to
be able to participate in a community of equals  This is certainly true.

--- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel

hi,

 No, the challenge can be posed in a way that refers to an arbitrary agent
 A which a constant challenge C accepts as input.

But the problem with saying it this way, is that the constant challenge
has to have an infinite memory capacity.

So in a sense, it's an infinite constant ;)

 No, the charm of the physical challenge is exactly that there exists a
 physically constant cavern which defeats any AIXI-tl that walks into it,
 while being tractable for wandering tl-Corbins.

No, this isn't quite right.

If the cavern is physically constant, then there must be an upper limit to
the t and l for which it can clone AIXItl's.

If the cavern has N bits (assuming a bitistic reduction of physics, for
simplicity ;), then it can't clone an AIXItl where t 2^N, can it?  Not
without grabbing bits (particles or whatever) from the outside universe to
carry out the cloning.  (and how could the AIXItl with t2^N even fit
inside it??)

You still need the quantifiers reversed: for any AIXI-tl, there is a cavern
posing a challenge that defeats it...

  I think part of what you're saying here is that AIXItl's are
 not designed to
  be able to participate in a community of equals  This is
 certainly true.

 Well, yes, as a special case of AIXI-tl's being unable to carry out
 reasoning where their internal processes are correlated with the
 environment.

Agreed...

(See, it IS actually possible to convince me of something, when it's
correct; I'm actually not *hopelessly* stubborn ;)

ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:

hi,


No, the challenge can be posed in a way that refers to an arbitrary agent
A which a constant challenge C accepts as input.


But the problem with saying it this way, is that the constant challenge
has to have an infinite memory capacity.

So in a sense, it's an infinite constant ;)


Infinite Turing tapes are a pretty routine assumption in operations like 
these.  I think Hutter's AIXI-tl is supposed to be able to handle constant 
environments (as opposed to constant challenges, a significant formal 
difference) that contain infinite Turing tapes.  Though maybe that'd 
violate separability?  Come to think of it, the Clone challenge might 
violate separability as well, since AIXI-tl (and hence its Clone) builds 
up state.

No, the charm of the physical challenge is exactly that there exists a
physically constant cavern which defeats any AIXI-tl that walks into it,
while being tractable for wandering tl-Corbins.


No, this isn't quite right.

If the cavern is physically constant, then there must be an upper limit to
the t and l for which it can clone AIXItl's.


Hm, this doesn't strike me as a fair qualifier.  One, if an AIXItl exists 
in the physical universe at all, there are probably infinitely powerful 
processors lying around like sunflower seeds.  And two, if you apply this 
same principle to any other physically realized challenge, it means that 
people could start saying Oh, well, AIXItl can't handle *this* challenge 
because there's an upper bound on how much computing power you're allowed 
to use.  If Hutter's theorem is allowed to assume infinite computing 
power inside the Cartesian theatre, then the magician's castle should be 
allowed to assume infinite computing power outside the Cartesian theatre. 
 Anyway, a constant cave with an infinite tape seems like a constant 
challenge to me, and a finite cave that breaks any {AIXI-tl, tl-human} 
contest up to l=googlebyte also still seems interesting, especially as 
AIXI-tl is supposed to work for any tl, not just sufficiently high tl.

Well, yes, as a special case of AIXI-tl's being unable to carry out
reasoning where their internal processes are correlated with the
environment.


Agreed...

(See, it IS actually possible to convince me of something, when it's
correct; I'm actually not *hopelessly* stubborn ;)


Yes, but it takes t2^l operations.

(Sorry, you didn't deserve it, but a straight line like that only comes 
along once.)

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel


   Anyway, a constant cave with an infinite tape seems like a constant
 challenge to me, and a finite cave that breaks any {AIXI-tl, tl-human}
 contest up to l=googlebyte also still seems interesting, especially as
 AIXI-tl is supposed to work for any tl, not just sufficiently high tl.

It's a fair mathematical challenge ... the reason I complained is that the
physical-world metaphor of a cave seems to me to imply a finite system.

A cave with an infinite tape in it is no longer a realizable physical
system!

  (See, it IS actually possible to convince me of something, when it's
  correct; I'm actually not *hopelessly* stubborn ;)

 Yes, but it takes t2^l operations.

 (Sorry, you didn't deserve it, but a straight line like that only comes
 along once.)

;-)


ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Alan Grimes
Eliezer S. Yudkowsky wrote:
 Let's imagine I'm a superintelligent magician, sitting in my castle, 
 Dyson Sphere, what-have-you.  I want to allow sentient beings some way 
 to visitme, but I'm tired of all these wandering AIXI-tl spambots that 
 script kiddies code up to brute-force my entrance challenges.  I don't 
 want to tl-bound my visitors; what if an actual sentient 10^10^15 
 ops/sec big wants to visit me?  I don't want to try and examine the 
 internal state of the visiting agent, either; that just starts a war of 
 camouflage between myself and the spammers.  Luckily, there's a simple 
 challenge I can pose to any visitor, cooperation with your clone, that 
 filters out the AIXI-tls and leaves only beings who are capable of a 
 certain level of reflectivity, presumably genuine sentients.  I don't 
 need to know the tl-bound of my visitors, or the tl-bound of the 
 AIXI-tl, in order to construct this challenge.  I write the code once.

Oh, that's trivial to break. I just put my AIXI-t1 (whatever that is) in
a human body and send it via rocket-ship... There would no way to clone
this being so you would have no way to carry out the test.

-- 
I WANT A DEC ALPHA!!! =)
21364: THE UNDISPUTED GOD OF ALL CPUS.
http://users.rcn.com/alangrimes/
[if rcn.com doesn't work, try erols.com ]

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:
 In a naturalistic universe, where there is no sharp boundary between
 the physics of you and the physics of the rest of the world, the
 capability to invent new top-level internal reflective choices can be
 very important, pragmatically, in terms of properties of distant
 reality that directly correlate with your choice to your benefit, if
 there's any breakage at all of the Cartesian boundary - any
 correlation between your mindstate and the rest of the environment.

 Unless, you are vastly smarter than the rest of the universe.  Then you
 can proceed like an AIXItl and there is no need for top-level internal
 reflective choices ;)

Actually, even if you are vastly smarter than the rest of the entire 
universe, you may still be stuck dealing with lesser entities (though not 
humans; superintelligences at least) who have any information at all about 
your initial conditions, unless you can make top-level internal reflective 
choices.

The chance that environmental superintelligences will cooperate with you 
in PD situations may depend on *their* estimate of *your* ability to 
generalize over the choice to defect and realize that a similar temptation 
exists on both sides.  In other words, it takes a top-level internal 
reflective choice to adopt a cooperative ethic on the one-shot complex PD 
rather than blindly trying to predict and outwit the environment for 
maximum gain, which is built into the definition of AIXI-tl's control 
process.  A superintelligence may cooperate with a comparatively small, 
tl-bounded AI, but be unable to cooperate with an AIXI-tl, provided there 
is any inferrable information about initial conditions.  In one sense 
AIXI-tl wins; it always defects, which formally is a better choice 
than cooperating on the oneshot PD, regardless of what the opponent does - 
assuming that the environment is not correlated with your decisionmaking 
process.  But anyone who knows that assumption is built into AIXI-tl's 
initial conditions will always defect against AIXI-tl.  A small, 
tl-bounded AI that can make reflective choices has the capability of 
adopting a cooperative ethic; provided that both entities know or infer 
something about the other's initial conditions, they can arrive at a 
knowably correlated reflective choice to adopt cooperative ethics.

AIXI-tl can learn the iterated PD, of course; just not the oneshot complex PD.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:

AIXI-tl can learn the iterated PD, of course; just not the
oneshot complex PD.


But if it's had the right prior experience, it may have an operating program
that is able to deal with the oneshot complex PD... ;-)


Ben, I'm not sure AIXI is capable of this.  AIXI may inexorably predict 
the environment and then inexorably try to maximize reward given 
environment.  The reflective realization that *your own choice* to follow 
that control procedure is correlated with a distant entity's choice not to 
cooperate with you may be beyond AIXI.  If it was the iterated PD, AIXI 
would learn how a defection fails to maximize reward over time.  But can 
AIXI understand, even in theory, regardless of what its internal programs 
simulate, that its top-level control function fails to maximize the a 
priori propensity of other minds with information about AIXI's internal 
state to cooperate with it, on the *one* shot PD?  AIXI can't take the 
action it needs to learn the utility of...

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


RE: [agi] Breaking AIXI-tl

2003-02-15 Thread Ben Goertzel

I guess that for AIXI to learn this sort of thing, it would have to be
rewarded for understanding AIXI in general, for proving theorems about AIXI,
etc.  Once it had learned this, it might be able to apply this knowledge in
the one-shot PD context  But I am not sure.

ben

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
 Behalf Of Eliezer S. Yudkowsky
 Sent: Saturday, February 15, 2003 3:36 PM
 To: [EMAIL PROTECTED]
 Subject: Re: [agi] Breaking AIXI-tl


 Ben Goertzel wrote:
 AIXI-tl can learn the iterated PD, of course; just not the
 oneshot complex PD.
 
  But if it's had the right prior experience, it may have an
 operating program
  that is able to deal with the oneshot complex PD... ;-)

 Ben, I'm not sure AIXI is capable of this.  AIXI may inexorably predict
 the environment and then inexorably try to maximize reward given
 environment.  The reflective realization that *your own choice* to follow
 that control procedure is correlated with a distant entity's
 choice not to
 cooperate with you may be beyond AIXI.  If it was the iterated PD, AIXI
 would learn how a defection fails to maximize reward over time.  But can
 AIXI understand, even in theory, regardless of what its internal programs
 simulate, that its top-level control function fails to maximize the a
 priori propensity of other minds with information about AIXI's internal
 state to cooperate with it, on the *one* shot PD?  AIXI can't take the
 action it needs to learn the utility of...

 --
 Eliezer S. Yudkowsky  http://singinst.org/
 Research Fellow, Singularity Institute for Artificial Intelligence

 ---
 To unsubscribe, change your address, or temporarily deactivate
 your subscription,
 please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-15 Thread Brad Wyble

 I guess that for AIXI to learn this sort of thing, it would have to be
 rewarded for understanding AIXI in general, for proving theorems about AIXI,
 etc.  Once it had learned this, it might be able to apply this knowledge in
 the one-shot PD context  But I am not sure.
 

For those of us who have missed a critical message or two in this weekend's lengthy 
exchange, can you explain briefly the one-shot complex PD?  I'm unsure how a program 
could evaluate and learn to predict the behavior of its opponent if it only gets 
1-shot.  Obviously I'm missing something.

-Brad



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel

 Really, when has a computer (with the exception of certain Microsoft
 products) ever been able to disobey it's human masters?

 It's easy to get caught up in the romance of superpowers, but come on,
 there's nothing to worry about.

 -Daniel

Hi Daniel,

Clearly there is nothing to worry about TODAY.

And I'm spending the vast bulk of my time working on practical AI design and
engineering and application work, not on speculating about the future.

However, I do believe that once AI tech has advanced far enough, there WILL
be something to worry about.

How close we are to this point is another question.

Current AI practice is very far away from achieving autonomous general
intelligence.

If I'm right about the potential of Novamente and similar designs, we could
be within a decade of getting there

If I'm wrong, well, Kurzweil has made some decent arguments why we'll get
there by 2050 or so... ;-)

-- Ben Goertzel

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel

 Even if a (grown) human is playing PD2, it outperforms AIXI-tl playing
 PD2.

Well, in the long run, I'm not at all sure this is the case.  You haven't
proved this to my satisfaction.

In the short run, it certainly is the case.  But so what?  AIXI-tl is damn
slow at learning, we know that.

The question is whether after enough trials AIXI-tl figures out it's playing
some entity similar to itself and learns how to act accordingly  If so,
then it's doing what AIXI-tl is supposed to do.

A human can also learn to solve vision recognition problems faster than
AIXI-tl, because we're wired for it (as we're wired for social gameplaying),
whereas AIXI-tl has to learn


 Humans can recognize a much stronger degree of similarity in human Other
 Minds than AIXI-tl's internal processes are capable of recognizing in any
 other AIXI-tl.

I don't believe that is true.

 Again, as far as I can tell, this
 necessarily requires abstracting over your own internal state and
 recognizing that the outcome of your own (internal) choices are
 necessarily reproduced by a similar computation elsewhere.
 Basically, it
 requires abstracting over your own halting problem to realize that the
 final result of your choice is correlated with that of the process
 simulated, even though you can't fully simulate the causal process
 producing the correlation in advance.  (This doesn't *solve* your own
 halting problem, but at least it enables you to *understand* the
 situation
 you've been put into.)  Except that instead of abstracting over your own
 halting problem, you're abstracting over the process of trying to
 simulate
 another mind trying to simulate you trying to simulate it, where
 the other
 mind is sufficiently similar to your own.  This is a kind of reasoning
 qualitatively closed to AIXI-tl; its control process goes on abortively
 trying to simulate the chain of simulations forever, stopping and
 discarding that prediction as unuseful as soon as it exceeds the t-bound.

OK... here's where the fact that you have a tabula rasa AIXI-tl in a very
limiting environment comes in.

In a richer environment, I don't see why AIXI-tl, after a long enough time,
couldn't learn an operating program that implicitly embodied an abstraction
over its own internal state.

In an environment consisting solely of PD2, it may be that AIXI-tl will
never have the inspiration to learn this kind of operating program.  (I'm
not sure.)

To me, this says mostly that PD2 is an inadequate environment for any
learning system to use, to learn how to become a mind.  If it ain't good
enough for AIXI-tl to use to learn how to become a mind, over a very long
period of time, it probably isn't good for any AI system to use to learn how
to become a mind.

 Anyway... basically, if you're in a real-world situation where the other
 intelligence has *any* information about your internal state, not just
 from direct examination, but from reasoning about your origins, then that
 also breaks the formalism and now a tl-bounded seed AI can outperform
 AIXI-tl on the ordinary (non-quined) problem of cooperation with a
 superintelligence.  The environment can't ever *really* be constant and
 completely separated as Hutter requires.  A physical environment that
 gives rise to an AIXI-tl is different from the environment that
 gives rise
 to a tl-bounded seed AI, and the different material implementations of
 these entities (Lord knows how you'd implement the AIXI-tl) will have
 different side effects, and so on.  All real world problems break the
 Cartesian assumption.  The questions But are there any kinds of problems
 for which that makes a real difference? and Does any
 conceivable kind of
 mind do any better? can both be answered affirmatively.

Welll  I agree with only some of this.

The thing is, an AIXI-tl-driven AI embedded in the real world would have a
richer environment to draw on than the impoverished data provided by PD2.
This AI would eventually learn how to model itself and reflect in a rich way
(by learning the right operating program).

However, AIXI-tl is a horribly bad AI algorithm, so it would take a VERY
VERY long time to carry out this learning, of course...

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:
 Even if a (grown) human is playing PD2, it outperforms AIXI-tl
 playing PD2.

 Well, in the long run, I'm not at all sure this is the case.  You
 haven't proved this to my satisfaction.

PD2 is very natural to humans; we can take for granted that humans excel
at PD2.  The question is AIXI-tl.

 In the short run, it certainly is the case.  But so what?  AIXI-tl is
 damn slow at learning, we know that.

AIXI-tl is most certainly not damn slow at learning any environment that
can be tl-bounded.  For problems that don't break the Cartesian formalism,
AIXI-tl learns only slightly lower than the fastest possible tl-bounded
learner.  It's got t2^l computing power for gossakes!  From our
perspective it learns at faster than the fastest rate humanly imaginable -
literally.

You appear to be thinking of AIXI-tl as a fuzzy little harmless baby being
confronted with some harsh trial.  That fuzzy little harmless baby, if the
tl-bound is large enough to simulate Lee Corbin, is wielding something
like 10^10^15 operations per second, which it is using to *among other
things* simulate every imaginable human experience.  AIXI-tl is larger
than universes; it contains all possible tl-bounded heavens and all 
possible tl-bounded hells.  The only question is whether its control 
process makes any good use of all that computation.

More things from the list of system properties that Friendliness 
programmers should sensitize themselves to:  Just because the endless 
decillions of alternate Ben Goertzels in torture chambers are screaming to 
God to stop it doesn't mean that AIXI-tl's control process cares.

 The question is whether after enough trials AIXI-tl figures out it's
 playing some entity similar to itself and learns how to act
 accordingly  If so, then it's doing what AIXI-tl is supposed to do.

AIXI-tl *cannot* figure this out because its control process is not
capable of recognizing tl-computable transforms of its own policies and
strategic abilities, *only* tl-computable transforms of its own direct
actions.  Yes, it simulates entities who know this; it also simulates
every possible other kind of tl-bounded entity.  The question is whether
that internal knowledge appears as an advantage recognized by the control
process and given AIXI-tl's formal definition, it does not appear to do so.

In my humble opinion, one of the (many) critical skills for creating AI is
learning to recognize what systems *really actually do* and not just what
you project onto them.  See also Eliza effect, failure of GOFAI, etc.

 A human can also learn to solve vision recognition problems faster than
  AIXI-tl, because we're wired for it (as we're wired for social
 gameplaying), whereas AIXI-tl has to learn

AIXI-tl learns vision *instantly*.  The Kolmogorov complexity of a visual
field is much less than its raw string, and the compact representation can
be computed by a tl-bounded process.  It develops a visual cortex on the
same round it sees its first color picture.

 Humans can recognize a much stronger degree of similarity in human
 Other Minds than AIXI-tl's internal processes are capable of
 recognizing in any other AIXI-tl.

 I don't believe that is true.

Mentally simulate the abstract specification of AIXI-tl instead of using
your intuitions about the behavior of a generic reinforcement process. 
Eventually the results you learn will be integrated into your intuitions 
and you'll be able to directly see dependencies betwen specifications and 
reflective modeling abilities.

 OK... here's where the fact that you have a tabula rasa AIXI-tl in a
 very limiting environment comes in.

 In a richer environment, I don't see why AIXI-tl, after a long enough
 time, couldn't learn an operating program that implicitly embodied an
 abstraction over its own internal state.

Because it is physically or computationally impossible for a tl-bounded 
program to access or internally reproduce the previously computed policies 
or t2^l strategic ability of AIXI-tl.

 In an environment consisting solely of PD2, it may be that AIXI-tl will
 never have the inspiration to learn this kind of operating program.
 (I'm not sure.)

 To me, this says mostly that PD2 is an inadequate environment for any
 learning system to use, to learn how to become a mind.  If it ain't
 good enough for AIXI-tl to use to learn how to become a mind, over a
 very long period of time, it probably isn't good for any AI system to
 use to learn how to become a mind.

Marcus Hutter has formally proved your intuitions wrong.  In any situation 
that does *not* break the formalism, AIXI-tl learns to equal or outperform 
any other process, despite being a tabula rasa, no matter how rich or poor 
its environment.

 Anyway... basically, if you're in a real-world situation where the
 other intelligence has *any* information about your internal state,
 not just from direct examination, but from reasoning about your
 origins, then that also breaks the formalism and now a 

Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Eliezer S. Yudkowsky
Bill Hibbard wrote:

On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote:


It *could* do this but it *doesn't* do this.  Its control process is such
that it follows an iterative trajectory through chaos which is forbidden
to arrive at a truthful solution, though it may converge to a stable
attractor.


This is the heart of the fallacy. Neither a human nor an AIXI
can know that his synchronized other self - whichever one
he is - is doing the same. All a human or an AIXI can know is
its observations. They can estimate but not know the intentions
of other minds.


The halting problem establishes that you can never perfectly understand 
your own decision process well enough to predict its decision in advance, 
because you'd have to take into account the decision process including the 
prediction, et cetera, establishing an infinite regress.

However, Corbin doesn't need to know absolutely that his other self is 
synchronized, nor does he need to know his other self's decision in 
advance.  Corbin only needs to establish a probabilistic estimate, good 
enough to guide his actions, that his other self's decision is correlated 
with his *after* the fact.  (I.e., it's not a halting problem where you 
need to predict yourself in advance; you only need to know your own 
decision after the fact.)

AIXI-tl is incapable of doing this for complex cooperative problems 
because its decision process only models tl-bounded things and AIXI-tl is 
not *remotely close* to being tl-bounded.  Humans can model minds much 
closer to their own size than AIXI-tl can.  Humans can recognize when 
their policies, not just their actions, are reproduced.  We can put 
ourselves in another human's shoes imperfectly; AIXI-tl can't put itself 
in another AIXI-tl's shoes to the extent of being able to recognize the 
actions of an AIXI-tl computed using a process that is inherently 2t^l 
large.  Humans can't recognize their other selves perfectly but the gap in 
the case of AIXI-tl is enormously greater.  (Humans also have a reflective 
control process on which they can perform inductive and deductive 
generalizations and jump over a limited class of infinite regresses in 
decision processes, but that's a separate issue.  Suffice it to say that a 
subprocess which generalizes over its own infinite regress does not 
obviously suffice for AIXI-tl to generalize over the top-level infinite 
regress in AIXI-tl's control process.)

Let's say that AIXI-tl takes action A in round 1, action B in round 2, and 
action C in round 3, and so on up to action Z in round 26.  There's no 
obvious reason for the sequence {A...Z} to be predictable *even 
approximately* by any of the tl-bounded processes AIXI-tl uses for 
prediction.  Any given action is the result of a tl-bounded policy but the 
*sequence* of *different* tl-bounded policies was chosen by a t2^l process.

A human in the same situation has a mnemonic record of the sequence of 
policies used to compute their strategies, and can recognize correlations 
between the sequence of policies and the other agent's sequence of 
actions, which can then be confirmed by directing O(other-agent) strategic 
processing power at the challenge of seeing the problem from the opposite 
perspective.  AIXI-tl is physically incapable of doing this directly and 
computationally incapable of doing it indirectly.  This is not an attack 
on the computability of intelligence; the human is doing something 
perfectly computable which AIXI-tl does not do.

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Michael Roy Ames
Eliezer S. Yudkowsky asked Ben Goertzel:

  Do you have a non-intuitive mental simulation mode?


LOL  --#:^D

It *is* a valid question, Eliezer, but it makes me laugh.

Michael Roy Ames
[Who currently estimates his *non-intuitive mental simulation mode* to
contain about 3 iterations of 5 variables each - 8 variables each on a
good day.  Each variable can link to a concept (either complex or
simple)... and if that sounds to you like something that a trashed-out
Commodore 64 could emulate, then you have some idea how he feels being
stuck at his current level of non-intuitive intelligence.]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel


I'll read the rest of your message tomorrow...

 But we aren't *talking* about whether AIXI-tl has a mindlike operating
 program.  We're talking about whether the physically realizable
 challenge,
 which definitely breaks the formalism, also breaks AIXI-tl in practice.
 That's what I originally stated, that's what you originally said you
 didn't believe, and that's all I'm trying to demonstrate.



Your original statement was posed in a misleading way, perhaps not
intentionally.

There is no challenge on which *an* AIXI-tl doesn't outperform *an* uploaded
human.

What you're trying to show is that there's an inter-AIXI-tl social situation
in which AIXI-tl's perform less intelligently than humans do in a similar
inter-human situation.

If you had posed it this way, I wouldn't have been as skeptical initially.

-- Ben

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



RE: [agi] Breaking AIXI-tl

2003-02-14 Thread Ben Goertzel


Hmmm  My friend, I think you've pretty much convinced me with this last
batch of arguments.  Or, actually, I'm not sure if it was your excellently
clear arguments or the fact that I finally got a quiet 15 minutes to really
think about it (the three kids, who have all been out sick from school with
a flu all week, are all finally in bed ;)

Your arguments are a long way from a rigorous proof, and I can't rule out
that there might be a hole in them, but in this last e-mail you were
explicit enough to convince me that what you're saying makes logical sense.

I'm going to try to paraphrase your argument, let's see if we're somewhere
in the neighborhood of harmony...

Basically: you've got these two clones playing a cooperative game, and each
one, at each turn, is controlled by a certain program.  Each clone chooses
his current operating program by searching the space of all programs of
length  L that finish running in  T timesteps, and finding the one that,
based on his study of prior gameplay, is expected to give him the highest
chance of winning.  But each guy takes 2^T timesteps to perform this search.

So your basic point is that, because these clones are acting by simulating
programs that finish running in T timesteps, they're not going to be able
to simulate each other very accurately.

Whereas, a pair of clones each possessing a more flexible control algorithm
could perform better in the game.  Because, if a more flexible player wants
to simulate his opponent, he can choose to devote nearly ALL his
thinking-time inbetween moves to simulating his opponent.  Because these
more flexible players are not constrained to a rigid control algorithm that
divides up their time into little bits, simulating a huge number of fast
programs.

AIXItl does not have the flexibility to say Well, this time interval, I'm
going to keep my operating program the same, and instead of using my time
seeking a new operating program, I'm going to spend most of it trying to
simulate my opponent, or trying to study my opponent.

HOWEVER... it's still quite possible that the AIXItl clones can predict each
other, isn't it?  If one of them keeps running the same operating program
for a while, then the other one should be able to learn an operating program
that responds appropriately to that operating program.  But I can see that
for some cooperative games, it might be unlikely for one of them to keep
running the same operating program for a while... they could just keep
shifting from program to program in response to each other.

 If AIXI-tl needs general intelligence but fails to develop
 general intelligence to solve the complex cooperation problem, while
 humans starting out with general intelligence do solve the problem, then
 AIXI-tl has been broken.

Well, we have different definitions of broken in this context, but that's
not a point worth arguing about.

 But we aren't *talking* about whether AIXI-tl has a mindlike operating
 program.  We're talking about whether the physically realizable
 challenge,
 which definitely breaks the formalism, also breaks AIXI-tl in practice.
 That's what I originally stated, that's what you originally said you
 didn't believe, and that's all I'm trying to demonstrate.

Yes, you would seem to have successfully shown (logically and intuitively,
though not mathematically) that AIXItl's can be dumber in their interactions
with other AIXItl's, than humans are in their analogous interactions with
other humans.

I don't think you should describe this as breaking the formalism, because
the formalism is about how a single AIXItl solves a fixed goal function, not
about how groups of AIXItl's interact.

But it's certainly an interesting result.  I hope that, even if you don't
take the time to prove it rigorously, you'll write it up in a brief,
coherent essay, so that others not on this list can appreciate it...

Funky stuff!! ;-)

-- Ben G

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Eliezer S. Yudkowsky
Ben Goertzel wrote:

 I'll read the rest of your message tomorrow...

 But we aren't *talking* about whether AIXI-tl has a mindlike
 operating program.  We're talking about whether the physically
 realizable challenge, which definitely breaks the formalism, also
 breaks AIXI-tl in practice. That's what I originally stated, that's
 what you originally said you didn't believe, and that's all I'm
 trying to demonstrate.

 Your original statement was posed in a misleading way, perhaps not
 intentionally.

 There is no challenge on which *an* AIXI-tl doesn't outperform *an*
 uploaded human.

We are all Lee Corbin; would you really say there's more than one... oh,
never mind, I don't want to get *that* started here.

There's a physical challenge which operates on *one* AIXI-tl and breaks 
it, even though it involves diagonalizing the AIXI-tl as part of the
challenge.  In the real world, all reality is interactive and
naturalistic, not walled off by a Cartesian theatre.  The example I gave
is probably the simplest case that clearly breaks the formalism and
clearly causes AIXI-tl to operate suboptimally.  There's more complex and
important cases, that we would understand as roughly constant
environmental challenges which break AIXI-tl's formalism in more subtle
ways, with the result that AIXI-tl can't cooperate in one-shot PDs with
superintelligences... and neither can a human, incidentally, but another
seed AI or superintelligence can-I-think, by inventing a new kind of
reflective choice which is guaranteed to be correlated as a result of
shared initial conditions, both elements that break AIXI-tl... well, 
anyway, the point is that there's a qualitatively different kind of 
intelligence here that I think could turn out to be extremely critical in 
negotiations among superintelligences.  The formalism in this situation 
gets broken, depending on how you're looking at it, by side effects of the 
AIXI-tl's existence or by violation of the separability condition. 
Actually, violations of the formalism are ubiquitous and this is not 
particularly counterintuitive; what is counterintuitive is that formalism 
violations turn out to make a real-world difference.

Are we at least in agreement on the fact that there exists a formalizable 
constant challenge C which accepts an arbitrary single agent and breaks 
both the AIXI-tl formalism and AIXI-tl?

Reads Ben Goertzel's other message, while working on this one.

OK.

We'd better take a couple of days off before taking up the AIXI 
Friendliness issue.  Maybe even wait until I get back from New York in a 
week.  Also, I want to wait for all these emails to show up in the AGI 
archive, then tell Marcus Hutter about them if no one has already.  I'd be 
interesting in seeing what he thinks.

 What you're trying to show is that there's an inter-AIXI-tl social
 situation in which AIXI-tl's perform less intelligently than humans do
 in a similar inter-human situation.

 If you had posed it this way, I wouldn't have been as skeptical
 initially.

If I'd posed it that way, it would have been uninteresting because I
wouldn't have broken the formalism.  Again, to quote my original claim:

 1)  There is a class of physically realizable problems, which humans
 can solve easily for maximum reward, but which - as far as I can tell
 - AIXI cannot solve even in principle;

 I don't see this, nor do I believe it...

And later expanded to:

 An intuitively fair, physically realizable challenge, with important
 real-world analogues, formalizable as a computation which can be fed
 either a tl-bounded uploaded human or an AIXI-tl, for which the human
 enjoys greater success measured strictly by total reward over time, due
 to the superior strategy employed by that human as the result of
 rational reasoning of a type not accessible to AIXI-tl.

It's really the formalizability of the challenge as a computation which 
can be fed either a *single* AIXI-tl or a *single* tl-bounded uploaded 
human that makes the whole thing interesting at all... I'm sorry I didn't 
succeed in making clear the general class of real-world analogues for 
which this is a special case.

If I were to take a very rough stab at it, it would be that the 
cooperation case with your own clone is an extreme case of many scenarios 
where superintelligences can cooperate with each other on the one-shot 
Prisoner's Dilemna provided they have *loosely similar* reflective goal 
systems and that they can probabilistically estimate that enough loose 
similarity exists.

It's the natural counterpart of the Clone challenge - loosely similar goal 
systems arise all the time, and it turns out that in addition to those 
goal systems being interpreted as a constant environmental challenge, 
there are social problems that depend on your being able to correlate your 
internal processes with theirs (you can correlate internal processes 
because you're both part of the same naturalistic universe).  This breaks 
AIXI-tl because it's not loosely 

Re: [agi] Breaking AIXI-tl

2003-02-14 Thread Eliezer S. Yudkowsky
Eliezer S. Yudkowsky wrote:


But if this isn't immediately obvious to you, it doesn't seem like a top 
priority to try and discuss it...

Argh.  That came out really, really wrong and I apologize for how it 
sounded.  I'm not very good at agreeing to disagree.

Must... sleep...

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re: [agi] Breaking AIXI-tl

2003-02-12 Thread Eliezer S. Yudkowsky
Shane Legg wrote:


Eliezer,

Yes, this is a clever argument.  This problem with AIXI has been
thought up before but only appears, at least as far as I know, in
material that is currently unpublished.  I don't know if anybody
has analysed the problem in detail as yet... but it certainly is
a very interesting question to think about:

What happens when two super intelligent AIXI's meet?


SI-AIXI is redundant; all AIXIs are enormously far beyond 
superintelligent.  As for the problem, the obvious answer is that no 
matter what strange things happen, an AIXI^2 which performs Solomonoff^2 
induction, using the universal prior of strings output by first-order 
Oracle machines, will come up with the best possible strategy for handling 
it...

Has the problem been thought up just in the sense of What happens when 
two AIXIs meet? or in the formalizable sense of Here's a computational 
challenge C on which a tl-bounded human upload outperforms AIXI-tl?

--
Eliezer S. Yudkowsky  http://singinst.org/
Research Fellow, Singularity Institute for Artificial Intelligence

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re: [agi] Breaking AIXI-tl

2003-02-12 Thread Shane Legg
Eliezer S. Yudkowsky wrote:


Has the problem been thought up just in the sense of What happens when 
two AIXIs meet? or in the formalizable sense of Here's a computational 
challenge C on which a tl-bounded human upload outperforms AIXI-tl?

I don't know of anybody else considering human upload vs. AIXI.

Cheers
Shane

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]


Re: [agi] Breaking AIXI-tl

2003-02-12 Thread Bill Hibbard
Hi Eliezer,

 An intuitively fair, physically realizable challenge, with important
 real-world analogues, formalizable as a computation which can be fed
 either a tl-bounded uploaded human or an AIXI-tl, for which the human
 enjoys greater success measured strictly by total reward over time, due to
 the superior strategy employed by that human as the result of rational
 reasoning of a type not accessible to AIXI-tl.

 Roughly speaking:

 A (selfish) human upload can engage in complex cooperative strategies with
 an exact (selfish) clone, and this ability is not accessible to AIXI-tl,
 since AIXI-tl itself is not tl-bounded and therefore cannot be simulated
 by AIXI-tl, nor does AIXI-tl have any means of abstractly representing the
 concept a copy of myself.  Similarly, AIXI is not computable and
 therefore cannot be simulated by AIXI.  Thus both AIXI and AIXI-tl break
 down in dealing with a physical environment that contains one or more
 copies of them.  You might say that AIXI and AIXI-tl can both do anything
 except recognize themselves in a mirror.

Why do you require an AIXI or AIXI-tl to simulate itself, when
humans cannot? A human cannot know that another human is an
exact clone of itself. All humans or AIXIs can know is what
they observe. They cannot know that another mind is identical.

 The simplest case is the one-shot Prisoner's Dilemna against your own
 exact clone.  It's pretty easy to formalize this challenge as a
 computation that accepts either a human upload or an AIXI-tl.  This
 obviously breaks the AIXI-tl formalism.  Does it break AIXI-tl?  This
 question is more complex than you might think.  For simple problems,
 there's a nonobvious way for AIXI-tl to stumble onto incorrect hypotheses
 which imply cooperative strategies, such that these hypotheses are stable
 under the further evidence then received.  I would expect there to be
 classes of complex cooperative problems in which the chaotic attractor
 AIXI-tl converges to is suboptimal, but I have not proved it.  It is
 definitely true that the physical problem breaks the AIXI formalism and
 that a human upload can straightforwardly converge to optimal cooperative
 strategies based on a model of reality which is more correct than any
 AIXI-tl is capable of achieving.

Given that humans can only know what they observe, and
thus cannot know what is going on inside another mind,
humans are on the same footing as AIXIs in Prisoner's
Dilema. I suspect that two AIXIs or AIXI-tl's will do
well at the game, since a strategy with betrayal probably
needs a longer program than a startegy without betrayal,
and the AIXI will weight more strongly a model of the
other's behavior with a shorter program.

 Ultimately AIXI's decision process breaks down in our physical universe
 because AIXI models an environmental reality with which it interacts,
 instead of modeling a naturalistic reality within which it is embedded.
 It's one of two major formal differences between AIXI's foundations and
 Novamente's.  Unfortunately there is a third foundational difference
 between AIXI and a Friendly AI.

I will grant you one thing: that since an AIXI cannot
exist and an AIXI-tl is too slow to be practical, using
them as a basis for discussing safe AGIs is a bit futile.

The other problem is that an AIXI's optimality is only as
valid as its assumption about the probability distribution
of universal Turing machine programs.

Cheers,
Bill
--
Bill Hibbard, SSEC, 1225 W. Dayton St., Madison, WI  53706
[EMAIL PROTECTED]  608-263-4427  fax: 608-263-6738
http://www.ssec.wisc.edu/~billh/vis.html

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]