RE: [agi] Breaking AIXI-tl
Philip, The discussion at times seems to have progressed on the basis that AIXI / AIXItl could choose to do all sorts amzing, powerful things. But what I'm uncear on is what generates the infinite space of computer programs? Does AIXI / AIXItl itself generate these programs? Or does it tap other entities programs? AIXI is not a physically realizable system, it's just a hypothetical mathematical entity. It could never actually be built, in any universe. AIXItl is physically realizable in theory, but probably never in our universe... it would require too much computational resources. (Except for trivially small values of the parameters t and l, which would result in a very dumb AIXItl, i.e. probably dumber than a beetle.) The way they work is to generate all possible programs (AIXI) or all possible programs of a given length l (AIXItl). (It's easy to write a program that generates all possible programs, the problem is that it runs forever ;). -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Ben Goertzel wrote: Agreed, except for the very modest resources part. AIXI could potentially accumulate pretty significant resources pretty quickly. Agreed. But if the AIXI needs to dissassemble the planet to build its defense mechanism, the fact that it is harmless afterwards isn't going to be much consolation to us. So, we only survive if the resources needed for the perfect defense are small enough that the construction project doesn't wipe us out as a side effect. This exploration makes the (fairly obvious, I guess) point that the problem with AIXI Friendliness-wise is its simplistic goal architecture (the reward function) rather than its learning mechanism. Well, I agree that this particular problem is a result of the AIXI's goal system architecture, but IMO the same problem occurs in a wide range of other goal systems I've seen proposed on this list. The root of the problem is that the thing we would really like to reward the system for, human satisfaction with its performance, is not a physical quantity that can be directly measured by a reward mechanism. So it is very tempting to choose some external phenomenon, like smiles or verbal expressions of satisfaction, as a proxy. Unfortunately, any such measurement can be subverted once the AI becomes good at modifying its physical surroundings, and an AI with this kind of goal system has no motivation not to wirehead itself. To avoid the problem entirely, you have to figure out how to make an AI that doesn't want to tinker with its reward system in the first place. This, in turn, requires some tricky design work that would not necessarily seem important unless one were aware of this problem. Which, of course, is the reason I commented on it in the first place. Billy Brown --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
To avoid the problem entirely, you have to figure out how to make an AI that doesn't want to tinker with its reward system in the first place. This, in turn, requires some tricky design work that would not necessarily seem important unless one were aware of this problem. Which, of course, is the reason I commented on it in the first place. Billy Brown I don't think that preventing an AI from tinkering with its reward system is the only solution, or even the best one... It will in many cases be appropriate for an AI to tinker with its goal system... I would recommend Eliezer's excellent writings on this topic if you don't know them, chiefly www.singinst.org/CFAI.html . Also, I have a brief informal essay on the topic, www.goertzel.org/dynapsyc/2002/AIMorality.htm , although my thoughts on the topic have progressed a fair bit since I wrote that. Note that I don't fully agree with Eliezer on this stuff, but I do think he's thought about it more thoroughly than anyone else (including me). It's a matter of creating an initial condition so that the trajectory of the evolving AI system (with a potentially evolving goal system) will have a very high probability of staying in a favorable region of state space ;-) -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Ben Goertzel wrote: I don't think that preventing an AI from tinkering with its reward system is the only solution, or even the best one... It will in many cases be appropriate for an AI to tinker with its goal system... I don't think I was being clear there. I don't mean the AI should be prevented from adjusting its goal system content, but rather that it should be sophisticated enough that it doesn't want to wirehead in the first place. I would recommend Eliezer's excellent writings on this topic if you don't know them, chiefly www.singinst.org/CFAI.html . Also, I have a brief informal essay on the topic, www.goertzel.org/dynapsyc/2002/AIMorality.htm , although my thoughts on the topic have progressed a fair bit since I wrote that. Yes, I've been following Eliezer's work since around '98. I'll have to take a look at your essay. Billy Brown --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Ben Goertzel wrote: I don't think that preventing an AI from tinkering with its reward system is the only solution, or even the best one... It will in many cases be appropriate for an AI to tinker with its goal system... I don't think I was being clear there. I don't mean the AI should be prevented from adjusting its goal system content, but rather that it should be sophisticated enough that it doesn't want to wirehead in the first place. Ah, I certainly agree with you then. The risk that's tricky to mitigate against is that, like a human drifting into drug addiction, the AI slowly drifts into a state of mind where it does want to wirehead ... ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
This seems to be a non-sequitor. The weakness of AIXI is not that it's goals don't change, but that it has no goals other than to maximize an externally given reward. So it's going to do whatever it predicts will most efficiently produce that reward, which is to coerce or subvert the evaluator. I'm not sure why an AIXI, rewarded for pleasing humans, would learn an operating program leading it to hurt or annihilate humans, though. It might learn a program involving actually doing beneficial acts for humans Or, it might learn a program that just tells humans what they want to hear, using its superhuman intelligence to trick humans into thinking that hearing its soothing words is better than having actual beneficial acts done. I'm not sure why you think the latter is more likely than the former. My guess is that the former is more likely. It may require a simpler program to please humans by benefiting them, than to please them by tricking them into thinking they're being benefited If you start with such a goal, I don't see how allowing the system to change its goals is going to help. Sure, you're right, if pleasing an external evaluator is the ONLY goal of a system, and the system's dynamics are entirely goal-directed, then there is no way to introduce goal-change into the system except randomly... Novamente is different because it has multiple initial goals, and because its behavior is not entirely goal-directed. In these regards Novamente is more human-brain-ish. But I think Eliezer's real point, which I'm not sure has come across, is that if you didn't spot such an obvious flaw right away, maybe you shouldn't trust your intuitions about what is safe and what is not. Yes, I understood and explicitly responded to that point before. Still, even after hearing you and Eliezer repeat the above argument, I'm still not sure it's correct. However, my intuitions about the safety of AIXI, which I have not thought much about, are worth vastly less than my intuitions about the safety of Novamente, which I've been thinking about and working with for years. Furthermore, my stated intention is NOT to rely on my prior intuitions to assess the safety of my AGI system. I don't think that anyone's prior intuitions about AI safety are worth all that much, where a complex system like Novamente is concerned. Rather, I think that once Novamente is a bit further along -- at the learning baby rather than partly implemented baby stage -- we will do experimentation that will give us the empirical knowledge needed to form serious opinions about safety (Friendliness). -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
I wrote: I'm not sure why an AIXI, rewarded for pleasing humans, would learn an operating program leading it to hurt or annihilate humans, though. It might learn a program involving actually doing beneficial acts for humans Or, it might learn a program that just tells humans what they want to hear, using its superhuman intelligence to trick humans into thinking that hearing its soothing words is better than having actual beneficial acts done. I'm not sure why you think the latter is more likely than the former. My guess is that the former is more likely. It may require a simpler program to please humans by benefiting them, than to please them by tricking them into thinking they're being benefited But even in the latter case, why would this program be likely to cause it to *harm* humans? That's what I don't see... If it can get its reward-button jollies by tricking us, or by actually benefiting us, why do you infer that it's going to choose to get its reward-button jollies by finding a way to get rewarded by harming us? I wouldn't feel terribly comfortable with an AIXI around hooked up to a bright red reward button in Marcus Hutter's basement, but I'm not sure it would be sudden disaster either... -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Wei Dai wrote: Eliezer S. Yudkowsky wrote: Important, because I strongly suspect Hofstadterian superrationality is a *lot* more ubiquitous among transhumans than among us... It's my understanding that Hofstadterian superrationality is not generally accepted within the game theory research community as a valid principle of decision making. Do you have any information to the contrary, or some other reason to think that it will be commonly used by transhumans? You yourself articulated, very precisely, the structure underlying Hofstadterian superrationality: Expected utility of a course of action is defined as the average of the utility function evaluated on each possible state of the multiverse, weighted by the probability of that state being the actual state if the course was chosen. The key precise phrasing is weighted by the probability of that state being the actual state if the course was chosen. This view of decisionmaking is applicable to a timeless universe; it provides clear recommendations in the case of, e.g., Newcomb's Paradox. The mathematical pattern of a goal system or decision may be instantiated in many distant locations simultaneously. Mathematical patterns are constant, and physical processes may produce knowably correlated outputs given knowably correlated initial conditions. For non-deterministic systems, or cases where the initial conditions are not completely known (where there exists a degree of subjective entropy in the specification of the initial conditions), the correlation estimated will be imperfect, but nonetheless nonzero. What I call the Golden Law, by analogy with the Golden Rule, states descriptively that a local decision is correlated with the decision of all mathematically similar goal processes, and states prescriptively that the utility of an action should be calculated given that the action is the output of the mathematical pattern represented by the decision process, not just the output of a particular physical system instantiating that process - that the utility of an action is the utility given that all sufficiently similar instantiations of a decision process within the multiverse do, already have, or someday will produce that action as an output. Similarity in this case is a purely descriptive argument with no prescriptive parameters. Golden decisionmaking does not imply altruism - your goal system might evaluate the utility of only your local process. The Golden Law does, however, descriptively and prescriptively produce Hofstadterian superrationality as a special case; if you are facing a sufficiently similar mind across the Prisoner's Dilemna, your decisions will be correlated and that correlation affects your local utility. Given that the output of the mathematical pattern instantiated by your physical decision process is C, the state of the multiverse is C, C; given that the output of the mathematical pattern instantiated by your physical decision process is D, the state of the multiverse is D, D. Thus, given sufficient rationality and a sufficient degree of known correlation between the two processes, the mathematical pattern that is the decision process will output C. -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
On Wed, Feb 19, 2003 at 11:02:31AM -0500, Ben Goertzel wrote: I'm not sure why an AIXI, rewarded for pleasing humans, would learn an operating program leading it to hurt or annihilate humans, though. It might learn a program involving actually doing beneficial acts for humans Or, it might learn a program that just tells humans what they want to hear, using its superhuman intelligence to trick humans into thinking that hearing its soothing words is better than having actual beneficial acts done. I'm not sure why you think the latter is more likely than the former. My guess is that the former is more likely. It may require a simpler program to please humans by benefiting them, than to please them by tricking them into thinking they're being benefited The AIXI would just contruct some nano-bots to modify the reward-button so that it's stuck in the down position, plus some defenses to prevent the reward mechanism from being further modified. It might need to trick humans initially into allowing it the ability to construct such nano-bots, but it's certainly a lot easier in the long run to do this than to benefit humans for all eternity. And not only is it easier, but this way he gets the maximum rewards per time unit, which he would not be able to get any other way. No real evaluator will ever give maximum rewards since it will always want to leave room for improvement. Furthermore, my stated intention is NOT to rely on my prior intuitions to assess the safety of my AGI system. I don't think that anyone's prior intuitions about AI safety are worth all that much, where a complex system like Novamente is concerned. Rather, I think that once Novamente is a bit further along -- at the learning baby rather than partly implemented baby stage -- we will do experimentation that will give us the empirical knowledge needed to form serious opinions about safety (Friendliness). What kinds of experimentations do you plan to do? Please give some specific examples. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
The AIXI would just contruct some nano-bots to modify the reward-button so that it's stuck in the down position, plus some defenses to prevent the reward mechanism from being further modified. It might need to trick humans initially into allowing it the ability to construct such nano-bots, but it's certainly a lot easier in the long run to do this than to benefit humans for all eternity. And not only is it easier, but this way he gets the maximum rewards per time unit, which he would not be able to get any other way. No real evaluator will ever give maximum rewards since it will always want to leave room for improvement. Fine, but if it does this, it is not anything harmful to humans. And, in the period BEFORE the AIXI figured out how to construct nanobots (or coerce teach humans how to do so), it might do some useful stuff for humans. So then we'd have an AIXI that was friendly for a while, and then basically disappeared into a shell. Then we could build a new AIXI and start over ;-) Furthermore, my stated intention is NOT to rely on my prior intuitions to assess the safety of my AGI system. I don't think that anyone's prior intuitions about AI safety are worth all that much, where a complex system like Novamente is concerned. Rather, I think that once Novamente is a bit further along -- at the learning baby rather than partly implemented baby stage -- we will do experimentation that will give us the empirical knowledge needed to form serious opinions about safety (Friendliness). What kinds of experimentations do you plan to do? Please give some specific examples. I will, a little later on -- I have to go outside now and spend a couple hours shoveling snow off my driveway ;-p Ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Wei Dai wrote: The AIXI would just contruct some nano-bots to modify the reward-button so that it's stuck in the down position, plus some defenses to prevent the reward mechanism from being further modified. It might need to trick humans initially into allowing it the ability to construct such nano-bots, but it's certainly a lot easier in the long run to do this than to benefit humans for all eternity. And not only is it easier, but this way he gets the maximum rewards per time unit, which he would not be able to get any other way. No real evaluator will ever give maximum rewards since it will always want to leave room for improvement. I think it's worse than that, actually. The next logical step is to make sure that nothing ever interferes with its control of the reward signal, or does anything else that would turn off AIXI. It will therefore persue the most effective defensive scheme it can come up with, and it has no reason to care about adverse consequences to humans. Now, there is no easy way to predict what strategy it will settle on, but build a modest bunker and ask to be left alone surely isn't it. At the very least it needs to become the strongest military power in the world, and stay that way. It might very well decide that exterminating the human race is a safer way of preventing future threats, by ensuring that nothing that could interfere with its operation is ever built. Then it has to make sure no alien civilization ever interferes with the reward button, which is the same problem on a much larger scale. There are lots of approaches it might take to this problem, but most of the obvious ones either wipe out the human race as a side effect or reduce us to the position of ants trying to survive in the AI's defense system. Billy Brown --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Now, there is no easy way to predict what strategy it will settle on, but build a modest bunker and ask to be left alone surely isn't it. At the very least it needs to become the strongest military power in the world, and stay that way. It might very well decide that exterminating the human race is a safer way of preventing future threats, by ensuring that nothing that could interfere with its operation is ever built. Then it has to make sure no alien civilization ever interferes with the reward button, which is the same problem on a much larger scale. There are lots of approaches it might take to this problem, but most of the obvious ones either wipe out the human race as a side effect or reduce us to the position of ants trying to survive in the AI's defense system. I think this is an appropriate time to paraphrase Kent Brockman: Earth has been taken over 'conquered', if you will by a master race of unfriendly AI's. It's difficult to tell from this vantage point whether they will destroy the captive earth men or merely enslave them. One thing is for certain, there is no stopping them; their nanobots will soon be here. And I, for one, welcome our new computerized overlords. I'd like to remind them that as a trusted agi-list personality, I can be helpful in rounding up Eliezer to...toil in their underground uranium caves http://www.the-ocean.com/simpsons/others/ants2.wav Apologies if this was inapporpriate. -Brad --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
On Wed, Feb 19, 2003 at 11:56:46AM -0500, Eliezer S. Yudkowsky wrote: The mathematical pattern of a goal system or decision may be instantiated in many distant locations simultaneously. Mathematical patterns are constant, and physical processes may produce knowably correlated outputs given knowably correlated initial conditions. For non-deterministic systems, or cases where the initial conditions are not completely known (where there exists a degree of subjective entropy in the specification of the initial conditions), the correlation estimated will be imperfect, but nonetheless nonzero. What I call the Golden Law, by analogy with the Golden Rule, states descriptively that a local decision is correlated with the decision of all mathematically similar goal processes, and states prescriptively that the utility of an action should be calculated given that the action is the output of the mathematical pattern represented by the decision process, not just the output of a particular physical system instantiating that process - that the utility of an action is the utility given that all sufficiently similar instantiations of a decision process within the multiverse do, already have, or someday will produce that action as an output. Similarity in this case is a purely descriptive argument with no prescriptive parameters. Ok, I see. I think I agree with this. I was confused by your phrase Hofstadterian superrationality because if I recall correctly, Hofstadter suggested that one should always cooperate in one-shot PD, whereas you're saying only cooperate if you have sufficient evidence that the other side is running the same decision algorithm as you are. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Now, there is no easy way to predict what strategy it will settle on, but build a modest bunker and ask to be left alone surely isn't it. At the very least it needs to become the strongest military power in the world, and stay that way. I ... Billy Brown I think this line of thinking makes way too many assumptions about the technologies this uber-AI might discover. It could discover a truly impenetrable shield, for example. It could project itself into an entirely different universe... It might decide we pose so little threat to it, with its shield up, that fighting with us isn't worthwhile. By opening its shield perhaps it would expose itself to .0001% chance of not getting rewarded, whereas by leaving its shield up and leaving us alone, it might have .1% chance of not getting rewarded. ETc. I agree that bad outcomes are possible, but I don't see how we can possibly estimate the odds of them. -- ben g --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Wei Dai wrote: Ok, I see. I think I agree with this. I was confused by your phrase Hofstadterian superrationality because if I recall correctly, Hofstadter suggested that one should always cooperate in one-shot PD, whereas you're saying only cooperate if you have sufficient evidence that the other side is running the same decision algorithm as you are. Similarity in this case may be (formally) emergent, in the sense that a most or all plausible initial conditions for a bootstrapping superintelligence - even extremely exotic conditions like the birth of a Friendly AI - exhibit convergence to decision processes that are correlated with each other with respect to the oneshot PD. If you have sufficient evidence that the other entity is a superintelligence, that alone may be sufficient correlation. -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Ben Goertzel wrote: I think this line of thinking makes way too many assumptions about the technologies this uber-AI might discover. It could discover a truly impenetrable shield, for example. It could project itself into an entirely different universe... It might decide we pose so little threat to it, with its shield up, that fighting with us isn't worthwhile. By opening its shield perhaps it would expose itself to .0001% chance of not getting rewarded, whereas by leaving its shield up and leaving us alone, it might have .1% chance of not getting rewarded. ETc. You're thinking in static terms. It doesn't just need to be safe from anything ordinary humans do with 20th century technology. It needs to be safe from anything that could ever conceivably be created by humanity or its descendants. This obviously includes other AIs with capabilities as great as its own, but with whatever other goal systems humans might try out. Now, it is certainly conceivable that the laws of physics just happen to be such that a sufficiently good technology can create a provably impenetrable defense in a short time span, using very modest resources. If that happens to be the case, the runaway AI isn't a problem. But in just about any other case we all end up dead, either because wiping out humanity now is far easier that creating a defense against our distant descendants, or because the best defensive measures the AI can think of require engineering projects that would wipe us out as a side effect. Billy Brown --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Billy Brown wrote: Ben Goertzel wrote: I think this line of thinking makes way too many assumptions about the technologies this uber-AI might discover. It could discover a truly impenetrable shield, for example. It could project itself into an entirely different universe... It might decide we pose so little threat to it, with its shield up, that fighting with us isn't worthwhile. By opening its shield perhaps it would expose itself to .0001% chance of not getting rewarded, whereas by leaving its shield up and leaving us alone, it might have .1% chance of not getting rewarded. Now, it is certainly conceivable that the laws of physics just happen to be such that a sufficiently good technology can create a provably impenetrable defense in a short time span, using very modest resources. If that happens to be the case, the runaway AI isn't a problem. But in just about any other case we all end up dead, either because wiping out humanity now is far easier that creating a defense against our distant descendants, or because the best defensive measures the AI can think of require engineering projects that would wipe us out as a side effect. It should also be pointed out that we are describing a state of AI such that: a) it provides no conceivable benefit to humanity b) a straightforward extrapolation shows it wiping out humanity c) it requires the postulation of a specific unsupported complex miracle to prevent the AI from wiping out humanity c1) these miracles are unstable when subjected to further examination c2) the AI still provides no benefit to humanity even given the miracle When a branch of an AI extrapolation ends in such a scenario it may legitimately be labeled a complete failure. -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
It should also be pointed out that we are describing a state of AI such that: a) it provides no conceivable benefit to humanity Not necessarily true: it's plausible that along the way, before learning how to whack off by stimulating its own reward button, it could provide some benefits to humanity. b) a straightforward extrapolation shows it wiping out humanity c) it requires the postulation of a specific unsupported complex miracle to prevent the AI from wiping out humanity c1) these miracles are unstable when subjected to further examination I'm not so sure about this, but it's not worth arguing, really. c2) the AI still provides no benefit to humanity even given the miracle When a branch of an AI extrapolation ends in such a scenario it may legitimately be labeled a complete failure. I'll classify it an almost-complete failure, sure ;) Fortunately it's also a totally pragmatically implausible system to construct, so there's not much to worry about...! -- Ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Eliezer S. Yudkowsky wrote: Important, because I strongly suspect Hofstadterian superrationality is a *lot* more ubiquitous among transhumans than among us... It's my understanding that Hofstadterian superrationality is not generally accepted within the game theory research community as a valid principle of decision making. Do you have any information to the contrary, or some other reason to think that it will be commonly used by transhumans? About a week ago Eliezer also wrote: 2) While an AIXI-tl of limited physical and cognitive capabilities might serve as a useful tool, AIXI is unFriendly and cannot be made Friendly regardless of *any* pattern of reinforcement delivered during childhood. I always thought that the biggest problem with the AIXI model is that it assumes that something in the environment is evaluating the AI and giving it rewards, so the easiest way for the AI to obtain its rewards would be to coerce or subvert the evaluator rather than to accomplish any real goals. I wrote a bit more about this problem at http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Wei Dai wrote: Important, because I strongly suspect Hofstadterian superrationality is a *lot* more ubiquitous among transhumans than among us... It's my understanding that Hofstadterian superrationality is not generally accepted within the game theory research community as a valid principle of decision making. Do you have any information to the contrary, or some other reason to think that it will be commonly used by transhumans? I don't agree with Eliezer about the importance of Hofstadterian superrationality. However, I do think he ended up making a good point about AIXItl, which is that an AIXItl will probably be a lot worse at modeling other AIXItl's, than a human is at modeling other humans. This suggests that AIXItl's playing cooperative games with each other, will likely fare worse than humans playing cooperative games with each other. I don't think this conclusion hinges on the importance of Hofstadterian superrationality... About a week ago Eliezer also wrote: 2) While an AIXI-tl of limited physical and cognitive capabilities might serve as a useful tool, AIXI is unFriendly and cannot be made Friendly regardless of *any* pattern of reinforcement delivered during childhood. I always thought that the biggest problem with the AIXI model is that it assumes that something in the environment is evaluating the AI and giving it rewards, so the easiest way for the AI to obtain its rewards would be to coerce or subvert the evaluator rather than to accomplish any real goals. I wrote a bit more about this problem at http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html. I agree, this is a weakness of AIXI/AIXItl as a practical AI design. In humans, and in a more pragmatic AI design like Novamente, one has a situation where the system's goals adapt and change along with the rest of the system, beginning from (and sometimes but not always straying far from) a set of initial goals. One could of course embed the AIXI/AIXItl learning mechanism in a supersystem that adapted its goals But then one would probably lose the nice theorems Marcus Hutter proved -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Eliezer, Allowing goals to change in a coupled way with thoughts memories, is not simply adding entropy -- Ben Ben Goertzel wrote: I always thought that the biggest problem with the AIXI model is that it assumes that something in the environment is evaluating the AI and giving it rewards, so the easiest way for the AI to obtain its rewards would be to coerce or subvert the evaluator rather than to accomplish any real goals. I wrote a bit more about this problem at http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html. I agree, this is a weakness of AIXI/AIXItl as a practical AI design. In humans, and in a more pragmatic AI design like Novamente, one has a situation where the system's goals adapt and change along with the rest of the system, beginning from (and sometimes but not always straying far from) a set of initial goals. How does adding entropy help? -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
On Tue, Feb 18, 2003 at 06:58:30PM -0500, Ben Goertzel wrote: However, I do think he ended up making a good point about AIXItl, which is that an AIXItl will probably be a lot worse at modeling other AIXItl's, than a human is at modeling other humans. This suggests that AIXItl's playing cooperative games with each other, will likely fare worse than humans playing cooperative games with each other. That's because AIXI wasn't designed with game theory in mind. I.e., the reason that it doesn't handle cooperative games is that it wasn't designed to. As the abstract says, AIXI is a combination of decision theory with Solomonoff's theory of universal induction. We know that game theory subsumes decision theory as a special case (where there is only one player) but not the other way around. Central to multi-player game theory is the concept of Nash equilibrium, which doesn't exist in decision theory. If you apply decision theory to multi-player games, you're going to end up with an infinite recursion where you try to predict the other players trying to predict you trying to predict the other players, and so on. If you cut this infinite recursion off at an arbitrary point, as AIXI-tl would, of course you're not going to get good results. I always thought that the biggest problem with the AIXI model is that it assumes that something in the environment is evaluating the AI and giving it rewards, so the easiest way for the AI to obtain its rewards would be to coerce or subvert the evaluator rather than to accomplish any real goals. I wrote a bit more about this problem at http://www.mail-archive.com/everything-list@eskimo.com/msg03620.html. I agree, this is a weakness of AIXI/AIXItl as a practical AI design. In humans, and in a more pragmatic AI design like Novamente, one has a situation where the system's goals adapt and change along with the rest of the system, beginning from (and sometimes but not always straying far from) a set of initial goals. This seems to be a non-sequitor. The weakness of AIXI is not that it's goals don't change, but that it has no goals other than to maximize an externally given reward. So it's going to do whatever it predicts will most efficiently produce that reward, which is to coerce or subvert the evaluator. If you start with such a goal, I don't see how allowing the system to change its goals is going to help. But I think Eliezer's real point, which I'm not sure has come across, is that if you didn't spot such an obvious flaw right away, maybe you shouldn't trust your intuitions about what is safe and what is not. --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl - AGI friendliness
Hi Eliezer/Ben, My recollection was that Eliezer initiated the Breaking AIXI-tl discussion as a way of proving that friendliness of AGIs had to be consciously built in at the start and couldn't be assumed to be teachable at a later point. (Or have I totally lost the plot?) Do you feel the discussion has covered enough technical ground and established enough concensus to bring the original topic back into focus? Cheers, Philip --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl - AGI friendliness
Actually, Eliezer said he had two points about AIXItl: 1) that it could be broken in the sense he's described 2) that it was intrinsically un-Friendly So far he has only made point 1), and has not gotten to point 2) !!! As for a general point about the teachability of Friendliness, I don't think that an analysis of AIXItl can lead to any such general conclusion. AIXItl is very, very different from Novamente or any other pragmatic AI system. I think that an analysis of AIXItl's Friendliness or otherwise is going to be useful primarily as an exercise in Friendliness analysis of AGI systems, rather than for any pragmatic implications it may yave. -- Ben -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Philip Sutton Sent: Sunday, February 16, 2003 9:42 AM To: [EMAIL PROTECTED] Subject: Re: [agi] Breaking AIXI-tl - AGI friendliness Hi Eliezer/Ben, My recollection was that Eliezer initiated the Breaking AIXI-tl discussion as a way of proving that friendliness of AGIs had to be consciously built in at the start and couldn't be assumed to be teachable at a later point. (Or have I totally lost the plot?) Do you feel the discussion has covered enough technical ground and established enough concensus to bring the original topic back into focus? Cheers, Philip --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl - AGI friendliness - how to move on
Hi Ben, From a high order implications point of view I'm not sure that we need too much written up from the last discussion. To me it's almost enough to know that both you and Eliezer agree that the AIXItl system can be 'broken' by the challenge he set and that a human digital simulation might not. The next step is to ask so what?. What has this got to do with the AGI friendliness issue. Hopefully Eliezer will write up a brief paper on his observations about AIXI and AIXItl. If he does that, I'll be happy to write a brief commentary on his paper expressing any differences of interpretation I have, and giving my own perspective on his points. That sounds good to me. Cheers, Philip --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl - AGI friendliness - how to move on
To me it's almost enough to know that both you and Eliezer agree that the AIXItl system can be 'broken' by the challenge he set and that a human digital simulation might not. The next step is to ask so what?. What has this got to do with the AGI friendliness issue. This last point of Eliezer's doesn't have much to do with the AGI Friendliness issue. It's simply an example of how a smarter AGI system may not be smarter in the context of interacting socially with its own peers. -- Ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Eliezer S. Yudkowsky wrote: Bill Hibbard wrote: On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote: It *could* do this but it *doesn't* do this. Its control process is such that it follows an iterative trajectory through chaos which is forbidden to arrive at a truthful solution, though it may converge to a stable attractor. This is the heart of the fallacy. Neither a human nor an AIXI can know that his synchronized other self - whichever one he is - is doing the same. All a human or an AIXI can know is its observations. They can estimate but not know the intentions of other minds. The halting problem establishes that you can never perfectly understand your own decision process well enough to predict its decision in advance, because you'd have to take into account the decision process including the prediction, et cetera, establishing an infinite regress. However, Corbin doesn't need to know absolutely that his other self is synchronized, nor does he need to know his other self's decision in advance. Corbin only needs to establish a probabilistic estimate, good enough to guide his actions, that his other self's decision is correlated with his *after* the fact. (I.e., it's not a halting problem where you need to predict yourself in advance; you only need to know your own decision after the fact.) AIXI-tl is incapable of doing this for complex cooperative problems because its decision process only models tl-bounded things and AIXI-tl is not *remotely close* to being tl-bounded. Now you are using a different argument. You previous argument was: Lee Corbin can work out his entire policy in step (2), before step (3) occurs, knowing that his synchronized other self - whichever one he is - is doing the same. Now you have Corbin merely estimating his clone's intentions. While it is true that AIXI-tl cannot completely simulate itself, it also can estimate another AIXI-tl's future behavior based on observed behavior. Your argument is now that Corbin can do it better. I don't know if this is true or not. . . . Let's say that AIXI-tl takes action A in round 1, action B in round 2, and action C in round 3, and so on up to action Z in round 26. There's no obvious reason for the sequence {A...Z} to be predictable *even approximately* by any of the tl-bounded processes AIXI-tl uses for prediction. Any given action is the result of a tl-bounded policy but the *sequence* of *different* tl-bounded policies was chosen by a t2^l process. Your example sequence is pretty simple and should match a nice simple universal turing machine program in an AIXI-tl, well within its bounds. Furthermore, two AIXI-tl's will probably converge on a simple sequence in prisoner's dilemma. But I have no idea if they can do it better than Corbin and his clone. Bill --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Eliezer/Ben, When you've had time to draw breath can you explain, in non-obscure, non-mathematical language, what the implications of the AIXI-tl discussion are? Thanks. Cheers, Philip --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Hi, There's a physical challenge which operates on *one* AIXI-tl and breaks it, even though it involves diagonalizing the AIXI-tl as part of the challenge. OK, I see what you mean by calling it a physical challenge. You mean that, as part of the challenge, the external agent posing the challenge is allowed to clone the AIXI-tl. An intuitively fair, physically realizable challenge, with important real-world analogues, formalizable as a computation which can be fed either a tl-bounded uploaded human or an AIXI-tl, for which the human enjoys greater success measured strictly by total reward over time, due to the superior strategy employed by that human as the result of rational reasoning of a type not accessible to AIXI-tl. It's really the formalizability of the challenge as a computation which can be fed either a *single* AIXI-tl or a *single* tl-bounded uploaded human that makes the whole thing interesting at all... I'm sorry I didn't succeed in making clear the general class of real-world analogues for which this is a special case. OK I don't see how the challenge you've described is formalizable as a computation which can be fed either a tl-bounded uploaded human or an AIXI-tl. The challenge involves cloning the agent being challenged. Thus it is not a computation feedable to the agent, unless you assume the agent is supplied with a cloning machine... If I were to take a very rough stab at it, it would be that the cooperation case with your own clone is an extreme case of many scenarios where superintelligences can cooperate with each other on the one-shot Prisoner's Dilemna provided they have *loosely similar* reflective goal systems and that they can probabilistically estimate that enough loose similarity exists. Yah, but the definition of a superintelligence is relative to the agent being challenged. For any fixed superintelligent agent A, there are AIXItl's big enough to succeed against it in any cooperative game. To break AIXI-tl, the challenge needs to be posed in a way that refers to AIXItl's own size, i.e. one has to say something like Playing a cooperative game with other intelligences of intelligence at least f(t,l) where if is some increasing function If the intelligence of the opponents is fixed, then one can always make an AIXItl win by increasing t and l ... So your challenges are all of the form: * For any fixed AIXItl, here is a challenge that will defeat it ForAll AIXItl's A(t,l), ThereExists a challenge C(t,l) so that fails_at(A,C) or alternatively ForAll AIXItl's A(t,l), ThereExists a challenge C(A(t,l)) so that fails_at(A,C) rather than of the form * Here is a challenge that will defeat any AIXItl ThereExists a challenge C so that ForAll AIXItl's A(t,l), fails_at(A,C) The point is that the challenge C is a function C(t,l) rather than being independent of t and l This of course is why your challenge doesn't break Hutter's theorem. But it's a distinction that your initial verbal formulation didn't make very clearly (and I understand, the distinction is not that easy to make in words.) Of course, it's also true that ForAll uploaded humans H, ThereExists a challenge C(H) so that fails_at(H,C) What you've shown that's interesting is that ThereExists a challenge C, so that: -- ForAll AIXItl's A(t,l), fails_at(A,C(A)) -- for many uploaded humans H, succeeds_at(H,C(H)) (Where, were one to try to actually prove this, one would substitute uploaded humans with other AI programs or something). The interesting part is that these little natural breakages in the formalism create an inability to take part in what I think might be a fundamental SI social idiom, conducting binding negotiations by convergence to goal processes that are guaranteed to have a correlated output, which relies on (a) Bayesian-inferred initial similarity between goal systems, and (b) the ability to create a top-level reflective choice that wasn't there before, that (c) was abstracted over an infinite recursion in your top-level predictive process. I think part of what you're saying here is that AIXItl's are not designed to be able to participate in a community of equals This is certainly true. --- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
hi, No, the challenge can be posed in a way that refers to an arbitrary agent A which a constant challenge C accepts as input. But the problem with saying it this way, is that the constant challenge has to have an infinite memory capacity. So in a sense, it's an infinite constant ;) No, the charm of the physical challenge is exactly that there exists a physically constant cavern which defeats any AIXI-tl that walks into it, while being tractable for wandering tl-Corbins. No, this isn't quite right. If the cavern is physically constant, then there must be an upper limit to the t and l for which it can clone AIXItl's. If the cavern has N bits (assuming a bitistic reduction of physics, for simplicity ;), then it can't clone an AIXItl where t 2^N, can it? Not without grabbing bits (particles or whatever) from the outside universe to carry out the cloning. (and how could the AIXItl with t2^N even fit inside it??) You still need the quantifiers reversed: for any AIXI-tl, there is a cavern posing a challenge that defeats it... I think part of what you're saying here is that AIXItl's are not designed to be able to participate in a community of equals This is certainly true. Well, yes, as a special case of AIXI-tl's being unable to carry out reasoning where their internal processes are correlated with the environment. Agreed... (See, it IS actually possible to convince me of something, when it's correct; I'm actually not *hopelessly* stubborn ;) ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Ben Goertzel wrote: hi, No, the challenge can be posed in a way that refers to an arbitrary agent A which a constant challenge C accepts as input. But the problem with saying it this way, is that the constant challenge has to have an infinite memory capacity. So in a sense, it's an infinite constant ;) Infinite Turing tapes are a pretty routine assumption in operations like these. I think Hutter's AIXI-tl is supposed to be able to handle constant environments (as opposed to constant challenges, a significant formal difference) that contain infinite Turing tapes. Though maybe that'd violate separability? Come to think of it, the Clone challenge might violate separability as well, since AIXI-tl (and hence its Clone) builds up state. No, the charm of the physical challenge is exactly that there exists a physically constant cavern which defeats any AIXI-tl that walks into it, while being tractable for wandering tl-Corbins. No, this isn't quite right. If the cavern is physically constant, then there must be an upper limit to the t and l for which it can clone AIXItl's. Hm, this doesn't strike me as a fair qualifier. One, if an AIXItl exists in the physical universe at all, there are probably infinitely powerful processors lying around like sunflower seeds. And two, if you apply this same principle to any other physically realized challenge, it means that people could start saying Oh, well, AIXItl can't handle *this* challenge because there's an upper bound on how much computing power you're allowed to use. If Hutter's theorem is allowed to assume infinite computing power inside the Cartesian theatre, then the magician's castle should be allowed to assume infinite computing power outside the Cartesian theatre. Anyway, a constant cave with an infinite tape seems like a constant challenge to me, and a finite cave that breaks any {AIXI-tl, tl-human} contest up to l=googlebyte also still seems interesting, especially as AIXI-tl is supposed to work for any tl, not just sufficiently high tl. Well, yes, as a special case of AIXI-tl's being unable to carry out reasoning where their internal processes are correlated with the environment. Agreed... (See, it IS actually possible to convince me of something, when it's correct; I'm actually not *hopelessly* stubborn ;) Yes, but it takes t2^l operations. (Sorry, you didn't deserve it, but a straight line like that only comes along once.) -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Anyway, a constant cave with an infinite tape seems like a constant challenge to me, and a finite cave that breaks any {AIXI-tl, tl-human} contest up to l=googlebyte also still seems interesting, especially as AIXI-tl is supposed to work for any tl, not just sufficiently high tl. It's a fair mathematical challenge ... the reason I complained is that the physical-world metaphor of a cave seems to me to imply a finite system. A cave with an infinite tape in it is no longer a realizable physical system! (See, it IS actually possible to convince me of something, when it's correct; I'm actually not *hopelessly* stubborn ;) Yes, but it takes t2^l operations. (Sorry, you didn't deserve it, but a straight line like that only comes along once.) ;-) ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Eliezer S. Yudkowsky wrote: Let's imagine I'm a superintelligent magician, sitting in my castle, Dyson Sphere, what-have-you. I want to allow sentient beings some way to visitme, but I'm tired of all these wandering AIXI-tl spambots that script kiddies code up to brute-force my entrance challenges. I don't want to tl-bound my visitors; what if an actual sentient 10^10^15 ops/sec big wants to visit me? I don't want to try and examine the internal state of the visiting agent, either; that just starts a war of camouflage between myself and the spammers. Luckily, there's a simple challenge I can pose to any visitor, cooperation with your clone, that filters out the AIXI-tls and leaves only beings who are capable of a certain level of reflectivity, presumably genuine sentients. I don't need to know the tl-bound of my visitors, or the tl-bound of the AIXI-tl, in order to construct this challenge. I write the code once. Oh, that's trivial to break. I just put my AIXI-t1 (whatever that is) in a human body and send it via rocket-ship... There would no way to clone this being so you would have no way to carry out the test. -- I WANT A DEC ALPHA!!! =) 21364: THE UNDISPUTED GOD OF ALL CPUS. http://users.rcn.com/alangrimes/ [if rcn.com doesn't work, try erols.com ] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Ben Goertzel wrote: In a naturalistic universe, where there is no sharp boundary between the physics of you and the physics of the rest of the world, the capability to invent new top-level internal reflective choices can be very important, pragmatically, in terms of properties of distant reality that directly correlate with your choice to your benefit, if there's any breakage at all of the Cartesian boundary - any correlation between your mindstate and the rest of the environment. Unless, you are vastly smarter than the rest of the universe. Then you can proceed like an AIXItl and there is no need for top-level internal reflective choices ;) Actually, even if you are vastly smarter than the rest of the entire universe, you may still be stuck dealing with lesser entities (though not humans; superintelligences at least) who have any information at all about your initial conditions, unless you can make top-level internal reflective choices. The chance that environmental superintelligences will cooperate with you in PD situations may depend on *their* estimate of *your* ability to generalize over the choice to defect and realize that a similar temptation exists on both sides. In other words, it takes a top-level internal reflective choice to adopt a cooperative ethic on the one-shot complex PD rather than blindly trying to predict and outwit the environment for maximum gain, which is built into the definition of AIXI-tl's control process. A superintelligence may cooperate with a comparatively small, tl-bounded AI, but be unable to cooperate with an AIXI-tl, provided there is any inferrable information about initial conditions. In one sense AIXI-tl wins; it always defects, which formally is a better choice than cooperating on the oneshot PD, regardless of what the opponent does - assuming that the environment is not correlated with your decisionmaking process. But anyone who knows that assumption is built into AIXI-tl's initial conditions will always defect against AIXI-tl. A small, tl-bounded AI that can make reflective choices has the capability of adopting a cooperative ethic; provided that both entities know or infer something about the other's initial conditions, they can arrive at a knowably correlated reflective choice to adopt cooperative ethics. AIXI-tl can learn the iterated PD, of course; just not the oneshot complex PD. -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Ben Goertzel wrote: AIXI-tl can learn the iterated PD, of course; just not the oneshot complex PD. But if it's had the right prior experience, it may have an operating program that is able to deal with the oneshot complex PD... ;-) Ben, I'm not sure AIXI is capable of this. AIXI may inexorably predict the environment and then inexorably try to maximize reward given environment. The reflective realization that *your own choice* to follow that control procedure is correlated with a distant entity's choice not to cooperate with you may be beyond AIXI. If it was the iterated PD, AIXI would learn how a defection fails to maximize reward over time. But can AIXI understand, even in theory, regardless of what its internal programs simulate, that its top-level control function fails to maximize the a priori propensity of other minds with information about AIXI's internal state to cooperate with it, on the *one* shot PD? AIXI can't take the action it needs to learn the utility of... -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
I guess that for AIXI to learn this sort of thing, it would have to be rewarded for understanding AIXI in general, for proving theorems about AIXI, etc. Once it had learned this, it might be able to apply this knowledge in the one-shot PD context But I am not sure. ben -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Eliezer S. Yudkowsky Sent: Saturday, February 15, 2003 3:36 PM To: [EMAIL PROTECTED] Subject: Re: [agi] Breaking AIXI-tl Ben Goertzel wrote: AIXI-tl can learn the iterated PD, of course; just not the oneshot complex PD. But if it's had the right prior experience, it may have an operating program that is able to deal with the oneshot complex PD... ;-) Ben, I'm not sure AIXI is capable of this. AIXI may inexorably predict the environment and then inexorably try to maximize reward given environment. The reflective realization that *your own choice* to follow that control procedure is correlated with a distant entity's choice not to cooperate with you may be beyond AIXI. If it was the iterated PD, AIXI would learn how a defection fails to maximize reward over time. But can AIXI understand, even in theory, regardless of what its internal programs simulate, that its top-level control function fails to maximize the a priori propensity of other minds with information about AIXI's internal state to cooperate with it, on the *one* shot PD? AIXI can't take the action it needs to learn the utility of... -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
I guess that for AIXI to learn this sort of thing, it would have to be rewarded for understanding AIXI in general, for proving theorems about AIXI, etc. Once it had learned this, it might be able to apply this knowledge in the one-shot PD context But I am not sure. For those of us who have missed a critical message or two in this weekend's lengthy exchange, can you explain briefly the one-shot complex PD? I'm unsure how a program could evaluate and learn to predict the behavior of its opponent if it only gets 1-shot. Obviously I'm missing something. -Brad --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Really, when has a computer (with the exception of certain Microsoft products) ever been able to disobey it's human masters? It's easy to get caught up in the romance of superpowers, but come on, there's nothing to worry about. -Daniel Hi Daniel, Clearly there is nothing to worry about TODAY. And I'm spending the vast bulk of my time working on practical AI design and engineering and application work, not on speculating about the future. However, I do believe that once AI tech has advanced far enough, there WILL be something to worry about. How close we are to this point is another question. Current AI practice is very far away from achieving autonomous general intelligence. If I'm right about the potential of Novamente and similar designs, we could be within a decade of getting there If I'm wrong, well, Kurzweil has made some decent arguments why we'll get there by 2050 or so... ;-) -- Ben Goertzel --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Even if a (grown) human is playing PD2, it outperforms AIXI-tl playing PD2. Well, in the long run, I'm not at all sure this is the case. You haven't proved this to my satisfaction. In the short run, it certainly is the case. But so what? AIXI-tl is damn slow at learning, we know that. The question is whether after enough trials AIXI-tl figures out it's playing some entity similar to itself and learns how to act accordingly If so, then it's doing what AIXI-tl is supposed to do. A human can also learn to solve vision recognition problems faster than AIXI-tl, because we're wired for it (as we're wired for social gameplaying), whereas AIXI-tl has to learn Humans can recognize a much stronger degree of similarity in human Other Minds than AIXI-tl's internal processes are capable of recognizing in any other AIXI-tl. I don't believe that is true. Again, as far as I can tell, this necessarily requires abstracting over your own internal state and recognizing that the outcome of your own (internal) choices are necessarily reproduced by a similar computation elsewhere. Basically, it requires abstracting over your own halting problem to realize that the final result of your choice is correlated with that of the process simulated, even though you can't fully simulate the causal process producing the correlation in advance. (This doesn't *solve* your own halting problem, but at least it enables you to *understand* the situation you've been put into.) Except that instead of abstracting over your own halting problem, you're abstracting over the process of trying to simulate another mind trying to simulate you trying to simulate it, where the other mind is sufficiently similar to your own. This is a kind of reasoning qualitatively closed to AIXI-tl; its control process goes on abortively trying to simulate the chain of simulations forever, stopping and discarding that prediction as unuseful as soon as it exceeds the t-bound. OK... here's where the fact that you have a tabula rasa AIXI-tl in a very limiting environment comes in. In a richer environment, I don't see why AIXI-tl, after a long enough time, couldn't learn an operating program that implicitly embodied an abstraction over its own internal state. In an environment consisting solely of PD2, it may be that AIXI-tl will never have the inspiration to learn this kind of operating program. (I'm not sure.) To me, this says mostly that PD2 is an inadequate environment for any learning system to use, to learn how to become a mind. If it ain't good enough for AIXI-tl to use to learn how to become a mind, over a very long period of time, it probably isn't good for any AI system to use to learn how to become a mind. Anyway... basically, if you're in a real-world situation where the other intelligence has *any* information about your internal state, not just from direct examination, but from reasoning about your origins, then that also breaks the formalism and now a tl-bounded seed AI can outperform AIXI-tl on the ordinary (non-quined) problem of cooperation with a superintelligence. The environment can't ever *really* be constant and completely separated as Hutter requires. A physical environment that gives rise to an AIXI-tl is different from the environment that gives rise to a tl-bounded seed AI, and the different material implementations of these entities (Lord knows how you'd implement the AIXI-tl) will have different side effects, and so on. All real world problems break the Cartesian assumption. The questions But are there any kinds of problems for which that makes a real difference? and Does any conceivable kind of mind do any better? can both be answered affirmatively. Welll I agree with only some of this. The thing is, an AIXI-tl-driven AI embedded in the real world would have a richer environment to draw on than the impoverished data provided by PD2. This AI would eventually learn how to model itself and reflect in a rich way (by learning the right operating program). However, AIXI-tl is a horribly bad AI algorithm, so it would take a VERY VERY long time to carry out this learning, of course... -- Ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Ben Goertzel wrote: Even if a (grown) human is playing PD2, it outperforms AIXI-tl playing PD2. Well, in the long run, I'm not at all sure this is the case. You haven't proved this to my satisfaction. PD2 is very natural to humans; we can take for granted that humans excel at PD2. The question is AIXI-tl. In the short run, it certainly is the case. But so what? AIXI-tl is damn slow at learning, we know that. AIXI-tl is most certainly not damn slow at learning any environment that can be tl-bounded. For problems that don't break the Cartesian formalism, AIXI-tl learns only slightly lower than the fastest possible tl-bounded learner. It's got t2^l computing power for gossakes! From our perspective it learns at faster than the fastest rate humanly imaginable - literally. You appear to be thinking of AIXI-tl as a fuzzy little harmless baby being confronted with some harsh trial. That fuzzy little harmless baby, if the tl-bound is large enough to simulate Lee Corbin, is wielding something like 10^10^15 operations per second, which it is using to *among other things* simulate every imaginable human experience. AIXI-tl is larger than universes; it contains all possible tl-bounded heavens and all possible tl-bounded hells. The only question is whether its control process makes any good use of all that computation. More things from the list of system properties that Friendliness programmers should sensitize themselves to: Just because the endless decillions of alternate Ben Goertzels in torture chambers are screaming to God to stop it doesn't mean that AIXI-tl's control process cares. The question is whether after enough trials AIXI-tl figures out it's playing some entity similar to itself and learns how to act accordingly If so, then it's doing what AIXI-tl is supposed to do. AIXI-tl *cannot* figure this out because its control process is not capable of recognizing tl-computable transforms of its own policies and strategic abilities, *only* tl-computable transforms of its own direct actions. Yes, it simulates entities who know this; it also simulates every possible other kind of tl-bounded entity. The question is whether that internal knowledge appears as an advantage recognized by the control process and given AIXI-tl's formal definition, it does not appear to do so. In my humble opinion, one of the (many) critical skills for creating AI is learning to recognize what systems *really actually do* and not just what you project onto them. See also Eliza effect, failure of GOFAI, etc. A human can also learn to solve vision recognition problems faster than AIXI-tl, because we're wired for it (as we're wired for social gameplaying), whereas AIXI-tl has to learn AIXI-tl learns vision *instantly*. The Kolmogorov complexity of a visual field is much less than its raw string, and the compact representation can be computed by a tl-bounded process. It develops a visual cortex on the same round it sees its first color picture. Humans can recognize a much stronger degree of similarity in human Other Minds than AIXI-tl's internal processes are capable of recognizing in any other AIXI-tl. I don't believe that is true. Mentally simulate the abstract specification of AIXI-tl instead of using your intuitions about the behavior of a generic reinforcement process. Eventually the results you learn will be integrated into your intuitions and you'll be able to directly see dependencies betwen specifications and reflective modeling abilities. OK... here's where the fact that you have a tabula rasa AIXI-tl in a very limiting environment comes in. In a richer environment, I don't see why AIXI-tl, after a long enough time, couldn't learn an operating program that implicitly embodied an abstraction over its own internal state. Because it is physically or computationally impossible for a tl-bounded program to access or internally reproduce the previously computed policies or t2^l strategic ability of AIXI-tl. In an environment consisting solely of PD2, it may be that AIXI-tl will never have the inspiration to learn this kind of operating program. (I'm not sure.) To me, this says mostly that PD2 is an inadequate environment for any learning system to use, to learn how to become a mind. If it ain't good enough for AIXI-tl to use to learn how to become a mind, over a very long period of time, it probably isn't good for any AI system to use to learn how to become a mind. Marcus Hutter has formally proved your intuitions wrong. In any situation that does *not* break the formalism, AIXI-tl learns to equal or outperform any other process, despite being a tabula rasa, no matter how rich or poor its environment. Anyway... basically, if you're in a real-world situation where the other intelligence has *any* information about your internal state, not just from direct examination, but from reasoning about your origins, then that also breaks the formalism and now a
Re: [agi] Breaking AIXI-tl
Bill Hibbard wrote: On Fri, 14 Feb 2003, Eliezer S. Yudkowsky wrote: It *could* do this but it *doesn't* do this. Its control process is such that it follows an iterative trajectory through chaos which is forbidden to arrive at a truthful solution, though it may converge to a stable attractor. This is the heart of the fallacy. Neither a human nor an AIXI can know that his synchronized other self - whichever one he is - is doing the same. All a human or an AIXI can know is its observations. They can estimate but not know the intentions of other minds. The halting problem establishes that you can never perfectly understand your own decision process well enough to predict its decision in advance, because you'd have to take into account the decision process including the prediction, et cetera, establishing an infinite regress. However, Corbin doesn't need to know absolutely that his other self is synchronized, nor does he need to know his other self's decision in advance. Corbin only needs to establish a probabilistic estimate, good enough to guide his actions, that his other self's decision is correlated with his *after* the fact. (I.e., it's not a halting problem where you need to predict yourself in advance; you only need to know your own decision after the fact.) AIXI-tl is incapable of doing this for complex cooperative problems because its decision process only models tl-bounded things and AIXI-tl is not *remotely close* to being tl-bounded. Humans can model minds much closer to their own size than AIXI-tl can. Humans can recognize when their policies, not just their actions, are reproduced. We can put ourselves in another human's shoes imperfectly; AIXI-tl can't put itself in another AIXI-tl's shoes to the extent of being able to recognize the actions of an AIXI-tl computed using a process that is inherently 2t^l large. Humans can't recognize their other selves perfectly but the gap in the case of AIXI-tl is enormously greater. (Humans also have a reflective control process on which they can perform inductive and deductive generalizations and jump over a limited class of infinite regresses in decision processes, but that's a separate issue. Suffice it to say that a subprocess which generalizes over its own infinite regress does not obviously suffice for AIXI-tl to generalize over the top-level infinite regress in AIXI-tl's control process.) Let's say that AIXI-tl takes action A in round 1, action B in round 2, and action C in round 3, and so on up to action Z in round 26. There's no obvious reason for the sequence {A...Z} to be predictable *even approximately* by any of the tl-bounded processes AIXI-tl uses for prediction. Any given action is the result of a tl-bounded policy but the *sequence* of *different* tl-bounded policies was chosen by a t2^l process. A human in the same situation has a mnemonic record of the sequence of policies used to compute their strategies, and can recognize correlations between the sequence of policies and the other agent's sequence of actions, which can then be confirmed by directing O(other-agent) strategic processing power at the challenge of seeing the problem from the opposite perspective. AIXI-tl is physically incapable of doing this directly and computationally incapable of doing it indirectly. This is not an attack on the computability of intelligence; the human is doing something perfectly computable which AIXI-tl does not do. -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Eliezer S. Yudkowsky asked Ben Goertzel: Do you have a non-intuitive mental simulation mode? LOL --#:^D It *is* a valid question, Eliezer, but it makes me laugh. Michael Roy Ames [Who currently estimates his *non-intuitive mental simulation mode* to contain about 3 iterations of 5 variables each - 8 variables each on a good day. Each variable can link to a concept (either complex or simple)... and if that sounds to you like something that a trashed-out Commodore 64 could emulate, then you have some idea how he feels being stuck at his current level of non-intuitive intelligence.] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
I'll read the rest of your message tomorrow... But we aren't *talking* about whether AIXI-tl has a mindlike operating program. We're talking about whether the physically realizable challenge, which definitely breaks the formalism, also breaks AIXI-tl in practice. That's what I originally stated, that's what you originally said you didn't believe, and that's all I'm trying to demonstrate. Your original statement was posed in a misleading way, perhaps not intentionally. There is no challenge on which *an* AIXI-tl doesn't outperform *an* uploaded human. What you're trying to show is that there's an inter-AIXI-tl social situation in which AIXI-tl's perform less intelligently than humans do in a similar inter-human situation. If you had posed it this way, I wouldn't have been as skeptical initially. -- Ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] Breaking AIXI-tl
Hmmm My friend, I think you've pretty much convinced me with this last batch of arguments. Or, actually, I'm not sure if it was your excellently clear arguments or the fact that I finally got a quiet 15 minutes to really think about it (the three kids, who have all been out sick from school with a flu all week, are all finally in bed ;) Your arguments are a long way from a rigorous proof, and I can't rule out that there might be a hole in them, but in this last e-mail you were explicit enough to convince me that what you're saying makes logical sense. I'm going to try to paraphrase your argument, let's see if we're somewhere in the neighborhood of harmony... Basically: you've got these two clones playing a cooperative game, and each one, at each turn, is controlled by a certain program. Each clone chooses his current operating program by searching the space of all programs of length L that finish running in T timesteps, and finding the one that, based on his study of prior gameplay, is expected to give him the highest chance of winning. But each guy takes 2^T timesteps to perform this search. So your basic point is that, because these clones are acting by simulating programs that finish running in T timesteps, they're not going to be able to simulate each other very accurately. Whereas, a pair of clones each possessing a more flexible control algorithm could perform better in the game. Because, if a more flexible player wants to simulate his opponent, he can choose to devote nearly ALL his thinking-time inbetween moves to simulating his opponent. Because these more flexible players are not constrained to a rigid control algorithm that divides up their time into little bits, simulating a huge number of fast programs. AIXItl does not have the flexibility to say Well, this time interval, I'm going to keep my operating program the same, and instead of using my time seeking a new operating program, I'm going to spend most of it trying to simulate my opponent, or trying to study my opponent. HOWEVER... it's still quite possible that the AIXItl clones can predict each other, isn't it? If one of them keeps running the same operating program for a while, then the other one should be able to learn an operating program that responds appropriately to that operating program. But I can see that for some cooperative games, it might be unlikely for one of them to keep running the same operating program for a while... they could just keep shifting from program to program in response to each other. If AIXI-tl needs general intelligence but fails to develop general intelligence to solve the complex cooperation problem, while humans starting out with general intelligence do solve the problem, then AIXI-tl has been broken. Well, we have different definitions of broken in this context, but that's not a point worth arguing about. But we aren't *talking* about whether AIXI-tl has a mindlike operating program. We're talking about whether the physically realizable challenge, which definitely breaks the formalism, also breaks AIXI-tl in practice. That's what I originally stated, that's what you originally said you didn't believe, and that's all I'm trying to demonstrate. Yes, you would seem to have successfully shown (logically and intuitively, though not mathematically) that AIXItl's can be dumber in their interactions with other AIXItl's, than humans are in their analogous interactions with other humans. I don't think you should describe this as breaking the formalism, because the formalism is about how a single AIXItl solves a fixed goal function, not about how groups of AIXItl's interact. But it's certainly an interesting result. I hope that, even if you don't take the time to prove it rigorously, you'll write it up in a brief, coherent essay, so that others not on this list can appreciate it... Funky stuff!! ;-) -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Ben Goertzel wrote: I'll read the rest of your message tomorrow... But we aren't *talking* about whether AIXI-tl has a mindlike operating program. We're talking about whether the physically realizable challenge, which definitely breaks the formalism, also breaks AIXI-tl in practice. That's what I originally stated, that's what you originally said you didn't believe, and that's all I'm trying to demonstrate. Your original statement was posed in a misleading way, perhaps not intentionally. There is no challenge on which *an* AIXI-tl doesn't outperform *an* uploaded human. We are all Lee Corbin; would you really say there's more than one... oh, never mind, I don't want to get *that* started here. There's a physical challenge which operates on *one* AIXI-tl and breaks it, even though it involves diagonalizing the AIXI-tl as part of the challenge. In the real world, all reality is interactive and naturalistic, not walled off by a Cartesian theatre. The example I gave is probably the simplest case that clearly breaks the formalism and clearly causes AIXI-tl to operate suboptimally. There's more complex and important cases, that we would understand as roughly constant environmental challenges which break AIXI-tl's formalism in more subtle ways, with the result that AIXI-tl can't cooperate in one-shot PDs with superintelligences... and neither can a human, incidentally, but another seed AI or superintelligence can-I-think, by inventing a new kind of reflective choice which is guaranteed to be correlated as a result of shared initial conditions, both elements that break AIXI-tl... well, anyway, the point is that there's a qualitatively different kind of intelligence here that I think could turn out to be extremely critical in negotiations among superintelligences. The formalism in this situation gets broken, depending on how you're looking at it, by side effects of the AIXI-tl's existence or by violation of the separability condition. Actually, violations of the formalism are ubiquitous and this is not particularly counterintuitive; what is counterintuitive is that formalism violations turn out to make a real-world difference. Are we at least in agreement on the fact that there exists a formalizable constant challenge C which accepts an arbitrary single agent and breaks both the AIXI-tl formalism and AIXI-tl? Reads Ben Goertzel's other message, while working on this one. OK. We'd better take a couple of days off before taking up the AIXI Friendliness issue. Maybe even wait until I get back from New York in a week. Also, I want to wait for all these emails to show up in the AGI archive, then tell Marcus Hutter about them if no one has already. I'd be interesting in seeing what he thinks. What you're trying to show is that there's an inter-AIXI-tl social situation in which AIXI-tl's perform less intelligently than humans do in a similar inter-human situation. If you had posed it this way, I wouldn't have been as skeptical initially. If I'd posed it that way, it would have been uninteresting because I wouldn't have broken the formalism. Again, to quote my original claim: 1) There is a class of physically realizable problems, which humans can solve easily for maximum reward, but which - as far as I can tell - AIXI cannot solve even in principle; I don't see this, nor do I believe it... And later expanded to: An intuitively fair, physically realizable challenge, with important real-world analogues, formalizable as a computation which can be fed either a tl-bounded uploaded human or an AIXI-tl, for which the human enjoys greater success measured strictly by total reward over time, due to the superior strategy employed by that human as the result of rational reasoning of a type not accessible to AIXI-tl. It's really the formalizability of the challenge as a computation which can be fed either a *single* AIXI-tl or a *single* tl-bounded uploaded human that makes the whole thing interesting at all... I'm sorry I didn't succeed in making clear the general class of real-world analogues for which this is a special case. If I were to take a very rough stab at it, it would be that the cooperation case with your own clone is an extreme case of many scenarios where superintelligences can cooperate with each other on the one-shot Prisoner's Dilemna provided they have *loosely similar* reflective goal systems and that they can probabilistically estimate that enough loose similarity exists. It's the natural counterpart of the Clone challenge - loosely similar goal systems arise all the time, and it turns out that in addition to those goal systems being interpreted as a constant environmental challenge, there are social problems that depend on your being able to correlate your internal processes with theirs (you can correlate internal processes because you're both part of the same naturalistic universe). This breaks AIXI-tl because it's not loosely
Re: [agi] Breaking AIXI-tl
Eliezer S. Yudkowsky wrote: But if this isn't immediately obvious to you, it doesn't seem like a top priority to try and discuss it... Argh. That came out really, really wrong and I apologize for how it sounded. I'm not very good at agreeing to disagree. Must... sleep... -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Shane Legg wrote: Eliezer, Yes, this is a clever argument. This problem with AIXI has been thought up before but only appears, at least as far as I know, in material that is currently unpublished. I don't know if anybody has analysed the problem in detail as yet... but it certainly is a very interesting question to think about: What happens when two super intelligent AIXI's meet? SI-AIXI is redundant; all AIXIs are enormously far beyond superintelligent. As for the problem, the obvious answer is that no matter what strange things happen, an AIXI^2 which performs Solomonoff^2 induction, using the universal prior of strings output by first-order Oracle machines, will come up with the best possible strategy for handling it... Has the problem been thought up just in the sense of What happens when two AIXIs meet? or in the formalizable sense of Here's a computational challenge C on which a tl-bounded human upload outperforms AIXI-tl? -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Eliezer S. Yudkowsky wrote: Has the problem been thought up just in the sense of What happens when two AIXIs meet? or in the formalizable sense of Here's a computational challenge C on which a tl-bounded human upload outperforms AIXI-tl? I don't know of anybody else considering human upload vs. AIXI. Cheers Shane --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] Breaking AIXI-tl
Hi Eliezer, An intuitively fair, physically realizable challenge, with important real-world analogues, formalizable as a computation which can be fed either a tl-bounded uploaded human or an AIXI-tl, for which the human enjoys greater success measured strictly by total reward over time, due to the superior strategy employed by that human as the result of rational reasoning of a type not accessible to AIXI-tl. Roughly speaking: A (selfish) human upload can engage in complex cooperative strategies with an exact (selfish) clone, and this ability is not accessible to AIXI-tl, since AIXI-tl itself is not tl-bounded and therefore cannot be simulated by AIXI-tl, nor does AIXI-tl have any means of abstractly representing the concept a copy of myself. Similarly, AIXI is not computable and therefore cannot be simulated by AIXI. Thus both AIXI and AIXI-tl break down in dealing with a physical environment that contains one or more copies of them. You might say that AIXI and AIXI-tl can both do anything except recognize themselves in a mirror. Why do you require an AIXI or AIXI-tl to simulate itself, when humans cannot? A human cannot know that another human is an exact clone of itself. All humans or AIXIs can know is what they observe. They cannot know that another mind is identical. The simplest case is the one-shot Prisoner's Dilemna against your own exact clone. It's pretty easy to formalize this challenge as a computation that accepts either a human upload or an AIXI-tl. This obviously breaks the AIXI-tl formalism. Does it break AIXI-tl? This question is more complex than you might think. For simple problems, there's a nonobvious way for AIXI-tl to stumble onto incorrect hypotheses which imply cooperative strategies, such that these hypotheses are stable under the further evidence then received. I would expect there to be classes of complex cooperative problems in which the chaotic attractor AIXI-tl converges to is suboptimal, but I have not proved it. It is definitely true that the physical problem breaks the AIXI formalism and that a human upload can straightforwardly converge to optimal cooperative strategies based on a model of reality which is more correct than any AIXI-tl is capable of achieving. Given that humans can only know what they observe, and thus cannot know what is going on inside another mind, humans are on the same footing as AIXIs in Prisoner's Dilema. I suspect that two AIXIs or AIXI-tl's will do well at the game, since a strategy with betrayal probably needs a longer program than a startegy without betrayal, and the AIXI will weight more strongly a model of the other's behavior with a shorter program. Ultimately AIXI's decision process breaks down in our physical universe because AIXI models an environmental reality with which it interacts, instead of modeling a naturalistic reality within which it is embedded. It's one of two major formal differences between AIXI's foundations and Novamente's. Unfortunately there is a third foundational difference between AIXI and a Friendly AI. I will grant you one thing: that since an AIXI cannot exist and an AIXI-tl is too slow to be practical, using them as a basis for discussing safe AGIs is a bit futile. The other problem is that an AIXI's optimality is only as valid as its assumption about the probability distribution of universal Turing machine programs. Cheers, Bill -- Bill Hibbard, SSEC, 1225 W. Dayton St., Madison, WI 53706 [EMAIL PROTECTED] 608-263-4427 fax: 608-263-6738 http://www.ssec.wisc.edu/~billh/vis.html --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]