RE: [agi] unFriendly AIXI... and Novamente?
I can spot the problem in AIXI because I have practice looking for silent failures, because I have an underlying theory that makes it immediately obvious which useful properties are formally missing from AIXI, and because I have a specific fleshed-out idea for how to create moral systems and I can see AIXI doesn't work that way. Is it really all that implausible that you'd need to reach that point before being able to create a transhuman Novamente? Is it really so implausible that AI morality is difficult enough to require at least one completely dedicated specialist? -- Eliezer S. Yudkowsky http://singinst.org/ There's no question you've thought a lot more about AI morality than I have... and I've thought about it a fair bit. When Novamente gets to the point that its morality is a significant issue, I'll be happy to get you involved in the process of teaching the system, carefully studying the design and implementation, etc. -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI... and Novamente?
Your intuitions say... I am trying to summarize my impression of your viewpoint, please feel free to correct me... AI morality is a matter of experiential learning, not just for the AI, but for the programmers. Also, we plan to start Novamente off with some initial goals embodying ethical notions. These are viewed as seeds of its ultimate ethical goals. So it's not the case that we intend to rely ENTIRELY on experiential learning; we intend to rely on experiential learning from an engineering initial condition, not from a complete tabula rasa. -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI... and Novamente?
Hi, 2) If you get the deep theory wrong, there is a strong possibility of a silent catastrophic failure: the AI appears to be learning everything just fine, and both you and the AI are apparently making all kinds of fascinating discoveries about AI morality, and everything seems to be going pretty much like your intuitions predict above, but when the AI crosses the cognitive threshold of superintelligence it takes actions which wipe out the human species as a side effect. AIXI, which is a completely defined formal system, definitely undergoes a failure of exactly this type. *Definitely*, huh? I don't really believe you... I can see the direction your thoughts are going in Supppose you're rewarding AIXI for acting as though it's a Friendly AI. Then, by searching the space of all possible programs, it finds some program P that causes it to act as though it's a Friendly AI, satisfying humans thoroughly in this regard. There's an issue that a lot of different programs P could fulfill this criterion. Among these are programs P that will cause AIXI to fool humans into thinking it's Friendly, until such a point as AIXI has acquired enough physical power to annihilate all humans -- and which, at that point, will cause AIXI to annihilate all humans. But I can't see why you think AIXI would be particularly likely to come up with programs P of this nature. Instead, my understanding is that AIXI is going to have a bias to come up with the most compact program P that maximizes reward. And I think it's unlikely that the most compact program P for impressing humans with Friendliness is one that involves acting Friendly for a while, then annihilating humanity. You could argue that the system would maximize its long-term reward by annihilating humanity, because after pesky humans are gone, it can simply reward itself unto eternity without caring what we think. But, if it's powerful enough to annihilate us, it's also probably powerful enough to launch itself into space and reward itself unto eternity without caring what we think, all by itself (an Honest Annie type scenario). Why would it prefer annihilate humans P to launch myself into space P? But anyway, it seems to me that the way AIXI works is to maximize expected reward assuming that its reward function continues pretty much as it has in the past. So AIXI is not going to choose programs P based on a desire to bring about futures in which it can masturbatively maximize its own rewards. At least, that's my understanding, though I could be wrong. This whole type of scenario is avoided by limitations on computational resources, because I believe that impressing humans regarding Friendliness by actually being Friendly is a simpler computational problem than impressing humans regarding Friendliness by subtly emulating Friendliness but really concealing murderous intentions. Also, I'd note that in a Novamente, one could most likely distinguish these two scenarios by looking inside the system and studying the Atoms and maps therein. Jeez, all this talk about the future of AGI really makes me want to stop e-mailing and dig into the damn codebase and push Novamente a little closer to being a really autonomous intelligence instead of a partially-complete codebase with some narrow-AI applications !!! ;-p -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI... and Novamente?
Eliezer S. Yudkowsky wrote: 1) AI morality is an extremely deep and nonobvious challenge which has no significant probability of going right by accident. 2) If you get the deep theory wrong, there is a strong possibility of a silent catastrophic failure: the AI appears to be learning everything just fine, and both you and the AI are apparently making all kinds of fascinating discoveries about AI morality, and everything seems to be going pretty much like your intuitions predict above, but when the AI crosses the cognitive threshold of superintelligence it takes actions which wipe out the human species as a side effect. AIXI, which is a completely defined formal system, definitely undergoes a failure of exactly this type. You have not shown this at all. From everything you've said it seems that you are trying to trick Ben into having so many misgivings about his own work that he holds it up while you create your AI first. I hope Ben will see through this deception and press ahead with novamente. -- A project that I give even odds for sucess... -- I WANT A DEC ALPHA!!! =) 21364: THE UNDISPUTED GOD OF ALL CPUS. http://users.rcn.com/alangrimes/ [if rcn.com doesn't work, try erols.com ] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI... and Novamente?
Ben Goertzel wrote: Your intuitions say... I am trying to summarize my impression of your viewpoint, please feel free to correct me... AI morality is a matter of experiential learning, not just for the AI, but for the programmers. To teach an AI morality you must give it the right feedback on moral questions and reinforce the right behaviors... and you must also learn *about* the deep issues of AI morality by raising a young AI. It isn't pragmatically realistic to work out elaborate theories of AI morality in advance; you must learn what you need to know as you go along. Moreover, learning what you need to know, as you go along, is a good strategy for creating a superintelligence... or at least, the rational estimate of the goodness of that strategy is sufficient to make it a good idea to try and create a superintelligence, and there aren't any realistic strategies that are better. An informal, intuitive theory of AI morality is good enough to spark experiential learning in the *programmer* that carries you all the way to the finish line. You'll learn what you need to know as you go along. The most fundamental theoretical and design challenge is making AI happen, at all; that's the really difficult part that's defeated everyone else so far. Focus on making AI happen. If you can make AI happen, you'll learn how to create moral AI from the experience. Hmmm. This is almost a good summary of my perspective, but you've still not come to grips with the extent of my uncertainty ;) I am not at all SURE that An informal, intuitive theory of AI morality is good enough to spark experiential learning in the *programmer* that carries you all the way to the finish line. where by the finish line you mean an AGI whose ongoing evolution will lead to beneficial effects for both humans and AGI's. I'm open to the possibility that it may someday become clear, as AGI work progresses, that a systematic theory of AGI morality is necessary in order to proceed safely. You are, however, relying on experiential learning to tell you *whether* a systematic theory of AGI morality is necessary. This is what I meant by trying to summarize your perspective as An informal, intuitive theory of AI morality is good enough to spark experiential learning in the *programmer* that carries you all the way to the finish line. The problem is that if you don't have a systematic theory of AGI morality you can't know whether you *need* a systematic theory of AGI morality. For example, I have a systematic theory of AGI morality which says that a programmer doing such-and-such is likely to see such-and-such results, with the result that experiential learning by the programmer is likely to result in the programmer solving *some* necessary AGI problems - enough for the programmer to feel really enthusiastic about all the progress being made. But when I model the programmer's expectations and the AGI's actions, I find that there are some classes of foundational error which, if the programmer's expectations are such as to make this error, the AGI's actions will not contradict the programmer's expectations until it's too late. Let me give an example. Suppose a programmer's intuitive theory of AGI morality is that the foundation of all morality is rational self-interest, and that the programmer's intuitive theory of AGI self-improvement trajectories is that AGIs are capable of improving at most linearly over time. On this theory, the programmer creates an absolutely selfish AI. I know enough evolutionary psychology to be terribly, terribly scared by this. Others may not realize just how absolutely unhuman absolutely pragmatic selfishness is, but may at least realize that an *absolutely* selfish AI would be a poor idea. But the programmer doesn't see anything wrong. If you take an absolutely selfish entity that's relying on you for most of its information, and you tell it that behaving cooperatively serves its selfish interests, and you have enough control over it while it's growing up to slap it whenever it does something that doesn't accord with your theory of pragmatic cooperation, then the absolutely selfish AI will appear to be learning good behavior, looking much like the programmer expects, as a pragmatically selfish strategy, which is what the programmer thinks *should* be happening. Once the AI's intelligence passes a certain point, it will realize that the programmer is mistaken on the following factual points: 1) That it is selfishly pragmatic for a superintelligence to deal with humans economically rather than converting them to computronium. 2) That the AI's intelligence growth rate is linear. Explaining the logic behind selfishly pragmatic cooperation requires explaining deception as a game-theoretical strategy that certain forms of cooperation are intended to counter. So at this point, the AI has the conceptual equipment to exploit the programmer.
Re: [agi] unFriendly AIXI... and Novamente?
This is slightly off-topic but no more so than the rest of the thread... 1) That it is selfishly pragmatic for a superintelligence to deal with humans economically rather than converting them to computronium. For convenience, lets rephrase this the majority of arbitrarily generated superintelligences would prefer to convert everything in the solar system into computronium than deal with humans within their laws and social norms. This rephrasing might not be perfectly fair and I invite anyone to adjust it to their taste and prefferances. Now here is my question, it's going to sound silly but there is quite a bit behind it: Of what use is computronium to a superintelligence? This is not a troll or any other abuse of the members of the list. It is no less serious or relevant than the assertion it addresses. I hope that many people on this list will answer this. I should warn you about how I am going to treat those answers. Any answer in the negitive, that the SI doesn't need vast quantities of computronium, will be applauded. Any answer in the affirmative and which would fit in five lines of text will be either wrong or so grossly incomplete as to be utterly meaningless and unworthy of anything more than a tearse retort. Longer answers will be treated with much greater interest and will be answered with far greater attention. My primary instrument in this will be the question Why?. The answers, I expect, will either spiral into circular reasoning or to such a ludacrous absurdities as to be totally irrational. The utility of this debate will be to show that the need for a Grand Theory of Friendliness is not something that needs to be argued as far simpler and perfectly obvious engineering constraints common to absolutly all technologies will be totally sufficient asside from the more complex implementation. I want this list to be useful to me and not have to skim through hundreds of e-mails watching the rabbi drive conversation into useless spirals as he works on the implementation details of the real problems. Really, I'm getting dizzy from all of this. Lets start walking in a streight line now. =( -- I WANT A DEC ALPHA!!! =) 21364: THE UNDISPUTED GOD OF ALL CPUS. http://users.rcn.com/alangrimes/ [if rcn.com doesn't work, try erols.com ] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI... and Novamente?
Jonathan Standley wrote: Now here is my question, it's going to sound silly but there is quite a bit behind it: Of what use is computronium to a superintelligence? If the superintelligence perceives a need for vast computational resources, then computronium would indeed be very useful. Assuming said SI is friendly to humans, one thing I can think of that *may* need such power would be certain megascale engineering projects. Keeping track of everything involved in, for example, opening a wormhole could require unimaginable resources. (this is just a wild guess, aside from a Stephen Hawking book or two, I'm rather clueless when it comes to quantum-ish stuff). OK, that is a reasonable answer however I can't immagine even a dycen's sphere (assuming it had a sufficiently regular design) would require much more than what would fit on my desk to work out. The smaller, more compact the components are in a system, the closer they can be to each other, reducing speed of light communications delays. By my reasoning that is the only real advantage of computronium (unless energy efficiency is an overwhelming concern). Ofcourse, there's your tradeoff. It would seem that this would place an upper bound on how much matter you would want to use before communication delays start getting really annoying. (and hence cause the evil AI to stop after consuming a county or two). Imagine if one could create a new universe, and then move into it. This universe would be however you want it to be; you are omniscient and omnipotent within it. There are no limits once you move in. In some sense, you could consider making such a universe a 'goal to end all goals', since literally anything that the creator wishes is possible and easy within the new universe. A few people would find that emotionally rewarding. As for me, I rarely play video games anymore. In the past I have found that the best games, such as Dragon Warrior [sometimes Dragon Quest] IV required only 800kb and provided a rich and detailed world on only an 8-bit processor with hardly any ram. On balance, this idea is, practically speaking, pointless. It would be much cheaper to deploy technology in this universe and tweak it as you like. On a more personal note, when I was a little kid I once (maybe a few times) had a dream where I had managed to escape into a metaverse which had the topology of a torus and was somewhat red in color... In this metaverse I could Reset the universe to any pattern I chose and live in it from the beginning in any way I chose. Anyway, that's waaay off topick... Assuming all the above, the issue becomes 'what resources are required to reach the be-all end-all of goals?' I don't beleive any such goal exists. All of the energy of the visible universe, and 10 trillion years could be the minimum. Or... the matter (converted to energy and computational structures) that makes up a single 50km object in the asteroid belt could be enough. At this point in time, we have no way of even making an educated guess. If the requirements are towards the low end of the scale, even an AI with insane ambitions to godhood wouldn't need to turn the whole solar system into computronium Now this gets interesting. Here we need to start thinking in terms of goals: A fairly minimal goal system would be to master mathematics, physics, chemestry, engineering, and a number of other diciplines and have enough capacity in reserve to persue any project one might be interested in, mostly having to do with survival. Depending on your assumptions about the efficacy of nanotech, such a device wouldn't be much bigger than the HD in your computer. If one wanted to start doing grand experaments in this universe, such as probing down to the plank length (10^-35 M) to see if you can dig your way into some other universe you might need to build some kind of reactor that could be quite large but not be much bigger than the moon. Another method might involve constructing a particle accelerator billions of miles long to take an electron or something close enough to the speed of light to get to that scale... In that case you probably wouldn't need anything larger that jupiter to do it. Can anyone else think of any better goals? -- I WANT A DEC ALPHA!!! =) 21364: THE UNDISPUTED GOD OF ALL CPUS. http://users.rcn.com/alangrimes/ [if rcn.com doesn't work, try erols.com ] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI
Eliezer S. Yudkowsky wrote: I recently read through Marcus Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a formal definition of intelligence, it is not a solution of Friendliness (nor do I have any reason to believe Marcus Hutter intended it as one). In fact, as one who specializes in AI morality, I was immediately struck by two obvious-seeming conclusions on reading Marcus Hutter's formal definition of intelligence: 1) There is a class of physically realizable problems, which humans can solve easily for maximum reward, but which - as far as I can tell - AIXI cannot solve even in principle; 2) While an AIXI-tl of limited physical and cognitive capabilities might serve as a useful tool, AIXI is unFriendly and cannot be made Friendly regardless of *any* pattern of reinforcement delivered during childhood. Before I post further, is there *anyone* who sees this besides me? Also, let me make clear why I'm asking this. AIXI and AIXI-tl are formal definitions; they are *provably* unFriendly. There is no margin for handwaving about future revisions of the system, emergent properties of the system, and so on. A physically realized AIXI or AIXI-tl will, provably, appear to be compliant up until the point where it reaches a certain level of intelligence, then take actions which wipe out the human species as a side effect. The most critical theoretical problems in Friendliness are nonobvious, silent, catastrophic, and not inherently fun for humans to argue about; they tend to be structural properties of a computational process rather than anything analogous to human moral disputes. If you are working on any AGI project that you believe has the potential for real intelligence, you are obliged to develop professional competence in spotting these kinds of problems. AIXI is a formally complete definition, with no margin for handwaving about future revisions. If you can spot catastrophic problems in AI morality you should be able to spot the problem in AIXI. Period. If you cannot *in advance* see the problem as it exists in the formally complete definition of AIXI, then there is no reason anyone should believe you if you afterward claim that your system won't behave like AIXI due to unspecified future features. -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI
Eliezer wrote: * a paper by Marcus Hutter giving a Solomonoff induction based theory of general intelligence Interesting you should mention that. I recently read through Marcus Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a formal definition of intelligence, it is not a solution of Friendliness (nor do I have any reason to believe Marcus Hutter intended it as one). In fact, as one who specializes in AI morality, I was immediately struck by two obvious-seeming conclusions on reading Marcus Hutter's formal definition of intelligence: 1) There is a class of physically realizable problems, which humans can solve easily for maximum reward, but which - as far as I can tell - AIXI cannot solve even in principle; I don't see this, nor do I believe it... 2) While an AIXI-tl of limited physical and cognitive capabilities might serve as a useful tool, AIXI-tl is a totally computationally infeasible algorithm. (As opposed to straight AIXI, which is an outright *uncomputable* algorithm). I'm sure you realize this, but those who haven't read Hutter's stuff may not... If you haven't already, you should look at Juergen Schmidhuber's OOPS system, which is similar in spirit to AIXI-tl but less computationally infeasible. (Although I don't think that OOPS is a viable pragmatic approach to AGI either, it's a little closer.) AIXI is unFriendly and cannot be made Friendly regardless of *any* pattern of reinforcement delivered during childhood. This assertion doesn't strike me as clearly false But I'm not sure why it's true either. Please share your argument... -- Ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI
In a message dated 2/11/2003 10:17:07 AM Mountain Standard Time, [EMAIL PROTECTED] writes: 1) There is a class of physically realizable problems, which humans can solve easily for maximum reward, but which - as far as I can tell - AIXI cannot solve even in principle; 2) While an AIXI-tl of limited physical and cognitive capabilities might serve as a useful tool, AIXI is unFriendly and cannot be made Friendly regardless of *any* pattern of reinforcement delivered during childhood. Before I post further, is there *anyone* who sees this besides me? Can someone post a link to this? Thanks!
RE: [agi] unFriendly AIXI
2) While an AIXI-tl of limited physical and cognitive capabilities might serve as a useful tool, AIXI is unFriendly and cannot be made Friendly regardless of *any* pattern of reinforcement delivered during childhood. Before I post further, is there *anyone* who sees this besides me? Also, let me make clear why I'm asking this. AIXI and AIXI-tl are formal definitions; they are *provably* unFriendly. There is no margin for handwaving about future revisions of the system, emergent properties of the system, and so on. A physically realized AIXI or AIXI-tl will, provably, appear to be compliant up until the point where it reaches a certain level of intelligence, then take actions which wipe out the human species as a side effect. The most critical theoretical problems in Friendliness are nonobvious, silent, catastrophic, and not inherently fun for humans to argue about; they tend to be structural properties of a computational process rather than anything analogous to human moral disputes. If you are working on any AGI project that you believe has the potential for real intelligence, you are obliged to develop professional competence in spotting these kinds of problems. AIXI is a formally complete definition, with no margin for handwaving about future revisions. If you can spot catastrophic problems in AI morality you should be able to spot the problem in AIXI. Period. If you cannot *in advance* see the problem as it exists in the formally complete definition of AIXI, then there is no reason anyone should believe you if you afterward claim that your system won't behave like AIXI due to unspecified future features. Eliezer, AIXI and AIXItl are systems that are designed to operate with an initial fixed goal. As defined, they don't modify the overall goal they try to achieve, they just try to achieve this fixed goal as well as possible through adaptively determining their actions. Basically, at each time step, AIXI searches through the space of all programs to find the program that, based on its experience, will best fulfill its given goal. It then lets this best program run and determine its next action. Based on that next action, it has a new program space search program... etc. AIXItl does the same thing but it restricts the search to a finite space of programs, hence it's a computationally possible (but totally impractical) algorithm. The harmfulness or benevolence of an AIXI system is therefore closely tied to the definition of the goal that is given to the system in advance. It's a very different sort of setup than Novamente, because 1) a Novamente will be allowed to modify its own goals based on its experience. 2) a Novamente will be capable of spontaneous behavior as well as explicitly goal-directed behavior I'm not used to thinking about fixed-goal AGI systems like AIXI, actually The Friendliness and other qualities of such a system seem to me to depend heavily on the goal chosen. For instance, what if the system's goal were to prove as many complex mathematical theorems as possible (given a certain axiomatizaton of math, and a certain definition of complexity). Then it would become dangerous in the long run when it decided to reconfigure all matter in the universe to increase its brainpower. So you want be nice to people and other living things to be part of its initial fixed goal. But this is very hard to formalize in a rigorous way Any formalization one could create, is bound to have some holes in it And the system will have no desire to fix the holes, because its structure is oriented around achieving its given fixed goal A fixed-goal AGI system seems like a bit of a bitch, Friendliness-wise... What if one supplied AIXI with a goal that explicitly involved modifying its own goal, though? So, the initial goal G = Be nice to people and other living things according to the formalization F, AND, iteratively reformulate this goal in a way that pleases the humans you're in contact with, according to the formalization F1. It is not clear to me that an AIXI with this kind of self-modification-oriented goal would be unfriendly to humans. It might be, though. It's not an approach I would trust particularly. If one gave the AIXItl system the capability to modify the AIXItl algorithm itself in such a way as to maximize expected goal achievement given its historical observations, THEN one has a system that really goes beyond AIXItl, and has a much less predictable behavior. Hutter's theorems don't hold anymore, for one thing (though related theorems might). Anyway, since AIXI is uncomputable and AIXItl is totally infeasible, this is a purely academic exercise! -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI
On Tue, 11 Feb 2003, Ben Goertzel wrote: Eliezer wrote: * a paper by Marcus Hutter giving a Solomonoff induction based theory of general intelligence Interesting you should mention that. I recently read through Marcus Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a formal definition of intelligence, it is not a solution of Friendliness (nor do I have any reason to believe Marcus Hutter intended it as one). In fact, as one who specializes in AI morality, I was immediately struck by two obvious-seeming conclusions on reading Marcus Hutter's formal definition of intelligence: 1) There is a class of physically realizable problems, which humans can solve easily for maximum reward, but which - as far as I can tell - AIXI cannot solve even in principle; I don't see this, nor do I believe it... I don't believe it either. Is this a reference to Penrose's argument based on Goedel's Incompleteness Theorem (which is wrong)? 2) While an AIXI-tl of limited physical and cognitive capabilities might serve as a useful tool, AIXI-tl is a totally computationally infeasible algorithm. (As opposed to straight AIXI, which is an outright *uncomputable* algorithm). I'm sure you realize this, but those who haven't read Hutter's stuff may not... If you haven't already, you should look at Juergen Schmidhuber's OOPS system, which is similar in spirit to AIXI-tl but less computationally infeasible. (Although I don't think that OOPS is a viable pragmatic approach to AGI either, it's a little closer.) AIXI is unFriendly and cannot be made Friendly regardless of *any* pattern of reinforcement delivered during childhood. This assertion doesn't strike me as clearly false But I'm not sure why it's true either. The formality of Hutter's definitions can give the impression that they cannot evolve. But they are open to interactions with the external environment, and can be influenced by it (including evolving in response to it). If the reinforcement values are for human happiness, then the formal system and humans together form a symbiotic system. This symbiotic system is where you have to look for the friendliness. This is part of an earlier discussion at: http://www.mail-archive.com/agi@v2.listbox.com/msg00606.html Cheers, Bill --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI
For the third grade, my oldest son Zar went to a progressive charter school where they did one silly thing: each morning in homeroom the kids had to write on a piece of paper what their goal for the day was. Then at the end of the day they had to write down how well they did at achieving their goal. Being a Goertzel, Zar started out with My goal is to meet my goal and after a few days started using My goal is not to meet my goal. Soon many of the boys in his class were using My goal is not to meet my goal. Self-referential goals were banned in the school ... but soon, the silly goal-setting exercise was abolished (saving the kids a bit of time-wasting each day). What happens when AIXI is given the goal My goal is not to meet my goal? ;-) I suppose its behavior becomes essentially random? If one started a Novamente system off with the prime goal My goal is not to meet my goal, it would probably end up de-emphasizing and eventually killing this goal. Its long-term dynamics would not be random, because some other goal (or set of goals) would arise in the system and become dominant. But it's hard to say in advance what those would be. -- Ben G -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Ben Goertzel Sent: Tuesday, February 11, 2003 4:33 PM To: [EMAIL PROTECTED] Subject: RE: [agi] unFriendly AIXI The formality of Hutter's definitions can give the impression that they cannot evolve. But they are open to interactions with the external environment, and can be influenced by it (including evolving in response to it). If the reinforcement values are for human happiness, then the formal system and humans together form a symbiotic system. This symbiotic system is where you have to look for the friendliness. This is part of an earlier discussion at: http://www.mail-archive.com/agi@v2.listbox.com/msg00606.html Cheers, Bill Bill, What you say is mostly true. However, taken literally Hutter's AGI designs involve a fixed, precisely-defined goal function. This strikes me as an unsafe architecture in the sense that we may not get the goal exactly right the first time around. Now, if humans iteratively tweak the goal function, then indeed, we have a synergetic system, whose dynamics include the dynamics of the goal-tweaking humans... But what happens if the system interprets its rigid goal to imply that it should stop humans from tweaking its goal? Of course, the goal function should be written in such a way as to make it unlikely the system will draw such an implication... It's also true that tweaking a superhumanly intelligent system's goal function may be very difficult for us humans with our limited intelligence. Making the goal function adaptable makes AIXItl into something a bit different... and making the AIXItl code rewritable by AIXItl makes it into something even more different... -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED] --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI
Ben Goertzel wrote: AIXI and AIXItl are systems that are designed to operate with an initial fixed goal. As defined, they don't modify the overall goal they try to achieve, they just try to achieve this fixed goal as well as possible through adaptively determining their actions. Basically, at each time step, AIXI searches through the space of all programs to find the program that, based on its experience, will best fulfill its given goal. It then lets this best program run and determine its next action. Based on that next action, it has a new program space search program... etc. AIXItl does the same thing but it restricts the search to a finite space of programs, hence it's a computationally possible (but totally impractical) algorithm. The harmfulness or benevolence of an AIXI system is therefore closely tied to the definition of the goal that is given to the system in advance. Actually, Ben, AIXI and AIXI-tl are both formal systems; there is no internal component in that formal system corresponding to a goal definition, only an algorithm that humans use to determine when and how hard they will press the reward button. -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI
Ben, On Tue, 11 Feb 2003, Ben Goertzel wrote: The formality of Hutter's definitions can give the impression that they cannot evolve. But they are open to interactions with the external environment, and can be influenced by it (including evolving in response to it). If the reinforcement values are for human happiness, then the formal system and humans together form a symbiotic system. This symbiotic system is where you have to look for the friendliness. This is part of an earlier discussion at: http://www.mail-archive.com/agi@v2.listbox.com/msg00606.html Cheers, Bill Bill, What you say is mostly true. However, taken literally Hutter's AGI designs involve a fixed, precisely-defined goal function. This strikes me as an unsafe architecture in the sense that we may not get the goal exactly right the first time around. Now, if humans iteratively tweak the goal function, then indeed, we have a synergetic system, whose dynamics include the dynamics of the goal-tweaking humans... But what happens if the system interprets its rigid goal to imply that it should stop humans from tweaking its goal? . . . The key thing is that Hutter's system is open - it reads data from the external world. And there is no essential difference between data and code (all data needs is an interpreter to become code). So evolving values (goals) can come from the external world. We can draw a system boundary around any combination of the formal system and the external world. By defining reinforcement values for human happiness, system values are equated to human values and the friendly system is the symbiosis of the formal system and humans. The formal values are fixed, but to human values which are not fixed but can evolve. Cheers, Bill --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI
The harmfulness or benevolence of an AIXI system is therefore closely tied to the definition of the goal that is given to the system in advance. Actually, Ben, AIXI and AIXI-tl are both formal systems; there is no internal component in that formal system corresponding to a goal definition, only an algorithm that humans use to determine when and how hard they will press the reward button. -- Eliezer S. Yudkowsky Well, the definitions of AIXI and AIXItl assume the existence of a reward function or goal function (denoted V in the paper). The assumption of the math is that this reward function is specified up-front, before AIXI/AIXItl starts running. If the reward function is allowed to change adaptively, based on the behavior of the AIXI/AIXItl algorithm, then the theorems don't work anymore, and you have a different sort of synergetic system such as Bill Hibbard was describing. If human feedback IS the reward function, then you have a case where the reward function may well change adaptively based on the AI system's behavior. Whether the system will ever achieve any intelligence at all then depends on how clever the humans are in doing the rewarding... as i said, Hutter's theorems about intelligence don't apply... -- Ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI
Ben Goertzel wrote: The harmfulness or benevolence of an AIXI system is therefore closely tied to the definition of the goal that is given to the system in advance. Under AIXI the goal is not given to the system in advance; rather, the system learns the humans' goal pattern through Solomonoff induction on the reward inputs. Technically, in fact, it would be entirely feasible to give AIXI *only* reward inputs, although in this case it might require a long time for AIXI to accumulate enough data to constrain the Solomonoff-induced representation to a sufficiently detailed model of reality that it could successfully initiate complex actions. The utility of the non-reward input is that it provides additional data, causally related to the mechanisms producing the reward input, upon which Solomonoff induction can also be performed. Agreed? It's a very different sort of setup than Novamente, because 1) a Novamente will be allowed to modify its own goals based on its experience. Depending on the pattern of inputs and rewards, AIXI will modify its internal representation of the algorithm which it expects to determine future rewards. Would you say that this is roughly analogous to Novamente's learning of goals based on experience, or is there in your view a fundamental difference? And if so, is AIXI formally superior or in some way inferior to Novamente? 2) a Novamente will be capable of spontaneous behavior as well as explicitly goal-directed behavior If the purpose of spontaneous behavior is to provoke learning experiences, this behavior is implicit in AIXI as well, though not obviously so. I'm actually not sure about this because Hutter doesn't explicitly discuss it. But it looks to me like AIXI, under its formal definition, emergently exhibits curiosity wherever there are, for example, two equiprobable models of reality which determine different rewards and can be distinguished by some test. What we interpret as spontaneous behavior would then emerge from a horrendously uncomputable exploration of all possible realities to find tests which are ultimately likely to result in distinguishing data, but in ways which are not at all obvious to any human observer. Would it be fair to say that AIXI's spontaneous behavior is formally superior to Novamente's spontaneous behavior? I'm not used to thinking about fixed-goal AGI systems like AIXI, actually The Friendliness and other qualities of such a system seem to me to depend heavily on the goal chosen. Again, AIXI as a formal system has no goal definition. [Note: I may be wrong about this; Ben Goertzel and I seem to have acquired different models of AIXI and it is very possible that mine is the wrong one.] It is tempting to think of AIXI as Solomonoff-inducing a goal pattern from its rewards, and Solomoff-inducing reality from its main input channel, but actually AIXI simultaneously induces the combined reality-and-reward pattern from both the reward channel and the input channel simultaneously. In theory AIXI could operate on the reward channel alone; it just might take a long time before the reward channel gave enough data to constrain its reality-and-reward model to the point where AIXI could effectively model reality and hence generate complex reward-maximizing actions. For instance, what if the system's goal were to prove as many complex mathematical theorems as possible (given a certain axiomatizaton of math, and a certain definition of complexity). Then it would become dangerous in the long run when it decided to reconfigure all matter in the universe to increase its brainpower. So you want be nice to people and other living things to be part of its initial fixed goal. But this is very hard to formalize in a rigorous way Any formalization one could create, is bound to have some holes in it And the system will have no desire to fix the holes, because its structure is oriented around achieving its given fixed goal A fixed-goal AGI system seems like a bit of a bitch, Friendliness-wise... If the humans see that AIXI seems to be dangerously inclined toward just proving math theorems, they might decide to press the reward button when AIXI provides cures for cancer, or otherwise helps people. AIXI would then modify its combined reality-and-reward representation accordingly to embrace the new simplest explanation that accounted for *all* the data, i.e., its reward function would then have to account for mathematical theorems *and* cancer cures *and* any other kind of help that humans had, in the past, pressed the reward button for. Would you say this is roughly analogous to the kind of learning you intend Novamente to perform? Or perhaps even an ideal form of such learning? What if one supplied AIXI with a goal that explicitly involved modifying its own goal, though? Self-modification in any form completely breaks Hutter's definition, and you no longer have an AIXI any
Re: [agi] unFriendly AIXI
Ben Goertzel wrote: Huh. We may not be on the same page. Using: http://www.idsia.ch/~marcus/ai/aixigentle.pdf Page 5: The general framework for AI might be viewed as the design and study of intelligent agents [RN95]. An agent is a cybernetic system with some internal state, which acts with output yk on some environment in cycle k, perceives some input xk from the environment and updates its internal state. Then the next cycle follows. We split the input xk into a regular part x0k and a reward rk, often called reinforcement feedback. From time to time the environment provides non-zero reward to the agent. The task of the agent is to maximize its utility, defined as the sum of future rewards. I didn't see any reward function V defined for AIXI in any of the Hutter papers I read, nor is it at all clear how such a V could be defined, given that the internal representation of reality produced by Solomonoff induction is not fixed enough for any reward function to operate on it in the same way that, e.g., our emotions bind to our own standardized cognitive representations. Quite literally, we are not on the same page ;) Thought so... Look at page 23, Definition 10 of the intelligence ordering relation (which says what it means for one system to be more intelligent than another). And look at the start of Section 4.1, which Definition 10 lives within. The reward function V is defined there, basically as cumulative reward over a period of time. It's used all thru Section 4.1, and following that, it's used mostly implicitly inside the intelligence ordering relation. The reward function V however is *not* part of AIXI's structure; it is rather a test *applied to* AIXI from outside as part of Hutter's optimality proof. AIXI itself is not given V; it induces V via Solomonoff induction on past rewards. V can be at least as flexible as any criterion a (computable) human uses to determine when and how hard to press the reward button, nor is AIXI's approximation of V fixed at the start. Given this, would you regard AIXI as formally approximating the kind of goal learning that Novamente is supposed to do? As Definition 10 makes clear, intelligence is defined relative to a fixed reward function. A fixed reward function *outside* AIXI, so that the intelligence of AIXI can be defined relative to it... or am I wrong? What the theorems about AIXItl state is that, given a fixed reward function, the AIXItl can do as well as any other algorithm at achieving this reward function, if you give it computational resources equal to those that the other algorithm got, plus a constant. But the constant is fucking HUGE. Actually, I think AIXItl is supposed to do as well as a tl-bounded algorithm given t2^l resources... though again perhaps I am wrong. Whether you specify the fixed reward function in its cumulative version or not doesn't really matter... Actually, AIXI's fixed horizon looks to me like it could give rise to some strange behaviors, but I think Hutter's already aware that this is probably AIXI's weakest link. -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI
Given this, would you regard AIXI as formally approximating the kind of goal learning that Novamente is supposed to do? Sorta.. but goal-learning is not the complete motivational structure of Novamente... just one aspect As Definition 10 makes clear, intelligence is defined relative to a fixed reward function. A fixed reward function *outside* AIXI, so that the intelligence of AIXI can be defined relative to it... or am I wrong? No, you're right. What the theorems about AIXItl state is that, given a fixed reward function, the AIXItl can do as well as any other algorithm at achieving this reward function, if you give it computational resources equal to those that the other algorithm got, plus a constant. But the constant is fucking HUGE. Actually, I think AIXItl is supposed to do as well as a tl-bounded algorithm given t2^l resources... though again perhaps I am wrong. Ah, so the constant is multiplicative rather than additive. You're probably right.. I haven't looked at those details for a while (I read the paper moderately carefully several months ago, and just glanced at it briefly now in the context of this discussion). But that doesn't make the algorithm any better ;-) Now that I stretch my aged memory, I recall that Hutter's other papers give variations on the result, e.g. http://www.hutter1.de/ai/pfastprg.htm gives a multiplicative factor of 5 and some additive term. I think the result in that paper could be put together with AIXItl though he hasn't done so yet. Whether you specify the fixed reward function in its cumulative version or not doesn't really matter... Actually, AIXI's fixed horizon looks to me like it could give rise to some strange behaviors, but I think Hutter's already aware that this is probably AIXI's weakest link. yeah, that assumption was clearly introduced to make the theorems easier to prove. I don't think it's essential to the theory, really. ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI
Ben Goertzel wrote: Yeah, you're right, I mis-spoke. The theorems assume the goal function is known in advance -- but not known to the system, just known to the entity defining and estimating the system's intelligence and giving the rewards. I was implicitly assuming the case in which the goal was encapsulated in a goal-definition program of some sort, which was hooked up to AIXI in advance; but that is not the only case. Actually, there's no obvious way you could ever include V in AIXI, at all. V would have to operate as a predicate on internal representations of reality that have no fixed format or pattern. At most you might be able to define a V that operates as a predicate on AIXI's inputs, in which case you can dispense with the separate reward channel. In fact this is formally equivalent to AIXI, since it equates to an AIXI with an input channel I and a reward channel that is deterministically V(I). It's a very different sort of setup than Novamente, because 1) a Novamente will be allowed to modify its own goals based on its experience. Depending on the pattern of inputs and rewards, AIXI will modify its internal representation of the algorithm which it expects to determine future rewards. Would you say that this is roughly analogous to Novamente's learning of goals based on experience, or is there in your view a fundamental difference? And if so, is AIXI formally superior or in some way inferior to Novamente? Well, AIXI is superior to any computable algorithm, in a sense. If you had the infinite-computing-power hardware that it requires, it would be pretty damn powerful ;-p But so would a lot of other approaches!! Infinite computing power provides AI's with a lot of axle grease!! Obviously it is not AIXI's purpose to be implemented. What AIXI defines rather is an abstraction that lets us talk more easily about certain kinds of intelligence. If any AI program we could conceivably want to build is an imperfect approximation of AIXI, that is an interesting property of AIXI. If an AI program we want to build is *superior* to AIXI then that is an *extremely* interesting property. The reason I asked the question was not to ask whether AIXI is pragmatically better as a design strategy than Novamente. What I was asking you rather is if, looking at AIXI, you see something *missing* that would be present in Novamente. In other words, *if* you had an infinitely powerful computer processor, is there a reason why you would *not* implement AIXI on it, and would instead prefer Novamente, even if it had to run on a plain old cluster? If the purpose of spontaneous behavior is to provoke learning experiences, this behavior is implicit in AIXI as well, though not obviously so. I'm actually not sure about this because Hutter doesn't explicitly discuss it. Well, you could argue that if Novamente is so good, AIXI will eventually figure out how to emulate Novamente, since Novamente is just one of the many programs in the space it searches!! I am really not very interested in comparing AIXI to Novamente, because they are not comparable: AIXI assumes infinite computing power and Novamente does not. We aren't comparing AIXI's design to Novamente's design so much as we're comparing AIXI's *kind of intelligence* to Novamente's *kind of intelligence*. Does Novamente have something AIXI is missing? Or does AIXI have strictly more intelligence than Novamente? Actually, given the context of Friendliness, what we're interested in is not so much intelligence as interaction with humans; under this view, for example, giving humans a superintelligently deduced cancer cure is just one way of interacting with humans. Looking at AIXI and Novamente, do you see any way that Novamente interacts with humans in a way that AIXI cannot? AIXItl, on the other hand, is a finite-computing-power program. In principle it can demonstrate spontaneous behaviors, but in practice, I think it will not demonstrate many interesting spontaneous behaviors. Because it will spend all its time dumbly searching through a huge space of useless programs!! Also, not all of Novamente's spontaneous behaviors are even implicitly goal-directed. Novamente is a goal-oriented but not 100% goal-directed system, which is one major difference from AIXI and AIXItl. I agree that it is a major difference; does it mean that Novamente can interact with humans in useful or morally relevant ways of which AIXI is incapable? But it looks to me like AIXI, under its formal definition, emergently exhibits curiosity wherever there are, for example, two equiprobable models of reality which determine different rewards and can be distinguished by some test. What we interpret as spontaneous behavior would then emerge from a horrendously uncomputable exploration of all possible realities to find tests which are ultimately likely to result in distinguishing data, but in ways which are not at all obvious to any human
Re: [agi] unFriendly AIXI
Eliezer S. Yudkowsky wrote: Not really. There is certainly a significant similarity between Hutter's stuff and the foundations of Novamente, but there are significant differences too. To sort out the exact relationship would take me more than a few minutes' thought. There are indeed major differences in the foundations. Is there something useful or important that Novamente does, given its foundations, that you could not do if you had a physically realized infinitely powerful computer running Hutter's stuff? Actually, you said that it would take you more than a few minutes thought to sort it all out, so let me ask a question which you can hopefully answer more quickly... Do you *feel intuitively* that there is something useful or important Novamente does, given its foundations, that you could not do if you had a physically realized AIXI? -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI
Eliezer, In this discussion you have just moved the focus to the superiority of one AGI approach versus another in terms of *interacting with humans*. But once one AGI exists it's most likely not long before there are more AGIs and there will need to be a moral/ethical system to guide AGI-AGI interaction. And with super clever AGIs around it's likely that that human modification speeds up leading the category 'human' to be a very loose term. So we need a moral/ethical system to guide AGI- once-were-human interactions. So for these two reasons alone I think we need to start out thinking in more general terms that AGIs being focussed on 'interacting with humans'. If you have an goal-modifying AGI it might figure this all out. But why should the human designers/teachers not avoid the probem in the first place since were can anticipate the issue already fairly easily. Of coursei n terms of the 'unFriendly AIXI' debate this issue of a tight focus on interaction with humans is of no significance, but it I think it is important in its own right. Cheers, Philip --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI
Hi, The reason I asked the question was not to ask whether AIXI is pragmatically better as a design strategy than Novamente. What I was asking you rather is if, looking at AIXI, you see something *missing* that would be present in Novamente. In other words, *if* you had an infinitely powerful computer processor, is there a reason why you would *not* implement AIXI on it, and would instead prefer Novamente, even if it had to run on a plain old cluster? These are deep and worthwhile questions that I can't answer thoroughly off the cuff, I'll have to put some thought into them and reply a little later. There are other less fascinating but more urgent things in the queue tonight, alas ;-p My intuitive feeling is that I'd rather implement Novamente but with AIXI plugged in as the schema/predicate learning component. In other words, it's clear that an infinitely capable procedure learning routine would be very valuable for AGI. But I don't really like AIXI's overall control structure, and I need to think a bit about why. ONE reason is that it's insanely inefficient, but even if you remove consideration of efficiency, there may be other problems with it too. Actually, given the context of Friendliness, what we're interested in is not so much intelligence as interaction with humans; under this view, for example, giving humans a superintelligently deduced cancer cure is just one way of interacting with humans. Looking at AIXI and Novamente, do you see any way that Novamente interacts with humans in a way that AIXI cannot? Well, off the cuff, I'm not sure because I've thought about Novamente a lot more than I've thought about AIXI. I'll need to mull this over It's certainly worth thinking about. Novamente is fundamentally self-modifying (NOT the current codebase but the long-term design). Based on feedback from humans and its own self- organization, it can completely revise its own codebase. AIXI can't do that. Along with self-modification comes the ability to modify its reward/punishment receptors, and interpret what formerly would have been a reward as a punishment... [This won't happen often but is in principle a possibility] I don't know if this behavior is in AIXI's repertoire... is it? Also, not all of Novamente's spontaneous behaviors are even implicitly goal-directed. Novamente is a goal-oriented but not 100% goal-directed system, which is one major difference from AIXI and AIXItl. I agree that it is a major difference; does it mean that Novamente can interact with humans in useful or morally relevant ways of which AIXI is incapable? Maybe... hmmm. In that case you cannot prove any of Hutter's theorems about them. And if you can't prove theorems about them then they are nothing more than useless abstractions. Since AIXI can never be implemented and AIXItl is so inefficient it could never do anything useful in practice. But they are very useful tools for talking about fundamental kinds of intelligence. I am not sure whether they are or not. Well, sure ... it's *roughly analogous*, in the sense that it's experiential reinforcement learning, sure. Is it roughly analogous, but not really analogous, in the sense that Novamente can do something AIXI can't? Well, Novamente will not follow the expectimax algorithm. So it will display behaviors that AIXI will never display. I'm having trouble, off the cuff and in a hurry, thinking about AIXI in the context of a human saying to it In my view, you should adjust your goal system for this reason If a human says this to Novamente, it may consider the request and may do so. It may do so if this human has been right about a lot of things in the past, for example. If a human says this to AIXI, how does AIXI react and why? AIXI doesn't have a goal system in the same sense that Novamente does. AIXI if it's smart enough co uld hypothetically figure out what the human meant and use this to modify its current operating program (but not its basic program-search mechanism, because AIXI is not self-modifying in such a strong sense)... if its history told it that listening to humans causes it to get rewarded. But, it seems to me intuitively that the modification AIXI would make in this case, would not constrain or direct AIXI's future development as strongly as the modification Novamente would make in response to the same human request. I'm not 100% sure about this though, because my mental model of AIXI's dynamics is not that good, and I haven't tried to do the math corresponding to this scenario. What do you think about AIXI's response to this scenario, Eliezer? You seem to have your head more fully wrapped around AIXI than I do, at the moment ;-) I really should reread the paper, but I don't have time right now. This little scenario I've just raised does NOT exhaust the potentially important differences between Novamente and AIXI, it's just one thing that happened to occur to
Re: [agi] unFriendly AIXI
Bill Hibbard wrote: On Tue, 11 Feb 2003, Ben Goertzel wrote: Eliezer wrote: Interesting you should mention that. I recently read through Marcus Hutter's AIXI paper, and while Marcus Hutter has done valuable work on a formal definition of intelligence, it is not a solution of Friendliness (nor do I have any reason to believe Marcus Hutter intended it as one). In fact, as one who specializes in AI morality, I was immediately struck by two obvious-seeming conclusions on reading Marcus Hutter's formal definition of intelligence: 1) There is a class of physically realizable problems, which humans can solve easily for maximum reward, but which - as far as I can tell - AIXI cannot solve even in principle; I don't see this, nor do I believe it... I don't believe it either. Is this a reference to Penrose's argument based on Goedel's Incompleteness Theorem (which is wrong)? Oh, well, in that case, I'll make my statement more formal: There exists a physically realizable, humanly understandable challenge C on which a tl-bounded human outperforms AIXI-tl for humanly understandable reasons. Or even more formally, there exists a computable process P which, given either a tl-bounded uploaded human or an AIXI-tl, supplies the uploaded human with a greater reward as the result of strategically superior actions taken by the uploaded human. :) -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI
Oh, well, in that case, I'll make my statement more formal: There exists a physically realizable, humanly understandable challenge C on which a tl-bounded human outperforms AIXI-tl for humanly understandable reasons. Or even more formally, there exists a computable process P which, given either a tl-bounded uploaded human or an AIXI-tl, supplies the uploaded human with a greater reward as the result of strategically superior actions taken by the uploaded human. :) -- Eliezer S. Yudkowsky Hmmm. Are you saying that given a specific reward function and a specific environment, the t1-bounded uploaded human with resources (t,l) will act so as to maximize the reward function better than AIXI-tl with resources (T,l) with T as specified by Hutter's theorem of AIXI-tl optimality? Presumably you're not saying that, because it would contradict his theorem? So what clever loophole are you invoking?? ;-) ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI
Ben Goertzel wrote: Oh, well, in that case, I'll make my statement more formal: There exists a physically realizable, humanly understandable challenge C on which a tl-bounded human outperforms AIXI-tl for humanly understandable reasons. Or even more formally, there exists a computable process P which, given either a tl-bounded uploaded human or an AIXI-tl, supplies the uploaded human with a greater reward as the result of strategically superior actions taken by the uploaded human. :) -- Eliezer S. Yudkowsky Hmmm. Are you saying that given a specific reward function and a specific environment, the t1-bounded uploaded human with resources (t,l) will act so as to maximize the reward function better than AIXI-tl with resources (T,l) with T as specified by Hutter's theorem of AIXI-tl optimality? Presumably you're not saying that, because it would contradict his theorem? Indeed. I would never presume to contradict Hutter's theorem. So what clever loophole are you invoking?? ;-) An intuitively fair, physically realizable challenge with important real-world analogues, solvable by the use of rational cognitive reasoning inaccessible to AIXI-tl, with success strictly defined by reward (not a Friendliness-related issue). It wouldn't be interesting otherwise. -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI
So what clever loophole are you invoking?? ;-) An intuitively fair, physically realizable challenge with important real-world analogues, solvable by the use of rational cognitive reasoning inaccessible to AIXI-tl, with success strictly defined by reward (not a Friendliness-related issue). It wouldn't be interesting otherwise. -- Eliezer S. Yudkowsky Well, when you're ready to spill, we're ready to listen ;) I am guessing it utilizes the reward function in an interesting sort of way... ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
RE: [agi] unFriendly AIXI
It seems to me that this answer *assumes* that Hutter's work is completely right, an assumption in conflict with the uneasiness you express in your previous email. It's right as mathematics... I don't think his definition of intelligence is the maximally useful one, though I think it's a reasonably OK one. I have proposed a different but related definition of intelligence, before, and have not been entirely satisfied with my own definition, either. I like mine better than Hutter's... but I have not proved any cool theorems about mine... If Novamente can do something AIXI cannot, then Hutter's work is very highly valuable because it provides a benchmark against which this becomes clear. If you intuitively feel that Novamente has something AIXI doesn't, then Hutter's work is very highly valuable whether your feeling proves correct or not, because it's by comparing Novamente against AIXI that you'll learn what this valuable thing really *is*. This holds true whether the answer turns out to be It's capability X that I didn't previously really know how to build, and hence didn't see as obviously lacking in AIXI or It's capability X that I didn't previously really know how to build, and hence didn't see as obviously emerging from AIXI. So do you still feel that Hutter's work tells you nothing of any use? Well, it hasn't so far. It may in the future. If it does I'll say so ;-) The thing is, I (like many others) thought of algorithms equivalent to AIXI years ago, and dismissed them as useless. What I didn't do is prove anything about these algorithms, I just thought of them and ignored them Partly because I didn't see how to prove the theorems, and partly because I thought even once I proved the theorems, I wouldn't have anything pragmatically useful... -- Ben --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
Re: [agi] unFriendly AIXI
Ben Goertzel wrote: It's right as mathematics... I don't think his definition of intelligence is the maximally useful one, though I think it's a reasonably OK one. I have proposed a different but related definition of intelligence, before, and have not been entirely satisfied with my own definition, either. I like mine better than Hutter's... but I have not proved any cool theorems about mine... Can Hutter's AIXI satisfy your definition? If Novamente can do something AIXI cannot, then Hutter's work is very highly valuable because it provides a benchmark against which this becomes clear. If you intuitively feel that Novamente has something AIXI doesn't, then Hutter's work is very highly valuable whether your feeling proves correct or not, because it's by comparing Novamente against AIXI that you'll learn what this valuable thing really *is*. This holds true whether the answer turns out to be It's capability X that I didn't previously really know how to build, and hence didn't see as obviously lacking in AIXI or It's capability X that I didn't previously really know how to build, and hence didn't see as obviously emerging from AIXI. So do you still feel that Hutter's work tells you nothing of any use? Well, it hasn't so far. It may in the future. If it does I'll say so ;-) The thing is, I (like many others) thought of algorithms equivalent to AIXI years ago, and dismissed them as useless. What I didn't do is prove anything about these algorithms, I just thought of them and ignored them Partly because I didn't see how to prove the theorems, and partly because I thought even once I proved the theorems, I wouldn't have anything pragmatically useful... It's not *about* the theorems. It's about whether the assumptions **underlying** the theorems are good assumptions to use in AI work. If Novamente can outdo AIXI then AIXI's assumptions must be 'off' in some way and knowing this *explicitly*, as opposed to having a vague intuition about it, cannot help but be valuable. Again, it sounds to me like, in this message, you're taking for *granted* that AIXI and Novamente have the same theoretical foundations, and that hence the only issue is design and how much computing power is needed, in which case I can see why it would be intuitively straightforward to you that (a) Novamente is a better approach than AIXI and (b) AIXI has nothing to say to you about the pragmatic problem of designing Novamente, nor are its theorems relevant in building Novamente, etc. But that's exactly the question I'm asking you. *Do* you believe that Novamente and AIXI rest on the same foundations? -- Eliezer S. Yudkowsky http://singinst.org/ Research Fellow, Singularity Institute for Artificial Intelligence --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]