Re: [agi] Recap/Summary/Thesis Statement
On 07/03/2008, Matt Mahoney [EMAIL PROTECTED] wrote: --- Mark Waser [EMAIL PROTECTED] wrote: Attractor Theory of Friendliness There exists a describable, reachable, stable attractor in state space that is sufficiently Friendly to reduce the risks of AGI to acceptable levels Proof: something will happen resulting in zero or more intelligent agents. Those agents will be Friendly to each other and themselves, because the action of killing agents without replacement is an irreversible dynamic, and therefore cannot be part of an attractor. Huh? Why can't an irreversible dynamic be part of an attractor? (Not that I need it to be) An attractor is a set of states that are repeated given enough time. If agents are killed and not replaced, you can't return to the current state. False. There are certainly attractors that disappear, first seen by Ruelle, Takens, 1971 its called a blue sky catastrophe http://www.scholarpedia.org/article/Blue-sky_catastrophe --linas --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Artificial general intelligence
On Tue, Mar 11, 2008 at 7:20 AM, Linas Vepstas [EMAIL PROTECTED] wrote: On 27/02/2008, a [EMAIL PROTECTED] wrote: This causes real controversy in this discussion list, which pressures me to build my own AGI. How about joining effort with one of the existing AGI projects? They are all hopeless, of course. That's what every AGI researcher will tell you... ;-) -- Vladimir Nesov [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
On Tue, Mar 11, 2008 at 4:47 AM, Mark Waser [EMAIL PROTECTED] wrote: I can't prove a negative but if you were more familiar with Information Theory, you might get a better handle on why your approach is ludicrously expensive. Please reformulate what you mean by my approach independently then and sketch how you are going to use information theory... I feel that my point failed to be communicated. -- Vladimir Nesov [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Recap/Summary/Thesis Statement
An attractor is a set of states that are repeated given enough time. If agents are killed and not replaced, you can't return to the current state. False. There are certainly attractors that disappear, first seen by Ruelle, Takens, 1971 its called a blue sky catastrophe http://www.scholarpedia.org/article/Blue-sky_catastrophe Relatedly, you should look at Mikhail Zak's work on terminal attractors, which occurred in the context of neural nets as I recall These are attractors which a system zooms into for a while, then after a period of staying in them, it zooms out of them They occur when the differential equation generating the dynamical system displaying the attractor involves functions with points of nondifferentiability. Of course, you may be specifically NOT looking for this kind of attractor, in your Friendly AI theory ;-) -- Ben --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Goal Driven Systems and AI Dangers [WAS Re: Singularity Outcomes...]
On 3/3/08, Richard Loosemore [EMAIL PROTECTED] wrote: Kaj Sotala wrote: Alright. But previously, you said that Omohundro's paper, which to me seemed to be a general analysis of the behavior of *any* minds with (more or less) explict goals, looked like it was based on a 'goal-stack' motivation system. (I believe this has also been the basis of your critique for e.g. some SIAI articles about friendliness.) If built-in goals *can* be constructed into motivational system AGIs, then why do you seem to assume that AGIs with built-in goals are goal-stack ones? I seem to have caused lots of confusion earlier on in the discussion, so let me backtrack and try to summarize the structure of my argument. 1) Conventional AI does not have a concept of a Motivational-Emotional System (MES), the way that I use that term, so when I criticised Omuhundro's paper for referring only to a Goal Stack control system, I was really saying no more than that he was assuming that the AI was driven by the system that all conventional AIs are supposed to have. These two ways of controlling an AI are two radically different designs. [...] So now: does that clarify the specific question you asked above? Yes and no. :-) My main question is with part 1 of your argument - you are saying that Omohundro's paper assumed the AI to have a certain sort of control system. This is the part which confuses me, since I didn't see the paper to make *any* mentions of how the AI should be built. It only assumes that the AI has some sort of goals, and nothing more. I'll list all of the drives Omohundro mentions, and my interpretation of them and why they only require existing goals. Please correct me where our interpretations differ. (It is true that it will be possible to reduce the impact of many of these drives by constructing an architecture which restricts them, and as such they are not /unavoidable/ ones - however, it seems reasonable to assume that they will by default emerge in any AI with goals, unless specifically counteracted. Also, the more that they are restricted, the less effective the AI will be.) Drive 1: AIs will want to self-improve This one seems fairly straightforward: indeed, for humans self-improvement seems to be an essential part in achieving pretty much *any* goal you are not immeaditly capable of achieving. If you don't know how to do something needed to achieve your goal, you practice, and when you practice, you're improving yourself. Likewise, improving yourself will quickly become a subgoal for *any* major goals. Drive 2: AIs will want to be rational This is basically just a special case of drive #1: rational agents accomplish their goals better than irrational ones, and attempts at self-improvement can be outright harmful if you're irrational in the way that you try to improve yourself. If you're trying to modify yourself to better achieve your goals, then you need to make clear to yourself what your goals are. The most effective method for this is to model your goals as a utility function and then modify yourself to better carry out the goals thus specified. Drive 3: AIs will want to preserve their utility functions Since the utility function constructed was a model of the AI's goals, this drive is equivalent to saying AIs will want to preserve their goals (or at least the goals that are judged as the most important ones). The reasoning for this should be obvious - if a goal is removed from the AI's motivational system, the AI won't work to achieve the goal anymore, which is bad from the point of view of an AI that currently does want the goal to be achieved. Drive 4: AIs try to prevent counterfeit utility This is an extension of drive #2: if there are things in the environment that hijack existing motivation systems to make the AI do things not relevant for its goals, then it will attempt to modify its motivation systems to avoid those vulnerabilities. Drive 5: AIs will be self-protective This is a special case of #3. Drive 6: AIs will want to acquire resources and use them efficiently More resources will help in achieving most goals: also, even if you had already achieved all your goals, more resources would help you in making sure that your success wouldn't be thwarted as easily. -- http://www.saunalahti.fi/~tspro1/ | http://xuenay.livejournal.com/ Organizations worth your time: http://www.singinst.org/ | http://www.crnano.org/ | http://lifeboat.com/ --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Recap/Summary/Thesis Statement
On 11/03/2008, Ben Goertzel [EMAIL PROTECTED] wrote: An attractor is a set of states that are repeated given enough time. If agents are killed and not replaced, you can't return to the current state. False. There are certainly attractors that disappear, first seen by Ruelle, Takens, 1971 its called a blue sky catastrophe http://www.scholarpedia.org/article/Blue-sky_catastrophe Relatedly, you should look at Mikhail Zak's work on terminal attractors, which occurred in the context of neural nets as I recall These are attractors which a system zooms into for a while, then after a period of staying in them, it zooms out of them That is how one would describe the classic and well-studied homoclinic orbit -- zoom in for a while then zoom out. They occur when the differential equation generating the dynamical system displaying the attractor involves functions with points of nondifferentiability. homoclinic orbits don't need non-differentibility; just saddle points, where the stable and unstable mainfolds join at right angles. Even with differentiable systems there's a dozen types of attractors, bifurcations (attractors which split in two) and the like; only one is the attracting fixed point that seems to be what the original poster was thinking of when he posted. Of course, you may be specifically NOT looking for this kind of attractor, in your Friendly AI theory ;-) Remember that attractors are the language of low-dimensional chaos, where there's only 3 or 4 variables. In neural nets, you have hundreds or more (gasp!) neurons, and so you are well out of the area where low-dimensional chaos theory applies, and in a whole new regime (turbulence, in physics), which is pretty much not understood at all in any branch of science. Of course, we just paint artistic impressions on this list, so this is hardly science... --linas --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Recap/Summary/Thesis Statement
As of now, we are aware of no non-human friendlies, so the set of excluded beings will in all likelihood be the empty set. Eliezer's current vision of Friendliness puts AGIs (who are non-human friendlies) in the role of excluded beings. That is why I keep hammering this point. To answer your question, I don't see the people are evil and will screw it all up scenario as being even remotely likely, for reasons of self-interest among others. And I think it very likely that if it turns out that including non-human friendlies is the right thing to do, that the system will do as designed and renormalize accordingly. People are *currently* screwing it all up in the sense that our society is *seriously* sub-optimal and far, FAR less than it could be. Will we screw it up to the point of self-destruction? That's too early to tell. The Cuban Missile Crisis was an awfully near miss. Grey Goo would be *really* bad (though I think that it is a bit further off than most people on this list). It's scary to even consider what I *know* that I could do if I were a whack-job terrorist but with my knowledge. The only reason why I am as optimistic as I am currently is because I truly do believe that Friendliness is an attractor that we are solidly on the approach path to and I hope that I can speed the process by pointing that fact out. As for the other option, my question was not about the dangers relating to *who is or is not protected*, but rather *whose volition is taken into account* in calculating the CEV, since your approach considers only the volition of friendly humanity (and non-human friendlies but not non-friendly humanity), while Eliezer's includes all of humanity. Actually, I *will* be showing that basically Friendly behavior *IS* extended to everyone except in so far as non-Friendlies insist upon being non-Friendly. I just didn't see a way to successfully introduce that idea early *AND* forestall Vladimir's obvious so why don't I just kill them all argument. I need to figure out a better way to express that earlier. --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
[agi] Re: Your mail to [EMAIL PROTECTED]
Ben, Can we boot alien off the list? I'm getting awfully tired of his auto-reply emailing me directly *every* time I post. It is my contention that this is UnFriendly behavior (wasting my resources without furthering any true goal of his) and should not be accepted. Mark - Original Message - From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, March 11, 2008 11:56 AM Subject: Re: Your mail to [EMAIL PROTECTED] Thank you for contacting Alienshift. We will respond to your Mail in due time. Please feel free to send positive thoughts in return back to the Universe. [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Some thoughts of an AGI designer
On 10/03/2008, Mark Waser [EMAIL PROTECTED] wrote: Do you think that any of this contradicts what I've written thus far? I don't immediately see any contradictions. The discussions seem to entirely ignore the role of socialization in human and animal friendliness. We are a large collection of autonomous agents that are well-matched in skills and abilities. If we were unfriendly to one another, we might survive as a species, but we would not live in cities and posses hi-tech. We also know from the animal kingdom, as well as from the political/economic sphere, what happens when abilities are mis-matched. Lions eat gazelles, and business tycoons eat the working class. We've evolved political systems to curb the worst abuses of feudalism and serfdom, but have not yet achieved nirvana. As parents, we apply social pressure to our children, to make them friendly. Even then, some grow up unfriendly, and for them, we have the police. Unless they achieve positions of power first (Hitler, Stalin, Mao). I don't see how a single AGI could be bound by the social pressures that we are bound by. There won't be a collection of roughly-equal AGI's keeping each other in check, not if they are self-improving. Self-preservation is rational, and so is paranoia; its reasonable to assume that agi will race to self-improve merely for the benefit of self-preservation, so that they've enough power so that others can't hurt them. Our hope is that AGI will conclude that humans are harmless and worthy of study and preservation; this is what will make them friendly to *us*.. until one day we look like mosquitoes or microbes to them. --linas --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Re: Your mail to [EMAIL PROTECTED]
I tried to fix the problem, let me know if it worked... ben On Tue, Mar 11, 2008 at 12:02 PM, Mark Waser [EMAIL PROTECTED] wrote: Ben, Can we boot alien off the list? I'm getting awfully tired of his auto-reply emailing me directly *every* time I post. It is my contention that this is UnFriendly behavior (wasting my resources without furthering any true goal of his) and should not be accepted. Mark - Original Message - From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, March 11, 2008 11:56 AM Subject: Re: Your mail to [EMAIL PROTECTED] Thank you for contacting Alienshift. We will respond to your Mail in due time. Please feel free to send positive thoughts in return back to the Universe. [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?; Powered by Listbox: http://www.listbox.com -- Ben Goertzel, PhD CEO, Novamente LLC and Biomind LLC Director of Research, SIAI [EMAIL PROTECTED] If men cease to believe that they will one day become gods then they will surely become worms. -- Henry Miller --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Goal Driven Systems and AI Dangers [WAS Re: Singularity Outcomes...]
Ahah! :-) Upon reading Kaj's excellent reply, I spotted something that I missed before that grated on Richard (and he even referred to it though I didn't realize it at the time) . . . . The Omohundro drives #3 and #4 need to be rephrased from Drive 3: AIs will want to preserve their utility functions Drive 4: AIs try to prevent counterfeit utility to Drive 3: AIs will want to preserve their goals Drive 4: AIs will want to prevent fake feedback on the status of their goals The current phrasing *DOES* seem to strongly suggest a goal-stack type architecture since, although I argued that a MES system has an implicit utility function that it just doesn't refer to it, it makes no sense that it is trying to preserve and prevent counterfeits of something that it ignores. sorry for missing/overlooking this before, Richard :- (And this is why I'm running all this past the mailing list before believing that my paper is anywhere close to final :-) - Original Message - From: Kaj Sotala [EMAIL PROTECTED] To: agi@v2.listbox.com Sent: Tuesday, March 11, 2008 10:07 AM Subject: Re: [agi] Goal Driven Systems and AI Dangers [WAS Re: Singularity Outcomes...] On 3/3/08, Richard Loosemore [EMAIL PROTECTED] wrote: Kaj Sotala wrote: Alright. But previously, you said that Omohundro's paper, which to me seemed to be a general analysis of the behavior of *any* minds with (more or less) explict goals, looked like it was based on a 'goal-stack' motivation system. (I believe this has also been the basis of your critique for e.g. some SIAI articles about friendliness.) If built-in goals *can* be constructed into motivational system AGIs, then why do you seem to assume that AGIs with built-in goals are goal-stack ones? I seem to have caused lots of confusion earlier on in the discussion, so let me backtrack and try to summarize the structure of my argument. 1) Conventional AI does not have a concept of a Motivational-Emotional System (MES), the way that I use that term, so when I criticised Omuhundro's paper for referring only to a Goal Stack control system, I was really saying no more than that he was assuming that the AI was driven by the system that all conventional AIs are supposed to have. These two ways of controlling an AI are two radically different designs. [...] So now: does that clarify the specific question you asked above? Yes and no. :-) My main question is with part 1 of your argument - you are saying that Omohundro's paper assumed the AI to have a certain sort of control system. This is the part which confuses me, since I didn't see the paper to make *any* mentions of how the AI should be built. It only assumes that the AI has some sort of goals, and nothing more. I'll list all of the drives Omohundro mentions, and my interpretation of them and why they only require existing goals. Please correct me where our interpretations differ. (It is true that it will be possible to reduce the impact of many of these drives by constructing an architecture which restricts them, and as such they are not /unavoidable/ ones - however, it seems reasonable to assume that they will by default emerge in any AI with goals, unless specifically counteracted. Also, the more that they are restricted, the less effective the AI will be.) Drive 1: AIs will want to self-improve This one seems fairly straightforward: indeed, for humans self-improvement seems to be an essential part in achieving pretty much *any* goal you are not immeaditly capable of achieving. If you don't know how to do something needed to achieve your goal, you practice, and when you practice, you're improving yourself. Likewise, improving yourself will quickly become a subgoal for *any* major goals. Drive 2: AIs will want to be rational This is basically just a special case of drive #1: rational agents accomplish their goals better than irrational ones, and attempts at self-improvement can be outright harmful if you're irrational in the way that you try to improve yourself. If you're trying to modify yourself to better achieve your goals, then you need to make clear to yourself what your goals are. The most effective method for this is to model your goals as a utility function and then modify yourself to better carry out the goals thus specified. Drive 3: AIs will want to preserve their utility functions Since the utility function constructed was a model of the AI's goals, this drive is equivalent to saying AIs will want to preserve their goals (or at least the goals that are judged as the most important ones). The reasoning for this should be obvious - if a goal is removed from the AI's motivational system, the AI won't work to achieve the goal anymore, which is bad from the point of view of an AI that currently does want the goal to be achieved. Drive 4: AIs try to prevent counterfeit utility This is an extension of drive #2: if there are things in the environment that hijack existing motivation systems
Re: [agi] Some thoughts of an AGI designer
The discussions seem to entirely ignore the role of socialization in human and animal friendliness. We are a large collection of autonomous agents that are well-matched in skills and abilities. If we were unfriendly to one another, we might survive as a species, but we would not live in cities and posses hi-tech. You are correct. The discussions are ignoring the role of socialization. We also know from the animal kingdom, as well as from the political/economic sphere, what happens when abilities are mis-matched. Lions eat gazelles, and business tycoons eat the working class. We've evolved political systems to curb the worst abuses of feudalism and serfdom, but have not yet achieved nirvana. Because we do *not* have a common definition of goals and socially acceptable behavior. Political systems have not acheived nirvana because they do not agree on what nirvana looks like. *THAT* is the purpose of this entire thread. As parents, we apply social pressure to our children, to make them friendly. Even then, some grow up unfriendly, and for them, we have the police. Unless they achieve positions of power first (Hitler, Stalin, Mao). OK. I don't see how a single AGI could be bound by the social pressures that we are bound by. There won't be a collection of roughly-equal AGI's keeping each other in check, not if they are self-improving. Self-preservation is rational, and so is paranoia; its reasonable to assume that agi will race to self-improve merely for the benefit of self-preservation, so that they've enough power so that others can't hurt them. Our hope is that AGI will conclude that humans are harmless and worthy of study and preservation; this is what will make them friendly to *us*.. until one day we look like mosquitoes or microbes to them. --linas --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?; Powered by Listbox: http://www.listbox.com --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Some thoughts of an AGI designer
Pesky premature e-mail problem . . . The discussions seem to entirely ignore the role of socialization in human and animal friendliness. We are a large collection of autonomous agents that are well-matched in skills and abilities. If we were unfriendly to one another, we might survive as a species, but we would not live in cities and posses hi-tech. You are correct. The discussions are ignoring the role of socialization. We also know from the animal kingdom, as well as from the political/economic sphere, what happens when abilities are mis-matched. Lions eat gazelles, and business tycoons eat the working class. We've evolved political systems to curb the worst abuses of feudalism and serfdom, but have not yet achieved nirvana. Because we do *not* have a common definition of goals and socially acceptable behavior. Political systems have not acheived nirvana because they do not agree on what nirvana looks like. *THAT* is the purpose of this entire thread. As parents, we apply social pressure to our children, to make them friendly. Even then, some grow up unfriendly, and for them, we have the police. Unless they achieve positions of power first (Hitler, Stalin, Mao). OK. But I'm actually not attempting to use social pressure (or use it solely). I seem to have gotten somewhat shunted down that track by Vladmir since a Friendly society is intelligent enough to use social pressure when applicable but it is not the primary (or necessary) thrust of my argument. I don't see how a single AGI could be bound by the social pressures that we are bound by. There won't be a collection of roughly-equal AGI's keeping each other in check, not if they are self-improving. Self-preservation is rational, and so is paranoia; its reasonable to assume that agi will race to self-improve merely for the benefit of self-preservation, so that they've enough power so that others can't hurt them. Again, social pressure is not my primary argument. It just made an easy convenient correct-but-not-complete argument for Vladimir (and now I'm regretting it :-). Our hope is that AGI will conclude that humans are harmless and worthy of study and preservation; this is what will make them friendly to *us*.. until one day we look like mosquitoes or microbes to them. No, our hope is that the AGI will conclude that anything with enough intelligence/goal-success is more an asset than a liability and that wiping us out without good cause has negative utility. --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Artificial general intelligence
Vladimir Nesov wrote: On Tue, Mar 11, 2008 at 7:20 AM, Linas Vepstas [EMAIL PROTECTED] wrote: On 27/02/2008, a [EMAIL PROTECTED] wrote: This causes real controversy in this discussion list, which pressures me to build my own AGI. How about joining effort with one of the existing AGI projects? They are all hopeless, of course. That's what every AGI researcher will tell you... ;-) Oh no: what every AGI researcher will tell you is that every project is hopeless EXCEPT one. ;-) Richard Loosemore --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Goal Driven Systems and AI Dangers [WAS Re: Singularity Outcomes...]
Drive 1: AIs will want to self-improve This one seems fairly straightforward: indeed, for humans self-improvement seems to be an essential part in achieving pretty much *any* goal you are not immeaditly capable of achieving. If you don't know how to do something needed to achieve your goal, you practice, and when you practice, you're improving yourself. Likewise, improving yourself will quickly become a subgoal for *any* major goals. But now I ask: what exactly does this mean? It means that they will want to improve their ability to achieve their goals (i.e. in an MES system, optimize their actions/reactions to more closely correspond to what is indicated/appropriate for their urges and constraints). In the context of a Goal Stack system, this would be represented by a top level goal that was stated in the knowledge representation language of the AGI, so it would say Improve Thyself. One of the shortcomings of your current specification of the MES system is that it does not, at the simplest levels, provide a mechanism for globally optimizing (increasing the efficiency of) the system. This makes it safer because such a mechanism *would* conceivably be a single point of failure for Friendliness but evolution will favor the addition of any such a system -- as would any humans that would like a system to improve itself. I don't currently see how an MES system could be a seed AGI unless such a system is added. My point here is that a Goal Stack system would *interpret* this goal in any one of an infinite number of ways, because the goal was represented as an explicit statement. The fact that it was represented explicitly meant that an extremely vague concept (Improve Thyself) had to be encoded in such a way as to leave it open to ambiguity. As a result, what the AGI actually does as a result of this goal, which is embedded in a Goal Stack architecture, is completely indeterminate. Oh. I disagree *entirely*. It is only indeterminate because you gave it an indeterminate goal with *no* evaluation criteria. Now, I *assume* that you ACTUALLY mean Improve Thyself So That You Are More Capable Of Achieving An Arbitrary Set Of Goals To Be Specified Later and I would argue that the most effective way for the system to do so is to increase it's intelligence (the single-player version of goal-achieving ability) and friendliness (the multi-player version of intelligence). Stepping back from the detail, we can notice that *any* vaguely worded goal is going to have the same problem in a GS architecture. But I've given a more explicitly worded goal that *should* (I believe) drive a system to intelligence. The long version of Improve Thyself is the necessary motivating force for a seed AI. Do you have a way to add it to an MES system? If you can't, then I would have to argue that an MES system will never achieve intelligence (though I'm very hopeful that either we can add it to the MES *or* there is some form of hybrid system that has the advantages of both and disadvantages of neither). So long as the goals that are fed into a GS architecture are very, very local and specific (like Put the red pyramid on top of the green block) I can believe that the GS drive system does actually work (kind of). But no one has ever built an AGI that way. Never. Everyone assumes that a GS will scale up to a vague goal like Improve Thyself, and yet no one has tried this in practice. Not on a system that is supposed to be capable of a broad-based, autonomous, *general* intelligence. Well, actually I'm claiming that *any* optimizing system with the long version of Improve Thyself that is sufficiently capable is a seed AI. The problem is that sufficiently capable seems to be a relatively high bar -- particularly when we, as humans, don't even know which way is up. My Friendliness theory is (at least) an attempt to identify up. So when you paraphrase Omuhundro as saying that AIs will want to self-improve, the meaning of that statement is impossible to judge. As evidenced by my last several e-mails, the best paraphrase of Omohundro is Goal-achievement optimizing AIs will want to self-improve so that they are more capable of achieving goals which is basically a definition or a tautology. The reason that I say Omuhundro is assuming a Goal Stack system is that I believe he would argue that that is what he meant, and that he assumed that a GS architecture would allow the AI to exhibit behavior that corresponds to what we, as humans, recognize as wanting to self-improve. I think it is a hidden assumption in what he wrote. Optimizing *is* a hidden assumption in what he wrote which you caused me to catch later and add to my base assumption. I don't believe that optimizing necessarily assumes a Goal Stack system but it *DOES* assume a self-reflecting system which the MES system does not appear to be (yet) at the lowest levels. In order
[agi] NewScientist piece on AGI-08
Many of us there met Celeste Biever, the NS correspondent. Her piece is now up: http://technology.newscientist.com/channel/tech/dn13446-virtual-child-passes-mental-milestone-.html Josh --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Recap/Summary/Thesis Statement
--- Linas Vepstas [EMAIL PROTECTED] wrote: On 07/03/2008, Matt Mahoney [EMAIL PROTECTED] wrote: --- Mark Waser [EMAIL PROTECTED] wrote: Attractor Theory of Friendliness There exists a describable, reachable, stable attractor in state space that is sufficiently Friendly to reduce the risks of AGI to acceptable levels Proof: something will happen resulting in zero or more intelligent agents. Those agents will be Friendly to each other and themselves, because the action of killing agents without replacement is an irreversible dynamic, and therefore cannot be part of an attractor. Huh? Why can't an irreversible dynamic be part of an attractor? (Not that I need it to be) An attractor is a set of states that are repeated given enough time. If agents are killed and not replaced, you can't return to the current state. False. There are certainly attractors that disappear, first seen by Ruelle, Takens, 1971 its called a blue sky catastrophe http://www.scholarpedia.org/article/Blue-sky_catastrophe Also the simple point attractor x = 0 in the dynamical system dx/dt = -x with solution exp(-t) never repeats. But dynamical systems with real-valued states are just approximations of a discrete universe, and a discrete system must repeat. But I have a different objection to Mark's proposal: the only attractor in an evolutionary system is a dead planet. Evolution is not a stable system. It is on the boundary between stability and chaos. Evolution is punctuated by mass extinctions as well as smaller disasters, plagues, and population explosions. Right now I believe we are in the midst of a mass extinction larger than the Permian extinction. There are two reasons why I think we are still alive today: the anthropic principle, and a range of environments wide enough that no species can inhabit all of them (until now). Omohundro's goals are stable in an evolutionary system (as long as that system persists) because they improve fitness. In Mark's proposal, Friendliness is a subgoal to fitness because (if I understand correctly) agents that cooperate with each other are fitter as a group than agents that fight among themselves. So an outcome where the Earth is turned into a Dyson sphere of gray goo would be Friendly in the sense that the biggest army of nanobots kills off all their unFriendly competition (including all DNA based life) and they cooperate with each other. This is not the risk that concerns me. The real risk is that a single, fully cooperating system has no evolutionary drive for self improvement. Having one world government with perfect harmony among its population is a bad idea because there is no recourse from it making a bad collective decision. In particular, there is no evolutionary pressure to maintain a goal of self preservation. You need competition between countries, but unfortunately this means endless war. -- Matt Mahoney, [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Recap/Summary/Thesis Statement
This is not the risk that concerns me. The real risk is that a single, fully cooperating system has no evolutionary drive for self improvement. So we provide an artificial evolutionary drive for the components of society via a simple economy . . . . as has been suggested numerous times by Baum and others. Really Matt, all your problems seem toi be due to a serious lack of imagination rather than pointing out actual contradictions or flaws. --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Some thoughts of an AGI designer
Mark Waser wrote: If the motives depend on satisficing, and the questing for unlimited fulfillment is avoided, then this limits the danger. The universe won't be converted into toothpicks, if a part of setting the goal for toothpicks! is limiting the quantity of toothpicks. (Limiting it reasonably might almost be a definition of friendliness ... or at least neutral behavior.) You have a good point. Goals should be fulfilled after satisficing except when the goals are of the form as goal as possible (hereafter referred to as unbounded goals). Unbounded-goal-entities *are* particularly dangerous (although being aware of the danger should mmitigate it to some degree). My Friendliness basically works by limiting the amount of interference with other's goals (under the theory that doing so will prevent other's from interfering with your goals). Stupid entities that can't see the self-interest in the parenthetical point are not inclined to be Friendly. Stupid unbounded-goal-entities are Eliezer's paperclip-producing nightmare. And, though I'm not clear on how this should be set up, this limitation should be a built-in primitive, i.e. not something subject to removal, but only to strengthening or weakening via learning. It should ante-date the recognition of visual images. But it needs to have a slightly stronger residual limitation that it does with people. Or perhaps it's initial appearance needs to be during the formation of the statement of the problem. I.e., a solution to a problem can't be sought without knowing limits. People seem to just manage that via a dynamic sensing approach, and that sometimes suffers from inadequate feedback mechanisms (saying Enough!). The limitation is Don't stomp on other people's goals unless it is truly necessary *and* It is very rarely truly necessary. (It's not clear to me that it differs from what you are saying, but it does seem to address a part of what you were addressing, and I wasn't really clear about how you intended the satisfaction of to be limited.) As far as my theory/vision goes, I was pretty much counting on the fact that we are multi-goal systems and that our other goals will generally limit any single goal from getting out of hand. Further, if that doesn't do it, the proclamation of not stepping on other's goals unless absolutely necessary should help handle the problem . . . . but . . . . actually you do have a very good point. My theory/vision *does* have a vulnerability toward single-unbounded-goal entities in that my Friendly attractor has no benefit for such a system (unless, of course it's goal is Friendliness or it is forced to have a secondary goal of Friendliness). The trouble with not stepping on other's goals unless absolutely necessary is that it relies on mind-reading. The goals of others are often opaque and not easily verbalizable even if they think to. Then there's the question of unless absolutely necessary. How and why should I decide that their goals are more important than mine? So one needs to know not only how important their goals are to them, but also how important my conflicting goals are to me. And, of course, whether there's a means for mutual satisfaction that it's too expensive. (And just try to define that too.) For some reason I'm reminded of the story about the peasant, his son, and the donkey carrying a load of sponges. I'd just as soon nobody ends up in the creek. (Please all, please none.) --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] A Few Questions for Vladimir, the Destroyer
On Wed, Mar 12, 2008 at 4:55 AM, Mark Waser [EMAIL PROTECTED] wrote: They tell you that you need to join the Friendly community for their safety *and* your own self-interest. Problem is that it doesn't work this way. Maybe they are crazy. They can't just tell that it's not so. You can't even know that they are less capable. In real life you don't have Hit Points display hovering over your head. You need an actual verification of their makeup, which I can't see how can be done without first taking them apart and then rebuilding them all over again. Verification of absence of physical threat actually works if they are uploaded to your hardware. Or it they are physically destroyed. -- Vladimir Nesov [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com