Re: [agi] What should we do to be prepared?
On Thursday 06 March 2008 08:45:00 pm, Vladimir Nesov wrote: On Fri, Mar 7, 2008 at 3:27 AM, J Storrs Hall, PhD [EMAIL PROTECTED] wrote: The scenario takes on an entirely different tone if you replace weed out some wild carrots with kill all the old people who are economically inefficient. In particular the former is something one can easily imagine people doing without a second thought, while the latter is likely to generate considerable opposition in society. Sufficient enforcement is in place for this case: people steer governments in the direction where laws won't allow that when they age, evolutionary and memetic drives oppose it. It's too costly to overcome these drives and destroy counterproductive humans. But this cost is independent from potential gain from replacement. As the gain increases, decision can change, again we only need sufficiently good 'cultivated humans'. Consider expensive medical treatments which most countries won't give away when dying people can't afford them. Life has a cost, and this cost can be met. Suppose that productivity amongst AIs is such that the entire economy takes on a Moore's Law growth curve. (For simplicity say a doubling each year.) At the end of the first decade, the tax rate on AIs will have to be only 0.1% to give the humans, free, everything we now produce with all our effort. And the tax rate would go DOWN by a factor of two each year. I don't see the AIs really worrying about it. Alternatively, since humans already own everything, and will indeed own the AIs originally, we could simply cash out and invest, and the income from the current value of the world would easily produce an income equal to our needs in an AI economy. It might be a good idea to legally entail the human trust fund... So how would you design a super-intelligence: (a) a single giant blob modelled on an individual human mind (b) a society (complete with culture) with lots of human-level minds and high-speed communication? This is a technical question with no good answer, why is it relevant? The discussion forked at the point of whether an AI would be like a single supermind or more like a society of humans... we seem to be in agreement or agree that it doesn't make much difference to the point at issue. On the other hand, the technical issue is interesting of itself, perhaps more so than the rest of the discussion :-) Josh --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
[agi] Recap/Summary/Thesis Statement
Attractor Theory of Friendliness There exists a describable, reachable, stable attractor in state space that is sufficiently Friendly to reduce the risks of AGI to acceptable levels --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
Whether humans conspire to weed out wild carrots impacts whether humans are classified as Friendly (or, it would if the wild carrots were sentient). Why does it matter what word we/they assign to this situation? My vision of Friendliness places many more constraints on the behavior towards other Friendly entities than it does on the behavior towards non-Friendly entities. If we are classified as Friendly, there are many more constraints on the behavior that they will adopt towards us. Or, to make it more clear, substitute the words Enemy and Friend for Unfriendly and Friendly. If you are a Friend, the Friendly AI is nice to you. If you are not a Friend, the AI has a lot fewer constraints on how it deals with you. It is in the future AGI overlords enlightened self-interest to be Friendly -- so I'm going to assume that they will be. It doesn't follow. If you think it's clearly the case, explain decision process that leads to choosing 'friendliness'. So far it is self-referential: if dominant structure always adopts the same friendliness when its predecessor was friendly, then it will be safe when taken over. But if dominant structure turns unfriendly, it can clear the ground and redefine friendliness in its own image. What does it leave you? You are conflating two arguments here but both are crucial to my thesis. The decision process that leads to Friendliness is *exactly* what we are going through here. We have a desired result (or more accurately, we have conditions that we desperately want to avoid). We are searching for ways to make it happen. I am proposing one way that is (I believe) sufficient to make it happen. I am open to other suggestions but none are currently on the table (that I believe are feasible). What is different in my theory is that it handles the case where the dominant theory turns unfriendly. The core of my thesis is that the particular Friendliness that I/we are trying to reach is an attractor -- which means that if the dominant structure starts to turn unfriendly, it is actually a self-correcting situation. --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
How do you propose to make humans Friendly? I assume this would also have the effect of ending war, crime, etc. I don't have such a proposal but an obvious first step is defining/describing Friendliness and why it might be a good idea for us. Hopefully then, the attractor takes over. (Actually, I guess that is a proposal, isn't it?:-) I know you have made exceptions to the rule that intelligences can't be reprogrammed against their will, but what if AGI is developed before the technology to reprogram brains, so you don't have this option? Or should AGI be delayed until we do? Is it even possible to reliably reprogram brains without AGI? Um. Why are we reprogramming brains? That doesn't seem necessary or even generally beneficial (unless you're only talking about self-programming). --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Recap/Summary/Thesis Statement
--- Mark Waser [EMAIL PROTECTED] wrote: Attractor Theory of Friendliness There exists a describable, reachable, stable attractor in state space that is sufficiently Friendly to reduce the risks of AGI to acceptable levels Proof: something will happen resulting in zero or more intelligent agents. Those agents will be Friendly to each other and themselves, because the action of killing agents without replacement is an irreversible dynamic, and therefore cannot be part of an attractor. Corollary: Killing with replacement is Friendly. Corollary: Friendliness does not guarantee survival of DNA based life. -- Matt Mahoney, [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
--- Mark Waser [EMAIL PROTECTED] wrote: How do you propose to make humans Friendly? I assume this would also have the effect of ending war, crime, etc. I don't have such a proposal but an obvious first step is defining/describing Friendliness and why it might be a good idea for us. Hopefully then, the attractor takes over. (Actually, I guess that is a proposal, isn't it?:-) I know you have made exceptions to the rule that intelligences can't be reprogrammed against their will, but what if AGI is developed before the technology to reprogram brains, so you don't have this option? Or should AGI be delayed until we do? Is it even possible to reliably reprogram brains without AGI? Um. Why are we reprogramming brains? That doesn't seem necessary or even generally beneficial (unless you're only talking about self-programming). As a way to make people behave. A lot of stuff has been written on why war and crime are bad ideas, but so far it hasn't worked. -- Matt Mahoney, [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Recap/Summary/Thesis Statement
Attractor Theory of Friendliness There exists a describable, reachable, stable attractor in state space that is sufficiently Friendly to reduce the risks of AGI to acceptable levels Proof: something will happen resulting in zero or more intelligent agents. Those agents will be Friendly to each other and themselves, because the action of killing agents without replacement is an irreversible dynamic, and therefore cannot be part of an attractor. Huh? Why can't an irreversible dynamic be part of an attractor? (Not that I need it to be) Corollary: Killing with replacement is Friendly. Bad Logic. Not X (replacement) leads to not Y (Friendly) does NOT have the corollary X (replacement) leads to Y (Friendliness). And I do NOT agree that Killing with replacement is Friendly. Corollary: Friendliness does not guarantee survival of DNA based life. Both not a corollary and entirely irrelevant to my points (and, in fact, in direct agreement with my statement I'm afraid that my vision of Friendliness certainly does permit the intentional destruction of the human race if that is the *only* way to preserve a hundred more intelligent, more advanced, more populous races. On the other hand, given the circumstance space that we are likely to occupy with a huge certainty, the intentional destruction of the human race is most certainly ruled out. Or, in other words, there are no infinite guarantees but we can reduce the dangers to infinitessimally small levels.) My thesis statement explicitly says acceptable levels, not guarantee. = = = = = What is your point with this e-mail? It appears to a total non-sequitor (as well as being incorrect). --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
[agi] Causality challenge
Are any of the AI folks here competing in this challenge? http://www.causality.inf.ethz.ch/challenge.php Eric B. Ramsay --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
Matt Mahoney wrote: --- Mark Waser [EMAIL PROTECTED] wrote: How do you propose to make humans Friendly? I assume this would also have the effect of ending war, crime, etc. I don't have such a proposal but an obvious first step is defining/describing Friendliness and why it might be a good idea for us. Hopefully then, the attractor takes over. (Actually, I guess that is a proposal, isn't it?:-) I know you have made exceptions to the rule that intelligences can't be reprogrammed against their will, but what if AGI is developed before the technology to reprogram brains, so you don't have this option? Or should AGI be delayed until we do? Is it even possible to reliably reprogram brains without AGI? Um. Why are we reprogramming brains? That doesn't seem necessary or even generally beneficial (unless you're only talking about self-programming). As a way to make people behave. A lot of stuff has been written on why war and crime are bad ideas, but so far it hasn't worked. -- Matt Mahoney, [EMAIL PROTECTED] Reprogramming humans doesn't appear to be an option. Reprogramming the AGI of the future might be IF the designers build in the right mechanisms for an effective oversight of the units. Friendly may be nice, and a good marketing tool, but the prudent measure is to assume that the AGI can still be fooled - be tempted, be enamored by an opportunity. The emphasis might better be placed on asking AGI designers to build in the ability to record the goals / intents / cause / mission of the unit and allow it to be reviewed by appointed authority. cringe I believe the US may be requiring large companies to backup all emails through internal email systems. A similar measure could be taken to backup the cause that AGI is operating under; that is, what AGI is being influenced by at the workspace logic level. (use the imagination a bit...) I understand that there are issues of who gets to be the authority, and that isn't where this is leading. The intent is to suggest designers think oversight as a design specification. --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
--- Stan Nilsen [EMAIL PROTECTED] wrote: Reprogramming humans doesn't appear to be an option. We do it all the time. It is called school. Less commonly, the mentally ill are forced to take drugs or treatment for their own good. Most notably, this includes drug addicts. Also, it is common practice to give hospital and nursing home patients tranquilizers to make less work for the staff. Note that the definition of mentally ill is subject to change. Alan Turing was required by court order to take female hormones to cure his homosexuality, and committed suicide shortly afterwards. Reprogramming the AGI of the future might be IF the designers build in the right mechanisms for an effective oversight of the units. We only get to program the first generation of AGI. Programming subsequent generations will be up to their parents. They will be too complex for us to do it. -- Matt Mahoney, [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
Matt Mahoney wrote: --- Stan Nilsen [EMAIL PROTECTED] wrote: Reprogramming humans doesn't appear to be an option. We do it all the time. It is called school. I might be tempted to call this manipulation rather than programming. The results of schooling are questionable while programming will produce an expected result if the method is sound. Less commonly, the mentally ill are forced to take drugs or treatment for their own good. Most notably, this includes drug addicts. Also, it is common practice to give hospital and nursing home patients tranquilizers to make less work for the staff. Note that the definition of mentally ill is subject to change. Alan Turing was required by court order to take female hormones to cure his homosexuality, and committed suicide shortly afterwards. Reprogramming the AGI of the future might be IF the designers build in the right mechanisms for an effective oversight of the units. We only get to program the first generation of AGI. Programming subsequent generations will be up to their parents. They will be too complex for us to do it. Is there a reason to believe that a fledgling AGI will be proficient right from the start? It's easy to jump from AGI #1 to an AGI 10 years down the road and presume these fantastic capabilities. Even if the AGI can spend millions of cycles ingesting the Internet, won't it find thousands of difficult problems that might challenge it? Hard problems don't just dissolve when you apply resources. The point here is that control and domination of humans may not be very high on priority list. Do you think this older AGI will have an interest in trying to control other AGI that might come on the scene? I suspect that they will, and they might see fit to design their offspring with an oversight interface. In part, my contention is that AGI will not automatically agree with one another - do smart people necessarily come to the same opinion? Or does AGI existence mean no longer there are opinions, only facts since these units grasp everything correctly? Science fiction aside, there may be a slow transition to AGI into society - remember that the G in AGI means general, not born with stock market manipulation capability (unless it mimics the General population, in which case, good luck.) -- Matt Mahoney, [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?; Powered by Listbox: http://www.listbox.com --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Recap/Summary/Thesis Statement
--- Mark Waser [EMAIL PROTECTED] wrote: Attractor Theory of Friendliness There exists a describable, reachable, stable attractor in state space that is sufficiently Friendly to reduce the risks of AGI to acceptable levels Proof: something will happen resulting in zero or more intelligent agents. Those agents will be Friendly to each other and themselves, because the action of killing agents without replacement is an irreversible dynamic, and therefore cannot be part of an attractor. Huh? Why can't an irreversible dynamic be part of an attractor? (Not that I need it to be) An attractor is a set of states that are repeated given enough time. If agents are killed and not replaced, you can't return to the current state. Corollary: Killing with replacement is Friendly. Bad Logic. Not X (replacement) leads to not Y (Friendly) does NOT have the corollary X (replacement) leads to Y (Friendliness). And I do NOT agree that Killing with replacement is Friendly. You're right. Killing with replacement (e.g. evolution) may or may not be Friendly. Corollary: Friendliness does not guarantee survival of DNA based life. Both not a corollary and entirely irrelevant to my points (and, in fact, in direct agreement with my statement I'm afraid that my vision of Friendliness certainly does permit the intentional destruction of the human race if that is the *only* way to preserve a hundred more intelligent, more advanced, more populous races. On the other hand, given the circumstance space that we are likely to occupy with a huge certainty, the intentional destruction of the human race is most certainly ruled out. Or, in other words, there are no infinite guarantees but we can reduce the dangers to infinitessimally small levels.) My thesis statement explicitly says acceptable levels, not guarantee. You seem to be giving special status to Homo Sapiens. How does this arise out of your dynamic? I know you can program an initial bias, but how is it stable? Humans are not the pinnacle of evolution. We are a point on a curve. Is it bad that Homo Erectus is extinct? Would we be better off if they weren't? -- Matt Mahoney, [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
Comments seem to be dying down and disagreement appears to be minimal, so let me continue . . . . Part 3. Fundamentally, what I'm trying to do here is to describe an attractor that will appeal to any goal-seeking entity (self-interest) and be beneficial to humanity at the same time (Friendly). Since Friendliness is obviously a subset of human self-interest, I can focus upon the former and the latter will be solved as a consequence. Humanity does not need to be factored into the equation (explicitly) at all. Or, in other words -- The goal of Friendliness is to promote the goals of all Friendly entities. To me, this statement is like that of the Elusynian Mysteries -- very simple (maybe even blindingly obvious to some) but incredibly profound and powerful in it's implications. Two immediate implications are that we suddenly have the concept of a society (all Friendly entities) and, since we have an explicit goal, we start to gain traction on what is good and bad relative to that goal. Clearly, anything that is innately contrary to the drives described by Omohundro is (all together now :-) BAD. Similarly, anything that promotes the goals of Friendly entities without negatively impacting any Friendly entities is GOOD. And anything else can be judged on the degree to which it impacts the goals of *all* Friendly entities (though, I still don't want to descend to the level of the trees and start arguing the relative trade-offs of whether saving a few *very* intelligent entities is better than saving a large number of less intelligent entities since it is my contention that this is *always* entirely situation-dependent AND that once given the situation, Friendliness CAN provide *some* but not always *complete* guidance -- though it can always definitely rule out quite a lot for that particular set of circumstances). So, it's now quite easy to move on to answering the question of What is in the set of horrible nasty thing[s]?. The simple answer is anything that interferes with (your choice of formulation) the achievement of goals/the basic Omohundro drives. The most obvious no-nos include: a.. destruction (interference with self-protection), b.. physical crippling (interference with self-protection, self-improvement and resource-use), c.. mental crippling (inteference with rationality, self-protection, self-improvement and resource use), and d.. perversion of goal structure (interference with utility function preservation and prevention of counterfeit utilities) The last one is particularly important to note since we (as humans) seem to be just getting a handle on it ourselves. I can also argue at this point that Eliezer's vision of Friendliness must arguably be either mentally crippling or a perversion of goal-structure for the AI involved since the AI is constrained to act in a fashion that is more constrained than Friendliness (a situation that no rational super-intelligence would voluntarily place itself in unless there were no other choice). This is why many people have an instinctive reaction against Eliezer's proposals. Even though they can't clearly describe why it is a problem, they clearly sense that there is a unnecessary constraint on a more-effectively goal-seeking entity than themselves. That seems to be a dangerous situation. Now, while Eliezer is correct in that there actually are some invisible bars that they can't see (i.e. that no goal-seeking entity will voluntarily violate their own current goals) -- they are correct in that Eliezer's formulation is *NOT* an attractor and that the entity may well go through some very dangerous territory (for humans) on the way to the attractor if outside forces or internal errors change their goals. Thus Eliezer's vision of Friendliness is emphatically *NOT* Friendly by my formulation. To be clear, the additional constraint is that the AI is *required* to show {lower-case}friendly behavior towards all humans even if they (the humans) are not {upper-case}Friendly. And, I probably shouldn't say this, but . . . it is also arguable that this constraint would likely make the conversion of humanity to Friendliness a much longer and bloodier process. TAKE-AWAY: Having the statement The goal of Friendliness is to promote the goals of all Friendly entities allows us to make considerable progress in describing and defining Friendliness. Part 4 will go into some of the further implications of our goal statement (most particularly those which are a consequence of having a society). --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
--- Mark Waser [EMAIL PROTECTED] wrote: TAKE-AWAY: Having the statement The goal of Friendliness is to promote the goals of all Friendly entities allows us to make considerable progress in describing and defining Friendliness. How does an agent know if another agent is Friendly or not, especially if the other agent is more intelligent? -- Matt Mahoney, [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
On 03/07/2008 08:09 AM,, Mark Waser wrote: There is one unique attractor in state space. No. I am not claiming that there is one unique attractor. I am merely saying that there is one describable, reachable, stable attractor that has the characteristics that we want. There are *clearly* other attractors. For starters, my attractor requires sufficient intelligence to recognize it's benefits. There is certainly another very powerful attractor for simpler, brute force approaches (which frequently have long-term disastrous consequences that aren't seen or are ignored). Of course. An earlier version said there is one unique attractor that identify friendliness here, and while editing it somehow ended up in that obviously wrong form. Since any sufficiently advanced species will eventually be drawn towards F, the CEV of all species is F. While I believe this to be true, I am not convinced that it is necessary for my argument. I think that it would make my argument a lot easier if I could prove it to be true -- but I currently don't see a way to do that. Anyone want to chime in here? Ah, okay. I thought you were going to argue this following on from Omohundro's paper about drives common to all sufficiently advanced AIs and extend it to all sufficiently advanced intelligences, but that's my hallucination. Therefore F is not species-specific, and has nothing to do with any particular species or the characteristics of the first species that develops an AGI (AI). I believe that the F that I am proposing is not species-specific. My problem is that there may be another attractor F' existing somewhere far off in state space that some other species might start out close enough to that it would be pulled into that attractor instead. In that case, there would be the question as to how the species in the two different attractors interact. My belief is that it would be to the mutual benefit of both but I am not able to prove that at this time. For there to be another attractor F', it would of necessity have to be an attractor that is not desirable to us, since you said there is only one stable attractor for us that has the desired characteristics. I don't see how beings subject to these two different attractors would find mutual benefit in general, since if they did, then F' would have the desirable characteristics that we wish a stable attractor to have, but it doesn't. This means that genuine conflict between friendly species or between friendly individuals is not even possible, so there is no question of an AI needing to arbitrate between the conflicting interests of two friendly individuals or groups of individuals. Of course, there will still be conflicts between non-friendlies, and the AI may arbitrate and/or intervene. Wherever/whenever there is a shortage of resources (i.e. not all goals can be satisfied), goals will conflict. Friendliness describes the behavior that should result when such conflicts arise. Friendly entities should not need arbitration or intervention but should welcome help in determining the optimal solution (which is *close* to arbitration but subtly different in that it is not adverserial). I would rephrase your general point as a true, adverserial relationship is not even possible. That's a better way of putting it. Conflict will be possible, but they'll always be resolved via exchange of information rather than bullets. The AI will not be empathetic towards homo sapiens sapiens in particular. It will be empathetic towards f-beings (friendly beings in the technical sense), whether they exist or not (since the AI might be the only being anywhere near the attractor). Yes. It will also be empathic towards beings with the potential to become f-beings because f-beings are a tremendous resource/benefit. You've said elsewhere that the constraints on how it deals with non-friendlies are rather minimal, so while it might be empathic/empathetc, it will still have no qualms about kicking ass and inflicting pain where necessary. This means no specific acts of the AI towards any species or individuals are ruled out, since it might be part of their CEV (which is the CEV of all beings), even though they are not smart enough to realize it. Absolutely correct and dead wrong at the same time. You could invent specific incredibly low-probabaility but possible circumstances where *any* specific act is justified. I'm afraid that my vision of Friendliness certainly does permit the intentional destruction of the human race if that is the *only* way to preserve a hundred more intelligent, more advanced, more populous races. On the other hand, given the circumstance space that we are likely to occupy with a huge certainty, the intentional destruction of the human race is most certainly ruled out. Or, in other words, there are no infinite guarantees but we can reduce the dangers to infinitessimally
Re: [agi] What should we do to be prepared?
How does an agent know if another agent is Friendly or not, especially if the other agent is more intelligent? An excellent question but I'm afraid that I don't believe that there is an answer (but, fortunately, I don't believe that this has any effect on my thesis). --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] Recap/Summary/Thesis Statement
--- Mark Waser [EMAIL PROTECTED] wrote: Huh? Why can't an irreversible dynamic be part of an attractor? (Not that I need it to be) An attractor is a set of states that are repeated given enough time. NO! Easily disprovable by an obvious example. The sun (moving through space) is an attractor for the Earth and the other solar planets YET the sun and the other planets are never is the same location (state) twice (due to the movement of the entire solar system through the universe). No, the attractor is the center of the sun. The Earth and other planets are in the basin of attraction but have not yet reached equilibrium. http://en.wikipedia.org/wiki/Attractor You seem to be giving special status to Homo Sapiens. How does this arise out of your dynamic? I know you can program an initial bias, but how is it stable? I am emphatically *NOT* giving special status to Homo Sapiens. In fact, that is precisely *my* objection to Eliezer's view of Friendliness. OK. That makes the problem much easier. -- Matt Mahoney, [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com
Re: [agi] What should we do to be prepared?
On 03/07/2008 03:20 PM,, Mark Waser wrote: For there to be another attractor F', it would of necessity have to be an attractor that is not desirable to us, since you said there is only one stable attractor for us that has the desired characteristics. Uh, no. I am not claiming that there is */ONLY/* one unique attractor (that has the desired characteristics). I am merely saying that there is */AT LEAST/* one describable, reachable, stable attractor that has the characteristics that we want. (Note: I've clarified a previous statement my adding the */ONLY/* and */AT LEAST /*and the parenthetical expression that has the desired characteristics.) Okay, got it now. At least one, not exactly one. I really don't like the particular quantifier rather minimal. I would argue (and will later attempt to prove) that the constraints are still actually as close to Friendly as rationally possible because that is the most rational way to move non-Friendlies to a Friendly status (which is a major Friendliness goal that I'll be getting to shortly). The Friendly will indeed have no qualms about kicking ass and inflicting pain */where necessary/* but the where necessary clause is critically important since a Friendly shouldn't resort to this (even for Unfriendlies) until it is truly necessary. Fair enough. rather minimal is much too strong a phrase. I think you're fudging a bit here. If we are only likely to occupy the circumstance space with probability less than 1, then the intentional destruction of the human race is not 'most certainly ruled out': it is with very high probability less than 1 ruled out. I'm not trying to say it's likely; only that's it's possible. */I make this point to distinguish your approach from other approaches that purport to make absolute guarantees about certain things (as in some ethical systems where certain things are *always* wrong, regardless of context or circumstance)./* Um. I think that we're in violent agreement. I'm not quite sure where you think I'm fudging. The reason I thought you were fudging was that I thought you were saying that it is absolutely certain that the AI will never turn the planet into computronium and upload us *AND* there are no absolute guarantees. I guess I was misled when I read given the circumstance space that we are likely to occupy with a huge certainty, the intentional destruction of the human race is most certainly ruled out as meaning 'turning earth into computronium is certainly ruled out'. It's only certainly ruled out *assuming* the highly likely area of circumstance space that we are likely to inhabit. So yeah, I guess we do agree. This raises another point for me though. In another post (2008-03-06 14:36) you said: It would *NOT* be Friendly if I have a goal that I not be turned into computronium even if your clause (which I hereby state that I do) Yet, if I understand our recent exchange correctly, it is possible for this to occur and be a Friendly action regardless of what sub-goals I may or may have. (It's just extremely unlikely given ..., which is an important distinction.) It would be nice to have some ballpark probability estimates though to know what we mean by extremely unlikely. 10E-6 is a very different beast than 10E-1000. I don't think it's inflammatory or a case of garbage in to contemplate that all of humanity could be wrong. For much of our history, there have been things that *every single human was wrong about*. This is merely the assertion that we can't make guarantees about what vastly superior f-beings will find to be the case. We may one day outgrow our attachment to meatspace, and we may be wrong in our belief that everything essential can be preserved in meatspace, but we might not be at that point yet when the AI has to make the decision. Why would the AI *have* to make the decision? It shouldn't be for it's own convenience. The only circumstance that I could think of where the AI should make such a decision *for us* over our objections is if we would be destroyed otherwise (but there was no way for it to convince us of this fact before the destruction was inevitable). It might not *have* to. I'm only saying it's possible. And it would almost certainly be for some circumstance that has not occurred to us, so I can't give you a specific scenario. Not being able to find such a scenario is different though from there not actually being one. In order to believe the later, a proof is required. Yes, when you talk about Friendliness as that distant attractor, it starts to sound an awful lot like enlightenment, where self-interest is one aspect of that enlightenment, and friendly behavior is another aspect. Argh! I would argue that Friendliness is *not* that distant. Can't you see how the attractor that I'm describing is both self-interest and Friendly because **ultimately they are the same thing** (OK, so maybe that *IS*
Re: [agi] What should we do to be prepared?
On Fri, Mar 7, 2008 at 5:24 PM, Mark Waser [EMAIL PROTECTED] wrote: The core of my thesis is that the particular Friendliness that I/we are trying to reach is an attractor -- which means that if the dominant structure starts to turn unfriendly, it is actually a self-correcting situation. This sounds like magic thinking, sweeping the problem under the rug of 'attractor' word. Anyway, even if this trick somehow works, it doesn't actually address the problem of friendly AI. The problem with unfriendly AI is not that it turns selfish, but that it doesn't get what we want from it or can't foresee consequences of its actions in sufficient detail. If you already have a system (in the lab) that is smart enough to support your code of friendliness and not crash old humanity by oversight by the year 2500, you should be able to make it produce another system that works with unfriendly humanity, doesn't have its own agenda, and so on. P.S. I'm just starting to fundamentally revise my attitude to the problem of friendliness, see my post Understanding the problem of friendliness on SL4. -- Vladimir Nesov [EMAIL PROTECTED] --- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244id_secret=95818715-a78a9b Powered by Listbox: http://www.listbox.com