Re: [agi] Goal Driven Systems and AI Dangers [WAS Re: Singularity Outcomes...]

Richard Loosemore Tue, 11 Mar 2008 11:36:35 -0700

Kaj Sotala wrote:

On 3/3/08, Richard Loosemore <[EMAIL PROTECTED]> wrote:

Kaj Sotala wrote:
 > Alright. But previously, you said that Omohundro's paper, which to me
 > seemed to be a general analysis of the behavior of *any* minds with
 > (more or less) explict goals, looked like it was based on a
 > 'goal-stack' motivation system. (I believe this has also been the
 > basis of your critique for e.g. some SIAI articles about
 > friendliness.) If built-in goals *can* be constructed into
 > motivational system AGIs, then why do you seem to assume that AGIs
 > with built-in goals are goal-stack ones?



I seem to have caused lots of confusion earlier on in the discussion, so
 let me backtrack and try to summarize the structure of my argument.

 1)  Conventional AI does not have a concept of a "Motivational-Emotional
 System" (MES), the way that I use that term, so when I criticised
 Omuhundro's paper for referring only to a "Goal Stack" control system, I
 was really saying no more than that he was assuming that the AI was
 driven by the system that all conventional AIs are supposed to have.
 These two ways of controlling an AI are two radically different designs.

[...]

 So now:  does that clarify the specific question you asked above?


Yes and no. :-) My main question is with part 1 of your argument - you
are saying that Omohundro's paper assumed the AI to have a certain
sort of control system. This is the part which confuses me, since I
didn't see the paper to make *any* mentions of how the AI should be
built. It only assumes that the AI has some sort of goals, and nothing
more.

I'll list all of the drives Omohundro mentions, and my interpretation
of them and why they only require existing goals. Please correct me
where our interpretations differ. (It is true that it will be possible
to reduce the impact of many of these drives by constructing an
architecture which restricts them, and as such they are not
/unavoidable/ ones - however, it seems reasonable to assume that they
will by default emerge in any AI with goals, unless specifically
counteracted. Also, the more that they are restricted, the less
effective the AI will be.)

Drive 1: AIs will want to self-improve
This one seems fairly straightforward: indeed, for humans
self-improvement seems to be an essential part in achieving pretty
much *any* goal you are not immeaditly capable of achieving. If you
don't know how to do something needed to achieve your goal, you
practice, and when you practice, you're improving yourself. Likewise,
improving yourself will quickly become a subgoal for *any* major
goals.


But now I ask:  what exactly does this mean?

In the context of a Goal Stack system, this would be represented by atop level goal that was stated in the knowledge representation languageof the AGI, so it would say "Improve Thyself".

Next, it would subgoal this (break it down into subgoals). Since thetop level goal is so unbelievably vague, there are a billion differentways to break this down into subgoals: it might get out a polishingcloth and start working down its beautiful shiny exterior, or it mightstart a transistor-by-transistor check of all its circuits, or.... allthe way up to taking a course in Postmodern critiques of the Postmodernmovement.

And included in that range of "improvement" activities would be thepossibility of something like "Improve my ability to functionefficiently" which gets broken down into subgoals like "Remove allsources of distraction that reduce efficiency" and then "Remove allhumans, because they are a distraction".

My point here is that a Goal Stack system would *interpret* this goal inany one of an infinite number of ways, because the goal was representedas an explicit statement. The fact that it was represented explicitlymeant that an extremely vague concept ("Improve Thyself") had to beencoded in such a way as to leave it open to ambiguity. As a result,what the AGI actually does as a result of this goal, which is embeddedin a Goal Stack architecture, is completely indeterminate.

Stepping back from the detail, we can notice that *any* vaguely wordedgoal is going to have the same problem in a GS architecture. And if wedwell on that for a moment, we start to wonder exactly what would happento an AGI that was driven by goals that had to be stated in vague terms... will the AGI *ever* exhibit coherent, intelligent behavior whendriven by such a GS drive system, or will it have flashes ofintelligence puncuated by the wild pursuit of bizarre obsessions? Willit even have flashes of intelligence?

So long as the goals that are fed into a GS architecture are very, verylocal and specific (like "Put the red pyramid on top of the greenblock") I can believe that the GS drive system does actually work (kindof). But no one has ever built an AGI that way. Never. Everyoneassumes that a GS will scale up to a vague goal like "Improve Thyself",and yet no one has tried this in practice. Not on a system that issupposed to be capable of a broad-based, autonomous, *general* intelligence.

So when you paraphrase Omuhundro as saying that "AIs will want toself-improve", the meaning of that statement is impossible to judge.

The reason that I say Omuhundro is assuming a Goal Stack system is thatI believe he would argue that that is what he meant, and that he assumedthat a GS architecture would allow the AI to exhibit behavior thatcorresponds to what we, as humans, recognize as wanting to self-improve.I think it is a hidden assumption in what he wrote.


Now, suppose he did not mean to make such an assumption.  Then what?

Well, then he might have meant to include the Motivational EmotionalSystem that I have described in my own writings - a diffuse controlmechanism completely unlike a Goal Stack.

But then, in a MES drive system, the "goal" of self-improvement is notan absolute, so the AGI does not get stuck in the kind of crazy chainsof logic I described above. Self-improvment can be just a tendency.Indeed, it can be just the same as the behavior exhibited by people, whogenerally tend to self-improve, but without being obsessed by it (usually).

But in that case, what can be deduced about this drive? Is it anautomatic feature of an AGI? Well, if we build it into the AGI it is(in other words, no it is not!). Are we obliged to put it in, if wewant the AGI to function well? Well, kind of, yes.


Drive 2: AIs will want to be rational
This is basically just a special case of drive #1: rational agents
accomplish their goals better than irrational ones, and attempts at
self-improvement can be outright harmful if you're irrational in the
way that you try to improve yourself. If you're trying to modify
yourself to better achieve your goals, then you need to make clear to
yourself what your goals are. The most effective method for this is to
model your goals as a utility function and then modify yourself to
better carry out the goals thus specified.

Well, again, what exactly do you mean by "rational"? There are manymeanings of this term, ranging from "generally sensible" to "strictlyfollowing a mathematical logic".

Rational agents accomplish their goals better than irrational ones? Canthis be proved? And with what assumptions? Which goals are betteraccomplished .... is the goal of "being rational" better accomplished by"being rational"? Is the goal of "generating a work of art that hastrue genuineness" something that needs rationality?

And if a system is trying to modify itself to better achieve its goals,what if it decides that just enjoying the subjective experience of lifeis good enough as a goal, and then realizes that it will not get more ofthat by becoming more rational?

Most of these questions are rhetorical (whoops, too late to say that!),but my general point is that the actual behavior that results from agoal like "Be rational" depends (again) on the exact interpretation, andin the right kind of MES system there is no *absolute* law at work thatsays that everything the creature does must be perfectly or maximallyrational. The only time you get that kind of absolute obedience to aprinciple of rationality is in a GS type of AGI.

So, if Omunhundro meant to include MES-driven AGIs in his assumptions,then I see no deductions that can be made from the idea that the AGIwill want to be more rational, because in an MES-driven AGI the tendencytoward rationaliity is just a tendency, and it the behavior of thesystem would certanly not be forced toward maximum rationality.

The only way that anyone can conclude that a "Be rational" goal wouldhave defiite effects is if they believe that this exists in the contextof a Goal-Stack AGI, and even there (as I argued above), I think it is a"fantasy" AGI that they are thinking of, because I believe that inpractice the insertion of a "Be rational" drive into a GS AGI wouldactually not cause that AGI to exhibit what we recognize as rationalbehavior, but would actually lead to spontaneous outbursts of randombehavior, becuase of the need to interpret "Be Rational".

Drive 3: AIs will want to preserve their utility functions
Since the utility function constructed was a model of the AI's goals,
this drive is equivalent to saying "AIs will want to preserve their
goals" (or at least the goals that are judged as the most important
ones). The reasoning for this should be obvious - if a goal is removed
from the AI's motivational system, the AI won't work to achieve the
goal anymore, which is bad from the point of view of an AI that
currently does want the goal to be achieved.

This is, I believe, only true of a rigidly deterministic GS system, butI can demonstrate easily enough that it is not true of at least on etypeof MES system.

Here is the demonstration (I originally made this argument when I firstarrived on the SL4 list a couple of years ago, and I do wonder if it wasone of the reasons why some of the people there took an instant disliketo me). I, as a human being, and driven by goals which include mysexuality, and part of that, for me, is the drive to be heterosexualonly. In real life I have no desire to cross party lines: no judgementimplied, it just happens to be the way I am wired.

However, as an AGI researcher, I *know* that I would be able to rewiremyself at some point in the future so that I would actually break thistaboo. Knowing this, would I do it, perhaps as an experiment? Well, asthe me of today, I don't want to do that, but I am aware that the me oftomorrow (after the rewiring) would be perfectly happy about it.Knowing that my drives today contain a zero desire to cross gender linesis one thing, but in spite of that I might be happy to switch my wiringso that I *did* enjoy it.

This means that by intellectual force I have been able to at leastconsider the possibilty of changing my drive system to like something,today, I absolutely do not want. I know it would do not harm, so it isopen as a possibility.

Now, that is the kind of thing that is possible in a system drive by anMES drive mechanism.

It is almost certainly not possible in GS system. That makes one bigdifference between the two and undermines the idea that Omuhundro'ssuggestions were neutral with respect to drive mechanism assumptions.

I will have to stop here because I have run out of time, but does thisconvey the nature of my concern? I think that at every stage, when youtry to pin down exactly what is meant by these statements about goalsand drives, the vagueness forces you into a position where differencesbetween background assumptions are overwhelming.




Richard Loosemore

Drive 4: AIs try to prevent counterfeit utility
This is an extension of drive #2: if there are things in the
environment that hijack existing motivation systems to make the AI do
things not relevant for its goals, then it will attempt to modify its
motivation systems to avoid those vulnerabilities.

Drive 5: AIs will be self-protective
This is a special case of #3.

Drive 6: AIs will want to acquire resources and use them efficiently
More resources will help in achieving most goals: also, even if you
had already achieved all your goals, more resources would help you in
making sure that your success wouldn't be thwarted as easily.


-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=95818715-a78a9b
Powered by Listbox: http://www.listbox.com

Re: [agi] Goal Driven Systems and AI Dangers [WAS Re: Singularity Outcomes...]

Reply via email to