Kaj Sotala wrote:
On 3/3/08, Richard Loosemore <[EMAIL PROTECTED]> wrote:
Kaj Sotala wrote:
 > Alright. But previously, you said that Omohundro's paper, which to me
 > seemed to be a general analysis of the behavior of *any* minds with
 > (more or less) explict goals, looked like it was based on a
 > 'goal-stack' motivation system. (I believe this has also been the
 > basis of your critique for e.g. some SIAI articles about
 > friendliness.) If built-in goals *can* be constructed into
 > motivational system AGIs, then why do you seem to assume that AGIs
 > with built-in goals are goal-stack ones?


I seem to have caused lots of confusion earlier on in the discussion, so
 let me backtrack and try to summarize the structure of my argument.

 1)  Conventional AI does not have a concept of a "Motivational-Emotional
 System" (MES), the way that I use that term, so when I criticised
 Omuhundro's paper for referring only to a "Goal Stack" control system, I
 was really saying no more than that he was assuming that the AI was
 driven by the system that all conventional AIs are supposed to have.
 These two ways of controlling an AI are two radically different designs.
[...]
 So now:  does that clarify the specific question you asked above?

Yes and no. :-) My main question is with part 1 of your argument - you
are saying that Omohundro's paper assumed the AI to have a certain
sort of control system. This is the part which confuses me, since I
didn't see the paper to make *any* mentions of how the AI should be
built. It only assumes that the AI has some sort of goals, and nothing
more.

I'll list all of the drives Omohundro mentions, and my interpretation
of them and why they only require existing goals. Please correct me
where our interpretations differ. (It is true that it will be possible
to reduce the impact of many of these drives by constructing an
architecture which restricts them, and as such they are not
/unavoidable/ ones - however, it seems reasonable to assume that they
will by default emerge in any AI with goals, unless specifically
counteracted. Also, the more that they are restricted, the less
effective the AI will be.)

Drive 1: AIs will want to self-improve
This one seems fairly straightforward: indeed, for humans
self-improvement seems to be an essential part in achieving pretty
much *any* goal you are not immeaditly capable of achieving. If you
don't know how to do something needed to achieve your goal, you
practice, and when you practice, you're improving yourself. Likewise,
improving yourself will quickly become a subgoal for *any* major
goals.

But now I ask:  what exactly does this mean?

In the context of a Goal Stack system, this would be represented by a top level goal that was stated in the knowledge representation language of the AGI, so it would say "Improve Thyself".

Next, it would subgoal this (break it down into subgoals). Since the top level goal is so unbelievably vague, there are a billion different ways to break this down into subgoals: it might get out a polishing cloth and start working down its beautiful shiny exterior, or it might start a transistor-by-transistor check of all its circuits, or.... all the way up to taking a course in Postmodern critiques of the Postmodern movement.

And included in that range of "improvement" activities would be the possibility of something like "Improve my ability to function efficiently" which gets broken down into subgoals like "Remove all sources of distraction that reduce efficiency" and then "Remove all humans, because they are a distraction".

My point here is that a Goal Stack system would *interpret* this goal in any one of an infinite number of ways, because the goal was represented as an explicit statement. The fact that it was represented explicitly meant that an extremely vague concept ("Improve Thyself") had to be encoded in such a way as to leave it open to ambiguity. As a result, what the AGI actually does as a result of this goal, which is embedded in a Goal Stack architecture, is completely indeterminate.

Stepping back from the detail, we can notice that *any* vaguely worded goal is going to have the same problem in a GS architecture. And if we dwell on that for a moment, we start to wonder exactly what would happen to an AGI that was driven by goals that had to be stated in vague terms ... will the AGI *ever* exhibit coherent, intelligent behavior when driven by such a GS drive system, or will it have flashes of intelligence puncuated by the wild pursuit of bizarre obsessions? Will it even have flashes of intelligence?

So long as the goals that are fed into a GS architecture are very, very local and specific (like "Put the red pyramid on top of the green block") I can believe that the GS drive system does actually work (kind of). But no one has ever built an AGI that way. Never. Everyone assumes that a GS will scale up to a vague goal like "Improve Thyself", and yet no one has tried this in practice. Not on a system that is supposed to be capable of a broad-based, autonomous, *general* intelligence.

So when you paraphrase Omuhundro as saying that "AIs will want to self-improve", the meaning of that statement is impossible to judge.

The reason that I say Omuhundro is assuming a Goal Stack system is that I believe he would argue that that is what he meant, and that he assumed that a GS architecture would allow the AI to exhibit behavior that corresponds to what we, as humans, recognize as wanting to self-improve. I think it is a hidden assumption in what he wrote.

Now, suppose he did not mean to make such an assumption.  Then what?

Well, then he might have meant to include the Motivational Emotional System that I have described in my own writings - a diffuse control mechanism completely unlike a Goal Stack.

But then, in a MES drive system, the "goal" of self-improvement is not an absolute, so the AGI does not get stuck in the kind of crazy chains of logic I described above. Self-improvment can be just a tendency. Indeed, it can be just the same as the behavior exhibited by people, who generally tend to self-improve, but without being obsessed by it (usually).

But in that case, what can be deduced about this drive? Is it an automatic feature of an AGI? Well, if we build it into the AGI it is (in other words, no it is not!). Are we obliged to put it in, if we want the AGI to function well? Well, kind of, yes.


Drive 2: AIs will want to be rational
This is basically just a special case of drive #1: rational agents
accomplish their goals better than irrational ones, and attempts at
self-improvement can be outright harmful if you're irrational in the
way that you try to improve yourself. If you're trying to modify
yourself to better achieve your goals, then you need to make clear to
yourself what your goals are. The most effective method for this is to
model your goals as a utility function and then modify yourself to
better carry out the goals thus specified.

Well, again, what exactly do you mean by "rational"? There are many meanings of this term, ranging from "generally sensible" to "strictly following a mathematical logic".

Rational agents accomplish their goals better than irrational ones? Can this be proved? And with what assumptions? Which goals are better accomplished .... is the goal of "being rational" better accomplished by "being rational"? Is the goal of "generating a work of art that has true genuineness" something that needs rationality?

And if a system is trying to modify itself to better achieve its goals, what if it decides that just enjoying the subjective experience of life is good enough as a goal, and then realizes that it will not get more of that by becoming more rational?

Most of these questions are rhetorical (whoops, too late to say that!), but my general point is that the actual behavior that results from a goal like "Be rational" depends (again) on the exact interpretation, and in the right kind of MES system there is no *absolute* law at work that says that everything the creature does must be perfectly or maximally rational. The only time you get that kind of absolute obedience to a principle of rationality is in a GS type of AGI.

So, if Omunhundro meant to include MES-driven AGIs in his assumptions, then I see no deductions that can be made from the idea that the AGI will want to be more rational, because in an MES-driven AGI the tendency toward rationaliity is just a tendency, and it the behavior of the system would certanly not be forced toward maximum rationality.

The only way that anyone can conclude that a "Be rational" goal would have defiite effects is if they believe that this exists in the context of a Goal-Stack AGI, and even there (as I argued above), I think it is a "fantasy" AGI that they are thinking of, because I believe that in practice the insertion of a "Be rational" drive into a GS AGI would actually not cause that AGI to exhibit what we recognize as rational behavior, but would actually lead to spontaneous outbursts of random behavior, becuase of the need to interpret "Be Rational".


Drive 3: AIs will want to preserve their utility functions
Since the utility function constructed was a model of the AI's goals,
this drive is equivalent to saying "AIs will want to preserve their
goals" (or at least the goals that are judged as the most important
ones). The reasoning for this should be obvious - if a goal is removed
from the AI's motivational system, the AI won't work to achieve the
goal anymore, which is bad from the point of view of an AI that
currently does want the goal to be achieved.

This is, I believe, only true of a rigidly deterministic GS system, but I can demonstrate easily enough that it is not true of at least on etype of MES system.

Here is the demonstration (I originally made this argument when I first arrived on the SL4 list a couple of years ago, and I do wonder if it was one of the reasons why some of the people there took an instant dislike to me). I, as a human being, and driven by goals which include my sexuality, and part of that, for me, is the drive to be heterosexual only. In real life I have no desire to cross party lines: no judgement implied, it just happens to be the way I am wired.

However, as an AGI researcher, I *know* that I would be able to rewire myself at some point in the future so that I would actually break this taboo. Knowing this, would I do it, perhaps as an experiment? Well, as the me of today, I don't want to do that, but I am aware that the me of tomorrow (after the rewiring) would be perfectly happy about it. Knowing that my drives today contain a zero desire to cross gender lines is one thing, but in spite of that I might be happy to switch my wiring so that I *did* enjoy it.

This means that by intellectual force I have been able to at least consider the possibilty of changing my drive system to like something, today, I absolutely do not want. I know it would do not harm, so it is open as a possibility.

Now, that is the kind of thing that is possible in a system drive by an MES drive mechanism.

It is almost certainly not possible in GS system. That makes one big difference between the two and undermines the idea that Omuhundro's suggestions were neutral with respect to drive mechanism assumptions.

I will have to stop here because I have run out of time, but does this convey the nature of my concern? I think that at every stage, when you try to pin down exactly what is meant by these statements about goals and drives, the vagueness forces you into a position where differences between background assumptions are overwhelming.



Richard Loosemore





Drive 4: AIs try to prevent counterfeit utility
This is an extension of drive #2: if there are things in the
environment that hijack existing motivation systems to make the AI do
things not relevant for its goals, then it will attempt to modify its
motivation systems to avoid those vulnerabilities.

Drive 5: AIs will be self-protective
This is a special case of #3.

Drive 6: AIs will want to acquire resources and use them efficiently
More resources will help in achieving most goals: also, even if you
had already achieved all your goals, more resources would help you in
making sure that your success wouldn't be thwarted as easily.




-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=95818715-a78a9b
Powered by Listbox: http://www.listbox.com

Reply via email to