2008/6/11 J Storrs Hall, PhD <[EMAIL PROTECTED]>: > Vladimir, > > You seem to be assuming that there is some objective utility for which the > AI's internal utility function is merely the indicator, and that if the > indicator is changed it is thus objectively wrong and irrational. > > There are two answers to this. First is to assume that there is such an > objective utility, e.g. the utility of the AI's creator. I implicitly assumed > such a point of view when I described this as "the real problem". But > consider: Any AI who believes this must realize that there may be errors and > approximations in its own utility function as judged by the "real" utility, > and must thus have as a first priority fixing and upgrading its own utility > function. Thus it turns into a moral philosopher and it never does anything > useful -- exactly the kind of Nirvana attractor I'm talking about. > > On the other hand, it might take its utility function for granted, i.e. assume > (or agree to act as if) there were no objective utility. It's pretty much > going to have to act this way just to get on with life, as indeed most people > (except moral philosophers) do. > > But this leaves it vulnerable to modifications to its own U(x), as in my > message. You could always say that you'll build in U(x) and make it fixed, > which not only solves my problem but friendliness -- but leaves the AI unable > to learn utility. I.e. the most important part of the AI mind is forced to > remain brittle GOFAI construct. Solution unsatisfactory.
I'm not quite sure what you find unsatisfactory. I think humans have a fixed U(x), but it is not a hard goal for the system but an implicit tendency for the internal programs to not self-modify away from (an agoric economy of programs is not oblidged to find better ways of getting credit, but a good set of programs is hard to dislodge by a bad set). I also think that part of humanity's U(x) relies on social interaction which can be a very complex function. Which can lead to very complex behaviour. Imagine if we were trying to raise children like we teach computers, we wouldn't reward the socially for playing with balls or saying their first words, but would put them straight into designing electronic circuits. Hence why I think that having one or more humans act as part of the U(x) of a system is necessary for interesting behaviour. If there is only one human acting as the input to the U(x) then I think the system and human should be considered part of a larger intentional system, as it will be trying to optimise one goal. Unless the human decides to try and teach it to think for itself, with its own goals. Which would be odd for an intentional system. > I claim that there's plenty of historical evidence that people fall into this > kind of attractor, as the word nirvana indicates (and you'll find similar > attractors at the core of many religions). > I don't know many people that have actively wasted away due to self-modification of their goals. Hunger strikes is the closest, but not many people fall into it. Our U(x) is quite limited, and easily satisified in the current economy (food, sexual stimulation, warmth, positive social indicators). This leaves the rest of our software to range all over the place as long as these are satifisfied. Will Pearson ------------------------------------------- agi Archives: http://www.listbox.com/member/archive/303/=now RSS Feed: http://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: http://www.listbox.com/member/?member_id=8660244&id_secret=103754539-40ed26 Powered by Listbox: http://www.listbox.com
