There've been enough responses to this that I will reply in generalities, and 
hope I cover everything important...

When I described Nirvana attractractors as a problem for AGI, I meant that in 
the sense that they form a substantial challenge for the designer (as do many 
other features/capabilities of AGI!), not that it was an insoluble problem.

The hierarchical fixed utility function is probably pretty good -- not only 
does it match humans (a la Maslow) but Asimov's Three Laws. And it can be 
more subtle than it originally appears: 

Consider a 3-Laws robot that refuses to cut a human with a knife because that 
would harm her. It would be unable to become a surgeon, for example. But the 
First Law has a clause, "or through inaction allow a human to come to harm," 
which means that the robot cannot obey by doing nothing -- it must weigh the 
consequences of all its possible courses of action. 

Now note that it hasn't changed its utility function -- it always believed 
that, say, appendicitis is worse than an incision -- but what can happen is 
that its world model gets better and it *looks like* it's changed its utility 
function because it now knows that operations can cure appendicitis.

Now it seems reasonable that this is a lot of what happens with people, too. 
And you can get a lot of mileage out of expressing the utility function in 
very abstract terms, e.g. "life-threatening disease" so that no utility 
function update is necessary when you learn about a new disease.

The problem is that the more abstract you make the concepts, the more the 
process of learning an ontology looks like ... revising your utility 
function!  Enlightenment, after all, is a Good Thing, so anything that leads 
to it, nirvana for example, must be good as well. 

So I'm going to broaden my thesis and say that the nirvana attractors lie in 
the path of *any* AI with unbounded learning ability that creates new 
abstractions on top of the things it already knows.

How to avoid them? I think one very useful technique is to start with the kind 
of knowledge and introspection capability to let the AI know when it faces 
one, and recognize that any apparent utility therein is fallacious. 

Of course, none of this matters till we have systems that are capable of 
unbounded self-improvement and abstraction-forming, anyway.

Josh


-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=103754539-40ed26
Powered by Listbox: http://www.listbox.com

Reply via email to