> > It doesn't matter what I do with the question. It
> only matters what an AGI does with it.
> 
> AGI doesn't do anything with the question, you do. You
> answer the
> question by implementing Friendly AI. FAI is the answer to
> the
> question.

The question is: how could one specify Friendliness in such a way that an AI 
will be guaranteed-Friendly? Is your answer to that really just "you build a 
Friendly AI"?  Why do I feel like a dog chasing my own tail?

I've been saying that Friendliness is impossible to implement because 1) it's a 
moving target (as in, changes through time), since 2) its definition is 
dependent on context (situational context, cultural context, etc).  In other 
words, Friendliness is not something that can be hardwired. It can't be 
formalized, coded, designed, implemented, or proved. It is an invention of the 
collective psychology of humankind, and every bit as fuzzy as that sounds. At 
best, it can be approximated. 

I'll put a challenge out to demonstrate my claim. I challenge anyone who 
believes that Friendliness is attainable in principle to construct a scenario 
in which there is a clear right action that does not depend on cultural or 
situational context. If you say, "an AGI is alone in a room with a human. That 
AGI should not kill the human." I say, what if the human in the room has just 
killed a hundred people in cold blood, and will certainly kill more?  OK, you 
up the ante: it's a child who hasn't killed anyone. I say: yet. The child is 
contagious with an extremely deadly airborne pathogen.  So you say, ok, fine, 
the child is healthy. I say: what if the child has asked the AI to assist in 
her suicide? Let's say the child's father has dishonored the family and in this 
child's culture, whenever a father does a terrible thing, the family is 
expected to commit suicide. If this child does not commit suicide, it will 
bring even greater dishonor to the extended
 family, who will all be ritually massacred.

You see where I'm going, I hope. You can always construct increasingly 
elaborate scenarios based on nothing but human culture and the valuations that 
go with it.  Friendliness *must* take these cultural considerations seriously, 
because that's what a particular culture's morality is based on. And if you 
accept this, you have to see that these valuations change through time, that 
they are essentially invented. From an objective standpoint, the best you can 
do is to show that the morality of a particular culture lends stability to that 
collective. But cultural stability does not imply preservation of individual 
life, or human rights in general - they are separate concepts.

The only out is if there is such a thing as objective morality... if you can 
specify right from wrong without any reference to a particular set of cultural 
valuations. 

> > If you can't guarantee Friendliness, then
> self-modifying approaches to
> > AGI should just be abandoned. Do we agree on that?
> 
> More or less, but keeping in mind that
> "guarantee" doesn't need to be
> a formal proof of absolute certainty. If you can't show
> that a design
> implements Friendliness, you shouldn't implement it.

What does guarantee mean if not absolute certainty?  

Terren


      


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com

Reply via email to