Agreed, Tim, no sandbox environment can be sufficient for determining
benevolence.

Such an environment can only be a heuristic guide.

We will gather data about an AGI's benevolence from its behavior in the
sandbox, and from our knowledge of its internal state.  And we will make our
best judgment about the AGI's benevolence.  No guarantees.

I have seen no alternative suggestion, in extensive discussions on the SL4
list.

Eliezer has suggested that it is possible to create an AGI system with a
special Friendliness-friendly goal architecture, one that that makes
"Friendliness to humans" a more probable outcome than it would be with
Novamente's more flexible goal architecture.  I have read his ideas
carefully and remain unconvinced.

-- Ben

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Tim Barnard
> Sent: Thursday, January 10, 2002 1:06 PM
> To: [EMAIL PROTECTED]
> Subject: Re: [agi] Friendliness toward humans
>
>
> > 3) an intention to implement a careful "AGI sandbox" that we
> won't release
> > our AGI from until we're convinced it is genuinely benevolent
> >
> >
> > -- Ben
>
> Unfortunately, what one says and what one's intent is can be two
> completely
> different things. It's unlikely, to my mind, that the "sandbox"
> restriction
> will be a sufficient environment for determining benevolence.
> What would be
> sufficient?. Will we require the AGI to take on some kind of physical form
> with which it can demonstrate to us over some period of time its
> benevolent
> characteristics? Hard to say.
>
> Tim
>
>
> -------
> To unsubscribe, change your address, or temporarily deactivate
> your subscription,
> please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]
>

-------
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]

Reply via email to