Terren Suydam <[EMAIL PROTECTED]> was quoted to say:
>I've been saying that Friendliness is impossible to implement because 1)
>it's a moving target (as in, changes through time), since 2) its definition
>is dependent on context (situational context, cultural context, etc).

I think that Friendliness is doing what people want, and an AI would
be able to infer what people want using a process similar to how
people infer what each other want.  What people want depends on
context, but the process by which one could infer what people want
does not depend on context.  Different people want different things,
so the AI use a weighted average of their utilities as its utility.

Surely you could land in Mongolia somewhere, watch people behave, and
estimate who wants what.  This problem is solvable, in principle.

>In other words, Friendliness is not something that can be
>hardwired. It can't be formalized, coded, ...

Great!  There's a decision procedure at
http://www.fungible.com/respect/code/index.html.  If you don't
understand it, read the paper at
http://www.fungible.com/respect/paper.html.  Tell me why it's wrong.

Well, actually, I know it's wrong, but the bugs appear fixable.  The
bugs are of the "oops" variety rather than "the concept is
meaningless" variety.  The known bugs are:

* It is subject to a common logical fallacy I call "The Novice
  Philosopher Problem".  See
  http://www.fungible.com/respect/paper.html#novice-philosopher-problem.
  My bogus misdefinition of the word "probabilty" should probably go
  away, since solving this issue will probably require moving to real
  probabilities.

* It takes a planning horizon as a parameter, and then at the end of
  the planning horizon it gives people what they want.  At that point
  the estimate of what people want takes into account the
  naturally-occuring irregular human planning horizon.  So if
  (hypothetically) you're a crack addict who wants your fix right now,
  and the AI is planning a year at a time, you won't be happy with
  what it does for you.  The delayed gratification is a spurious
  implicit statement that long-term planning is morally superior to
  short-term planning, so to minimize conflict with the short-sighted
  the algorithm should always be run with one timestep as the planning
  horizon.

* It's not very good at dealing with personal indebtedness.  For
  example, suppose the AI has been told to respect people enough that
  it won't consider stealing.  The present code would only go shopping
  if it could pick out the merchandise and then pay for it all within
  the planning horizon.  This doesn't interact well with the previous
  issue where we should set the planning horizon to be very short.  I
  think the right fix here is maintain a debt per person and then to
  define respect in terms of people getting what they are owed.  The
  present scheme is like the desired scheme, except in the present
  scheme the debt is always zero.

> It can't be ... designed, implemented ...

I agree that nobody has done that yet.

> It can't be ... proved.

Proving that the decision procedure would actually produce behavior I
like presents a logical puzzle.  The decision procedure itself is the
only formal description of what I like that I have available.  So what
is there to prove?  I wish I knew a better approach to this.

-- 
Tim Freeman               http://www.fungible.com           [EMAIL PROTECTED]


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com

Reply via email to