Possible plans considered would be projected forward and given a GoodValue that will try and min/max itself to find optimal paths:
GoodValue = a*alive + b*health + c*wealth + d*enjoyment + e*learning + f*friends + g*pastplans - h*time
: Where staying alive is paramount right now (a is highest parameter), and each other element has an effect, health is staying healthy and undamaged, welath is money and object accumulated minus cost of an activity, enjoyment is activities that an entity enjoys, learning is a metric for promoting exploration of new experiences, and friends is a general metric for promoting people to like you and keeping from harming people, past plans is an indicator of repeating patterns of actions, and time subtracts the amount of time taken by activity.
So any system would run best with the highest score fro GoodValue.
This is simply an intial GoodValue equation as well, that is modifiable in its variables, and can be added to as the AGI goes along.
I take my inspiration in part from many old style MuDs where there is a fairly rich, yet finite world and set of interactions. I think we should take something like this as our model, even though it would not be a full AGI, and then strap on a very advanced learning system, that would allow an AGI to aquire any new information needed about the world through interactiona dn diretion by humans.
The GoodValue above is a measure of what action the AGI should take next.
James Ratcliff
[EMAIL PROTECTED] wrote:
How do you score any given AI System test run?
Dan Goe
----------------------------------------------------
From : James Ratcliff <[EMAIL PROTECTED]>
To : [email protected]
Subject : Re: [agi] Reward versus Punishment? .... Motivational system
Date : Mon, 12 Jun 2006 06:13:45 -0700 (PDT)
> Will,
> Right now I would think that a negative reward would be usable for
this aspect. I am using the positive negative reward system right now for
motivational/planning aspects for the AGI.
> So if sitting at a desk considering a plan of action that might hurt
himself or another, the plan would have a negative rating, where another
safer plan may have a higher rating.
> One possible thing here as well is to have asmall random value added,
so that even though a plan has a suboptimal value, it woul dbe possible to
take that route instead. (maybe adding in the value of having a new
experience here as well)
>
> One important thing we will need here, is an entire KR that will
represent all the AGI's past actions, and a way to look back over the
actions and compare them to thier expected outcomes, and see why something
is different. (Reflection)
> IE If the robot proposes to cross the road at a point, sees it as a good
plan, and does it, but nearly gets hit by a car, it needs to be able to
look back over its actions, and determine that something was missing from
his equation, and try to add it back in, or ask a human for assistance, so
in the future, he can better handle this activity.
>
> James Ratcliff
> On Fri, 09 Jun 2006 19:13:19 -500, [EMAIL PROTECTED] wrote:
> >
> > What about punishment?
>
> Currently I see it as the programs in control of outputting (and hence
> the ones to get reward), losing the control and the chance to get
> reinforcement. However experiment or better theory would be needed to
> determine whether this is sufficient or negative reward would be
> needed.
>
> Will
>
> -------
> To unsubscribe, change your address, or temporarily deactivate your
subscription,
> please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
>
>
>
> Thank You
> James Ratcliff
> http://FallsTown.com - Local Wichita Falls Community Website
> http://Falazar.com - Personal Website
> Hosting Starting at $9.95
> Dialups Accounts - $8.95
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> -------
> To unsubscribe, change your address, or temporarily deactivate your
subscription,
> please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
-------
To unsubscribe, change your address, or temporarily deactivate your subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
[EMAIL PROTECTED]>
Thank You
James Ratcliff
http://falazar.com
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
