Hi breznak, I think this is a great approach - we need to consider the motivations that drive and control behavior. One suggestion is to use the term "homeostasis" instead of "osmosis" for internal regulatory functions. Hunger is one of the examples of this. The brain and body attempt to keep an equilibrium that optimizes energy intake and output.
Tom Portegys > This mail introduces my experiments with NuPIC on simulating behavior, > emotions, goals and learning. > > It uses a utility-encoder: > https://github.com/breznak/nupic/tree/utility-encoder > which I'd like to ask you for review, opinions and consideration for > mailine. > More than for practical issues, I hope this encoder could be an entry point > for a field of some very interesting experiments with CLAs. > > The principle of the encoder is very easy, it provides some kind of > postprocessing of the original input, which is then added to the encoders > output as another field (score). > > Usecase for the encoder is eg behavior modeling (which I'll show further). > A typical example is: use vector encoder where two fields carry the meaning > of (position X, position Y); at initialization, the encoder is passed > user-defined evaluation function, which accepts the input and produces a > score of it. For this example, the score could be eucleidian distance to a > defined target (1,1). The resulting (post)input would be "[x, y], score" -> > which is converted to bitmap as output. > > > =================== > > The behavior and emotions experiments with NuPIC can be found in my > https://github.com/breznak/ALife repo. > > > 1/ Emotions: > -I went on to assume that basic emotions (low level, like hunger, pain, > "feeling good") can be hardwired to the program, and so are in humans and > animals where these are encoded in levels of hormones (adrenalin, ..). Such > emotions drive "osmosis" where the body wants to keep certains conditions, > inner states - like feeling hungry, keeping reasonable temperature, > "biological clock for mothers", ... > > This is modelled by the utility-encoder (above). > > Emotions can be used to model a higher level goals as well. Here it loses > the biological plausability but the use of utility still holds. Such case > can be "will to reach a target position, get highest profit in trades, etc" > > > 2/ Actions' effects > Another interesting use is where the creature is discovering its abilities > (a young baby, completely new environment [space], or a n artifitial limb > ["vision" through taste gadget for blind people]. > The similar concept is used in Prolog programming/planning - where actions > have it's prerequisities and effects "ie examples of cranes & cars, > monkey&banana&box". > This utilizes nicely the concept of SP (and TP) to learn effects, > requirements and changes of actions. > Example can be: {"hungry", eat, chicken} -> inner state hunger goes down -> > high score! > while {"full", eat, chicken} -> not much improvement in inner states -> med > score. And finally: {"extremely hungry", play violin, violin} -> lowers > food ammount -> very low score. > > Staced up actions example could be a sequence {no food, hungry, walk} > followed by {have food, hungry, eat} has high score, vs {no food, hungry, > eat}, {no food, hungry, walk} sequence does not. > > 3/ Behavior > Is the final stage, combines the above + some sort of planning. > Can be described like pursuing the main goal(s) while switching to more > actual sub-goals as needed. "Eg Get from NYC to LA, avoid planes and dont > die (hunger, hit by cars, ...)" > > The utility map is quite hard to plot, because actually it's changing by > position-action-innerstates(-and time). > > This is modelled by the behavior agent who percieves the worlds, keeps > inner representation of the explored states (memory - "5gold on pos [1,5]; > troll on [8,8]"), has a collection of it's inner states (hunger, body > temperature, oz in car's gas tank,..). This agent updates its utility map > for each {state, innerstate, action} taken (similar to reinforcement > learning) > > Like I said, the agent creates a utility map as it consumes its resources > and moves through the environment. Emotions allow shaping its directions > toward (sub)goals. the progress is done by minimalization (or max, doesn''t > matter) of the utility function, following the gradient. Here, the "choose > the best" can be done either "artificially" (non-biological way), or there > could be a higher level region which will take the possible inputs and > choose the by the minimum score. (Out of interest, such minimalizing CLA > would be a nice proof of concept). > > > I'd like to hear you further ideas, other examples, flaws in my plan etc > etc :) > > Cheers, > breznak > > -- > Marek Otahal :o) >
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
