You say.... 

> Likewise, an artificial general 
> intelligence is not "a set of environment states S, a set of actions A, 
> and a set of scalar "rewards" in the Reals".)

What are the basic system modules to seed AI? 

Dan Goe



----------------------------------------------------
>From : Richard Loosemore <[EMAIL PROTECTED]>
To : [email protected]
Subject : Re: [agi] AGI bottlenecks
Date : Fri, 09 Jun 2006 10:25:18 -0400
> 
> James,
> 
> It is a little hard to know where to start, to be honest.  Do you have a 
> background in any particular area already, or are you pre-college?  If 
> the latter, and if you are interested in the field in a serious way, I 
> would recommend that you hunt down a good programme in cognitive science 
> (and if possible do software engineering as a minor).  After about three 
> or four years of that, you'll have a better idea of where the below 
> argument was coming from.  Even then, expect to have to argue the heck 
> out of your professors, only believe one tenth of everything they say, 
> and discover your own science as you go along, rather than be told what 
> the answers are.  A lot of the questions do not have answers yet.
> 
> All thinking systems do have a motivation system of some sort (what you 
> were talking about below as "rewards"), but people's ideas about the 
> design of that motivational system vary widely from the implicit and 
> confused to the detailed and convoluted (but not necessarily less 
> confused).  The existence of a motivational system was not the issue in 
> my post:  the issue was exactly *how* you design that motivation system.
> 
> Behaviorism (and reinforcement learning) was a suggestion that took a 
> diabolically simplistic view of how that motivation system is supposed 
> to work .... so simplistic that, in fact, it swept under the carpet all 
> the real issues.  What I was complaining of was a recent revival in 
> interest in the idea of reinforcement learning, in which people were 
> beginning to make the same stupid mistakes that were made 80 years ago, 
> without apparently being aware of what those stupid mistakes were.
> 
> (To give you an analogy that illustrates the problem:  imagine someone 
> waltzes into Detroit and says "It ain't so hard to beat these Japanese 
> car makers:  I mean, a car is just four wheels and a thing that pushes 
> them around.  I could build one of those in my garage and beat the pants 
> off Toyota in a couple of weeks!"   A car is not "four wheels and a 
> thing that pushes them around".  Likewise, an artificial general 
> intelligence is not "a set of environment states S, a set of actions A, 
> and a set of scalar "rewards" in the Reals".)
> 
> Watching history repeat itself is pretty damned annoying.
> 
> Richard Loosemore
> 
> 
> 
> 
> James Ratcliff wrote:
> > Richard,
> >   Can you explain differently, in other words the second part of this 
> > post.  I am very interested in this as a large part of an AI system.
> >   I believe in some fashion there needs to be a controlling algorithm 
> > that tells the AI that it is doing "Right" be it either an internal or 
> > external human reward.  We receive these rewards in our daily life, in 
> > our jobs relationships and such, wether we actually learn from these 
is 
> > to be debated though.
> > 
> > James Ratcliff
> > 
> > */Richard Loosemore <[EMAIL PROTECTED]>/* wrote:
> > 
> > 
> >     Will,
> > 
> >     Comments taken, but the direction of my critique may have gotten
> >     lost in
> >     the details:
> > 
> >     Suppose I proposed a solution to the problem of unifying quantum
> >     mechanics and gravity, and suppose I came out with a solution that 
said 
> >     that the unified theory involved (a) a specific interface to 
quantum 
> >     theory, which I spell out in great detail, and (b) ditto for an
> >     interface with geometrodynamics, and (c) a linkage component, to 
be 
> >     specified.
> > 
> >     Physicists would laugh at this. What linkage component?! they 
would 
> >     say. And what makes you *believe* that once you sorted out the 
linkage 
> >     component, the two interfaces you just specified would play any 
role 
> >     whatsoever in that linkage component? They would point out that my
> >     "linkage component" was the meat of the theory, and yet I had 
referred 
> >     to in such a way that it seemed as though it was just an extra, to 
be 
> >     sorted out later.
> > 
> >     This is exactly what happened to Behaviorism, and the idea of
> >     Reinforcement Learning. The one difference was that they did not
> >     explicitly specify an equivalent of my (c) item above: it was for 
the 
> >     cognitive psychologists to come along later and point out that
> >     Reinforcement Learning implicitly assumed that something in the 
brain 
> >     would do the job of deciding when to give rewards, and the job of
> >     deciding what the patterns actually were .... and that that 
something 
> >     was the part doing all the real work. In the case of all the
> >     experiments in the behaviorist literature, the experimenter 
substituted 
> >     for those components, making them less than obvious.
> > 
> >     Exactly the same critique bears on anyone who suggests that
> >     Reinforcement Learning could be the basis for an AGI. I do not 
believe 
> >     there is still any reply to that critique.
> > 
> >     Richard Loosemore
> > 
> > 
> > 
> > 
> > 
> >     William Pearson wrote:
> >      > On 01/06/06, Richard Loosemore wrote:
> >      >
> >      >> I had similar feelings about William Pearson's recent message 
about 
> >      >> systems that use reinforcement learning:
> >      >>
> >      >> >
> >      >> > A reinforcement scenario, from wikipedia is defined as
> >      >> >
> >      >> > "Formally, the basic reinforcement learning model consists 
of: 
> >      >> >
> >      >> > 1. a set of environment states S;
> >      >> > 2. a set of actions A; and
> >      >> > 3. a set of scalar "rewards" in the Reals.
> >      >> > "
> >      >>
> >      >> Here is my standard response to Behaviorism (which is what the 
above 
> >      >> reinforcement learning model actually is): Who decides when 
the 
> >     rewards
> >      >> should come, and who chooses what are the relevant "states" 
and 
> >      >> "actions"?
> >      >
> >      > The rewards I don't deal with, I am interested in external 
brain 
> >      > add-ons rather than autonomous systems, so the reward system 
will be 
> >      > closely coupled to a human in some fashion.
> >      >
> >      > The rest of post I was trying to outline a system that could 
alter 
> >      > what it considered actions and states (and bias, learning 
algorithms 
> >      > etc). The RL definition was just there as an example to work 
against. 
> >      >
> >      >> If you find out what is doing *that* work, you have found your
> >      >> intelligent system. And it will probably turn out to be so
> >     enormously
> >      >> complex, relative to the reinforcement learning part shown
> >     above, that
> >      >> the above formalism (assuming it has not been discarded by 
then) 
> >     will be
> >      >> almost irrelevant.
> >      >
> >      > The internals of the system will be enormously more complex 
compared 
> >      > to the reinforcement part I described. But it won't make that
> >      > irrelevent. What goes on inside a PC is vastly more complex 
than the 
> >      > system that governs the permissions of what each *nix program 
can do. 
> >      > This doesn't mean the permission governing system is 
irrelevent. 
> >      >
> >      > Like the permissions system in *nix the reinforcement system it 
is 
> >      > only supposed to govern who is allowed to do what, not what 
actually 
> >      > happens. Unlike the permission system it is supposed to get 
that from 
> >      > the affect of the programs on the environment. Without it both 
sorts 
> >      > of systems would be highly unstable.
> >      >
> >      > I see it as a necessity for complete modular flexibility. If 
you get 
> >      > one of the bits that does the work wrong, or wrong for the 
current 
> >      > environment, how do you allow it to change?
> >      >
> >      >> Just my deux centimes' worth.
> >      >>
> >      >
> >      > Appreciated.
> >      >
> >      >>
> >      >> On a more positive note, I do think it is possible for AGI
> >     researchers
> >      >> to work together within a common formalism. My presentation at 
the 
> >      >> AGIRI workshop was about that, and when I get the paper 
version 
> >     of the
> >      >> talk finalized I will post it somewhere.
> >      >>
> >      >
> >      > I'll be interested, but sceptical.
> >      >
> >      > Will
> >      >
> >      > -------
> >      > To unsubscribe, change your address, or temporarily deactivate 
your 
> >      > subscription, please go to
> >      > http://v2.listbox.com/member/[EMAIL PROTECTED]
> >      >
> >      >
> > 
> >     -------
> >     To unsubscribe, change your address, or temporarily deactivate 
your 
> >     subscription,
> >     please go to 
http://v2.listbox.com/member/[EMAIL PROTECTED] 
> > 
> > 
> > 
> > 
> > Thank You
> > James Ratcliff
> > http://FallsTown.com - Local Wichita Falls Community Website
> > http://Falazar.com - Personal Website
> > Hosting Starting at $9.95
> > Dialups Accounts - $8.95
> > 
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam protection around
> > http://mail.yahoo.com
> > 
> > 
------------------------------------------------------------------------ 
> > To unsubscribe, change your address, or temporarily deactivate your 
> > subscription, please go to 
> > http://v2.listbox.com/member/[EMAIL PROTECTED]
> 
> -------
> To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
> please go to http://v2.listbox.com/member/[EMAIL PROTECTED]

-------
To unsubscribe, change your address, or temporarily deactivate your 
subscription,
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]

Reply via email to