Re: [agi] AGI bottlenecks

Richard Loosemore Fri, 09 Jun 2006 07:27:03 -0700


James,

It is a little hard to know where to start, to be honest. Do you have abackground in any particular area already, or are you pre-college? Ifthe latter, and if you are interested in the field in a serious way, Iwould recommend that you hunt down a good programme in cognitive science(and if possible do software engineering as a minor). After about threeor four years of that, you'll have a better idea of where the belowargument was coming from. Even then, expect to have to argue the heckout of your professors, only believe one tenth of everything they say,and discover your own science as you go along, rather than be told whatthe answers are. A lot of the questions do not have answers yet.

All thinking systems do have a motivation system of some sort (what youwere talking about below as "rewards"), but people's ideas about thedesign of that motivational system vary widely from the implicit andconfused to the detailed and convoluted (but not necessarily lessconfused). The existence of a motivational system was not the issue inmy post: the issue was exactly *how* you design that motivation system.

Behaviorism (and reinforcement learning) was a suggestion that took adiabolically simplistic view of how that motivation system is supposedto work .... so simplistic that, in fact, it swept under the carpet allthe real issues. What I was complaining of was a recent revival ininterest in the idea of reinforcement learning, in which people werebeginning to make the same stupid mistakes that were made 80 years ago,without apparently being aware of what those stupid mistakes were.

(To give you an analogy that illustrates the problem: imagine someonewaltzes into Detroit and says "It ain't so hard to beat these Japanesecar makers: I mean, a car is just four wheels and a thing that pushesthem around. I could build one of those in my garage and beat the pantsoff Toyota in a couple of weeks!" A car is not "four wheels and athing that pushes them around". Likewise, an artificial generalintelligence is not "a set of environment states S, a set of actions A,and a set of scalar "rewards" in the Reals".)


Watching history repeat itself is pretty damned annoying.

Richard Loosemore




James Ratcliff wrote:

Richard,

Can you explain differently, in other words the second part of thispost. I am very interested in this as a large part of an AI system.I believe in some fashion there needs to be a controlling algorithmthat tells the AI that it is doing "Right" be it either an internal orexternal human reward. We receive these rewards in our daily life, inour jobs relationships and such, wether we actually learn from these isto be debated though.


James Ratcliff

*/Richard Loosemore <[EMAIL PROTECTED]>/* wrote:


    Will,

    Comments taken, but the direction of my critique may have gotten
    lost in
    the details:

    Suppose I proposed a solution to the problem of unifying quantum
    mechanics and gravity, and suppose I came out with a solution that said
    that the unified theory involved (a) a specific interface to quantum
    theory, which I spell out in great detail, and (b) ditto for an
    interface with geometrodynamics, and (c) a linkage component, to be
    specified.

    Physicists would laugh at this. What linkage component?! they would
    say. And what makes you *believe* that once you sorted out the linkage
    component, the two interfaces you just specified would play any role
    whatsoever in that linkage component? They would point out that my
    "linkage component" was the meat of the theory, and yet I had referred
    to in such a way that it seemed as though it was just an extra, to be
    sorted out later.

    This is exactly what happened to Behaviorism, and the idea of
    Reinforcement Learning. The one difference was that they did not
    explicitly specify an equivalent of my (c) item above: it was for the
    cognitive psychologists to come along later and point out that
    Reinforcement Learning implicitly assumed that something in the brain
    would do the job of deciding when to give rewards, and the job of
    deciding what the patterns actually were .... and that that something
    was the part doing all the real work. In the case of all the
    experiments in the behaviorist literature, the experimenter substituted
    for those components, making them less than obvious.

    Exactly the same critique bears on anyone who suggests that
    Reinforcement Learning could be the basis for an AGI. I do not believe
    there is still any reply to that critique.

    Richard Loosemore





    William Pearson wrote:
     > On 01/06/06, Richard Loosemore wrote:
     >
     >> I had similar feelings about William Pearson's recent message about
     >> systems that use reinforcement learning:
     >>
     >> >
     >> > A reinforcement scenario, from wikipedia is defined as
     >> >
     >> > "Formally, the basic reinforcement learning model consists of:
     >> >
     >> > 1. a set of environment states S;
     >> > 2. a set of actions A; and
     >> > 3. a set of scalar "rewards" in the Reals.
     >> > "
     >>
     >> Here is my standard response to Behaviorism (which is what the above
     >> reinforcement learning model actually is): Who decides when the
    rewards
     >> should come, and who chooses what are the relevant "states" and
     >> "actions"?
     >
     > The rewards I don't deal with, I am interested in external brain
     > add-ons rather than autonomous systems, so the reward system will be
     > closely coupled to a human in some fashion.
     >
     > The rest of post I was trying to outline a system that could alter
     > what it considered actions and states (and bias, learning algorithms
     > etc). The RL definition was just there as an example to work against.
     >
     >> If you find out what is doing *that* work, you have found your
     >> intelligent system. And it will probably turn out to be so
    enormously
     >> complex, relative to the reinforcement learning part shown
    above, that
     >> the above formalism (assuming it has not been discarded by then)
    will be
     >> almost irrelevant.
     >
     > The internals of the system will be enormously more complex compared
     > to the reinforcement part I described. But it won't make that
     > irrelevent. What goes on inside a PC is vastly more complex than the
     > system that governs the permissions of what each *nix program can do.
     > This doesn't mean the permission governing system is irrelevent.
     >
     > Like the permissions system in *nix the reinforcement system it is
     > only supposed to govern who is allowed to do what, not what actually
     > happens. Unlike the permission system it is supposed to get that from
     > the affect of the programs on the environment. Without it both sorts
     > of systems would be highly unstable.
     >
     > I see it as a necessity for complete modular flexibility. If you get
     > one of the bits that does the work wrong, or wrong for the current
     > environment, how do you allow it to change?
     >
     >> Just my deux centimes' worth.
     >>
     >
     > Appreciated.
     >
     >>
     >> On a more positive note, I do think it is possible for AGI
    researchers
     >> to work together within a common formalism. My presentation at the
     >> AGIRI workshop was about that, and when I get the paper version
    of the
     >> talk finalized I will post it somewhere.
     >>
     >
     > I'll be interested, but sceptical.
     >
     > Will
     >
     > -------
     > To unsubscribe, change your address, or temporarily deactivate your
     > subscription, please go to
     > http://v2.listbox.com/member/[EMAIL PROTECTED]
     >
     >

    -------
    To unsubscribe, change your address, or temporarily deactivate your
    subscription,
    please go to http://v2.listbox.com/member/[EMAIL PROTECTED]




Thank You
James Ratcliff
http://FallsTown.com - Local Wichita Falls Community Website
http://Falazar.com - Personal Website
Hosting Starting at $9.95
Dialups Accounts - $8.95

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

------------------------------------------------------------------------

To unsubscribe, change your address, or temporarily deactivate yoursubscription, please go tohttp://v2.listbox.com/member/[EMAIL PROTECTED]


-------

To unsubscribe, change your address, or temporarily deactivate your subscription,please go to http://v2.listbox.com/member/[EMAIL PROTECTED]

Re: [agi] AGI bottlenecks

Reply via email to