Steve and Ben,

We humans have a scoring system that supervises our learning, too. Our
"score" is the sum satisfaction level we experience at any given moment.
Contributions to this score include pleasure, pain, and
homeostasis-modulated signals such as hunger pangs, thirst, etc. There are
also less tangible contributions such as social status/approval,
self-sufficiency, power, morality, and self-determination. The more
tangible ones are mostly computed by hardwired mechanisms in the brain
which do not themselves learn or adapt. The less tangible ones typically
rely to some degree on the state of the world model constructed by the
areas of the brain which do learn and adapt, but the polarity (and probably
magnitude) of the contribution is hardwired.

The functionality of the brain quite plainly incorporates both supervised
and unsupervised learning. Our understanding and perceptual processing are
largely handled by unsupervised learning mechanisms which serve to
construct a (relatively) unbiased model for reality, whereas our attention
and behavior are largely handled by supervised learning mechanisms that
attempt to optimize the "score" I described above, making use of that
constructed world model to do so.

Looking at evolution, the designer of our minds, we can glean some evident
design constraints which corroborate this viewpoint. The brain isn't just
an extra part thrown in for no reason; otherwise, evolution would have
eliminated such a costly and useless burden. The brain serves to coordinate
behavior, ensuring that certain conditions are maintained and certain goals
are met which maximize the likelihood of survival and subsequent successful
reproduction. Each of the contributions to the "score" I described above
maps obviously to either an invariant that aids survival or a measure whose
optimization correlates with survival and/or reproduction. The unsupervised
learning mechanisms, on the other hand, serve to generate a model for
reality which aids the supervised learning mechanisms in more effectively
accomplishing homeostasis or optimization, and which also supports the
measurement of the less tangible factors in the "score". It does this by
reducing the dimensionality of the problem and extracting the components of
the sensory data stream that are most useful for prediction.

It seems a bit arbitrary to me to discount the accomplishments of Deep Mind
on the basis that it is "just" supervised learning that makes use of some
unsupervised learning. Yes, they need much beefier unsupervised learning to
build a comprehensive world model in support of the RL, but that
score-based "supervised" component is actually vital to intelligence.


On Wed, Oct 21, 2015 at 4:18 AM, Ben Goertzel <[email protected]> wrote:

>
> Well, most unsupervised deep learning algorithms these days involve some
> unsupervised pre-training of the network...
>
> But the crux of Deep Mind's video-game demonstration was RL where the game
> score was the "supervision" providing the utility function, yeah...
>
> I should add that Deep Mind is doing a huge amount of other stuff besides
> this game-focused stuff -- that just happens to be the aspect of their work
> that yields the funkiest demos...
>
> ben
>
> On Wed, Oct 21, 2015 at 5:15 PM, Steve Richfield <
> [email protected]> wrote:
>
>> Ben,
>>
>> Am I reading this right - that this is "just" a VERY good demo of
>> SUPERVISED learning - with no unsupervised components?
>>
>> Steve
>> =========
>>
>> On Tue, Oct 20, 2015 at 11:19 PM, Ben Goertzel <[email protected]> wrote:
>>
>>>
>>> On Wed, Oct 21, 2015 at 1:54 PM, Benjamin Kapp <[email protected]>
>>> wrote:
>>>
>>>> They are using reinforcement learning to train their system.  But one
>>>> of the problems with this is that it is dependent on a reward/punishment
>>>> system which for them is determined by game scores.  But in the real world
>>>> there is no game score.  Also in the game world game score is temporally
>>>> closely related to the actions the agent performs.  However in the real
>>>> world rewards and punishments may be delayed by a great deal of time (if
>>>> they are ever given).
>>>>
>>>
>>>
>>> Demis, Shane and the other Deep Mind folks are well aware of these
>>> issues, of course...
>>>
>>> A fallacy I commonly see is that people like to compare their own ideas,
>>> with other peoples' practical demonstrations...   Of course Deep Mind's
>>> practical demos, so far, embody only a small fraction of their ideas and
>>> understanding...
>>>
>>>
>>>>
>>>> Further, Demis says in his talk they assume humans gain knowledge from
>>>> experience, however the poverty of the stimulus argument proposed by
>>>> Chomsky demonstrates clearly that humans acquire language faster than is
>>>> possible given the limited stimuli they are exposed to.  As such (for some
>>>> kinds of knowledge of the world at least) it seems that human knowledge
>>>> acquisition is due in no small part to a priori instinctual knowledge,
>>>> something they do not seem to be representing in their system.
>>>>
>>>
>>>
>>> While I agree that humans have some inborn "inductive bias" as well as
>>> some specific hard-wired skills, the extent and nature of this bias and
>>> hard-wiring in the context of language is certainly not well-understood
>>> currently.   Derek Bickerton's writing on this topic has often been
>>> interesting....  I note that using a deep network with a specific
>>> architecture for language understanding is also a way of coding "inductive
>>> bias" into one's architecture....   It is not clear how specific are the
>>> inductive biases for language encoded into the human brain...  Old-style
>>> Chomskian "principles and parameters" ideas are clearly not correct in
>>> detail...
>>>
>>>
>>>>
>>>> Also even if Demis is up to speed on all the latest knowledge from the
>>>> domains of the mind sciences (which he likely is not), it wouldn't be the
>>>> case that he would know how the brain functions deterministically since
>>>> this is still outside the scope of human knowledge.  As such he and his
>>>> team can only guess at how the brain does intelligence.
>>>>
>>>
>>>
>>> Demis is a top-grade neuroscientist as well as an AGI guy, and many of
>>> the Deep Mind folks are deeply into cognitive neuroscience.  They know the
>>> latest research....   Which means they also know how incomplete that
>>> research is, yeah...
>>>
>>> -- Ben
>>> *AGI* | Archives <https://www.listbox.com/member/archive/303/=now>
>>> <https://www.listbox.com/member/archive/rss/303/10443978-6f4c28ac> |
>>> Modify <https://www.listbox.com/member/?&;> Your Subscription
>>> <http://www.listbox.com>
>>>
>>
>>
>>
>> --
>> Full employment can be had with the stoke of a pen. Simply institute a
>> six hour workday. That will easily create enough new jobs to bring back
>> full employment.
>>
>> *AGI* | Archives <https://www.listbox.com/member/archive/303/=now>
>> <https://www.listbox.com/member/archive/rss/303/212726-deec6279> | Modify
>> <https://www.listbox.com/member/?&;> Your Subscription
>> <http://www.listbox.com>
>>
>
>
>
> --
> Ben Goertzel, PhD
> http://goertzel.org
>
> "The reasonable man adapts himself to the world: the unreasonable one
> persists in trying to adapt the world to himself. Therefore all progress
> depends on the unreasonable man." -- George Bernard Shaw
> *AGI* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/23050605-2da819ff> |
> Modify
> <https://www.listbox.com/member/?&;>
> Your Subscription <http://www.listbox.com>
>



-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Reply via email to