Re: [agi] "Reinforcement Learning: An Introduction" Richard S. Sutton and Andrew G. Barto

Robert Wensman Mon, 25 Jun 2007 01:43:16 -0700

To put it bluntly: Isn't the concept of reinforcement learning (as typically
described) exactly the kind of simplistic AI models that AGI tries to
distance itself from? Included below is a relevant section of a mail I
previously sent to this mailing list. In short, I do not believe AGI is
possible based on one single reinfocement loop, so to study them in
isolation is of little use (for more detail, read my section about parallell
adaptation). Also I do not know wether that document you showed gives any
clues about how to scale the method described, but it is scalability that
would be most important for any AGI learning algorithms.


I also do not agree that learning about unscalable AI methods is a good way
to start learning about AGI. When learning a more scalable methodology, it
is sometimes necessary to force oneself not to think in terms of unscalable
methods.

For example, when artists learn do draw portrait, they need to aquire a
drawing skill that scales to the whole face and the whole body they are
about to draw. It is not enough to get an ear and nose right if they are not
in the correct relative position! So to do this, arts teachers needs to
force their students to stop think about the details, and there are various
methods for doing this. For example in model drawing there is often a time
limit that forces the student to draw very quickly on the entire canvas,
giving no time to care about details. Arts students also use more crude
tools, like charcoal that gives little room for expressing detail. When
fully trained, the student can abstractly disposition a complex scene, and
only then it is time to care about details!

So if anything, I feel reinforcement learning is the completley wrong thing
to start with for someone wanting to learn about AGI. Reading about systems
like Novamente (which I found very interesting by the way, maybe I will
write some comments on it later) would be a much better start I feel.

/Robert Wensman

...

*The principle of parallel adaptation*
As any programmer knows, premature optimisation is often a bad idea, but on
the other hand some optimisation principles are of such importance, that it
is necessary to take them into consideration from the very outset. One such
principle is that of *parallel adaptation*. If we are to build an adaptive
system, working on any domain complex enough, it might be infeasible if we
cannot divide the feedback streams into smaller ones that can be considered
independently.

Assume for example that a certain system has a state of 32 bits that are to
be adapted based on some feedback stream. Assume also that there are very
little we can assume about the feedback when creating the system, so
basically we just have to test different states at random and analyse the
feedback stream. The time to adapt such a system could hence be in the order
of 2^32. But if instead we can divide the feedback into two steams, each
giving feedback to 16 bits correspondingly, then the time to adaptation will
suddenly be in the order of 2^17. Hence this is a performance gain of such
magnitude that we cannot ignore it, and if evolution creates anywhere near
optimal solutions, we can be almost sure of that our own brain utilizes this
principle if possible somehow.

As you might have noticed my definition of AGI and the assumptions I make
about the world are influenced by this principle. For example, I find it
necessary for an AGI system to make a clear distinction between knowledge
about the world, and ideas about how to act, partially because mixing them
could lead to an infeasible adaptation problem. Also, I think assumption (3)
I make about the world, namely that the world consists out of a number of
objects that could be understood independently, is a key point in any
attempt to build an effective AGI.

When I was a student I often met AI researchers and other studens interested
in AI, who believed it would be impossible to separate behaviour from
knowledge in a true AGI, and many held the belief that if we were to build a
true AGI, it would be impossible for us to understand its internal structure
anyway. Essentially their approach would point towards one single feedback
stream for the entire AGI, much like the simple neural networks that were so
popular in media a couple of years ago. But there is a risk that if we do
not take paralellization of adaptation seriously enough the adaptation
process for a larger class of AGI becomes infeasible, and nothing is going
to happen in a million years!

When it comes to assumption 3 about the world, I have only roughly
envisioned some ways to utilize it. It seems that if memes are used to
describe the objects of the world, then we need some kind of mechanism that
tries to guess what objects are present in a given situation, and which are
not, so any adaptation of the world model that occurs, is targeted to the
right objects. This is what I would call *spatial meme adaptation*.

To elaborate further on this idea, assume that the AGI keeps track of
*n*situations called
*spatials*, and for each such spatial, there is a *frequency function *that
maps memes to a frequency, a real value between 0 and 1 that represents how
likely different objects are to occur in this specific situation, in effect
defining the profile of a population. Another way could be that each spatial
explicitly contains its own population of memeplexes. For any given input
the AGI would then perform the following steps.

1.       Choose spatial.

2.       Perform evolutionary adaptation of the current spatial.

There might be a lot of research threads to follow here, for example, is
spatial meme adaptation feasible at all, find out different methods to
organize the spatials, and would it make sense to have the concept of
recursive spatials? (Based on the human language nesting capability, some
recursive mechanism in the brain is limited to 2-3 steps, could it be the
recursive spatials?). It can be noted that the idea of recursive spatial
meme adaptation has structural similarities with Marvin Minsky's "Society of
Mind" idea, but focuses more on adaptation and world modelling rather than
behavior.



...








2007/6/22, Lukasz Stafiniak <[EMAIL PROTECTED]>:


Obligatory reading:
http://www.cs.ualberta.ca/~sutton/book/ebook/the-book.html

Cheers.

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;


-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=231415&user_secret=e9e40a7e

Re: [agi] "Reinforcement Learning: An Introduction" Richard S. Sutton and Andrew G. Barto

Reply via email to