Re: [Pyro-users] Conx with Reinforcement Learning

Douglas S. Blank Sun, 28 Jan 2007 13:15:33 -0800

Chris Spencer wrote:

On 12/19/06, Douglas S. Blank <[EMAIL PROTECTED]> wrote:

Chris,


Yes, there is the beginning of some support in Conx for similar
techniques. You can at least use this as an example. Take a look at the
SigmaNetwork in pyrobot/brain/conx.py. It is an example of a type of
CRBP (complimentary reinforcement backprop). It works like this:

[snip]

Thanks for the great example. However, is it possible for SigmaNetwork
to train multiple distinct outputs? In your example, you have 11
output nodes attempting to approximate XOR. Suppose I wanted a network
that would approximate a policy function, requiring that each output
node represent a unique action. Am I right in thinking that
SigmaNetwork would be unable to perform this task, since it requires
that all the output nodes approximate a single target?

The way that it is written, yes, you are right: the whole output layercomputes a single value. But, you should be able to adapt the sourcecode of the SigmaNetwork to make it so that a subset each computes justone value, and the entire output layer can compute as many as you want(with overlap, too, if you wished).

Also, are you aware of any studies comparing CRBP to TD-Lambda? It
would be interesting to know which performs better.

I don't know of any, but I would be suspicious of some general statementabout one always performing better than the other.


-Doug

Regards,
Chris
_______________________________________________
Pyro-users mailing list
[email protected]
http://emergent.brynmawr.edu/mailman/listinfo/pyro-users


_______________________________________________
Pyro-users mailing list
[email protected]
http://emergent.brynmawr.edu/mailman/listinfo/pyro-users

Re: [Pyro-users] Conx with Reinforcement Learning

Reply via email to