Chris S wrote:
Can a Conx NN currently be trained via reinforcement learning
techniques, such as TD(lambda)?

Chris,

Yes, there is the beginning of some support in Conx for similar techniques. You can at least use this as an example. Take a look at the SigmaNetwork in pyrobot/brain/conx.py. It is an example of a type of CRBP (complimentary reinforcement backprop). It works like this:

1. Propagate the activations through the network
2. Flip a coin for each output, using the output as a bias. For example, if the output is .75, then the flip will be biased to be 1.0, otherwise it will be 0.0. 3. Sum up those 1 and 0 values. Use those as "votes", where majority rules. Ie, if a majority have a value of 1, then the network says 1. 4. Compare that vote with the target (just one value in this example). If it matches, then train the net with the vector from step 2 as a target vector. If it doesn't match, then use the complement of the vector from step 2 as the target vector.
5. Set the error of the output layer with the difference from step 5.
6. Do backprop as usual.
7. repeat

Here is a CRBP network for learning XOR with the SigmaNetwork:

from pyrobot.brain.conx import SigmaNetwork
net = SigmaNetwork()
net.addLayers(2, 5, 11)
net.setInputs( [[0, 0], [0, 1], [1, 0], [1, 1]] )
net.setTargets( [[0], [1], [1], [0]] )
net.train()
net.interactive = 1
net.learning = 0
net.sweep()

Notice that the output layer has 11 nodes. You can have as many as you want, because it is the sum that is important. (The hidden can be any size too). The network is free to create a self-organized output representation. See:

Ackley and Littman
http://www.cs.duke.edu/~mlittman/docs/nips-crbp.ps

for more information on CRBP.

The SigmaNetwork shows how you can separate the backprop steps in order to get in there and set the target and error independently of the normal BP process.

Hope that helps,

-Doug

Regards,
Chris
_______________________________________________
Pyro-users mailing list
[email protected]
http://emergent.brynmawr.edu/mailman/listinfo/pyro-users



_______________________________________________
Pyro-users mailing list
[email protected]
http://emergent.brynmawr.edu/mailman/listinfo/pyro-users

Reply via email to