I'm trying to use Conx to train a couple networks to aid in a board game engine, and I'd like to get some feedback on my strategy. The networks will be used in a minimax search to prune the state tree and evaluate the player's position. Both networks will take the board as input, with a layer for each position on the board. One network, the move suggester, will output a layer for each action, indicating how much performing that action will increase the player's score. The second network, the score estimator, will output estimates of each player's score.
My question is what is the best way to train these networks? My current strategy is to do nothing until the game is over. I'll use a static algorithm to reliably score the end game state. Then I'll create a training corpus by taking the score and iterating through each move in the game, creating training sets in the form of [boardstate, finalscore]. For the move suggester, I think I'd have to disable the output layers for the actions not selected. This method seems fairly primitive and naive, but I've never done time difference learning with neural networks before. Is there a better way? Any suggestions are appreciated. Regards, Chris _______________________________________________ Pyro-users mailing list [email protected] http://emergent.brynmawr.edu/mailman/listinfo/pyro-users
