Hello all, this is my first time posting to this group. Sorry that it's a
long one.

I've had some ideas on board evaluation functions, and am curious what you
think about this...

In this email, I'm going to refer to neural networks and temporal difference
as the template for learning a good function approximation, but this isn't
the core of what I'm considering (i.e., maybe any function approximator is
suitable).

Here's the idea... First, create a very *small* NN, and train it to the
minimax value of a board position using TD(0) or some derivation.

While training, determine the average output over each board position. Once
training is finished (MSE levels off), instead of using this NN as an actual
board evaluator, we simply make it the root in a binary decision tree. The
tree is split based on the NN's output compared to its average output.
(Question - Would this work to reduce entropy as in ID3/C4.5?)

Now, stop training this NN, and add 2 similar networks as child nodes in the
tree, train them in a similar fashion. Continue this process indefinitely,
growing the decision tree with more and more NNs at each step.

I imagine that each child NN should use not only the board as input, but
also, all of its parent's and ancestors' hidden and output neuron node
values.

The obvious idea is to create evaluators that are increasingly specialized
for certain types of board configurations. I think that NNs that are at a
depth of at least a couple levels should be highly pruned at their input
layer -- it shouldn't be important for each to know every board feature
(especially considering they already have board approximations and
abstracted hidden layer features available as input from parent nodes).

Training would be quite long for any tree of reasonable size, but the
possible benefits are obvious... (1) A lot of "automatic" information might
be encoded in the path through the tree - some board features might be
rendered unimportant at child nodes, allowing them to "focus" on the most
important features, and (2) while scaling the size of the weight parameter
space linearly, the runtime evaluation time only increases logarithmically.

Has anybody done anything like this before?

My final thought concerns how to make this idea useful for go and its very
large branching size...

We could make child nodes evaluate boards one move deeper than their parents
(both during training and runtime). For example, say we're evaluating a
board after move N with a tree of depth 3 (and 7 nodes), the root node
evaluates the board at move (N-2), one level deeper we evaluate the board
after move (N-1), and finally evaluate the current board at the leaf. Or
devote 2 or more levels of the tree for each move (but not the whole tree).
Also, maybe branching is not required at every level (maybe we only split
the board space after every other move, or every fourth move).

In this scenario, explicit information of the current move would be required
as input to each NN - maybe a 5x5 eye of the neighborhood of the move
(before and after), a vector corresponding to the board position where the
move was made, changes in number of stones, liberties, groups, etc. However,
heavy pruning of the inputs should still be done.

Now, evaluating every possible move from a certain board would be made much
faster. And move ordering heuristics would be "built in".

The hope is that successive levels of the tree would tend to localize
specialization of the child nodes to areas of the board where specialization
is needed.

I've started coding some of these ideas into an old TD/NN based Othello
program I wrote some time ago... Though I've wanted to, I haven't delved
into writing any Go programs since the simple task of scoring a game has
some complexity I don't currently have the time or motivation to consider...
Does anybody have suggestions on a good places to start - a good, fast
engine with correct counting, easy adaptation for an NN/TD based bot?

More importantly, what do you think about the idea of creating a decision
tree like this? Do you think it's a dead end? Is there a better way to split
the tree at each node (I've thought of several, but this one seemed the most
promising)?

Thanks!
- Trevor
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to