I've noticed that Orego has done very poorly in the last couple of competitions. This is partly due to the improvements in others' programs, but I think the fact that Orego currently doesn't have a transposition table is crippling. Since I'm rewriting this stuff in Java, I'm thinking about how to handle this, and would like to offer up the following plan for all of you to poke holes in. I'm not sure if I'm reinventing the CrazyStone structure here, but I think this one is slightly different.

DATA STRUCTURE

The "tree" is a enormous, preallocated array of Node objects. (Because this array is also being used as a hash table, the size of the table should be a prime number.) This is really a directed acyclic graph (DAG) rather than a tree, because a Node may be a child of more than one parent.

Each Node contains the following information:

A+1 "pointers" (in some sense) to child Nodes, where A is the area of the board. (The extra pointer is for the pass move.) Some of these pointers (initially, all of them) may be the special value UNEXPLORED.

The number of runs through this Node so far.

The number black wins through this Node so far.

A Zobrist hash of the position represented by this Node. This must incorporate, in addition to the locations of stones on the board, the turn number (e.g., zero for the beginning of the game) and the number of consecutive passes.

The turn number, stored explicitly.

A forced leaf turn number, to be explained below.

(Note that, in contrast to the current Orego structure, we won't need tails or a free list for this one.)

GENERATING A MONTE CARLO RUN

I want to separate the process of generating a Monte Carlo run from the process of modifying the tree for two reasons. First, this will make multithreading easier. Second, I will be able to incorporate recorded games into the tree by pretending that the program had played them.

To generate a Monte Carlo run, I start at the root Node. Each move is chosen using UCT, based on UCB1-TUNED as described on p. 5 of the recent Gelly et al tech report on MoGo. As in the current Orego, this is done with a double-hashing scheme and in a way that always chooses an untried move if one exists.

There is one complication here. The number of runs through the children might exceed the number of runs through the parent because of transpositions. In Gelly's notation, this means Tj(n) may exceed n. I THINK I can ignore this, as it will just give "oversampled" moves unusually narrow confidence windows.

If we are not at a Node (because we're in an unexplored region of the search space), if the current Node has a forced leaf turn number greater than or equal to the root turn number, or if any child is found that is not at the correct turn number (because of a hash collision), the move is chosen randomly.

INCORPORATING A GAME INTO THE TREE

Again, we start at the root Node. We work down the tree, updating the run and black win count of each node encountered. If the child pointer we would be following is UNEXPLORED, we use the aforementioned Zobrist hash (modulo the table size) to find a Node to use as the child. This may be a Node that has already been initialized, a "fresh" Node (i.e., one that we can overwrite), or a Node that we can't overwrite.

If the child has the correct turn number, either we have been down this path before or we have found a transposition. In either case, continue down the tree.

If the child has a turn number that is lower than that of the root, it is an old node and can be overwritten. We reinitialize this one node, update its run and black win counts, and stop. Thus, every run adds at most one node to the tree.

If the offending child is not so old that it can be overwritten, we must leave it alone lest it mess up another part of the tree. At this point, the parent's forced leaf turn number is set to the child's turn number; all moves from that parent will be random until a later point in the game when it is safe to overwrite the offending child.


I welcome your input,

Peter Drake
Assistant Professor of Computer Science
Lewis & Clark College
http://www.lclark.edu/~drake/



_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to