It's good that you posted this. I liked the paper. (And I understand proof number search a little better after reading it too.) But it describes the epsilon trick in terms of transposition tables which I don't use. I didn't see how it would apply to a broader range of applications or, specifically, to my own program. The text below explains it nicely. I will definitely try this. - Dave Hillis -----Original Message----- From: [EMAIL PROTECTED] To: computer-go@computer-go.org Sent: Sun, 14 Jan 2007 4:50 AM Subject: [computer-go] Fast UCT (epsilon trick)
Hi I've looked back into the article and I see that it is not trivial to understand how does the epsilon trick (ET) works. Therefore I will describe it here on the example of UCT. The bottleneck of UCT algorithm is choosing a move. Program has to iterate over all children evaluating their UCB value. We want to use of this loop rarely. One way to do it is to group playouts together. I.e. play 2 or more MC simulations per each tree traversal. But this is not good enough as it doesn't distribute the playouts over the tree and some parts will not get enough of them. The ET works as follows: With each node of the tree program keeps additional information: - child_to_visit - visits_to_go When we visit node, if node.visits_to_go > 0 then node->visits_to_go <- node.visits_to_go - 1 node = child_to_visit So we have predetermined strategy to go to particular child visits_to_go times. But if node.visis_to_go = 0 then // we have to set the strategy first node->child_to_visit = UCT_find_child (node) node->visits_to_go = Epsilon * node->visited_count // round up As we can see, when Epsilon is small or when a particular node is visited not too many times, UCT-ET is identical to UCT. But when node was visited more times (near the root, far from leaves), then UCT-ET may save some time and go to precalculated child. That's it. I hope You like it :) Ćukasz Lew PS. In Df-PN, good Epsilon was 1/4 I believe that here it will be much smaller. 1/64 maybe. Or even one may change the Epsilon equation to: node->visits_to_go = Epsilon * sqrt(node->visited_count) If someone will do some experiments, I'm eager to hear the results. PS2 float InvSqrt (float x) { float xhalf = 0.5f * x; int i = *(int*)&x; i = 0x5f3759df - (i>>1); x = *(float*)&i; x = x * (1.5f - xhalf * x * x); return x; } ;-) _______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ________________________________________________________________________ Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam and email virus protection.
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/