It's good that you posted this. I liked the paper. (And I understand 
proof number search a little better after reading it too.) But it describes the 
epsilon trick in terms of transposition tables which I don't use. I didn't see 
how it would apply to a broader range of applications or, specifically, to my 
own program.
 
     The text below explains it nicely. I will definitely try this.
 
- Dave Hillis
 
 
-----Original Message-----
From: [EMAIL PROTECTED]
To: computer-go@computer-go.org
Sent: Sun, 14 Jan 2007 4:50 AM
Subject: [computer-go] Fast UCT (epsilon trick)


Hi 
I've looked back into the article and I see that it is not trivial to 
understand how does the epsilon trick (ET) works. Therefore I will 
describe it here on the example of UCT. 
 
The bottleneck of UCT algorithm is choosing a move. Program has to 
iterate over all children 
evaluating their UCB value. We want to use of this loop rarely. 
One way to do it is to group playouts together. I.e. play 2 or more MC 
simulations per each tree traversal. But this is not good enough as it 
doesn't distribute 
the playouts over the tree and some parts will not get enough of them. 
 
The ET works as follows: 
With each node of the tree program keeps additional information: 
- child_to_visit 
- visits_to_go 
 
When we visit node, 
if node.visits_to_go > 0 then 
 node->visits_to_go <- node.visits_to_go - 1 
 node = child_to_visit 
 
So we have predetermined strategy to go to particular child visits_to_go times. 
 
But 
if node.visis_to_go = 0 then // we have to set the strategy first 
 node->child_to_visit = UCT_find_child (node) 
 node->visits_to_go = Epsilon * node->visited_count // round up 
 
As we can see, when Epsilon is small or when a particular node is 
visited not too many times, UCT-ET is identical to UCT. 
 
But when node was visited more times (near the root, far from leaves), 
then UCT-ET may save some time and go to precalculated child. 
 
That's it. 
I hope You like it :) 
Ɓukasz Lew 
 
PS. 
 
In Df-PN, good Epsilon was 1/4 I believe that here it will be much 
smaller. 1/64 maybe. 
Or even one may change the Epsilon equation to: 
node->visits_to_go = Epsilon * sqrt(node->visited_count) 
 
If someone will do some experiments, I'm eager to hear the results. 
 
PS2 
 
float InvSqrt (float x) { 
  float xhalf = 0.5f * x; 
  int i = *(int*)&x; 
  i = 0x5f3759df - (i>>1); 
  x = *(float*)&i; 
  x = x * (1.5f - xhalf * x * x); 
  return x; 
} 
 
;-) 

_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/
________________________________________________________________________
Check Out the new free AIM(R) Mail -- 2 GB of storage and industry-leading spam 
and email virus protection.
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to