Hello all!
I would much appreciate if someone could explain me what exactly "upper
confidence bounds applied to trees" is.
I'm still not sure that UCT works good.
For example, I have a position with two possible moves (a1 and b1):
O
/ \
(a1) (b1)
First random game (using a1 as first move) returns 0, second (b1) returns 1:
O
/ \
(a1) (b1)
(0/1)(1/1)
Ok, let's try to explore b1 move (the program likes it, because uct of a1 = 0):
O
/ \
(a1) (b1)
(0/1)(1/1)
/ | \
a2 b2 c2
a2 returns 0, b2 and c2 returns 1:
O
/ \
(a1) (b1)
(0/1)(3/4)
/ | \
a2 b2 c2
0 1 1
So the program will explore b2 and c2 (starting from b1), but maybe a2 is the
only refuting move. Maybe a1 is better, but the program will continue to
explore b1, because utc of a1 is 0 and utc of b1 is more than 0.
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go