[Computer-go] Monte Carlo (upper confidence bounds applied to trees)

Петр Смолов Tue, 26 Oct 2010 03:43:35 -0700

Hello all!
I would much appreciate if someone could explain me what exactly "upper 
confidence bounds applied to trees" is.


I'm still not sure that UCT works good.

For example, I have a position with two possible moves (a1 and b1):

    O
  /   \
(a1) (b1)


First random game (using a1 as first move) returns 0, second (b1) returns 1:

    O
  /   \
(a1) (b1)
(0/1)(1/1)


Ok, let's try to explore b1 move (the program likes it, because uct of a1 = 0):

    O
  /   \
(a1) (b1)
(0/1)(1/1)
     / | \
   a2 b2 c2


a2 returns 0, b2 and c2 returns 1:

    O
  /   \
(a1) (b1)
(0/1)(3/4)
     / | \
   a2 b2 c2
   0   1  1


So the program will explore b2 and c2 (starting from b1), but maybe a2 is the 
only refuting move. Maybe a1 is better, but the program will continue to 
explore b1, because utc of a1 is 0 and utc of b1 is more than 0.
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

[Computer-go] Monte Carlo (upper confidence bounds applied to trees)

Reply via email to