In Arpad Rimmel's PhD thesis
(http://tao.lri.fr/Papers/thesesTAO/RimmelPhD.pdf), Mogo's UCT formula (page
50) is really too complicated to try for me. Have anyone try that formula?
Aja
----- Original Message -----
From: "David Fotland" <[email protected]>
To: <[email protected]>
Sent: Tuesday, October 26, 2010 11:44 PM
Subject: Re: [Computer-go] Monte Carlo (upper confidence bounds appliedto
trees)
Erica and many Faces do not set c = 0, so there are many strong go
programs that do not set c to zero. Fuego and Mogo use c = 0. I'm not
sure about others.
David
-----Original Message-----
From: [email protected] [mailto:computer-go-
[email protected]] On Behalf Of Jason House
Sent: Tuesday, October 26, 2010 6:01 AM
To: ???? ??????; [email protected]
Subject: Re: [Computer-go] Monte Carlo (upper confidence bounds applied
to
trees)
It looks like your example is using average win rate instead of an upper
bound. Rather than M/N UCT uses M/N + c*log(N1+N2+N3+...)/sqrt(N)
where Ni is the total simulations of the ith move. That extra term means
moves with initially bad results will be revisited. Stronger go programs
add
in some other terms and set c=0, but that isn't UCT.
Sent from my iPhone
On Oct 26, 2010, at 6:43 AM, ???? ?????? <[email protected]> wrote:
> Hello all!
> I would much appreciate if someone could explain me what exactly "upper
confidence bounds applied to trees" is.
>
> I'm still not sure that UCT works good.
>
> For example, I have a position with two possible moves (a1 and b1):
>
> O
> / \
> (a1) (b1)
>
>
> First random game (using a1 as first move) returns 0, second (b1)
> returns
1:
>
> O
> / \
> (a1) (b1)
> (0/1)(1/1)
>
>
> Ok, let's try to explore b1 move (the program likes it, because uct of
> a1
= 0):
>
> O
> / \
> (a1) (b1)
> (0/1)(1/1)
> / | \
> a2 b2 c2
>
>
> a2 returns 0, b2 and c2 returns 1:
>
> O
> / \
> (a1) (b1)
> (0/1)(3/4)
> / | \
> a2 b2 c2
> 0 1 1
>
>
> So the program will explore b2 and c2 (starting from b1), but maybe a2
> is
the only refuting move. Maybe a1 is better, but the program will continue
to
explore b1, because utc of a1 is 0 and utc of b1 is more than 0.
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go