Re: [Computer-go] Monte Carlo (upper confidence bounds appliedto trees)

Aja Tue, 26 Oct 2010 12:30:45 -0700

In Arpad Rimmel's PhD thesis(http://tao.lri.fr/Papers/thesesTAO/RimmelPhD.pdf), Mogo's UCT formula (page50) is really too complicated to try for me. Have anyone try that formula?

Aja

----- Original Message -----From: "David Fotland" <[email protected]>

To: <[email protected]>
Sent: Tuesday, October 26, 2010 11:44 PM

Subject: Re: [Computer-go] Monte Carlo (upper confidence bounds appliedtotrees)

Erica and many Faces do not set c = 0, so there are many strong goprograms that do not set c to zero. Fuego and Mogo use c = 0. I'm notsure about others.


David

-----Original Message-----
From: [email protected] [mailto:computer-go-
[email protected]] On Behalf Of Jason House
Sent: Tuesday, October 26, 2010 6:01 AM
To: ???? ??????; [email protected]

Subject: Re: [Computer-go] Monte Carlo (upper confidence bounds appliedto

trees)

It looks like your example is using average win rate instead of an upper
bound. Rather than M/N UCT uses M/N + c*log(N1+N2+N3+...)/sqrt(N)

where Ni is the total simulations of the ith move. That extra term means

moves with initially bad results will be revisited. Stronger go programsadd

in some other terms and set c=0, but that isn't UCT.

Sent from my iPhone

On Oct 26, 2010, at 6:43 AM, ???? ?????? <[email protected]> wrote:

> Hello all!
> I would much appreciate if someone could explain me what exactly "upper
confidence bounds applied to trees" is.
>
> I'm still not sure that UCT works good.
>
> For example, I have a position with two possible moves (a1 and b1):
>
>    O
>  /   \
> (a1) (b1)
>
>

> First random game (using a1 as first move) returns 0, second (b1)> returns

1:
>
>    O
>  /   \
> (a1) (b1)
> (0/1)(1/1)
>
>

> Ok, let's try to explore b1 move (the program likes it, because uct of> a1

= 0):
>
>    O
>  /   \
> (a1) (b1)
> (0/1)(1/1)
>     / | \
>   a2 b2 c2
>
>
> a2 returns 0, b2 and c2 returns 1:
>
>    O
>  /   \
> (a1) (b1)
> (0/1)(3/4)
>     / | \
>   a2 b2 c2
>   0   1  1
>
>

> So the program will explore b2 and c2 (starting from b1), but maybe a2> isthe only refuting move. Maybe a1 is better, but the program will continueto

explore b1, because utc of a1 is 0 and utc of b1 is more than 0.
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


_______________________________________________
Computer-go mailing list
[email protected]

http://dvandva.org/cgi-bin/mailman/listinfo/computer-go


_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] Monte Carlo (upper confidence bounds appliedto trees)

Reply via email to