Re: [Computer-go] UCT parameters and application to other games

Daniel Shawul Sat, 26 Mar 2011 05:13:31 -0700

Hello,

I am using monte carlo playouts for the UCT method. It can do about 10k/sec.
The UCT tree is expanded to a depth of  d = 3 in a 5 sec search, from then
onwards a random playout (with no bias)
is carried out.  Actually it is a 'patial playout' which doesn't go to the
end of the game, rather upto a depth of MAX_PLY=96.
 If the game has ended earlier, then a win/draw/loss is returned. Otherwise
I  forcefully end the game by using a determinstic eval
and assign a WDL. For 9x9 go actually most of random playouts end before
move 81.
For the alpha-beta searcher , I do classical evaluation. With heavy use of
reductions
I can get a depth of 14 half plies , which seems to give it quite an edge
against the UCT version.


Is the depth of expansion for the UCT tree too low ? (d = 3 in a 5 sec
search). Should I lower the UCTK parameter
to 0.1 or so which seems to give me a depth = 7 at the start positon of a
9x9 go. I am confident my implementation is
correct because it is working quite well in my checkers program despite my
expectation.

thanks
Daniel

On Sat, Mar 26, 2011 at 7:54 AM, Erik van der Werf <[email protected]
> wrote:

> It sounds like you're using a classical (deterministic) evaluation
> function.
> Try combining UCT with Monte Carlo evaluation.
>
> Erik
>
>
> On Sat, Mar 26, 2011 at 12:43 PM, Daniel Shawul <[email protected]> wrote:
> > Hello,
> > I am very new to UCT,  just implemented basic UCT for go yesterday.
> > But with no success so far for GO,I think  mostly because it searches not
> > very deep (depth = 3 on a 5 sec search with those values).
> > I am using the following values as UCT parameters
> > UCTK = sqrt(1/5) = 0.44     UCTN = 10 (visits afte which best move is
> > expanded)
> > Even if I lower UCTK down to 7 I get a maximum depth of d=7 at the start
> > position for a 5 sec search.
> > For how deep a search should I tune these parameter for ?
> > Before UCT,  I have an alpha-beta searcher which sometimes plays on CGOS.
> > It reached a level of ~1500, and this engine seems to be too strong for
> the
> > UCT version.
> >  It just gets outsearched in some tactical positions and also in
> evaluation
> > I think.
> > For example, I have an evaluation term which gives big bonuses for
> connected
> > strings which seems
> > to give an edge in a lot of games.. How do you introduce such eval terms
> in
> > UCT ?
> > But for my checkers program , to my big surprise , UCT made a significant
> > impact. The regular
> > alpha-beta searcher averages a depth=25 but the UCT version I think is
> > equally strong from the games
> > I saw. That was a kind of surprise for me because I thought UCT would
> work
> > better for bushy trees and
> > when the eval has a lot of strategy. It also reached good depths
> averaging
> > 16 plies .
> > My checkers eval had only material in it, so I don't know if UCT
> is bringing
> > strategy (distant information) to the game
> > which the other one don't have.The games are not really played out to the
> > end rather to a MAX_PLY = 96
> > afte which the material is counted and a WDL score is assigned (I call it
> > partial playout).
> > Also the fact that captures are forced seem to help a lot because it
> doesn't
> > make too many mistakes.
> > I also found out some positions where it encounters similar problems as
> > ladders in go. But in the checkers case,
> > this problems are still solved correctly. Only problem is that it doesn't
> > report correct looking winning rates.
> > For example, in a position with two kings where one of the kings is
> chasing
> > the other to the sides to mate it, but
> > the loosing king can draw by making a serious of correct moves to get
> itself
> > to one of the safe corners; The program
> > displays winning rates of 0.01 (when it should have been more like 0.5)
> but
> > it still manages the draw !
> > thanks and apologies for the verbose email
> > Daniel
> > _______________________________________________
> > Computer-go mailing list
> > [email protected]
> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
> >
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>

_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] UCT parameters and application to other games

Reply via email to