Re: [Computer-go] UCT parameters and application to other games

Erik van der Werf Sat, 26 Mar 2011 05:52:22 -0700

Ah ok, I misunderstood.

Still something seems to be wrong. On the empty 9x9 board I think most
programs with random/light playouts play in the order of 110 moves.
~81 moves seems quite low; in my experience you can only get such low
numbers to work well if you have a lot of knowledge in your playouts.
Did you check the quality of the evaluations/playouts?


If you want UCT to search deeper you need good priors and perhaps
something like rave/amaf.

Best,
Erik


On Sat, Mar 26, 2011 at 1:13 PM, Daniel Shawul <[email protected]> wrote:
> Hello,
> I am using monte carlo playouts for the UCT method. It can do about 10k/sec.
> The UCT tree is expanded to a depth of  d = 3 in a 5 sec search, from then
> onwards a random playout (with no bias)
> is carried out.  Actually it is a 'patial playout' which doesn't go to the
> end of the game, rather upto a depth of MAX_PLY=96.
>  If the game has ended earlier, then a win/draw/loss is returned. Otherwise
> I  forcefully end the game by using a determinstic eval
> and assign a WDL. For 9x9 go actually most of random playouts end before
> move 81.
> For the alpha-beta searcher , I do classical evaluation. With heavy use of
> reductions
> I can get a depth of 14 half plies , which seems to give it quite an edge
> against the UCT version.
> Is the depth of expansion for the UCT tree too low ? (d = 3 in a 5 sec
> search). Should I lower the UCTK parameter
> to 0.1 or so which seems to give me a depth = 7 at the start positon of a
> 9x9 go. I am confident my implementation is
> correct because it is working quite well in my checkers program despite my
> expectation.
> thanks
> Daniel
>
> On Sat, Mar 26, 2011 at 7:54 AM, Erik van der Werf
> <[email protected]> wrote:
>>
>> It sounds like you're using a classical (deterministic) evaluation
>> function.
>> Try combining UCT with Monte Carlo evaluation.
>>
>> Erik
>>
>>
>> On Sat, Mar 26, 2011 at 12:43 PM, Daniel Shawul <[email protected]> wrote:
>> > Hello,
>> > I am very new to UCT,  just implemented basic UCT for go yesterday.
>> > But with no success so far for GO,I think  mostly because it searches
>> > not
>> > very deep (depth = 3 on a 5 sec search with those values).
>> > I am using the following values as UCT parameters
>> > UCTK = sqrt(1/5) = 0.44     UCTN = 10 (visits afte which best move is
>> > expanded)
>> > Even if I lower UCTK down to 7 I get a maximum depth of d=7 at the start
>> > position for a 5 sec search.
>> > For how deep a search should I tune these parameter for ?
>> > Before UCT,  I have an alpha-beta searcher which sometimes plays on
>> > CGOS.
>> > It reached a level of ~1500, and this engine seems to be too strong for
>> > the
>> > UCT version.
>> >  It just gets outsearched in some tactical positions and also in
>> > evaluation
>> > I think.
>> > For example, I have an evaluation term which gives big bonuses for
>> > connected
>> > strings which seems
>> > to give an edge in a lot of games.. How do you introduce such eval terms
>> > in
>> > UCT ?
>> > But for my checkers program , to my big surprise , UCT made a
>> > significant
>> > impact. The regular
>> > alpha-beta searcher averages a depth=25 but the UCT version I think is
>> > equally strong from the games
>> > I saw. That was a kind of surprise for me because I thought UCT would
>> > work
>> > better for bushy trees and
>> > when the eval has a lot of strategy. It also reached good depths
>> > averaging
>> > 16 plies .
>> > My checkers eval had only material in it, so I don't know if UCT
>> > is bringing
>> > strategy (distant information) to the game
>> > which the other one don't have.The games are not really played out to
>> > the
>> > end rather to a MAX_PLY = 96
>> > afte which the material is counted and a WDL score is assigned (I call
>> > it
>> > partial playout).
>> > Also the fact that captures are forced seem to help a lot because it
>> > doesn't
>> > make too many mistakes.
>> > I also found out some positions where it encounters similar problems as
>> > ladders in go. But in the checkers case,
>> > this problems are still solved correctly. Only problem is that it
>> > doesn't
>> > report correct looking winning rates.
>> > For example, in a position with two kings where one of the kings is
>> > chasing
>> > the other to the sides to mate it, but
>> > the loosing king can draw by making a serious of correct moves to get
>> > itself
>> > to one of the safe corners; The program
>> > displays winning rates of 0.01 (when it should have been more like 0.5)
>> > but
>> > it still manages the draw !
>> > thanks and apologies for the verbose email
>> > Daniel
>> > _______________________________________________
>> > Computer-go mailing list
>> > [email protected]
>> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>> >
>> _______________________________________________
>> Computer-go mailing list
>> [email protected]
>> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
>
> _______________________________________________
> Computer-go mailing list
> [email protected]
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
>
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Re: [Computer-go] UCT parameters and application to other games

Reply via email to