Ah ok, I misunderstood. Still something seems to be wrong. On the empty 9x9 board I think most programs with random/light playouts play in the order of 110 moves. ~81 moves seems quite low; in my experience you can only get such low numbers to work well if you have a lot of knowledge in your playouts. Did you check the quality of the evaluations/playouts?
If you want UCT to search deeper you need good priors and perhaps something like rave/amaf. Best, Erik On Sat, Mar 26, 2011 at 1:13 PM, Daniel Shawul <[email protected]> wrote: > Hello, > I am using monte carlo playouts for the UCT method. It can do about 10k/sec. > The UCT tree is expanded to a depth of d = 3 in a 5 sec search, from then > onwards a random playout (with no bias) > is carried out. Actually it is a 'patial playout' which doesn't go to the > end of the game, rather upto a depth of MAX_PLY=96. > If the game has ended earlier, then a win/draw/loss is returned. Otherwise > I forcefully end the game by using a determinstic eval > and assign a WDL. For 9x9 go actually most of random playouts end before > move 81. > For the alpha-beta searcher , I do classical evaluation. With heavy use of > reductions > I can get a depth of 14 half plies , which seems to give it quite an edge > against the UCT version. > Is the depth of expansion for the UCT tree too low ? (d = 3 in a 5 sec > search). Should I lower the UCTK parameter > to 0.1 or so which seems to give me a depth = 7 at the start positon of a > 9x9 go. I am confident my implementation is > correct because it is working quite well in my checkers program despite my > expectation. > thanks > Daniel > > On Sat, Mar 26, 2011 at 7:54 AM, Erik van der Werf > <[email protected]> wrote: >> >> It sounds like you're using a classical (deterministic) evaluation >> function. >> Try combining UCT with Monte Carlo evaluation. >> >> Erik >> >> >> On Sat, Mar 26, 2011 at 12:43 PM, Daniel Shawul <[email protected]> wrote: >> > Hello, >> > I am very new to UCT, just implemented basic UCT for go yesterday. >> > But with no success so far for GO,I think mostly because it searches >> > not >> > very deep (depth = 3 on a 5 sec search with those values). >> > I am using the following values as UCT parameters >> > UCTK = sqrt(1/5) = 0.44 UCTN = 10 (visits afte which best move is >> > expanded) >> > Even if I lower UCTK down to 7 I get a maximum depth of d=7 at the start >> > position for a 5 sec search. >> > For how deep a search should I tune these parameter for ? >> > Before UCT, I have an alpha-beta searcher which sometimes plays on >> > CGOS. >> > It reached a level of ~1500, and this engine seems to be too strong for >> > the >> > UCT version. >> > It just gets outsearched in some tactical positions and also in >> > evaluation >> > I think. >> > For example, I have an evaluation term which gives big bonuses for >> > connected >> > strings which seems >> > to give an edge in a lot of games.. How do you introduce such eval terms >> > in >> > UCT ? >> > But for my checkers program , to my big surprise , UCT made a >> > significant >> > impact. The regular >> > alpha-beta searcher averages a depth=25 but the UCT version I think is >> > equally strong from the games >> > I saw. That was a kind of surprise for me because I thought UCT would >> > work >> > better for bushy trees and >> > when the eval has a lot of strategy. It also reached good depths >> > averaging >> > 16 plies . >> > My checkers eval had only material in it, so I don't know if UCT >> > is bringing >> > strategy (distant information) to the game >> > which the other one don't have.The games are not really played out to >> > the >> > end rather to a MAX_PLY = 96 >> > afte which the material is counted and a WDL score is assigned (I call >> > it >> > partial playout). >> > Also the fact that captures are forced seem to help a lot because it >> > doesn't >> > make too many mistakes. >> > I also found out some positions where it encounters similar problems as >> > ladders in go. But in the checkers case, >> > this problems are still solved correctly. Only problem is that it >> > doesn't >> > report correct looking winning rates. >> > For example, in a position with two kings where one of the kings is >> > chasing >> > the other to the sides to mate it, but >> > the loosing king can draw by making a serious of correct moves to get >> > itself >> > to one of the safe corners; The program >> > displays winning rates of 0.01 (when it should have been more like 0.5) >> > but >> > it still manages the draw ! >> > thanks and apologies for the verbose email >> > Daniel >> > _______________________________________________ >> > Computer-go mailing list >> > [email protected] >> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >> > >> _______________________________________________ >> Computer-go mailing list >> [email protected] >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
