Hello, I am using monte carlo playouts for the UCT method. It can do about 10k/sec. The UCT tree is expanded to a depth of d = 3 in a 5 sec search, from then onwards a random playout (with no bias) is carried out. Actually it is a 'patial playout' which doesn't go to the end of the game, rather upto a depth of MAX_PLY=96. If the game has ended earlier, then a win/draw/loss is returned. Otherwise I forcefully end the game by using a determinstic eval and assign a WDL. For 9x9 go actually most of random playouts end before move 81. For the alpha-beta searcher , I do classical evaluation. With heavy use of reductions I can get a depth of 14 half plies , which seems to give it quite an edge against the UCT version.
Is the depth of expansion for the UCT tree too low ? (d = 3 in a 5 sec search). Should I lower the UCTK parameter to 0.1 or so which seems to give me a depth = 7 at the start positon of a 9x9 go. I am confident my implementation is correct because it is working quite well in my checkers program despite my expectation. thanks Daniel On Sat, Mar 26, 2011 at 7:54 AM, Erik van der Werf <[email protected] > wrote: > It sounds like you're using a classical (deterministic) evaluation > function. > Try combining UCT with Monte Carlo evaluation. > > Erik > > > On Sat, Mar 26, 2011 at 12:43 PM, Daniel Shawul <[email protected]> wrote: > > Hello, > > I am very new to UCT, just implemented basic UCT for go yesterday. > > But with no success so far for GO,I think mostly because it searches not > > very deep (depth = 3 on a 5 sec search with those values). > > I am using the following values as UCT parameters > > UCTK = sqrt(1/5) = 0.44 UCTN = 10 (visits afte which best move is > > expanded) > > Even if I lower UCTK down to 7 I get a maximum depth of d=7 at the start > > position for a 5 sec search. > > For how deep a search should I tune these parameter for ? > > Before UCT, I have an alpha-beta searcher which sometimes plays on CGOS. > > It reached a level of ~1500, and this engine seems to be too strong for > the > > UCT version. > > It just gets outsearched in some tactical positions and also in > evaluation > > I think. > > For example, I have an evaluation term which gives big bonuses for > connected > > strings which seems > > to give an edge in a lot of games.. How do you introduce such eval terms > in > > UCT ? > > But for my checkers program , to my big surprise , UCT made a significant > > impact. The regular > > alpha-beta searcher averages a depth=25 but the UCT version I think is > > equally strong from the games > > I saw. That was a kind of surprise for me because I thought UCT would > work > > better for bushy trees and > > when the eval has a lot of strategy. It also reached good depths > averaging > > 16 plies . > > My checkers eval had only material in it, so I don't know if UCT > is bringing > > strategy (distant information) to the game > > which the other one don't have.The games are not really played out to the > > end rather to a MAX_PLY = 96 > > afte which the material is counted and a WDL score is assigned (I call it > > partial playout). > > Also the fact that captures are forced seem to help a lot because it > doesn't > > make too many mistakes. > > I also found out some positions where it encounters similar problems as > > ladders in go. But in the checkers case, > > this problems are still solved correctly. Only problem is that it doesn't > > report correct looking winning rates. > > For example, in a position with two kings where one of the kings is > chasing > > the other to the sides to mate it, but > > the loosing king can draw by making a serious of correct moves to get > itself > > to one of the safe corners; The program > > displays winning rates of 0.01 (when it should have been more like 0.5) > but > > it still manages the draw ! > > thanks and apologies for the verbose email > > Daniel > > _______________________________________________ > > Computer-go mailing list > > [email protected] > > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
