There are some pure uct w/ light playout engines available for comparison. Someone is likely willing to run their bot on CGOS for comparison, but offline tests will get you meaningful results much more quickly.
Sent from my iPhone On Mar 26, 2011, at 11:49 AM, Daniel Shawul <[email protected]> wrote: > I am using 400 now and I almost never get intermediate cuts i.e plays out to > the end. > But still the shallow tactical search depth (depth = 2) is not helping it. > Is that really what I should expect to get at the start position? I mean no > matter what heavy > playout scheme I use if it can't get deeper , I don't see how it is going to > beat it. > Please I need an answer on this. I still have to add code to avoid filling > eyes and bias > the playouts. > > Here is log file for the start position. pps = the playouts per second, > visits = total playouts. > nodes = total nodes in the dynamic tree. Do you see anything suspicious about > it ? > > [st = 2813ms, mt = 59250ms , moves_left 40] > Tree : nodes 59341 depth 2 pps 9747 visits 27419 > @@@@ 0.41 58 143 > g5 0.42 76 181 > h5 0.45 146 325 > i5 0.36 26 72 > a6 0.44 122 275 > b6 0.46 164 361 > c6 0.48 471 972 > d6 0.47 283 601 > e6 0.48 430 891 > f6 0.47 292 618 > g6 0.49 584 1195 > h6 0.47 289 612 > i6 0.40 53 131 > a7 0.43 81 191 > b7 0.35 21 62 > c7 0.48 328 689 > d7 0.46 162 356 > e7 0.48 390 812 > f7 0.41 57 139 > g7 0.46 177 388 > h7 0.36 25 70 > i7 0.40 43 110 > a8 0.35 22 63 > b8 0.36 25 70 > c8 0.49 528 1085 > d8 0.00 0 8 > e8 0.47 306 646 > f8 0.39 44 112 > g8 0.44 116 263 > h8 0.44 106 242 > i8 0.42 75 178 > a9 0.36 25 70 > b9 0.38 31 84 > c9 0.14 2 14 > d9 0.39 39 102 > e9 0.40 50 124 > f9 0.38 32 85 > g9 0.42 79 186 > h9 0.39 43 109 > i9 0.21 4 21 > e5 0.48 412 856 > d5 0.46 172 378 > c5 0.49 708 1438 -> Best move played winning rate = 0.49 > visits = 1438 wins = 708 > b5 0.38 32 86 > a5 0.46 188 410 > i4 0.38 32 85 > h4 0.31 13 43 > g4 0.48 444 919 > f4 0.48 452 936 > e4 0.49 642 1309 > d4 0.48 396 824 > c4 0.34 19 57 > b4 0.46 163 358 > a4 0.46 188 409 > i3 0.46 207 447 > h3 0.29 11 39 > g3 0.43 86 200 > f3 0.41 56 138 > e3 0.47 280 595 > d3 0.47 283 600 > c3 0.46 188 409 > b3 0.47 231 497 > a3 0.46 167 367 > i2 0.37 28 76 > h2 0.44 104 239 > g2 0.45 155 342 > f2 0.39 39 102 > e2 0.27 9 33 > d2 0.25 6 26 > c2 0.43 89 207 > b2 0.43 94 217 > a2 0.43 81 191 > i1 0.41 54 133 > h1 0.14 2 14 > g1 0.42 63 153 > f1 0.35 21 62 > e1 0.46 187 407 > d1 0.48 314 662 > c1 0.46 174 381 > b1 0.36 25 70 > a1 0.32 15 48 > > > > On Sat, Mar 26, 2011 at 11:33 AM, Erik van der Werf > <[email protected]> wrote: > I agree, if you use a hard limit it should be much higher (probably > something like twice the board surface is ok). > > 110 moves is just an observation of the average playout length for the > empty 9x9 board. With smarter playouts that average tends to become > lower, but the distribution may still have a long tail. > > Erik > > > On Sat, Mar 26, 2011 at 3:36 PM, Rémi Coulom <[email protected]> wrote: > > I'd recommend more than 110. Maybe 200 is better. In Crazy Stone, I use no > > limit, and test for superko. > > > > Rémi > > > > On 26 mars 2011, at 15:32, Daniel Shawul wrote: > > > >> Sorry 81 moves was a bad estimate by me. I am actually using 96 moves. I > >> will change that to 110 or above > >> moves and see what effect it has. Also I would take Remi's suggestion i.e > >> to bias the move selection process. > >> For the alpha-beta program , I have a decent move ordering algorithm and > >> qsearch. I guess can borrow some from that. > >> > >> In the meantime, I found a paper using UCT for chinese checkers and other > >> games > >> http://www.google.com/url?sa=t&source=web&cd=7&ved=0CD4QFjAG&url=http%3A%2F%2Fweb.cs.du.edu%2F~sturtevant%2Fpapers%2Fmpuct_icga.pdf&rct=j&q=UCT%20for%20checkers&ei=mPiNTcqdBsSO0QG20-nWCw&usg=AFQjCNGFgMMMG8xMtawvx-3rQtwXPhfWxQ&cad=rja, > >> and also > >> some fun java programs using UCT for checkers. It seems UCT is indeed > >> competitive in checkers. > >> I must say I didn't expect that at all. I think the forced nature of > >> captures helps to improve tactical awareness of the MC simulations. > >> Is that so ? > >> > >> > >> On Sat, Mar 26, 2011 at 8:52 AM, Erik van der Werf > >> <[email protected]> wrote: > >> Ah ok, I misunderstood. > >> > >> Still something seems to be wrong. On the empty 9x9 board I think most > >> programs with random/light playouts play in the order of 110 moves. > >> ~81 moves seems quite low; in my experience you can only get such low > >> numbers to work well if you have a lot of knowledge in your playouts. > >> Did you check the quality of the evaluations/playouts? > >> > >> If you want UCT to search deeper you need good priors and perhaps > >> something like rave/amaf. > >> > >> Best, > >> Erik > >> > >> > >> On Sat, Mar 26, 2011 at 1:13 PM, Daniel Shawul <[email protected]> wrote: > >> > Hello, > >> > I am using monte carlo playouts for the UCT method. It can do about > >> > 10k/sec. > >> > The UCT tree is expanded to a depth of d = 3 in a 5 sec search, from > >> > then > >> > onwards a random playout (with no bias) > >> > is carried out. Actually it is a 'patial playout' which doesn't go to > >> > the > >> > end of the game, rather upto a depth of MAX_PLY=96. > >> > If the game has ended earlier, then a win/draw/loss is returned. > >> > Otherwise > >> > I forcefully end the game by using a determinstic eval > >> > and assign a WDL. For 9x9 go actually most of random playouts end before > >> > move 81. > >> > For the alpha-beta searcher , I do classical evaluation. With heavy use > >> > of > >> > reductions > >> > I can get a depth of 14 half plies , which seems to give it quite an edge > >> > against the UCT version. > >> > Is the depth of expansion for the UCT tree too low ? (d = 3 in a 5 sec > >> > search). Should I lower the UCTK parameter > >> > to 0.1 or so which seems to give me a depth = 7 at the start positon of a > >> > 9x9 go. I am confident my implementation is > >> > correct because it is working quite well in my checkers program despite > >> > my > >> > expectation. > >> > thanks > >> > Daniel > >> > > >> > On Sat, Mar 26, 2011 at 7:54 AM, Erik van der Werf > >> > <[email protected]> wrote: > >> >> > >> >> It sounds like you're using a classical (deterministic) evaluation > >> >> function. > >> >> Try combining UCT with Monte Carlo evaluation. > >> >> > >> >> Erik > >> >> > >> >> > >> >> On Sat, Mar 26, 2011 at 12:43 PM, Daniel Shawul <[email protected]> > >> >> wrote: > >> >> > Hello, > >> >> > I am very new to UCT, just implemented basic UCT for go yesterday. > >> >> > But with no success so far for GO,I think mostly because it searches > >> >> > not > >> >> > very deep (depth = 3 on a 5 sec search with those values). > >> >> > I am using the following values as UCT parameters > >> >> > UCTK = sqrt(1/5) = 0.44 UCTN = 10 (visits afte which best move is > >> >> > expanded) > >> >> > Even if I lower UCTK down to 7 I get a maximum depth of d=7 at the > >> >> > start > >> >> > position for a 5 sec search. > >> >> > For how deep a search should I tune these parameter for ? > >> >> > Before UCT, I have an alpha-beta searcher which sometimes plays on > >> >> > CGOS. > >> >> > It reached a level of ~1500, and this engine seems to be too strong > >> >> > for > >> >> > the > >> >> > UCT version. > >> >> > It just gets outsearched in some tactical positions and also in > >> >> > evaluation > >> >> > I think. > >> >> > For example, I have an evaluation term which gives big bonuses for > >> >> > connected > >> >> > strings which seems > >> >> > to give an edge in a lot of games.. How do you introduce such eval > >> >> > terms > >> >> > in > >> >> > UCT ? > >> >> > But for my checkers program , to my big surprise , UCT made a > >> >> > significant > >> >> > impact. The regular > >> >> > alpha-beta searcher averages a depth=25 but the UCT version I think is > >> >> > equally strong from the games > >> >> > I saw. That was a kind of surprise for me because I thought UCT would > >> >> > work > >> >> > better for bushy trees and > >> >> > when the eval has a lot of strategy. It also reached good depths > >> >> > averaging > >> >> > 16 plies . > >> >> > My checkers eval had only material in it, so I don't know if UCT > >> >> > is bringing > >> >> > strategy (distant information) to the game > >> >> > which the other one don't have.The games are not really played out to > >> >> > the > >> >> > end rather to a MAX_PLY = 96 > >> >> > afte which the material is counted and a WDL score is assigned (I call > >> >> > it > >> >> > partial playout). > >> >> > Also the fact that captures are forced seem to help a lot because it > >> >> > doesn't > >> >> > make too many mistakes. > >> >> > I also found out some positions where it encounters similar problems > >> >> > as > >> >> > ladders in go. But in the checkers case, > >> >> > this problems are still solved correctly. Only problem is that it > >> >> > doesn't > >> >> > report correct looking winning rates. > >> >> > For example, in a position with two kings where one of the kings is > >> >> > chasing > >> >> > the other to the sides to mate it, but > >> >> > the loosing king can draw by making a serious of correct moves to get > >> >> > itself > >> >> > to one of the safe corners; The program > >> >> > displays winning rates of 0.01 (when it should have been more like > >> >> > 0.5) > >> >> > but > >> >> > it still manages the draw ! > >> >> > thanks and apologies for the verbose email > >> >> > Daniel > >> >> > _______________________________________________ > >> >> > Computer-go mailing list > >> >> > [email protected] > >> >> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > >> >> > > >> >> _______________________________________________ > >> >> Computer-go mailing list > >> >> [email protected] > >> >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > >> > > >> > > >> > _______________________________________________ > >> > Computer-go mailing list > >> > [email protected] > >> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > >> > > >> _______________________________________________ > >> Computer-go mailing list > >> [email protected] > >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > >> > >> _______________________________________________ > >> Computer-go mailing list > >> [email protected] > >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > > > _______________________________________________ > > Computer-go mailing list > > [email protected] > > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
