You allow eye-filling in the playouts? Or anywhere? It should never be allowed. Top priority.
On Sat, Mar 26, 2011 at 11:49 AM, Daniel Shawul <[email protected]> wrote: > I am using 400 now and I almost never get intermediate cuts i.e plays out to > the end. > But still the shallow tactical search depth (depth = 2) is not helping it. > Is that really what I should expect to get at the start position? I mean no > matter what heavy > playout scheme I use if it can't get deeper , I don't see how it is going to > beat it. > Please I need an answer on this. I still have to add code to avoid filling > eyes and bias > the playouts. > > Here is log file for the start position. pps = the playouts per second, > visits = total playouts. > nodes = total nodes in the dynamic tree. Do you see anything suspicious > about it ? > > [st = 2813ms, mt = 59250ms , moves_left 40] > Tree : nodes 59341 depth 2 pps 9747 visits 27419 > @@@@ 0.41 58 143 > g5 0.42 76 181 > h5 0.45 146 325 > i5 0.36 26 72 > a6 0.44 122 275 > b6 0.46 164 361 > c6 0.48 471 972 > d6 0.47 283 601 > e6 0.48 430 891 > f6 0.47 292 618 > g6 0.49 584 1195 > h6 0.47 289 612 > i6 0.40 53 131 > a7 0.43 81 191 > b7 0.35 21 62 > c7 0.48 328 689 > d7 0.46 162 356 > e7 0.48 390 812 > f7 0.41 57 139 > g7 0.46 177 388 > h7 0.36 25 70 > i7 0.40 43 110 > a8 0.35 22 63 > b8 0.36 25 70 > c8 0.49 528 1085 > d8 0.00 0 8 > e8 0.47 306 646 > f8 0.39 44 112 > g8 0.44 116 263 > h8 0.44 106 242 > i8 0.42 75 178 > a9 0.36 25 70 > b9 0.38 31 84 > c9 0.14 2 14 > d9 0.39 39 102 > e9 0.40 50 124 > f9 0.38 32 85 > g9 0.42 79 186 > h9 0.39 43 109 > i9 0.21 4 21 > e5 0.48 412 856 > d5 0.46 172 378 > c5 0.49 708 1438 -> Best move played winning rate = 0.49 > visits = 1438 wins = 708 > b5 0.38 32 86 > a5 0.46 188 410 > i4 0.38 32 85 > h4 0.31 13 43 > g4 0.48 444 919 > f4 0.48 452 936 > e4 0.49 642 1309 > d4 0.48 396 824 > c4 0.34 19 57 > b4 0.46 163 358 > a4 0.46 188 409 > i3 0.46 207 447 > h3 0.29 11 39 > g3 0.43 86 200 > f3 0.41 56 138 > e3 0.47 280 595 > d3 0.47 283 600 > c3 0.46 188 409 > b3 0.47 231 497 > a3 0.46 167 367 > i2 0.37 28 76 > h2 0.44 104 239 > g2 0.45 155 342 > f2 0.39 39 102 > e2 0.27 9 33 > d2 0.25 6 26 > c2 0.43 89 207 > b2 0.43 94 217 > a2 0.43 81 191 > i1 0.41 54 133 > h1 0.14 2 14 > g1 0.42 63 153 > f1 0.35 21 62 > e1 0.46 187 407 > d1 0.48 314 662 > c1 0.46 174 381 > b1 0.36 25 70 > a1 0.32 15 48 > > > > On Sat, Mar 26, 2011 at 11:33 AM, Erik van der Werf > <[email protected]> wrote: >> >> I agree, if you use a hard limit it should be much higher (probably >> something like twice the board surface is ok). >> >> 110 moves is just an observation of the average playout length for the >> empty 9x9 board. With smarter playouts that average tends to become >> lower, but the distribution may still have a long tail. >> >> Erik >> >> >> On Sat, Mar 26, 2011 at 3:36 PM, Rémi Coulom <[email protected]> wrote: >> > I'd recommend more than 110. Maybe 200 is better. In Crazy Stone, I use >> > no limit, and test for superko. >> > >> > Rémi >> > >> > On 26 mars 2011, at 15:32, Daniel Shawul wrote: >> > >> >> Sorry 81 moves was a bad estimate by me. I am actually using 96 moves. >> >> I will change that to 110 or above >> >> moves and see what effect it has. Also I would take Remi's suggestion >> >> i.e to bias the move selection process. >> >> For the alpha-beta program , I have a decent move ordering algorithm >> >> and qsearch. I guess can borrow some from that. >> >> >> >> In the meantime, I found a paper using UCT for chinese checkers and >> >> other games >> >> http://www.google.com/url?sa=t&source=web&cd=7&ved=0CD4QFjAG&url=http%3A%2F%2Fweb.cs.du.edu%2F~sturtevant%2Fpapers%2Fmpuct_icga.pdf&rct=j&q=UCT%20for%20checkers&ei=mPiNTcqdBsSO0QG20-nWCw&usg=AFQjCNGFgMMMG8xMtawvx-3rQtwXPhfWxQ&cad=rja, >> >> and also >> >> some fun java programs using UCT for checkers. It seems UCT is indeed >> >> competitive in checkers. >> >> I must say I didn't expect that at all. I think the forced nature of >> >> captures helps to improve tactical awareness of the MC simulations. >> >> Is that so ? >> >> >> >> >> >> On Sat, Mar 26, 2011 at 8:52 AM, Erik van der Werf >> >> <[email protected]> wrote: >> >> Ah ok, I misunderstood. >> >> >> >> Still something seems to be wrong. On the empty 9x9 board I think most >> >> programs with random/light playouts play in the order of 110 moves. >> >> ~81 moves seems quite low; in my experience you can only get such low >> >> numbers to work well if you have a lot of knowledge in your playouts. >> >> Did you check the quality of the evaluations/playouts? >> >> >> >> If you want UCT to search deeper you need good priors and perhaps >> >> something like rave/amaf. >> >> >> >> Best, >> >> Erik >> >> >> >> >> >> On Sat, Mar 26, 2011 at 1:13 PM, Daniel Shawul <[email protected]> >> >> wrote: >> >> > Hello, >> >> > I am using monte carlo playouts for the UCT method. It can do about >> >> > 10k/sec. >> >> > The UCT tree is expanded to a depth of d = 3 in a 5 sec search, from >> >> > then >> >> > onwards a random playout (with no bias) >> >> > is carried out. Actually it is a 'patial playout' which doesn't go >> >> > to the >> >> > end of the game, rather upto a depth of MAX_PLY=96. >> >> > If the game has ended earlier, then a win/draw/loss is returned. >> >> > Otherwise >> >> > I forcefully end the game by using a determinstic eval >> >> > and assign a WDL. For 9x9 go actually most of random playouts end >> >> > before >> >> > move 81. >> >> > For the alpha-beta searcher , I do classical evaluation. With heavy >> >> > use of >> >> > reductions >> >> > I can get a depth of 14 half plies , which seems to give it quite an >> >> > edge >> >> > against the UCT version. >> >> > Is the depth of expansion for the UCT tree too low ? (d = 3 in a 5 >> >> > sec >> >> > search). Should I lower the UCTK parameter >> >> > to 0.1 or so which seems to give me a depth = 7 at the start positon >> >> > of a >> >> > 9x9 go. I am confident my implementation is >> >> > correct because it is working quite well in my checkers program >> >> > despite my >> >> > expectation. >> >> > thanks >> >> > Daniel >> >> > >> >> > On Sat, Mar 26, 2011 at 7:54 AM, Erik van der Werf >> >> > <[email protected]> wrote: >> >> >> >> >> >> It sounds like you're using a classical (deterministic) evaluation >> >> >> function. >> >> >> Try combining UCT with Monte Carlo evaluation. >> >> >> >> >> >> Erik >> >> >> >> >> >> >> >> >> On Sat, Mar 26, 2011 at 12:43 PM, Daniel Shawul <[email protected]> >> >> >> wrote: >> >> >> > Hello, >> >> >> > I am very new to UCT, just implemented basic UCT for go >> >> >> > yesterday. >> >> >> > But with no success so far for GO,I think mostly because it >> >> >> > searches >> >> >> > not >> >> >> > very deep (depth = 3 on a 5 sec search with those values). >> >> >> > I am using the following values as UCT parameters >> >> >> > UCTK = sqrt(1/5) = 0.44 UCTN = 10 (visits afte which best move >> >> >> > is >> >> >> > expanded) >> >> >> > Even if I lower UCTK down to 7 I get a maximum depth of d=7 at the >> >> >> > start >> >> >> > position for a 5 sec search. >> >> >> > For how deep a search should I tune these parameter for ? >> >> >> > Before UCT, I have an alpha-beta searcher which sometimes plays >> >> >> > on >> >> >> > CGOS. >> >> >> > It reached a level of ~1500, and this engine seems to be too >> >> >> > strong for >> >> >> > the >> >> >> > UCT version. >> >> >> > It just gets outsearched in some tactical positions and also in >> >> >> > evaluation >> >> >> > I think. >> >> >> > For example, I have an evaluation term which gives big bonuses for >> >> >> > connected >> >> >> > strings which seems >> >> >> > to give an edge in a lot of games.. How do you introduce such eval >> >> >> > terms >> >> >> > in >> >> >> > UCT ? >> >> >> > But for my checkers program , to my big surprise , UCT made a >> >> >> > significant >> >> >> > impact. The regular >> >> >> > alpha-beta searcher averages a depth=25 but the UCT version I >> >> >> > think is >> >> >> > equally strong from the games >> >> >> > I saw. That was a kind of surprise for me because I thought UCT >> >> >> > would >> >> >> > work >> >> >> > better for bushy trees and >> >> >> > when the eval has a lot of strategy. It also reached good depths >> >> >> > averaging >> >> >> > 16 plies . >> >> >> > My checkers eval had only material in it, so I don't know if UCT >> >> >> > is bringing >> >> >> > strategy (distant information) to the game >> >> >> > which the other one don't have.The games are not really played out >> >> >> > to >> >> >> > the >> >> >> > end rather to a MAX_PLY = 96 >> >> >> > afte which the material is counted and a WDL score is assigned (I >> >> >> > call >> >> >> > it >> >> >> > partial playout). >> >> >> > Also the fact that captures are forced seem to help a lot because >> >> >> > it >> >> >> > doesn't >> >> >> > make too many mistakes. >> >> >> > I also found out some positions where it encounters similar >> >> >> > problems as >> >> >> > ladders in go. But in the checkers case, >> >> >> > this problems are still solved correctly. Only problem is that it >> >> >> > doesn't >> >> >> > report correct looking winning rates. >> >> >> > For example, in a position with two kings where one of the kings >> >> >> > is >> >> >> > chasing >> >> >> > the other to the sides to mate it, but >> >> >> > the loosing king can draw by making a serious of correct moves to >> >> >> > get >> >> >> > itself >> >> >> > to one of the safe corners; The program >> >> >> > displays winning rates of 0.01 (when it should have been more like >> >> >> > 0.5) >> >> >> > but >> >> >> > it still manages the draw ! >> >> >> > thanks and apologies for the verbose email >> >> >> > Daniel >> >> >> > _______________________________________________ >> >> >> > Computer-go mailing list >> >> >> > [email protected] >> >> >> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >> >> >> > >> >> >> _______________________________________________ >> >> >> Computer-go mailing list >> >> >> [email protected] >> >> >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >> >> > >> >> > >> >> > _______________________________________________ >> >> > Computer-go mailing list >> >> > [email protected] >> >> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >> >> > >> >> _______________________________________________ >> >> Computer-go mailing list >> >> [email protected] >> >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >> >> >> >> _______________________________________________ >> >> Computer-go mailing list >> >> [email protected] >> >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >> > >> > _______________________________________________ >> > Computer-go mailing list >> > [email protected] >> > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >> > >> _______________________________________________ >> Computer-go mailing list >> [email protected] >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
