Hello Alexander, Your idea is interesting. If I understand correctly, your idea is similar to Fuego's "weight RAVE updates", see
http://fuego.sourceforge.net/fuego-doc-1.1/smartgame-doc/classSgUctSearch.html#a158df0720f5b23d7a2cb7381c1355214 Weight RAVE updates. Gives more weight to moves that are closer to the position for which the RAVE statistics are stored. The weighting function is linearly decreasing from 2 to 0 with the move number from the position for which the RAVE statistics are stored to the end of the simulated game. Regards, Aja 2013/3/31 Alexander Kozlovsky <[email protected]> > Hi! > I have an idea have to improve RAVE, but this is still rough. > So, I want to describe it here in the hope it lead to some > interesting discussion. I hope my not-so-good English > allows me to describe the idea adequately. > > If you don't want to read all the details, you can first scroll down > to "Use cases" to read when this improvement may be useful. > > > --- Current RAVE statistics implementation --- > > Let's say, we want to accumulate RAVE statistics. Let's say > we have four arrays for this: black_rave_total[], black_rave_wins[], > white_rave_total[], white_rave_wins[]. > > We increment black_rave_total[intersection] += 1 > if, during last simulation, black was first who play on this intersection. > > We increment black_rave_win[intersection] += 1 > if, during last simulation, black was first who play on this intersection, > and the simulation result is "black win". > > --- New proposal --- > > What if we add two arrays: black_rave_win_move_sum[] > and white_rave_win_move_sum[]. > These arrays will accumulates sum of move numbers when black > was first who play on the intersection and black win in this simulation. > > Concrete example: > > That is, let's say we already done ten simulations for current node. > In six simulations, black was first who play on B4 intersection. > In three of this simulations black win. > > In first of this three simulations, move number when black > play on B4 was 20 (this move number is counted from the > start of random simulation, not from the start of the game) > > In second of three simulation, the move number for B4 was 25. > In the third simulation where black play on B4 and win, > the move number was 72. > > In this case, black_rave_win_move_sum for B4 will be > 20 + 25 + 72 = 117 > > This number allows us to calculate average move number > for B4 when simulation result was successful for black: > 117 / 3 = 39. > > I denote this as black_rave_avg_win_move_num: > black_rave_avg_win_move_num[pos] = > black_rave_win_move_sum[pos] / black_rave_wins[pos] > > In current RAVE, we use winrate to determine the "best" move: > black_rave_winrate[pos] = black_rave_wins[pos] / black_rave_total[pos] > > I propose to use "weighted winrate" instead: > black_weighed_rave_winrate[pos] = > black_rave_winrate[pos] / black_rave_avg_win_move_num[pos] > > In current example, winrate for B4 is 3/6 = 0.5 > weighted winrate will be (3/6) / (117/3) = 0.0128205 > > Weighted winrate will be bigger for successful moves which must played > ealier > during simulation. Good endgame moves will have low weighted winrate. > > > --- Use cases --- > > 1) Let's say we have two moves with good RAVE winrate: E5 and A4. > A4 have bigger winrate, because A4 is inside safe territory, and each > successful simulation have A4. E5 is critical, and must be played > very early for result to be successful. Each simulation with E5 also > have A4, but some simulations without E5 were also successful > because of dumb opponent play during simulation. > > So, A4 have bigger RAVE winrate. But E5 have bigger > weighted winrate, because A4 can be played at any time during > simulation, and E5 must be played early, or it will be useless. > > With using of weighted RAVE winrate we can determine that > E5 is more important then A4, despite the fact A4 have bigger > RAVE winrate. > > 2) Let's say black must do three moves during simulation > in order to win - B2, B3, C3, exactly in this order. Without > this moves black cannot win the simulation. > > All of this moves have the same winrate, because the > simulation is successful for black only if all three moves > are played during simulations. > > So, if we use simple RAVE winrate, we can have problems > with determination of correct move order. > > But B2 have bigger weighted winrate then B3 and C3, > (and B3 have bigger weighted winrate then C3), because > in all successful simulations B2 played before B3, > and hence average move number for B2 is strictly less > then average move number for B3 and C3. So, when using > weighted winrate, we can determine correct move order. > > > What do you think, am I missing something? > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
