Re: [computer-go] More UCT / Monte-Carlo questions (Effect of rave)

Erik van der Werf Thu, 07 Feb 2008 01:39:09 -0800

Hmm, sounds like I should experiment some more. Thanks!

Erik



On Thu, Feb 7, 2008 at 1:11 AM, Hideki Kato <[EMAIL PROTECTED]> wrote:
> Hi Erik,
>
>  My program is very based on MoGo's report and the paper.  Yes, I
>  used FPU of 1.15.
>
>  -Hideki
>
>  Erik van der Werf: <[EMAIL PROTECTED]>:
>
>
> >Hi Hideki,
>  >
>  >Your results look similar to those of Mogo as reported in their icml
>  >paper. When you ran this experiment, did you use anything like FPU or
>  >progressive widening, or did you use Levente's original design which
>  >always selects unvisited moves first?
>  >
>  >Regards,
>  >Erik
>  >
>  >
>  >On Wed, Feb 6, 2008 at 3:42 PM, Hideki Kato <[EMAIL PROTECTED]> wrote:
>  >> I found some data.  GGMC Go v2r6, against GNU Go 3.7.10 level 10, 9x9,
>  >>  komi 7.5, 3000 playouts/move, 2000 games match:
>  >>
>  >>  Without RAVE:   winning rate was 23.1 +- 0.9% (-209 +- 9 ELO)
>  >>  With RAVE:      winning rate was 65.3 +- 1.1% (+110 +- 8 ELO)
>  >>
>  >>  Though this includes some other improvements, most come from RAVE.
>  >>  Unlike MoGo, my best 'K' was 1000.
>  >>
>  >>  Following is my implementation of RAVE for GGMC v2r6.
>  >>  1) Each playout returns the score and all moves with colors played.
>  >>  2) While back-propagating the value (degitized score), computes the
>  >>  mean and the variance according to UCB1 and do the same for RAVE
>  >>  seperatelly.  For RAVE, the values of all (legal) moves, except played
>  >>  one, in a node are updated.
>  >>  3) In the computation of values for RAVE, the point is that there
>  >>  appeares three colors (as someone, I remember GCP, mentioned before).
>  >>  If the players' colors aren't the same then skip.  Count the value as
>  >>  is or negate (1 - score, for me), depending on the color of the player
>  >>  at the position and the color for the score.
>  >>  4) Before back-propagating the value of each playout, I setup a color
>  >>  table for all intersections of the board for speed-up, in fact
>  >>  (initialized with EMPTY). That is, fill the board (table[move] =
>  >>  color) by tracing the moves and the colors returned by the playout
>  >>  forward (from leaf node to end of the game). Then, by tracing the
>  >>  path from root to the leaf node, clear the table[move] (table[move] =
>  >>  EMPTY), in order to avoid duplicate counting with UCB1.
>  >>  5) While descending the tree, merge the values come from UCB1 and
>  >>  RAVE with 'K' according to the formula in the paper.
>  >>
>  >>  #Though I'm writing this by reading my source code, this description
>  >>  may include some errors.
>  >>
>  >>  Hope this helps,
>  >>
>  >> Hideki
>  >>
>  >>  Gian-Carlo Pascutto: <[EMAIL PROTECTED]>:
>  >>
>  >> >> I also implemented RAVE in Mango. There was a few points of 
> improvements
>  >>  >> (around 60 Elo points with gnugo as reference), but as much as in the
>  >>  >> paper of Gelly and Silver :( (around 250 Elo points if I remember 
> well)
>  >>  >>
>  >>  >> It might be that the effect of RAVE depends a lot on the simulation
>  >>  >> strategy. Indeed, sometimes my RAVE was playing very good moves but 
> also
>  >>  >> very bad ones.
>  >>  >
>  >>  >I don't think the simulation strategy is the key.
>  >>  >
>  >>  >I suspect the improvement is largest when you don't do progressive 
> widening.
>  >>  >
>  >>  >Nevertheless it would be quite interesting to see the implementation
>  >>  >details of ggmc's RAVE. RAVE performance is quite dependent on exact
>  >>  >implementation and parameters.
>  >>  --
>  >>
>  >> [EMAIL PROTECTED] (Kato)
>  >>  _______________________________________________
>  >>
>  >>
>  >> computer-go mailing list
>  >>  [email protected]
>  >>  http://www.computer-go.org/mailman/listinfo/computer-go/
>  >>
>  >_______________________________________________
>  >computer-go mailing list
>  >[email protected]
>  >http://www.computer-go.org/mailman/listinfo/computer-go/
>  --
>  [EMAIL PROTECTED] (Kato)
>  _______________________________________________
>  computer-go mailing list
>  [email protected]
>  http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
[email protected]
http://www.computer-go.org/mailman/listinfo/computer-go/

Re: [computer-go] More UCT / Monte-Carlo questions (Effect of rave)

Reply via email to