On Thu, Oct 28, 2010 at 10:15:46AM +0200, Olivier Teytaud wrote: > > From: Olivier Teytaud <[email protected]> > > After the program plays all simulations, which move should it choose? > > > > (Wins/Visits) + SQRT(ln(...)) > > or > > (Wins+Draw/2)/Visits + SQRT(ln(...)) > > > > > None of these two formula :-) > These formulas is for choosing moves to be simulated. For turn-based games, > when al simulations are finished, we should choose > > move = argmax_m number_of_simulations(m) > > or something like that (you can introduce a bias built from the success > rate...).
I have been thinking that it might make sense to take the move with the highest *lower* confindence bound and play that move. In a way, that would be the move we have least reason to believe will be a blunder. But I am just a programmer, not a mathematician, so what do I know. If I had more time and energy, I'd make many experiments. But at the moment, all I can afford is to follow this list... - Heikki -- Heikki Levanto "In Murphy We Turst" heikki (at) lsd (dot) dk _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
