RE: [SPAM] Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning
I tried this yesterday with K=10 and it seemed to make Many Faces weaker (84.2% +- 2.3 vs 81.6% +-1.7), not 95% confidence, but likely weaker. This is 19x19 vs gnugo with Many Faces using 8K playouts per move, 1000 games without and 2000 games with the change. I have the UCT exploration term, so perhaps with exploration this idea doesn't work. Or perhaps the K I tried is too large. David From: computer-go-boun...@computer-go.org [mailto:computer-go-boun...@computer-go.org] On Behalf Of Olivier Teytaud Sent: Saturday, October 03, 2009 1:28 AM To: computer-go Subject: Re: [SPAM] Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning 4) regularized success rate (nbWins +K ) /(nbSims + 2K) (the original progressive bias is simpler than that) I'm not sure what you mean here. Can you explain a bit more? Sorry for being unclear, I hope I'll do better below. Instead of just number of wins divided by numer of simulations, we use nb of wins + K divided by nb of simulations + 2K; this is similar to the even game heuristic previously cited; it avoids that we 0% of success rate for a move tested just once. If you apply UCT with constant zero in front of the sqrt{log(N)/N_i) term, then such a regularization is necessary for showing consistency of UCT for two-player games; and even with non-zero exploration terms, I guess this kind of regularization avoids that the program spends a very long time without looking at a move just because of a few bad first simulations. This kind of detail is a bit boring, but I think K0 is much better in many cases... well, maybe not for other implementations, depending on the other terms you have - our formula is so long now I'm not able of writing it in closed form :-) By the way, K0 is in my humble opinion a very good idea if you want to check that UCT with positive constant has a good effect in your code - I feel that UCT is great if K=0, just because of the bad first simulation effect - with K=0 and without exploration term, just loosing the first few simulations can lead to the very bad situation in which a move is never tested anymore. Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning
What's your general approach? My understanding from your previous posts is that it's something like: Your understanding is right. By the way, all the current strong programs are really very similar... Perhaps Fuego has something different in 19x19 (no big database of patterns ?). I'm not at all sure of this conjecture on Fuego. Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning
On Oct 2, 2009, at 2:24 PM, Olivier Teytaud olivier.teyt...@lri.fr wrote: 4) regularized success rate (nbWins +K ) /(nbSims + 2K) (the original progressive bias is simpler than that) I'm not sure what you mean here. Can you explain a bit more? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [SPAM] Re: [computer-go] Progressive widening vs unpruning
Look for the graph I posted a few weeks ago. Most things tried make it worse. Some make it a little better, and every now and then there is a big jump. David I'm wondering, are these tunings about squeezing single-percent increases with very narrow confidence bounds, or something that gives immediately noticeable 10% boost when applied? I'm curious about how the top bots improve, if it's accumulating many tiny increases or long quests for sudden boosts. -- Petr Pasky Baudis A lot of people have my books on their bookshelves. That's the problem, they need to read them. -- Don Knuth ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning
I guess I'm not really appreciating the difference between node value prior and progressive bias - adding a fixed small number of wins or diminishing heuristic value seems very similar to me in practice. Is the difference noticeable? It just means that the weight of the prior does not necessarily decrease linearly. For example, for Rave, you might prefer to have a weight which depends on both the number of Rave simulations and on the number of real simulations. The formula designed by Sylvain for Rave, for example, can't be recovered without this generalization. For the current mogo, as we have several terms (one for the empirical success rate, one for Rave, one for patterns, one for expert rules...), some of them with used at different places in the code (there is a prior expressed in terms of rave simulations, and a prior expressed as real simulations), it would be complicated to come back to something much simpler ... but perhaps it's possible. Such a clarification might be useful, I have no clear idea of the impact of Rave values now in mogo, in particular for long time settings, and it's not so easy to clarify this point - too many dependencies between the many terms we have. I think someone pointed out a long time ago on this mailing list that initializing the prior in terms of Rave simulations was far less efficient than initializing the prior in terms of real simulations, at least if you have classical rave formulas - at least, we had an improvement when adding a prior in the real simulations, but we had also an improvement when adding one more term, which is not linear. Sorry for forgetting who :-( Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [SPAM] Re: [computer-go] Progressive widening vs unpruning
Are you sure you are not over-tuning to some opponent, or to self-play? I have a very simple formula, win rate plus rave (with the simpler beta formula), plus the simple upper bound term. I bias the rave simulations only. It seems to work pretty well. Your description sounds pretty complex. David From: computer-go-boun...@computer-go.org [mailto:computer-go-boun...@computer-go.org] On Behalf Of Olivier Teytaud Sent: Tuesday, September 29, 2009 1:26 PM To: computer-go Subject: Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning I guess I'm not really appreciating the difference between node value prior and progressive bias - adding a fixed small number of wins or diminishing heuristic value seems very similar to me in practice. Is the difference noticeable? It just means that the weight of the prior does not necessarily decrease linearly. For example, for Rave, you might prefer to have a weight which depends on both the number of Rave simulations and on the number of real simulations. The formula designed by Sylvain for Rave, for example, can't be recovered without this generalization. For the current mogo, as we have several terms (one for the empirical success rate, one for Rave, one for patterns, one for expert rules...), some of them with used at different places in the code (there is a prior expressed in terms of rave simulations, and a prior expressed as real simulations), it would be complicated to come back to something much simpler ... but perhaps it's possible. Such a clarification might be useful, I have no clear idea of the impact of Rave values now in mogo, in particular for long time settings, and it's not so easy to clarify this point - too many dependencies between the many terms we have. I think someone pointed out a long time ago on this mailing list that initializing the prior in terms of Rave simulations was far less efficient than initializing the prior in terms of real simulations, at least if you have classical rave formulas - at least, we had an improvement when adding a prior in the real simulations, but we had also an improvement when adding one more term, which is not linear. Sorry for forgetting who :-( Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/