Re: [computer-go] Progressive widening vs unpruning
I tried this yesterday with K=10 and it seemed to make Many Faces weaker (84.2% +- 2.3 vs 81.6% +-1.7), not 95% confidence, but likely weaker. This is 19x19 vs gnugo with Many Faces using 8K playouts per move, 1000 games without and 2000 games with the change. I have the UCT exploration term, so perhaps with exploration this idea doesn't work. Or perhaps the K I tried is too large. If I understood correctly Olivier described it (*) as being the most important term when using a large number of simulations. How about trying 8 million playouts instead of 8 thousand... Which brings up the question of how to reliably evaluate changes when using a high number of playouts. Darren *: Olivier wrote: What we call progressive unpruning is termed progressive bias by Rémi Coulom. It is the use of a score which is a linear combination between 1) expert knowledge 2) patterns (in 19x19) 3) rave values 4) regularized success rate (nbWins +K ) /(nbSims + 2K) (the original progressive bias is simpler than that) for small numbers of simulations, 1) and 2) are the most important; 3) become important later; and 4) is, later, the most important term. -- Darren Cook, Software Researcher/Developer http://dcook.org/gobet/ (Shodan Go Bet - who will win?) http://dcook.org/mlsn/ (Multilingual open source semantic network) http://dcook.org/work/ (About me and my work) http://dcook.org/blogs.html (My blogs and articles) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Progressive widening vs unpruning
What value of K is used with many simulations. It seems that with millions of simulations, a small value for K will make little difference. David -Original Message- From: computer-go-boun...@computer-go.org [mailto:computer-go- boun...@computer-go.org] On Behalf Of Darren Cook Sent: Tuesday, October 06, 2009 12:06 AM To: computer-go Subject: Re: [computer-go] Progressive widening vs unpruning I tried this yesterday with K=10 and it seemed to make Many Faces weaker (84.2% +- 2.3 vs 81.6% +-1.7), not 95% confidence, but likely weaker. This is 19x19 vs gnugo with Many Faces using 8K playouts per move, 1000 games without and 2000 games with the change. I have the UCT exploration term, so perhaps with exploration this idea doesn't work. Or perhaps the K I tried is too large. If I understood correctly Olivier described it (*) as being the most important term when using a large number of simulations. How about trying 8 million playouts instead of 8 thousand... Which brings up the question of how to reliably evaluate changes when using a high number of playouts. Darren *: Olivier wrote: What we call progressive unpruning is termed progressive bias by Rémi Coulom. It is the use of a score which is a linear combination between 1) expert knowledge 2) patterns (in 19x19) 3) rave values 4) regularized success rate (nbWins +K ) /(nbSims + 2K) (the original progressive bias is simpler than that) for small numbers of simulations, 1) and 2) are the most important; 3) become important later; and 4) is, later, the most important term. -- Darren Cook, Software Researcher/Developer http://dcook.org/gobet/ (Shodan Go Bet - who will win?) http://dcook.org/mlsn/ (Multilingual open source semantic network) http://dcook.org/work/ (About me and my work) http://dcook.org/blogs.html (My blogs and articles) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [SPAM] Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning
I tried this yesterday with K=10 and it seemed to make Many Faces weaker (84.2% +- 2.3 vs 81.6% +-1.7), not 95% confidence, but likely weaker. This is 19x19 vs gnugo with Many Faces using 8K playouts per move, 1000 games without and 2000 games with the change. I have the UCT exploration term, so perhaps with exploration this idea doesn't work. Or perhaps the K I tried is too large. David From: computer-go-boun...@computer-go.org [mailto:computer-go-boun...@computer-go.org] On Behalf Of Olivier Teytaud Sent: Saturday, October 03, 2009 1:28 AM To: computer-go Subject: Re: [SPAM] Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning 4) regularized success rate (nbWins +K ) /(nbSims + 2K) (the original progressive bias is simpler than that) I'm not sure what you mean here. Can you explain a bit more? Sorry for being unclear, I hope I'll do better below. Instead of just number of wins divided by numer of simulations, we use nb of wins + K divided by nb of simulations + 2K; this is similar to the even game heuristic previously cited; it avoids that we 0% of success rate for a move tested just once. If you apply UCT with constant zero in front of the sqrt{log(N)/N_i) term, then such a regularization is necessary for showing consistency of UCT for two-player games; and even with non-zero exploration terms, I guess this kind of regularization avoids that the program spends a very long time without looking at a move just because of a few bad first simulations. This kind of detail is a bit boring, but I think K0 is much better in many cases... well, maybe not for other implementations, depending on the other terms you have - our formula is so long now I'm not able of writing it in closed form :-) By the way, K0 is in my humble opinion a very good idea if you want to check that UCT with positive constant has a good effect in your code - I feel that UCT is great if K=0, just because of the bad first simulation effect - with K=0 and without exploration term, just loosing the first few simulations can lead to the very bad situation in which a move is never tested anymore. Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning
What's your general approach? My understanding from your previous posts is that it's something like: Your understanding is right. By the way, all the current strong programs are really very similar... Perhaps Fuego has something different in 19x19 (no big database of patterns ?). I'm not at all sure of this conjecture on Fuego. Best regards, Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning
On Oct 2, 2009, at 2:24 PM, Olivier Teytaud olivier.teyt...@lri.fr wrote: 4) regularized success rate (nbWins +K ) /(nbSims + 2K) (the original progressive bias is simpler than that) I'm not sure what you mean here. Can you explain a bit more? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Progressive widening vs unpruning
David Fotland wrote: Well I have no idea how much I gained from this. It might be weaker than what everyone else is doing, since it seems I didn't implement this as it's been described recently. My progressive widening only uses Rave values. It's very simple. Others seem to have much more complex schemes. But I don't think I implement Rave like others do either. Your implementation must be very different from mine. Actually I don't use Progressive widening (or unpruning) at all. It's a mystery to me why others say it does work. -- Yamato ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Progressive widening vs unpruning
David Fotland wrote: Well I have no idea how much I gained from this. It might be weaker than what everyone else is doing, since it seems I didn't implement this as it's been described recently. My progressive widening only uses Rave values. It's very simple. Others seem to have much more complex schemes. But I don't think I implement Rave like others do either. Hi David, Earlier in the thread you said: I start with one move, and slowly add moves to the pool of moves to be considered, ... Many Faces is a pretty good heuristic for selecting good moves, so if you are just using RAVE to do move ordering you might need to widen faster. But above you say: My progressive widening only uses Rave values. ? Yamato was asking why it works at all. My understanding of the progressive widening idea was to use some pattern/expert knowledge information to guide the early MCTS, to stop it wasting lots of playouts on silly moves. And/or to make it choose more solid (better shape) moves when lots of moves have similar winning percentages [1]. Is that how you see it helping too? Darren [1]: Also in this thread, David wrote: I think it's more that Many Faces values moves that have good long-term consequences that the search can't find, so among moves with similar win rates, it will choose the ones Many Faces prefers. -- Darren Cook, Software Researcher/Developer http://dcook.org/gobet/ (Shodan Go Bet - who will win?) http://dcook.org/mlsn/ (Multilingual open source semantic network) http://dcook.org/work/ (About me and my work) http://dcook.org/blogs.html (My blogs and articles) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Progressive widening vs unpruning
So far on 9x9 go, Many Faces doesn't seem to make a huge difference. On 19x19 it makes a huge difference. I ran test games overnight against Gnugo. With Many Faces turned on, my engine wins 85%. With many Faces turned off, my engine wins 7%. Both results are unexpected. Since most of my tuning is on 19x19, the combination of Many Faces and UCT is probably badly mistuned for 9x9. David -Original Message- From: computer-go-boun...@computer-go.org [mailto:computer-go- boun...@computer-go.org] On Behalf Of David Fotland Sent: Wednesday, September 30, 2009 10:58 PM To: 'computer-go' Subject: RE: [computer-go] Progressive widening vs unpruning I'm trying an experiment. I took the Many Faces code completely out of the engine, and put it on 9x9 cgos as mfgo-none-1c. It's faster without Many Faces, but it's just a basic uct engine with medium playouts. This should tell us how much benefit I get from the Many Faces knowledge. David I recall that you credited the use of Many Faces rules with a massive improvement against GnuGo, so the technique is certainly empirically justified. But I am wondering how it achieves its results. That is, what do you think the difference is, compared to standard unpruning? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Progressive widening vs unpruning
I use rave to decide which moves to promote into the pool of moves being evaluated by the UCT formula. I use Many Faces to bias the rave values. I haven't tried other approaches so I can't say how or why my approach works well. Certainly it doesn't scale very well to huge numbers of playouts, probably because after some time the initial many faces bias becomes irrelevant. -Original Message- From: computer-go-boun...@computer-go.org [mailto:computer-go- boun...@computer-go.org] On Behalf Of Darren Cook Sent: Thursday, October 01, 2009 5:39 AM To: computer-go Subject: Re: [computer-go] Progressive widening vs unpruning David Fotland wrote: Well I have no idea how much I gained from this. It might be weaker than what everyone else is doing, since it seems I didn't implement this as it's been described recently. My progressive widening only uses Rave values. It's very simple. Others seem to have much more complex schemes. But I don't think I implement Rave like others do either. Hi David, Earlier in the thread you said: I start with one move, and slowly add moves to the pool of moves to be considered, ... Many Faces is a pretty good heuristic for selecting good moves, so if you are just using RAVE to do move ordering you might need to widen faster. But above you say: My progressive widening only uses Rave values. ? Yamato was asking why it works at all. My understanding of the progressive widening idea was to use some pattern/expert knowledge information to guide the early MCTS, to stop it wasting lots of playouts on silly moves. And/or to make it choose more solid (better shape) moves when lots of moves have similar winning percentages [1]. Is that how you see it helping too? Darren [1]: Also in this thread, David wrote: I think it's more that Many Faces values moves that have good long-term consequences that the search can't find, so among moves with similar win rates, it will choose the ones Many Faces prefers. -- Darren Cook, Software Researcher/Developer http://dcook.org/gobet/ (Shodan Go Bet - who will win?) http://dcook.org/mlsn/ (Multilingual open source semantic network) http://dcook.org/work/ (About me and my work) http://dcook.org/blogs.html (My blogs and articles) ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Progressive widening vs unpruning
What's your general approach? My understanding from your previous posts is that it's something like: UCT search using Silver's beta formula and UCB1 with Win-rate and Rave for choosing a child (I use basic UCT with win-rate and Rave, and the original MOGO beta formula). UCT search is biased with variable sized patterns extracted from strong games using something similar to Remi's algorithm. I don't know if you bias the initial rave values, or bias the initial win rate, or add a third heuristic term to the UCT formula. (I bias the rave value, initially with a simple heuristic, and later with Many Faces' knowledge). I don't know how you accumulate the rave value. (I just use all moves that lead to a win for either side, in the whole game). You apply UCT to every legal move. (I apply UCT to a pool of moves that starts with one move and gradually increases to 30 moves. I use rave to choose which moves to promote to the pool). You have powerful playouts with tactical search in the playouts. How many playouts per second do you do per CPU on 19x19? (I have fairly simple playouts - local responses for 1 and 2 liberty neighbors, without local search, 3x3 local patterns, don't fill eyes or seki or make stupid self Atari.) Does this sound about right? Your implementation must be very different from mine. Actually I don't use Progressive widening (or unpruning) at all. It's a mystery to me why others say it does work. -- Yamato ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Progressive widening vs unpruning
I was not at all surprised by this result. My thinking goes like this. On 9x9 the global situation is everything that matters and precomputed information is not as important as searching effectly is. Good 9x9 games are often very sharp fights where then next move often violates good shapes known from 19x19 games. On 19x19 deep global search is impossible and good local shape is more often more important (or perhaps more often local shape is good enough to indicate the correct move) than the global situation so heuristic precomputed knowledge is essential to make search efficient. In Valkyria I also learned that things I do to improve 19x19 has no effect on 9x9 or sometimes makes it weaker. -Magnus Quoting David Fotland fotl...@smart-games.com: So far on 9x9 go, Many Faces doesn't seem to make a huge difference. On 19x19 it makes a huge difference. I ran test games overnight against Gnugo. With Many Faces turned on, my engine wins 85%. With many Faces turned off, my engine wins 7%. Both results are unexpected. Since most of my tuning is on 19x19, the combination of Many Faces and UCT is probably badly mistuned for 9x9. David ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Progressive widening vs unpruning
For 9x9 games, when I added progressive widening to AntiGo (before I added RAVE), it was low hanging fruit. I used my old program Antbot9x9 for the move ranking and got a very nice strength increase for very little effort. Then, with a bit of tweaking, I got more improvement. RAVE, on the other hand, gives me a small improvement, and it comes at the cost of having to fuss with it endlessly. With no opening table and 5000 playouts per move, AntiGo wins against gnugo3.7.10 in the neighborhood of 60 and 70%, depending on what is enabled. I optimized my playout policy before I added RAVE. They seem to interact poorly. The following tables are from running UCT for black to move on an empty 9x9 board and showing the RAVE scores at the root. Here is the result when using light playouts. RAVE ?? 1|?? 33?? 43?? 46?? 38?? 40?? 41?? 28?? 33?? 41 ?? 2|?? 37?? 44?? 37?? 35?? 43?? 42?? 37?? 42?? 40 ?? 3|?? 42?? 44?? 41?? 43?? 45?? 48?? 44?? 44?? 39 ?? 4|?? 35?? 45?? 46?? 45?? 43?? 42?? 48?? 45?? 35 ?? 5|?? 36?? 42?? 43?? 45?? 45?? 44?? 40?? 45?? 36 ?? 6|?? 40?? 42?? 40?? 43?? 45?? 43?? 45?? 43?? 37 ?? 7|?? 41?? 47?? 45?? 38?? 42?? 38?? 39?? 40?? 35 ?? 8|?? 32?? 45?? 44?? 45?? 46?? 47?? 45?? 40?? 38 ?? 9|?? 38?? 40?? 38?? 27?? 40?? 38?? 42?? 30?? 37?? It looks noisy, but perfectly reasonable. With more playouts, it looks nicer. Here is the result when using my heavy playouts. RAVE ?? 1|?? 53?? 44?? 45?? 44?? 45?? 47?? 46?? 43?? 54 ?? 2|?? 43?? 50?? 49?? 48?? 49?? 49?? 50?? 50?? 44 ?? 3|?? 46?? 49?? 49?? 50?? 49?? 49?? 48?? 49?? 46 ?? 4|?? 47?? 48?? 48?? 47?? 49?? 47?? 48?? 49?? 47 ?? 5|?? 45?? 49?? 47?? 49?? 49?? 50?? 47?? 49?? 47 ?? 6|?? 46?? 48?? 48?? 49?? 48?? 48?? 49?? 49?? 45 ?? 7|?? 47?? 50?? 49?? 50?? 49?? 48?? 49?? 50?? 45 ?? 8|?? 47?? 49?? 50?? 50?? 48?? 48?? 49?? 51?? 44 ?? 9|?? 54?? 45?? 47?? 45?? 46?? 46?? 46?? 46?? 54 It looks like garbage! In fact, it's surprising that it could help at all. Progressive widening saves that day. While moves in the corner have higher RAVE scores, they have very low rank. Here is the result when using my heavy playouts and CRAVE. CRAVE ?? 1|?? 35?? 37?? 37?? 28?? 40?? 35?? 40?? 32?? 25 ?? 2|?? 38?? 35?? 44?? 39?? 38?? 49?? 39?? 42?? 34 ?? 3|?? 33?? 38?? 48?? 48?? 51?? 49?? 49?? 39?? 36 ?? 4|?? 41?? 41?? 48?? 50?? 50?? 47?? 49?? 41?? 37 ?? 5|?? 35?? 41?? 51?? 48?? 50?? 47?? 50?? 43?? 31 ?? 6|?? 39?? 40?? 51?? 50?? 50?? 47?? 51?? 46?? 41 ?? 7|?? 36?? 36?? 51?? 51?? 45?? 52?? 50?? 38?? 37 ?? 8|?? 37?? 40?? 45?? 41?? 47?? 37?? 43?? 39?? 32 ?? 9|?? 21?? 27?? 31?? 36?? 38?? 35?? 32?? 35?? 28 Maybe CRAVE (and perhaps, progressive widening on 9x9) seems more useful to me because of compensation for problems other programs don't share. - Dave Hillis ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Progressive widening vs unpruning
I made some more tables with 50K playouts, in case that is easier to look at. 50 K playouts? LIGHT PLAYOUTs XX rave ?? 1|?? 36?? 39?? 38?? 40?? 39?? 40?? 40?? 40?? 37 ?? 2|?? 38?? 40?? 42?? 42?? 42?? 42?? 42?? 40?? 40 ?? 3|?? 40?? 42?? 42?? 43?? 43?? 42?? 43?? 42?? 39 ?? 4|?? 39?? 42?? 44?? 43?? 44?? 44?? 43?? 42?? 41 ?? 5|?? 38?? 41?? 42?? 44?? 45?? 45?? 42?? 41?? 40 ?? 6|?? 40?? 41?? 42?? 43?? 45?? 43?? 44?? 42?? 39 ?? 7|?? 41?? 42?? 41?? 43?? 43?? 43?? 42?? 41?? 40 ?? 8|?? 39?? 41?? 41?? 43?? 42?? 40?? 41?? 40?? 41 ?? 9|?? 37?? 40?? 40?? 39?? 40?? 40?? 39?? 40?? 37 NO CONTEXT xx rave ?? 1|?? 50?? 43?? 44?? 44?? 45?? 44?? 44?? 43?? 50 ?? 2|?? 43?? 49?? 48?? 48?? 48?? 48?? 48?? 49?? 43 ?? 3|?? 44?? 48?? 47?? 47?? 47?? 47?? 47?? 48?? 44 ?? 4|?? 44?? 48?? 47?? 47?? 47?? 47?? 47?? 48?? 44 ?? 5|?? 44?? 49?? 47?? 47?? 48?? 47?? 47?? 48?? 44 ?? 6|?? 44?? 48?? 47?? 47?? 47?? 47?? 47?? 48?? 44 ?? 7|?? 44?? 48?? 47?? 47?? 47?? 47?? 46?? 48?? 44 ?? 8|?? 43?? 49?? 48?? 48?? 48?? 48?? 48?? 49?? 43 ?? 9|?? 50?? 43?? 44?? 44?? 44?? 44?? 44?? 43?? 50? CONTEXT x CRAVE ?? 1|?? 31?? 36?? 35?? 38?? 35?? 33?? 38?? 35?? 29 ?? 2|?? 36?? 38?? 40?? 40?? 39?? 41?? 40?? 38?? 34 ?? 3|?? 36?? 40?? 47?? 47?? 46?? 47?? 47?? 42?? 35 ?? 4|?? 34?? 40?? 46?? 47?? 46?? 47?? 47?? 40?? 37 ?? 5|?? 35?? 40?? 45?? 47?? 47?? 47?? 45?? 40?? 39 ?? 6|?? 35?? 42?? 46?? 46?? 47?? 47?? 47?? 39?? 38 ?? 7|?? 35?? 40?? 47?? 46?? 43?? 46?? 47?? 41?? 36 ?? 8|?? 36?? 37?? 40?? 42?? 41?? 41?? 41?? 39?? 34 ?? 9|?? 32?? 34?? 34?? 35?? 37?? 36?? 36?? 35?? 31 - Dave Hillis -Original Message- From: dhillism...@netscape.net To: computer-go@computer-go.org Sent: Thu, Oct 1, 2009 4:50 pm Subject: Re: [computer-go] Progressive widening vs unpruning For 9x9 games, when I added progressive widening to AntiGo (before I added RAVE), it was low hanging fruit. I used my old program Antbot9x9 for the move ranking and got a very nice strength increase for very little effort. Then, with a bit of tweaking, I got more improvement. RAVE, on the other hand, gives me a small improvement, and it comes at the cost of having to fuss with it endlessly. ? With no opening table and 5000 playouts per move, AntiGo wins against gnugo3.7.10 in the neighborhood of 60 and 70%, depending on what is enabled. ? I optimized my playout policy before I added RAVE. They seem to interact poorly. ? The following tables are from running UCT for black to move on an empty 9x9 board and showing the RAVE scores at the root. Here is the result when using light playouts. RAVE ?? 1|?? 33?? 43?? 46?? 38?? 40?? 41?? 28?? 33?? 41 ?? 2|?? 37?? 44?? 37?? 35?? 43?? 42?? 37?? 42?? 40 ?? 3|?? 42?? 44?? 41?? 43?? 45?? 48?? 44?? 44?? 39 ?? 4|?? 35?? 45?? 46?? 45?? 43?? 42?? 48?? 45?? 35 ?? 5|?? 36?? 42?? 43?? 45?? 45?? 44?? 40?? 45?? 36 ?? 6|?? 40?? 42?? 40?? 43?? 45?? 43?? 45?? 43?? 37 ?? 7|?? 41?? 47?? 45?? 38?? 42?? 38?? 39?? 40?? 35 ?? 8|?? 32?? 45?? 44?? 45?? 46?? 47?? 45?? 40?? 38 ?? 9|?? 38?? 40?? 38?? 27?? 40?? 38?? 42?? 30?? 37?? It looks noisy, but perfectly reasonable. With more playouts, it looks nicer. ? Here is the result when using my heavy playouts. RAVE ?? 1|?? 53?? 44?? 45?? 44?? 45?? 47?? 46?? 43?? 54 ?? 2|?? 43?? 50?? 49?? 48?? 49?? 49?? 50?? 50?? 44 ?? 3|?? 46?? 49?? 49?? 50?? 49?? 49?? 48?? 49?? 46 ?? 4|?? 47?? 48?? 48?? 47?? 49?? 47?? 48?? 49?? 47 ?? 5|?? 45?? 49?? 47?? 49?? 49?? 50?? 47?? 49?? 47 ?? 6|?? 46?? 48?? 48?? 49?? 48?? 48?? 49?? 49?? 45 ?? 7|?? 47?? 50?? 49?? 50?? 49?? 48?? 49?? 50?? 45 ?? 8|?? 47?? 49?? 50?? 50?? 48?? 48?? 49?? 51?? 44 ?? 9|?? 54?? 45?? 47?? 45?? 46?? 46?? 46?? 46?? 54 It looks like garbage! In fact, it's surprising that it could help at all. Progressive widening saves that day. While moves in the corner have higher RAVE scores, they have very low rank. ? Here is the result when using my heavy playouts and CRAVE. CRAVE ?? 1|?? 35?? 37?? 37?? 28?? 40?? 35?? 40?? 32?? 25 ?? 2|?? 38?? 35?? 44?? 39?? 38?? 49?? 39?? 42?? 34 ?? 3|?? 33?? 38?? 48?? 48?? 51?? 49?? 49?? 39?? 36 ?? 4|?? 41?? 41?? 48?? 50?? 50?? 47?? 49?? 41?? 37 ?? 5|?? 35?? 41?? 51?? 48?? 50?? 47?? 50?? 43?? 31 ?? 6|?? 39?? 40?? 51?? 50?? 50?? 47?? 51?? 46?? 41 ?? 7|?? 36?? 36?? 51?? 51?? 45?? 52?? 50?? 38?? 37 ?? 8|?? 37?? 40?? 45?? 41?? 47?? 37?? 43?? 39?? 32 ?? 9|?? 21?? 27?? 31?? 36?? 38?? 35?? 32?? 35?? 28 Maybe CRAVE (and perhaps, progressive widening on 9x9) seems more useful to me because of compensation for problems other programs don't share. ? - Dave Hillis ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo
RE: [SPAM] Re: [computer-go] Progressive widening vs unpruning
Look for the graph I posted a few weeks ago. Most things tried make it worse. Some make it a little better, and every now and then there is a big jump. David I'm wondering, are these tunings about squeezing single-percent increases with very narrow confidence bounds, or something that gives immediately noticeable 10% boost when applied? I'm curious about how the top bots improve, if it's accumulating many tiny increases or long quests for sudden boosts. -- Petr Pasky Baudis A lot of people have my books on their bookshelves. That's the problem, they need to read them. -- Don Knuth ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Progressive widening vs unpruning
Well I have no idea how much I gained from this. It might be weaker than what everyone else is doing, since it seems I didn't implement this as it's been described recently. My progressive widening only uses Rave values. It's very simple. Others seem to have much more complex schemes. But I don't think I implement Rave like others do either. It is an interesting answer. You don't use the combined values, also static evaluations, for Progressive widening. So what do you use? It sounds like a ManyFaces' secret. Could you tell us how much Elo rating you gained by this invention? -- Yamato ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Progressive widening vs unpruning
I'm trying an experiment. I took the Many Faces code completely out of the engine, and put it on 9x9 cgos as mfgo-none-1c. It's faster without Many Faces, but it's just a basic uct engine with medium playouts. This should tell us how much benefit I get from the Many Faces knowledge. David I recall that you credited the use of Many Faces rules with a massive improvement against GnuGo, so the technique is certainly empirically justified. But I am wondering how it achieves its results. That is, what do you think the difference is, compared to standard unpruning? ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Progressive widening vs unpruning
I start with one move, and slowly add moves to the pool of moves to be considered, peaking at considering 30 moves. My current schedule looks like: visits 0 2 5 9 15 24 38 59 90 100 ... 2142 moves 1 2 3 4 5 6 7 8 9 10 ...20 I didn't find that strength is very sensitive to the schedule. Many Faces is a pretty good heuristic for selecting good moves, so if you are just using RAVE to do move ordering you might need to widen faster. David -Original Message- From: computer-go-boun...@computer-go.org [mailto:computer-go- boun...@computer-go.org] On Behalf Of Petr Baudis Sent: Tuesday, September 29, 2009 7:07 AM To: computer-go@computer-go.org Subject: [computer-go] Progressive widening vs unpruning Hi! I'm a little unclear on this, so I'd like to make sure I'm not missing any important technique - is progressive widening and progressive unpruning synonymous? I have looked both into the pMCTS and the CrazyStone papers and it seems that widening differs from unpruning in that certain number of simulations is first made before limiting the number of searches. Which of the variants is commonly used? What speed of widening works for you best? Thanks, -- Petr Pasky Baudis A lot of people have my books on their bookshelves. That's the problem, they need to read them. -- Don Knuth ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Progressive widening vs unpruning
I'm not sure whether they meant different things when they were first coined, but maybe that doesn't matter, and there are two different approaches that should be distinguished somehow. When a node has been visited the required number of times: 1) Use patterns, heuristics, ownership maps from the earlier playouts through it, etc. to calculate a score for each move. Then rank them from prettier to uglier. At this node, only allow moves within the top N prettiest moves, where N grows slowly with the number of subsequent playouts through the node. 2) Calculate the score, as in 1). Initialize the win/loss record for each move *as if* many more prior playouts had already gone through this node. Prettier moves are initialized as if they had a higher proportion of prior?wins. Typically, it is the win/loss record for RAVE moves that gets this prior adjustment. They can be combined too. If the calculations involved?are expensive, then it is important to hold off until several playouts have gone through the node. Most nodes don't get visited more than once or twice.?If they use something like ownership maps, then, naturally, one has to wait until some data has been gathered. - Dave Hillis -Original Message- From: computer-go-boun...@computer-go.org [mailto:computer-go- boun...@computer-go.org] On Behalf Of Petr Baudis Sent: Tuesday, September 29, 2009 7:07 AM To: computer-go@computer-go.org Subject: [computer-go] Progressive widening vs unpruning Hi! I'm a little unclear on this, so I'd like to make sure I'm not missing any important technique - is progressive widening and progressive unpruning synonymous? I have looked both into the pMCTS and the CrazyStone papers and it seems that widening differs from unpruning in that certain number of simulations is first made before limiting the number of searches. Which of the variants is commonly used? What speed of widening works for you best? Thanks, -- Petr Pasky Baudis A lot of people have my books on their bookshelves. That's the problem, they need to read them. -- Don Knuth ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Progressive widening vs unpruning
dhillism...@netscape.net wrote: I'm not sure whether they meant different things when they were first coined, but maybe that doesn't matter, and there are two different approaches that should be distinguished somehow. When a node has been visited the required number of times: 1) Use patterns, heuristics, ownership maps from the earlier playouts through it, etc. to calculate a score for each move. Then rank them from prettier to uglier. At this node, only allow moves within the top N prettiest moves, where N grows slowly with the number of subsequent playouts through the node. This is called progressive unpruning or widening. Guillaume and I published the idea independently, and gave it two different names. 2) Calculate the score, as in 1). Initialize the win/loss record for each move *as if* many more prior playouts had already gone through this node. Prettier moves are initialized as if they had a higher proportion of prior?wins. Typically, it is the win/loss record for RAVE moves that gets this prior adjustment. This was first proposed by Guillaume, and called progressive bias. Rémi ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] Progressive widening vs unpruning
I guess I'm not really appreciating the difference between node value prior and progressive bias - adding a fixed small number of wins or diminishing heuristic value seems very similar to me in practice. Is the difference noticeable? On Tue, Sep 29, 2009 at 08:25:56AM -0700, David Fotland wrote: I start with one move, and slowly add moves to the pool of moves to be considered, peaking at considering 30 moves. My current schedule looks like: visits0 2 5 9 15 24 38 59 90100 ... 2142 moves 1 2 3 4 5 6 7 8 9 10 ...20 Thanks! On Tue, Sep 29, 2009 at 08:49:22PM +0200, Olivier Teytaud wrote: I don't know to which extent my terminology is commonly used, but it seems to be close to the distinction by Dave (but not exactly equal). For me I use progressive widening when we add moves, progressively, to the pool of moves which are to be considered; whereas I use progressive unpruning when a score is computed for all moves, with a weight depending on the number of simulations; typically, I consider the Rave formula in GellySilver as a progressive unpruning (with, for prior value, the Rave value). Ah, I see; so progressive widening is what David described, modifying the process of choosing node with highest urgency, while what you describe as progressive unpruning is actually just what RAVE and priors are, adding extra terms to the urgency calculation, with diminishing value as the simulation count raises. * with progressive unpruning, if your prior is bad, you might have situations in which the score of the only good move is 0, whereas all bad moves have values going to 0 but 0 and therefore the good move is never simulated (this can, however, easily be patched by a lower bound on the value given by the prior) I found the even game prior (3/6) effective for this problem. Progressive unpruning, if you use the same terminology as me, has the advantage that the number of moves to be considered is adapted depending on the score; if you have three reasonnable moves and 300 bad moves, with progressive unpruning you'll visit all moves, whereas progressive unpruning * progressive widening you'll? will stay on the three reasonnable moves as long as they provide a better score than the prior score for the 300 bad moves. -- Petr Pasky Baudis A lot of people have my books on their bookshelves. That's the problem, they need to read them. -- Don Knuth ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning
I guess I'm not really appreciating the difference between node value prior and progressive bias - adding a fixed small number of wins or diminishing heuristic value seems very similar to me in practice. Is the difference noticeable? It just means that the weight of the prior does not necessarily decrease linearly. For example, for Rave, you might prefer to have a weight which depends on both the number of Rave simulations and on the number of real simulations. The formula designed by Sylvain for Rave, for example, can't be recovered without this generalization. For the current mogo, as we have several terms (one for the empirical success rate, one for Rave, one for patterns, one for expert rules...), some of them with used at different places in the code (there is a prior expressed in terms of rave simulations, and a prior expressed as real simulations), it would be complicated to come back to something much simpler ... but perhaps it's possible. Such a clarification might be useful, I have no clear idea of the impact of Rave values now in mogo, in particular for long time settings, and it's not so easy to clarify this point - too many dependencies between the many terms we have. I think someone pointed out a long time ago on this mailing list that initializing the prior in terms of Rave simulations was far less efficient than initializing the prior in terms of real simulations, at least if you have classical rave formulas - at least, we had an improvement when adding a prior in the real simulations, but we had also an improvement when adding one more term, which is not linear. Sorry for forgetting who :-( Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [SPAM] Re: [computer-go] Progressive widening vs unpruning
Are you sure you are not over-tuning to some opponent, or to self-play? I have a very simple formula, win rate plus rave (with the simpler beta formula), plus the simple upper bound term. I bias the rave simulations only. It seems to work pretty well. Your description sounds pretty complex. David From: computer-go-boun...@computer-go.org [mailto:computer-go-boun...@computer-go.org] On Behalf Of Olivier Teytaud Sent: Tuesday, September 29, 2009 1:26 PM To: computer-go Subject: Re: [SPAM] Re: [computer-go] Progressive widening vs unpruning I guess I'm not really appreciating the difference between node value prior and progressive bias - adding a fixed small number of wins or diminishing heuristic value seems very similar to me in practice. Is the difference noticeable? It just means that the weight of the prior does not necessarily decrease linearly. For example, for Rave, you might prefer to have a weight which depends on both the number of Rave simulations and on the number of real simulations. The formula designed by Sylvain for Rave, for example, can't be recovered without this generalization. For the current mogo, as we have several terms (one for the empirical success rate, one for Rave, one for patterns, one for expert rules...), some of them with used at different places in the code (there is a prior expressed in terms of rave simulations, and a prior expressed as real simulations), it would be complicated to come back to something much simpler ... but perhaps it's possible. Such a clarification might be useful, I have no clear idea of the impact of Rave values now in mogo, in particular for long time settings, and it's not so easy to clarify this point - too many dependencies between the many terms we have. I think someone pointed out a long time ago on this mailing list that initializing the prior in terms of Rave simulations was far less efficient than initializing the prior in terms of real simulations, at least if you have classical rave formulas - at least, we had an improvement when adding a prior in the real simulations, but we had also an improvement when adding one more term, which is not linear. Sorry for forgetting who :-( Olivier ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
RE: [computer-go] Progressive widening vs unpruning
This sounds like progressive widening, but it could still be progressive unpruning, depending on implementation choices. I do both. I have a small pool of moves that are active and I also bias the initial rave values. My current schedule looks like: To be sure that I understand, MF orders the moves using static analysis, and then the ordering is further modified by RAVE observations? So when Many Faces accumulates Schedule(N) trials, it will restrict its attention to the N highest ranked moves according to the combined Static + UCT/RAVE? Or does MF restrict the choice to the highest N by Static eval, and then order the top N by UCT/RAVE? No, neither. Now I'm thinking that maybe I'm doing something different from what I thought was described in the papers. I didn't look at them carefully. I just took the name Progressive widening, and invented something that seems to work well. if you are just using RAVE to do move ordering you might need to widen faster. I recall that you credited the use of Many Faces rules with a massive improvement against GnuGo, so the technique is certainly empirically justified. But I am wondering how it achieves its results. That is, what do you think the difference is, compared to standard unpruning? There is a rule that I live by, which is GG SS. This rule (really a universal law, when you get right down to it) states that a Good Guess is much better than a Short Search. Agreed. That's why I always prefer to add knowledge rather than tinker with search parameters. So is the benefit that MF avoids wasting trials on moves that were just lucky in early trials, but probably will not stand up? I think it's more that Many Faces values moves that have good long-term consequences that the search can't find, so among moves with similar win rates, it will choose the ones Many Faces prefers. Or sometimes I think that UCT/MC is filling in the holes where Many Faces' knowledge is incorrect or brittle. I am also wondering whether you could achieve the same effect by using pure progressive unpruning, but with a heavier weight (e.g., 100 trials) for Many Faces opinion. ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/