On Mon, Jun 20, 2011 at 4:49 PM, David Fotland <[email protected]>wrote:
> Also, this rating is calculated by self-play, so it overstates the true > rating. The asymptotic rating is achievable with today’s hardware, but > fuego on that many cores (my assumption) is not as strong as a professional > go player. > Yes, I completely agree. Self-testing (actually I think this is more about computer/computer testing) is subject to what I call "rating expansion" which actually is a good thing for measuring small improvements. Don > **** > > ** ** > > David**** > > ** ** > > *From:* [email protected] [mailto: > [email protected]] *On Behalf Of *Andy > *Sent:* Monday, June 20, 2011 1:35 PM > *To:* [email protected] > *Subject:* Re: [Computer-go] Thoughts about bugs and scalability**** > > ** ** > > It's more than 100 vanilla Elo points per rank.**** > > ** ** > > The KGS rating algorithm sets 1 rank as:**** > > ~139 for 30k-5k (70% win rate expected in even games between 5k and 6k)** > ** > > ~226 for 2d+ (79% win rate expected in even games between 2d and 3d)* > *** > > (Sliding scale between 5k and 2d.)**** > > ** ** > > EGF uses a similar system to slide the winning percentages**** > > 80% win rate expected between 6d and 7d**** > > 64% win rate expected between 29k and 30k **** > > ** ** > > On Mon, Jun 20, 2011 at 3:20 PM, Don Dailey <[email protected]> wrote:* > *** > > Richard,**** > > ** ** > > You said the performance asymptote is far from perfect play, but your > projection in the paper is an ELO of 4154 ABOVE the level of Fuego playing > at 1 minute per game. **** > > **** > > It has been estimated that 100 ELO is about 1 Dan at the low kyu high dan > levels and even less at higher levels. This implies by your own > calculation that there are at LEAST 41 ranks between the 1 minute Feugo > player and asymptote that you estimated. **** > > ** ** > > That "back of the envelope" calculation is crude I admit but it seems to > indicate that an ELO rating of 4154 compared to Fuego at 1 minute is FAR > above the best player. **** > > ** ** > > So what is the basis of your statement that the asymptote is "far from > perfect play?"**** > > ** ** > > I must have missed this in the paper as I read it quickly, but I did not > see any claims in the paper about this, which I considered a good things > since we don't know where perfect play is. **** > > ** ** > > Don**** > > ** ** > > ** ** > > ** ** > > The implication of that and your own calculation is that there are 41 ranks > between the 1 minute Fuego and the asymptote. **** > > ** ** > > On Mon, Jun 20, 2011 at 4:03 PM, Richard B Segal <[email protected]> > wrote:**** > > Hi All, > > As my scalability paper has come up several times in the last few weeks I > thought it would be useful to give a brief overview of the results relevant > to the current conversation and provide additional insights. > > The first set of experiments analyzed the scalability of Fuego on a single > core machine without multithreading. The experiments consisted of 1,000 > self-play games where one player received N minutes for all moves, and the > second player receiving 2*N minutes for all moves. The resulting scaling > graph looks qualitatively similar to the recent scaling graph posted for > Pachi. The important observation is that the ELO gained between each > successive doubling decreases as search time increases. If we assume this > trend continues, then the ELO gained for each doubling will converge to zero > forcing absolute performance to level off. Ideally, this asymptote would not > occur until perfect play is reached. Unfortunately, for Fuego, fitting an > exponential curve to the data suggests a performance asymptote that is far > from perfect play. > > This disappointing result raises lots of questions and caveats. The results > are specific to Fuego. I suspect all programs will display a similar shaped > scaling curve and asymptote. However, better designed programs should > asymptote to better performance. The best programs may asymptote to perfect > play. The results are for self-play with all its inherent limitations. > However, I suspect self-play will overestimate true scaling, so the results > are still probably an upper bound on Fuego's true performance. Sufficient > memory was available and did not affect the results. All results are for > 19x19 play. > > The level of play that Fuego appears to asymptote is well below expert play > and may even be below the performance of the top programs such as Zen and > Erica. When Fuego is run our 32 core / 128 thread competition machine, the > above analysis suggests Fuego is within 500 ELO of its maximum possible play > quality. My (almost random) guess is that on competition hardware Fuego is > about 200 to 300 elo weaker than Zen and Erica on competition hardware. > > The big question is why this asymptote exists at such a weak level of play. > Unfortunately, I do not have a definitive answer to this question. Some > important observations. I agree with much of the earlier discussion that any > well written MCTS implementation will eventually converge on optimal play. I > believe I can prove that any MCTS implementation that meets the following > criteria: > > 1) Eventually includes all possible legal move sequences in the tree. > 2) Correctly scores terminal positions > 3) Selects the highest scoring node an infinite number of times > 4) And never lets a node's score drop to zero unless its proven a loss. > > will eventually achieve optimal play. The above applies to traditional UCT, > greedy versions of UCT without an exploration term, and most RAVE variants. > Note, I suspect there is some confusion about how quickly the above > converges. The convergence of MCTS is much worse than linear in the size of > the search space. What happens when a move looks good for one million trials > and then a refuting response is found? MCTS may need to revisit the refuting > node millions of times to reduce the score of the parent move to a low > enough value for an alternative move to be preferred. MCTS has been shown to > have a worst case convergence of 2^(2^(2^....) where the number of > exponentiations is linear in the tree depth* > > My guess as to why the asymptote exists at such a weak level of play is a > combination of problems with Fuego and the scalability of MCTS as I describe > above. The scaling of MCTS is so poor that it may be impossible from simple > curve fitting to distinguish an asymptote and an extremely slowly improving > curve. I suspect the reason that Fuego hits its asymptote so early in the > play quality curve is due to limits in its design. Programs that can get > more value out of a fixed sized tree can achieve higher levels of play > before the awful scaling of MCTS overtakes and prevents further practical > improvement. > > - Rich > > > * P. Coquelin and R. Munos. Bandit Algorithms for Tree Search. UAI 2007. > > > > [email protected] wrote on 06/20/2011 02:23:48 PM: > > > [image removed] > > > > Re: [Computer-go] Thoughts about bugs and scalability > > > > René van de Veerdonk > > > > to: > > > > computer-go > > > > 06/20/2011 02:24 PM > > > > Sent by: > > > > [email protected] > > > > Please respond to computer-go**** > > > > > > I am surprised you were asked to pay for the paper, I was not asked > > to pay for the download (that's why I felt okay to provide the link). > > > > From the conclusion: > > "We first analyzed the single-threaded scaling of Fuego and found > > that there is an upper bound on the play-quality improvements that > > can come from additional search. This in itself limits the potential > > advantage of Blue Gene/P as its total processing power in a one-hour > > timed game easily exceeds the computation needed to maximize Fuego’s > > overall play quality." > > > > René > > > > On Mon, Jun 20, 2011 at 10:35 AM, Don Dailey <[email protected]> > wrote: > > First of all, I never purchase papers of unknown quality. If I > > know a paper is great or that it comes highly recommended by someone > > I trust I will purchase it although the idea of purchasing papers in > > general is distasteful to me. > > > > The scalability issue as you summarize has to do with the number of > > cores - I am not surprised that more cores bring less improvement > > and that is true in ALL games. That's not the issue I am talking > > about here although it's certain a legitimate practical issue. > > > > To better understand the issue, forget about the specific machine > > it's on and substitute the phrase "scale with time." > > > > There is no scaling issue with reasonably written programs that can > > be tested with current hardware. Showing that it's difficult to > > parallelize an algorithm that is basically serial does not prove > > anything here. > > > > I have explained why people believe MCTS may have limits and why > > they are wrong. All such "evidence" that we have hit some wall is > > anecdotal. Unless of course you know of someone who has run a few > > thousands games at 1 day per move on a single core machine (or > equivalent.) > > > > Don > > > > > > On Mon, Jun 20, 2011 at 1:12 PM, René van de Veerdonk < > > [email protected]> wrote: > > On Mon, Jun 20, 2011 at 8:27 AM, Don Dailey <[email protected]> > wrote: > > On Mon, Jun 20, 2011 at 10:09 AM, terry mcintyre < > [email protected] > > > wrote: > > Any particular instance of a program will probably fail to scale - > > especially against humans who share the lessons of experience. > > > > That is complete nonsense. How are you backing this up? What > > proof do you have that computers don't play better on better > > hardware? Why are the top programs being run on clusters and > > multi-core computers? Are the authors just complete idiots? > > > > Every bit of evidence we have says they are scaling very well > > against humans. That has also been our experience in game after > > game, not just in Go. > > > > I apologize for being so harsh on this, but you are too smart to be > > saying such dumb things. > > > > Please read (from Darren Cook): > > > > Richard Segal (who operates Blue Fuego) has a paper on the upper limit > > for scaling: > > http://www.springerlink.com/content/b8p81h40129116kl/ > > (Sorry, I couldn't find an author's download link for the paper; Richard > > is on the Fuego list but I'm not sure he is even a lurker here.) > > > > Another paper that mentions the topic (also Fuego specific) is here: > > http://webdocs.cs.ualberta.ca/~mmueller/ps/fuego-TCIAIG.pdf > > > > The short conclusion of these papers is that it was not worth it to > > scale Fuego to a IBM/BlueGene machine (original purpose of the > > work), as it didn't get any better anymore, i.e., performance didn't > > (appreciably) scale beyond 512 or so cores. Obviously, that's still > > a big improvement from a desktop pc, but far from the capacity of > > the supercomputer available. > > > > Zen also has a similar issue according to its co-author Kideko Kato. > > That's evidence enough for me to indicate that there indeed is an > > issue with the current breed of programs. > > > > It appears that current specific implementations have a scaling > > ceiling. I expect, as expressed by multiple people here, that once > > that ceiling gets close enough to become a limitation, people will > > change their algorithms accordingly and scaling will continue. But > > it will only happen once you can comfortably test the program in > > that resource regime. > > > > René > > > > _______________________________________________ > > Computer-go mailing list > > [email protected] > > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > > > > > _______________________________________________ > > Computer-go mailing list > > [email protected] > > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > _______________________________________________ > > Computer-go mailing list > > [email protected] > > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go**** > > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go**** > > ** ** > > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go**** > > ** ** > > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
