On Wed, 2009-01-14 at 20:44 -0200, Mark Boon wrote: > On Jan 14, 2009, at 3:40 PM, Don Dailey wrote: > > > However, the best thing to do is to ignore that page and go the "Bayes > > Rated" link which is updated every day. This is the total > > performance > > rating over all time of any player on CGOS. Everything is rated > > together, even if you have only played 1 or 2 games but I do not > > display > > players with less than 20 games. > > I see, I wasn't aware of that page. Doesn't it make sense then to > occasionally calibrate the ratings of the active programs with their > rating on this list? > > There seems to be quite a difference between the rating of many > programs on that list compared to what the CGOS server shows while > playing. A difference of 100 points or more often. I'm particularly > surprised by the high rating of the MCTS-2K.v814 program after 349 > games. Somehow an ELO of 1753 seems way too high for a program with > just 2,000 semi-light playouts, compared to a rating of 1340 of the > same program with 2,000 light playouts. Of course I introduced a few > improvements, but my own tests didn't indicate such a big difference. > Conversely, my tests show a 90% win-rate by the 32K version against > the 2K version. But the 32K version is only rated 2014 after some 600 > games. According to your table I'd expect a 400 point difference > between the two. > > Is it possible that after 350 games a program's 'bayeslo' rating is > off by 100-150 ELO points? Isn't that extremely unlikely, bordering on > the impossible?
Just look at the confidence interval posted on that page. You can be 95% confident (I think that's how it's configured) that whatever rating is posted is accurate to within the posted confidence interval. For 350 games that interval is well over 100 ELO (perhaps 150 ELO wide) but it depends of course on who you play. You will see the plus and minus signs in the heading to calculate that uncertainty. With incremental ratings the situation is worse and acts sort of like a random walk by a drunk with a long flimsy rubber band attached to him to make sure he doesn't go TOO far astray. - Don > > Mark > _______________________________________________ computer-go mailing list [email protected] http://www.computer-go.org/mailman/listinfo/computer-go/
