>> if you think about it, it simply *cannot* be accurate. > > Is that logic or is credulity speaking?
credulity, since i have no information about the specifics of how these two groups are being measured other than to say that they're not being measured in the same way. to be clear, i'm talking about comparing these two kinds of rankings: * humans playing against each other in real life in tournaments organized by, say, FIDE. * computers playing against one another and against people on a computer server. two assumptions that i'm making here: the first kind of ranking is what i assume people use when they talk about famously (top few in the world) strong chess playing humans. when someone mentions kramnik's ELO rating, i'm guessing they don't mean kramnik as to be found on some chess server somewhere. the second kind of ranking is what i assume is the only well-measured kind of ranking available for computer players. however, even if computers were allowed to play in FIDE-governed tournaments and to achieve all of their ELO points that way, i still don't think that their ELO ratings would accurately predict their winrates against the top players if the top players weren't all playing against them with the same frequency that they play against humans. additionally, for the reasons i mentioned in my last posting, i think that a very small pool of computer-only players can generate a fairly skewed ELO ranking because it is measuring a different thing -- namely, how well-ordered the set of computer players is. a final reason why we shouldn't expect ELO to be comparable between the two groups is that when comparing two far-from-comparable groups using ELO, it becomes very difficult to estimate winrates: imagine if the only way to get a go ranking on a go server was for the server to estimate the percentage of the time that you would win against someone 20 stones stronger. it would assign you a default minimum ranking and the player 20 stones stronger would slowly accrete ELO from you and your ilk. something less dramatic but similar is (presumably) occurring with computer players right now. if the winrate is effectively extremely high against humans, the lowest-strength computer player with that property will have an ELO higher than the highest-ranked human it has played. if these programs are close to well-ordered, and if their mutual winrates are high, then the ELO calculation for each of them should rapidly push the top program very far above the top human, even if the top human has never played the top program. the ELO will get sucked from whomever is willing to play them, but magnified because of the stratification of winrates that (presumably) occurs. it would be relatively easy to see to what extent this effect matters if anyone knows the pairwise winrates of the top (unmodified) computer players over a long enough time period. s. _______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
