On Sun, Oct 31, 2010 at 9:35 PM, steve uurtamo <[email protected]> wrote:
> >> if you think about it, it simply *cannot* be accurate. > > > > Is that logic or is credulity speaking? > > credulity, since i have no information about the specifics of how > these two groups are being measured other than to say that they're not > being measured in the same way. > > to be clear, i'm talking about comparing these two kinds of rankings: > > * humans playing against each other in real life in tournaments > organized by, say, FIDE. > > * computers playing against one another and against people on a computer > server. > > two assumptions that i'm making here: > > the first kind of ranking is what i assume people use when they talk > about famously (top few in the world) strong chess playing humans. > when someone mentions kramnik's ELO rating, i'm guessing they don't > mean kramnik as to be found on some chess server somewhere. the second > kind of ranking is what i assume is the only well-measured kind of > ranking available for computer players. > > however, even if computers were allowed to play in FIDE-governed > tournaments and to achieve all of their ELO points that way, i still > don't think that their ELO ratings would accurately predict their > winrates against the top players if the top players weren't all > playing against them with the same frequency that they play against > humans. additionally, for the reasons i mentioned in my last posting, > i think that a very small pool of computer-only players can generate a > fairly skewed ELO ranking because it is measuring a different thing -- > namely, how well-ordered the set of computer players is. > > a final reason why we shouldn't expect ELO to be comparable between > the two groups is that when comparing two far-from-comparable groups > using ELO, it becomes very difficult to estimate winrates: > > imagine if the only way to get a go ranking on a go server was for the > server to estimate the percentage of the time that you would win > against someone 20 stones stronger. it would assign you a default > minimum ranking and the player 20 stones stronger would slowly accrete > ELO from you and your ilk. something less dramatic but similar is > (presumably) occurring with computer players right now. if the winrate > is effectively extremely high against humans, the lowest-strength > computer player with that property will have an ELO higher than the > highest-ranked human it has played. if these programs are close to > well-ordered, and if their mutual winrates are high, then the ELO > calculation for each of them should rapidly push the top program very > far above the top human, even if the top human has never played the > top program. the ELO will get sucked from whomever is willing to play > them, but magnified because of the stratification of winrates that > (presumably) occurs. > > it would be relatively easy to see to what extent this effect matters > if anyone knows the pairwise winrates of the top (unmodified) computer > players over a long enough time period. > If you simply observe that computers always win the matches and almost all the games against the top humans, you know that computers are way above humans. I don't want to quibble over the exact ELO difference, my only point is that it's not close. All matches now are played with some kind of odds involved, otherwise it's not even close or interesting. It seems to me that you are not willing to accept that computers are really that much stronger and you are trying to explain it away with speculation. > > s. > _______________________________________________ > Computer-go mailing list > [email protected] > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go >
_______________________________________________ Computer-go mailing list [email protected] http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
