Don Dailey wrote:
Another example I found is the impressive Valkyria program.   Version
2.7 won 92% of it's games,  more than even the top rated greenpeep0.5.1.

However, the average rating of Valkyria's opponents was only 1722. This is quite a difference. So Valkyria is rated only 2222 compared to greenpeep 2621 despite the fact that it wins more!
Of course 2222 is still impressive and Valkyria is number 17  out of all
the programs that have played over 200 games (over 200 of them.)
- Don
Hi,

I would like to add a note to this discussion to explain that the computed rating is not a function of winning rate and average opponent. Simply take this simple example into consideration:

player A:
1 win and 1 loss against a player of rating 1500
1 win against a player of rating 500
1 loss against a player of rating 4500
-> 50% against an average of 2000

player B:
1 win and 1 loss against a player of rating 2000
1 win against a player of rating 1000
1 loss against a player of rating 3000
-> 50% against an average of 2000

Although they have the same average opponent, and the same winning rate, player A's evaluation should be much lower than player B's.

Maybe this was clear to Don already, but his message sounds a little like it would be possible to estimate rating from winning rate and average opponent. It is not.

Some rating algorithms try to do it anyway (like EloStat, and the rating system of the French Chess Federation), but they are very badly flawed. Real-life examples where they fail badly can be found on bayeselo's page:
http://remi.coulom.free.fr/Bayesian-Elo/
(look for "average of ratings" in the page)

Rémi
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to