Hi Don, All,
The 9x9 server has been down a couple of times recently when I went to
run up the standard engine pack
(http://computergo.wikispaces.com/CGOS+Standard+Engine+Packs).
I took the chance to put the ratings of all the different instances of
each engine into a spreadsheet, to see how much spread there was in the
ratings. To my surprise, the spread varied a lot from engine to engine.
The smallest standard deviation was GnuGo3.8MC100 with 34, the largest,
105, was from Go169.
Clearly there is a big difference here. Is there a way for the rating
system to reflect the reliability of this? I thought maybe an engine
with a low Std Dev could have a low K and an engine with a high standard
deviation could have a higher K. This would also help with engines that
learn - they would have a high SD and so a high K, due to the fact that
their rating changes. When they reach their ceiling, it would stabilise
and K would go down.
I've uploaded a couple of screenshots of the graphs and the spreadsheet
with all the raw data in it to the computergo wiki here:
http://computergo.wikispaces.com/CGOS+Rating+System
Here's the big picture: mean deviation from the mean across all
instances of all engines is ~44. I've pasted the table in-line below,
but not sure how it will come out once Thunderbird has auto-converted
this mail to plain text format for the list.
Engine Std Dev Max Min Mean
Go169 104.68 1232 947 1091.25
Amigo1.7 72.18 1192 1027 1108.71
Gnu3.8MC100 33.91 1516 1423 1453
Aya6.34 54.6 1774 1621 1705.2
Gnu3.8MC1K 42.11 1807 1670 1743.43
Gnu3.8MC10 43.02 1824 1696 1758.86
Gnu3.8L1 48.53 1857 1711 1772.63
Gnu3.8L10 44.57 1849 1713 1777.13
Apologies if the formatting is bad, probably best to look at the wiki
anyway - it has nice pictures =)
What do you think about the idea of an adjustable weighting depending on
the consistency of an engine? I don't really understand Elo, so maybe
you will tell me it already takes this into account!
Raffles
PS does anybody know what the cause of the 9x9 server going down is? The
19x19 server is still up.
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go