I would imagine that "practical significance" depends on the absolute level. Between two beginners 23% means little as it can be overtaken by a day's study. Between top-professionals it probably means the difference between a legendary 9p winning many top-title tournaments and a 9p who never wins a top title in his life.
Mark On Mon, Nov 26, 2012 at 4:13 AM, Don Dailey <dailey....@gmail.com> wrote: > > > On Mon, Nov 26, 2012 at 4:05 AM, "Ingo Althöfer" <3-hirn-ver...@gmx.de> > wrote: >> >> One general comment: >> >> Ratings are not transitive. For instance, >> A1 may score 25 % against B, >> and A2 may score 22 % against B. >> Then it can not be concluded that A1 will score more than 50 % >> in direct duel with A2. >> >> It is rather easy it construct triples of "semi-simple" agents A, B, C >> for some "normal" game where >> A score 95+ percent against B, >> B scores 95+ percent against C, >> C scores 95+ percent against A. > > > Hi Ingo, > > The ELO system which tries to model game playing skill mathematically makes > some assumptions that are not completely true, but are approximations to > the reality. One assumption made by the ELO system is that skill IS > transitive. It works quite well because in practice human skill and > program skill is nearly transitive. So it has proven to be a very good > model indeed. > > As you say it is not difficult to artificially construct classes of players > who do not have transitive relationships between each other. One very > simple way to do this is to take 3 equal players, and give them each a > different opening book such that the book will get them quickly into losing > or winning situations against each other. You can create your own > "rocks/paper/scissors" non-transitive relationship this way. > > You can also do it with the playing algorithm but it's a bit more difficult > but certainly possible. You give one program a serious weakness that one > of the other 2 can easily exploit but that the other program cannot exploit > - so each program has a unique exploitable weakness that only one of the > other 2 programs can exploit. > > Don > > > >> >> >> Ingo. >> >> -------- Original-Nachricht -------- >> > Datum: Sun, 25 Nov 2012 17:03:33 -0800 >> > Von: Leandro Marcolino <soria...@usc.edu> >> > An: computer-go@dvandva.org >> > Betreff: [Computer-go] Practical significance? >> >> > Hello all!.. >> > >> > I am currently doing a research about Computer Go. I can't tell the >> > details >> > about it yet, but I will post them here after (if) my paper is >> > accepted... >> > >> > In my research I compare many systems (An), playing against a fixed >> > strong >> > adversary (B). So A1 would have a percentage of victory x1 against B, >> > while >> > A2 would have a percentage of victory x2, etc... Then I compare the >> > percentage of victories, and for most cases I can show that one system >> > is >> > better than another with 95% of confidence. However, my adviser is >> > asking >> > me about not only the STATISTICAL significance of the results, but also >> > the >> > PRACTICAL significance of them. I mean, if one system is, for example >> > only >> > 1% better than another, with 99% of confidence, the result would have a >> > statistical significance, but wouldn't really matter in a practical >> > sense. >> > >> > In my case, the difference between the systems can range from about 4% >> > to >> > about 23%. Doesn't seem to be enough to argue that one system would be >> > one-handicap stone better than another. But what would be the minimum >> > difference for me to argue that one system is significantly better than >> > another, in a practical sense? (or they are not, in the end?..) Would >> > calculating ELO-ratings help me in answering this question? >> > >> > I think it gets even more complex if we think that, let's say, changing >> > the >> > percentage of victory from 95% to 100% seems to be much more significant >> > (in a practical sense) than changing from 30% to 35%, even though the >> > difference between the two systems is still only 5%. In my case, I am >> > dealing with percentages of victories that range from around 30% to >> > around >> > 53%. >> > >> > What do you guys think?.. >> > >> > Thanks for your help!.. >> > >> > Regards, >> > Leandro >> _______________________________________________ >> Computer-go mailing list >> Computer-go@dvandva.org >> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go > > > > _______________________________________________ > Computer-go mailing list > Computer-go@dvandva.org > http://dvandva.org/cgi-bin/mailman/listinfo/computer-go _______________________________________________ Computer-go mailing list Computer-go@dvandva.org http://dvandva.org/cgi-bin/mailman/listinfo/computer-go