Massimiliano Maini <[email protected]> wrote: > Why don't we show the % instead of the JSD ? It's much more reasonable.
The trouble with this is that the percentages don't mean what you think they mean. In the bgonline thread, some people got the misimpression that the points I was making were philosophical ones, and that I was arguing as a Bayesian. Before I go any further, let me state clearly at the outset that the points I am about to make are *strictly from the point of view of classical hypothesis testing*. I am *not* going to argue here that a Bayesian approach is better. Instead, I am just going to clear up some common misconceptions about what confidence intervals mean. > Notice that the percentage shown aside the top play is the > "confidence"we have in it being better than the 2nd best play. This is not correct. It is an extremely common misconception. The percentage is the probability that we would see the results that we in fact see (or even more skewed results), *under the assumption that the plays are equal*. This is *not* the same as the the *confidence we have that the first play is better than the second play*. I will state this again because it is so counterintuitive. We would like to think that "5%" is the probability of some event occurring in the real world. But *it's not*. 5% is the probability that, in the strange and implausible world where *the two plays are equal*, something as skewed as what we see (or something even more skewed) would occur. It is tempting, *but wrong*, to twist this statement around into something like, "There is a 5% probability that the lower-ranked play is better." THIS IS WRONG. Given that it's wrong to say this in the case of just two plays, it follows that describing the multivariate tail probability as "the probability that the third-ranked play is the best" (in the case of more than two plays) *is also wrong*, for the same reason. I strongly believe that GNU Backgammon should not say things that are just plain wrong, and should not perpetuate common statistical misconceptions. Now, I happen to believe that percentages are more intuitive than j.s.d. numbers, and I am in favor of reporting things as percentages rather than as j.s.d. numbers. However, the percentages should *not* be incorrectly described as "probabilities that this play is the best." *If* one insists on having GNU Backgammon issue claims of the form, "the probability that this play is the best is X%," *then* one should adopt a Bayesian standpoint. But I promised to speak strictly from the point of view of classical hypothesis testing, so I will say simply that statements of the form "the probability that this play is the best is X%" are simply *impossible* from this viewpoint. The multivariate tail probability, for example, tells you only the probability that some strange event will occur *under the assumption that the equities are equal to the estimated equities*. This is *not* the same as *the probability that the true equities are different from their estimated values*. If you don't believe that what I am saying here is as clearcut as I am claiming it is, then check with a statistician. And when I say "statistician," I don't just mean a scientist who uses statistics on a regular basis. I recently learned of a study where 70 academic psychologists were quizzed on what confidence intervals meant, and only 3 out of the 70 got it right. (Oakes, Statistical inference: A commentary for the social and behavioral sciences, Wiley, 1986.) Tim _______________________________________________ Bug-gnubg mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-gnubg
