Standard error may be informative, but statistical importance matters more to me. I think it is clear you tell gnubg is better than Jelly from those number, but nothing more.
On 7/6/07, Achim Mueller <[EMAIL PROTECTED]> wrote:
* Joseph Heled <[EMAIL PROTECTED]> [070705 12:41]: > Even if gnubg wins a match only 49.5%, in a set of 1000 matches there > is more than 5% chance that gnubg wins 519 of them. that what 95% (one > sided) confidence interval means. I guess I got it now. I probably was mislead by an article of Chuck Bower in GOL were he described std.err as sqrt(a*b/1000) and sigma as (a-b)/std.err (which is probably wrong). I just had a chat with Douglas Zare. He says that sigma should be 0.5(a-b)/std.error. In other words, don't take the difference between both bots, but between a bot and the mean (which is 500 here). If that's correct - I'm not sure and a bit puzzled now - we get the following numbers: gnu/jelly 5.88 100.0% gnu/snowie 0.69 51.0% gnu/bgb 1.20 77.0% bgb/jelly 2.15 96.8% bgb/snowie 0.25 19.7% snowie/jelly 4.56 100.0% This definitely makes more sense. Sorry for the confusion. Ciao Achim _______________________________________________ Bug-gnubg mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-gnubg
_______________________________________________ Bug-gnubg mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-gnubg
