[Computer-go] Understanding statistics for benchmarking

2015-11-03 Thread Urban Hafner
So, I’m currently running 200 games against GnuGo to see if a change to my program made a difference. But I now wonder if that’s enough games as I ran the same benchmark with the same code (but a different compiler version) and received different results: 85.5% wins (171 games of 200) the first

Re: [Computer-go] Understanding statistics for benchmarking

2015-11-03 Thread Rémi Coulom
The intervals given by gogui are the standard deviation, not the usual 95% confidence intervals. For 95% confidence intervals, you have to multiply the standard deviation by two. And you still have the 5% chance of not being inside the interval, so you can still get the occasional

Re: [Computer-go] Understanding statistics for benchmarking

2015-11-03 Thread Urban Hafner
Thank you Remi! So the 85.5% +/- 2.5 reported by GoGui would be 85.5% +/- 5 for 95% and 85.5% +/- 7.5. Correct? And thanks for the table. I think that’s good enough for now. I’ve now figured out how to calculate the std. deviation myself (it is easy) and with those two tools together I can now

Re: [Computer-go] Understanding statistics for benchmarking

2015-11-03 Thread Urban Hafner
On Tue, Nov 3, 2015 at 2:22 PM, Petr Baudis wrote: > (The situation is a bit dire if you have limited computing resources. > I admit that sometimes I didn't follow the above myself in less formal > exploratory experiments, but at least I tried to look only > "infrequently", e.g.

Re: [Computer-go] Understanding statistics for benchmarking

2015-11-03 Thread Peter Drake
Here's Orego's Java code for this: It involves a "two-tailed test for difference of proportions". I usually run 500-1000 games in each condition. (The exact number depends on the hardware available at the time.) On Tue, Nov 3, 2015 at 5:50 AM, Urban Hafner wrote: >

Re: [Computer-go] Understanding statistics for benchmarking

2015-11-03 Thread Petr Baudis
On Tue, Nov 03, 2015 at 09:46:00AM +0100, Rémi Coulom wrote: > The intervals given by gogui are the standard deviation, not the usual 95% > confidence intervals. > > For 95% confidence intervals, you have to multiply the standard deviation by > two. > > And you still have the 5% chance of not

Re: [Computer-go] Understanding statistics for benchmarking

2015-11-03 Thread Kahn Jonas
On Tue, 3 Nov 2015, Urban Hafner wrote: Thank you Remi! So the 85.5% +/- 2.5 reported by GoGui would be 85.5% +/- 5 for 95% and 85.5% +/- 7.5. Correct? Correct. But you do not need that intervals do not overlap for significativity. You may divide by $\sqrt{2}$ those intervals before testing

Re: [Computer-go] Understanding statistics for benchmarking

2015-11-03 Thread Urban Hafner
Yes, I noticed that too. But luckily that's the one thing I didn't even consider doing. Running the same number of games feels like the most natural thing to do anyway. Von meinem iPhone gesendet > Am 03.11.2015 um 14:22 schrieb Petr Baudis : > >> On Tue, Nov 03, 2015 at