In message <[email protected]>, Darren Cook <[email protected]>
writes
How many games do two programs need to play to be able to say with 95%
confidence that a new feature/bug fix has given a 50 ELO improvement?
The answer must depend on what improvement the fix has in fact produced.
If it was 200 ELO the answer may be quite small (it must depend on the
strength of the opponent). If it was 51 ELO it will be enormous.
I wonder if the question you meant to ask is:
How many games do two programs need to play to be able to say with 95%
confidence that a new feature/bug fix, which in fact has given a 50 ELO
improvement, is indeed an improvement?
Nick
What about 200 ELO? What about 99% confidence? I'm sure there must be a
straightforward equation for this, but google doesn't understand what I
am asking it, and my own statistics knowledge is letting me down.
TIA,
Darren
--
Nick Wedd [email protected]
_______________________________________________
Computer-go mailing list
[email protected]
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go