Hello Edward, Sunday, January 8, 2006, 11:07:46 PM, you wrote: > Let's consider this important aspect of benchmarking more carefully. > So there is an interesting question: how much should be a difference > in order to approve that some fs really wins at this statistics? Is > there any guarantee you won't get, say, 0.05 and 0.02 after next run? > Sorry, but I didn't find any answer in Justin's notes, NOTE5 (Tests > Performed) says that questionable tests were re-run, but it seems we > need something kinda research here instead of re-run. Exactly. By the way, Justin writes he did only 3 tests and calculated the average out of these 3. In statistics this is a very small sample. We would need at least 30 or so. If the results would have a big variance, they should be treated with exponential smoothening. And then we can go off with the calculations. Also It would be nice to have data from the exact tests made regularly to test for regressions and see what's the trend.
> etc..) 10 times. We will obtain for the same statistics X a set of > different (because of errors) values x1, x2, ..., x10. Suppose that > X has a normal distribution (any objections?) Well, for serious reasoning a smirnof-kolmogorov test should be used or Shapiro-Wilk or any other that applies to check the normal distribution. If it is, we can go ahead with all the calculations. If not.. Well, I'm not enough of a statistics guru to say :-) Regards, Maciej