Assuming that the algorithm does in fact produce a uniform distribution, the numbers refer to the probability that the observed distribution differed from the expected distribution (equal numbers in each of the 15 bins) purely due to chance. In other words, for the 2nd test there is a 12.05% probability that the deviation of the observed distribution of numbers from uniform could have been due to chance. If you had chosen a "standard" 5% threshold of significance then you would not reject the null hypothesis that the numbers are uniformly distributed in any of your 5 experiments.
Obviously if you did one experiment and got 0.01 (1% probability that the distribution could have occurred purely by chance), it is still possible that the algorithm produces a uniform distribution and you just got unlucky! I'd be suspicious if the numbers were all very similar. On Sun, May 22, 2011 at 3:30 AM, Roger Hui <[email protected]> wrote: > Thanks for your reply. To paraphrase Ewart Shaw, > my lack of depth in statistics is only ameliorated by > the narrowness of my understanding. Regarding the > last result: > >> -.14&chisqcdf f i.5 >> 0.262214 0.120479 0.124215 0.292923 0.68203 > > What are the numbers saying? Do they indicate that the > underlying distribution is nearly uniform or far from uniform? > Is it a good or bad thing (or irrelevant) that the 5 numbers > are quite different from each other? > > In "school" I learned that the Kolmogorov-Smirnov test > http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test > is good for this kind of thing. What do you think of > this test? > > > > ----- Original Message ----- > From: John Randall <[email protected]> > Date: Saturday, May 21, 2011 7:52 > Subject: Re: [Jprogramming] roll > To: Programming forum <[email protected]> > >> The process described should yield uniform random numbers >> between 0 >> and y-1. >> >> Following up on the tests in the original posting, we can put the >> numbers into equal sized bins and then calculate a chi-square >> statistic. If this is not significant, we can be somewhat >> assuredthat the numbers are random. >> >> f=:3 : 0"0 >> y=. 12345 >> t=. ?2e6$20000 >> z=. 1e6 {. (12345>t)#t >> NB. 12345=15*823 >> bin=.<.@% &823 >> observed=.(bin z) #/. z >> expected=.(#z)%15 >> chisquare=.+/(*:observed-expected)%expected >> ) >> >> require '~addons/stats/base/distribution.ijs' >> >> -.14&chisqcdf f i.5 >> 0.262214 0.120479 0.124215 0.292923 0.68203 >> >> Best wishes, >> >> John >> >> >> >> Roger Hui wrote: >> > I have found a bug in ?y where y is an extended >> > precision integer. The result should never be y, >> > but: >> > >> > t=: ? 1e6 $ 12345x >> > +/t=12345 >> > 89 >> > >> > The last result should of course be 0. >> > >> > I know how this can be fixed, but I would like to >> > verify that the algorithm for ?y is correct with the >> > mathematicians in this forum. >> > >> > For y an extended precision number, ?y is computed >> > by t=.?y1 where y1 is a little bigger than y, >> > repeating until t is less than y. (The same y1 >> > is used for a given y.) Assuming that ?m produces >> > uniform random numbers between 0 and m-1, does >> > this process give uniform random numbers between >> > 0 and y-1? >> > >> > For example, suppose we want to compute z=:?1e6$y=:12345. >> > >> > y=: 12345 >> > t=: ?2e6$20000 >> > +/12345>t >> > 1233158 >> > z=: 1e6 {. (12345>t)#t >> > >> > NB. various tests >> > (+/z)%#z >> > 6170.3 >> > 12344%2 >> > 6172 >> > >> > (c%y) , (+/c>:z)%#z [ c=: ? y >> > 0.993277 0.993333 >> > (c%y) , (+/c>:z)%#z [ c=: ? y >> > 0.43872 0.438943 >> > (c%y) , (+/c>:z)%#z [ c=: ? y >> > 0.593925 0.594519 >> > (c%y) , (+/c>:z)%#z [ c=: ? y >> > 0.315674 0.315579 > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
