Our null hypothesis is that the numbers are evenly distributed across bins, and our alternative hypothesis that they are not.
We are forming the chi-squared statistic y to measure the difference between our observed distribution and our expected distribution. It would be zero for an exact match, and large for a mismatch. Now fix a significance level alpha (say alpha=0.01). We calculate the P-value -.14&chisqcdf y (14=15-1 is the number of degrees of freedom, one of the parameters to the distribution). If X is a chi-square random variable with the correct number of degrees of freedom, the P-value is simply P(X>y). If the P-value is smaller than alpha, we reject the null hypothesis, and conclude that the distribution is not uniform. We can never legitimately conclude that the distribution is uniform, but performing a large number of trials and never rejecting the null hypothesis gives one confidence. In the example given what is important is that the numbers are never very small. Some considerable variation is to be expected. I have never used the Kolmogorov-Smirnov test, but I imagine it could do something similar. Best wishes, John Roger Hui wrote: > Thanks for your reply. To paraphrase Ewart Shaw, > my lack of depth in statistics is only ameliorated by > the narrowness of my understanding. Regarding the > last result: > >> -.14&chisqcdf f i.5 >> 0.262214 0.120479 0.124215 0.292923 0.68203 > > What are the numbers saying? Do they indicate that the > underlying distribution is nearly uniform or far from uniform? > Is it a good or bad thing (or irrelevant) that the 5 numbers > are quite different from each other? > > In "school" I learned that the Kolmogorov-Smirnov test > http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test > is good for this kind of thing. What do you think of > this test? > > > > ----- Original Message ----- > From: John Randall <[email protected]> > Date: Saturday, May 21, 2011 7:52 > Subject: Re: [Jprogramming] roll > To: Programming forum <[email protected]> > >> The process described should yield uniform random numbers >> between 0 >> and y-1. >> >> Following up on the tests in the original posting, we can put the >> numbers into equal sized bins and then calculate a chi-square >> statistic. If this is not significant, we can be somewhat >> assuredthat the numbers are random. >> >> f=:3 : 0"0 >> y=. 12345 >> t=. ?2e6$20000 >> z=. 1e6 {. (12345>t)#t >> NB. 12345=15*823 >> bin=.<.@% &823 >> observed=.(bin z) #/. z >> expected=.(#z)%15 >> chisquare=.+/(*:observed-expected)%expected >> ) >> >> require '~addons/stats/base/distribution.ijs' >> >> -.14&chisqcdf f i.5 >> 0.262214 0.120479 0.124215 0.292923 0.68203 >> >> Best wishes, >> >> John >> >> >> >> Roger Hui wrote: >> > I have found a bug in ?y where y is an extended >> > precision integer. The result should never be y, >> > but: >> > >> > t=: ? 1e6 $ 12345x >> > +/t=12345 >> > 89 >> > >> > The last result should of course be 0. >> > >> > I know how this can be fixed, but I would like to >> > verify that the algorithm for ?y is correct with the >> > mathematicians in this forum. >> > >> > For y an extended precision number, ?y is computed >> > by t=.?y1 where y1 is a little bigger than y, >> > repeating until t is less than y. (The same y1 >> > is used for a given y.) Assuming that ?m produces >> > uniform random numbers between 0 and m-1, does >> > this process give uniform random numbers between >> > 0 and y-1? >> > >> > For example, suppose we want to compute z=:?1e6$y=:12345. >> > >> > y=: 12345 >> > t=: ?2e6$20000 >> > +/12345>t >> > 1233158 >> > z=: 1e6 {. (12345>t)#t >> > >> > NB. various tests >> > (+/z)%#z >> > 6170.3 >> > 12344%2 >> > 6172 >> > >> > (c%y) , (+/c>:z)%#z [ c=: ? y >> > 0.993277 0.993333 >> > (c%y) , (+/c>:z)%#z [ c=: ? y >> > 0.43872 0.438943 >> > (c%y) , (+/c>:z)%#z [ c=: ? y >> > 0.593925 0.594519 >> > (c%y) , (+/c>:z)%#z [ c=: ? y >> > 0.315674 0.315579 > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
