Assuming that the algorithm does in fact produce a uniform
distribution, the numbers refer to the probability that the observed
distribution differed from the expected distribution (equal numbers in
each of the 15 bins) purely due to chance. In other words, for the 2nd
test there is a 12.05% probability that the deviation of the observed
distribution of numbers from uniform could have been due to chance. If
you had chosen a "standard" 5% threshold of significance then you
would not reject the null hypothesis that the numbers are uniformly
distributed in any of your 5 experiments.

Obviously if you did one experiment and got 0.01 (1% probability that
the distribution could have occurred purely by chance), it is still
possible that the algorithm produces a uniform distribution and you
just got unlucky! I'd be suspicious if the numbers were all very
similar.

On Sun, May 22, 2011 at 3:30 AM, Roger Hui <[email protected]> wrote:
> Thanks for your reply.  To paraphrase Ewart Shaw,
> my lack of depth in statistics is only ameliorated by
> the narrowness of my understanding.  Regarding the
> last result:
>
>>    -.14&chisqcdf f i.5
>> 0.262214 0.120479 0.124215 0.292923 0.68203
>
> What are the numbers saying?  Do they indicate that the
> underlying distribution is nearly uniform or far from uniform?
> Is it a good or bad thing (or irrelevant) that the 5 numbers
> are quite different from each other?
>
> In "school" I learned that the Kolmogorov-Smirnov test
> http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test
> is good for this kind of thing.  What do you think of
> this test?
>
>
>
> ----- Original Message -----
> From: John Randall <[email protected]>
> Date: Saturday, May 21, 2011 7:52
> Subject: Re: [Jprogramming] roll
> To: Programming forum <[email protected]>
>
>> The process described should yield uniform random numbers
>> between 0
>> and y-1.
>>
>> Following up on the tests in the original posting, we can put the
>> numbers into equal sized bins and then calculate a chi-square
>> statistic.  If this is not significant, we can be somewhat
>> assuredthat the numbers are random.
>>
>> f=:3 : 0"0
>> y=. 12345
>> t=. ?2e6$20000
>> z=. 1e6 {. (12345>t)#t
>> NB. 12345=15*823
>> bin=.<.@% &823
>> observed=.(bin z) #/. z
>> expected=.(#z)%15
>> chisquare=.+/(*:observed-expected)%expected
>> )
>>
>> require '~addons/stats/base/distribution.ijs'
>>
>>    -.14&chisqcdf f i.5
>> 0.262214 0.120479 0.124215 0.292923 0.68203
>>
>> Best wishes,
>>
>> John
>>
>>
>>
>> Roger Hui wrote:
>> > I have found a bug in ?y where y is an extended
>> > precision integer.  The result should never be y,
>> > but:
>> >
>> >    t=: ? 1e6 $ 12345x
>> >    +/t=12345
>> > 89
>> >
>> > The last result should of course be 0.
>> >
>> > I know how this can be fixed, but I would like to
>> > verify that the algorithm for ?y is correct with the
>> > mathematicians in this forum.
>> >
>> > For y an extended precision number, ?y is computed
>> > by t=.?y1 where y1 is a little bigger than y,
>> > repeating until t is less than y.  (The same y1
>> > is used for a given y.)  Assuming that ?m produces
>> > uniform random numbers between 0 and m-1, does
>> > this process give uniform random numbers between
>> > 0 and y-1?
>> >
>> > For example, suppose we want to compute z=:?1e6$y=:12345.
>> >
>> >    y=: 12345
>> >    t=: ?2e6$20000
>> >    +/12345>t
>> > 1233158
>> >    z=: 1e6 {. (12345>t)#t
>> >
>> >    NB. various tests
>> >    (+/z)%#z
>> > 6170.3
>> >    12344%2
>> > 6172
>> >
>> >    (c%y) , (+/c>:z)%#z [ c=: ? y
>> > 0.993277 0.993333
>> >    (c%y) , (+/c>:z)%#z [ c=: ? y
>> > 0.43872 0.438943
>> >    (c%y) , (+/c>:z)%#z [ c=: ? y
>> > 0.593925 0.594519
>> >    (c%y) , (+/c>:z)%#z [ c=: ? y
>> > 0.315674 0.315579
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to