[ 
https://issues.apache.org/jira/browse/MATH-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091236#comment-15091236
 ] 

Phil Steitz commented on MATH-1313:
-----------------------------------

The current test is definitely weak and the proposal above is better, but it 
could be made better still by looking not just at the mean, but the 
distribution itself.  I suspect the test was implemented before we had 
ChiSquare or Kolmogorov-Smirnov tests available, either of which could be used 
to evaluate conformity of the distribution.   The most direct would be to do a 
one-sample KS test with a Uniform distribution instance as the 
RealDistribution.  If the p-value returned is small, say less than .001, you 
fail the test.  Alternatively, you could set up equal-sized bins and do a 
ChiSquare test with expected bin counts all equal to the sample size / number 
of bins (this is what RealDistributionAbstractTest#testSampling does).

> Wrong tolerance in some unit tests of "RandomGeneratorAbstractTest"
> -------------------------------------------------------------------
>
>                 Key: MATH-1313
>                 URL: https://issues.apache.org/jira/browse/MATH-1313
>             Project: Commons Math
>          Issue Type: Bug
>            Reporter: Gilles
>            Assignee: Gilles
>            Priority: Minor
>              Labels: unit-test
>             Fix For: 4.0
>
>
> I doubt that the mean check in the unit test below is ever going to trigger 
> an assertion failure...
> {noformat}
>     @Test
>     public void testDoubleDirect() {
>         SummaryStatistics sample = new SummaryStatistics();
>         final int N = 10000;
>         for (int i = 0; i < N; ++i) {
>             sample.addValue(generator.nextDouble());
>         }
>         Assert.assertEquals("Note: This test will fail randomly about 1 in 
> 100 times.",
>                 0.5, sample.getMean(), FastMath.sqrt(N/12.0) * 2.576);
>         Assert.assertEquals(1.0 / (2.0 * FastMath.sqrt(3.0)),
>                      sample.getStandardDeviation(), 0.01);
>     }
> {noformat}
> And similar in "testFloatDirect()".
> I propose the following replacement:
> {noformat}
>     @Test
>     public void testDoubleDirect() {
>         SummaryStatistics sample = new SummaryStatistics();
>         final int N = 100000;
>         for (int i = 0; i < N; ++i) {
>             sample.addValue(generator.nextDouble());
>         }
>         assertUniformInUnitInterval(sample, 0.99);
>     }
> {noformat}
> where "assertUniformInUnitInterval" is defined as:
>    {noformat}
>     /**                                                                       
>                                                                               
>                                
>      * Check that the sample follows a uniform distribution on the {@code [0, 
> 1)} interval.                                                                 
>                                
>      *                                                                        
>                                                                               
>                                
>      * @param sample Data summary.                                            
>                                                                               
>                                
>      * @param confidenceIntervalLevel Confidence level. Must be in {@code (0, 
> 1)} interval.                                                                 
>                                
>      */
>     private void assertUniformInUnitInterval(SummaryStatistics sample,
>                                              double confidenceIntervalLevel) {
>         final int numSamples = (int) sample.getN();
>         final double mean = sample.getMean();
>         final double stddev = sample.getStandardDeviation() / 
> FastMath.sqrt(numSamples);
>         final TDistribution t = new TDistribution(numSamples - 1);
>         final double criticalValue = t.inverseCumulativeProbability(1 - 0.5 * 
> (1 - confidenceIntervalLevel));
>         final double tol = stddev * criticalValue;
>         Assert.assertEquals("mean=" + mean + " tol=" + tol + " (note: This 
> test will fail randomly about " +
>                             (100 * (1 - confidenceIntervalLevel)) + " in 100 
> times).",
>                             0.5, mean, tol);
>         Assert.assertEquals(FastMath.sqrt(1d / 12), 
> sample.getStandardDeviation(), 0.01);
>     }
> {noformat}
> Please correct if this new test is not what was intended.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to