[jira] [Commented] (MATH-1313) Wrong tolerance in some unit tests of "RandomGeneratorAbstractTest"

Phil Steitz (JIRA) Sun, 10 Jan 2016 21:02:06 -0800

    [ 
https://issues.apache.org/jira/browse/MATH-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091445#comment-15091445
 ]


Phil Steitz commented on MATH-1313:
-----------------------------------

The test *should* be changed to do something else.  The "improvement" 
effectively performs a t-test for equality of the mean to the expected mean and 
a non-statistical test of the standard deviation.  It does not test anything 
other than systematic bias of the mean and variance close (no actual 
statistical test) to expected.  The use of a t-test is questionable since we 
know (hope, actually) that the distribution being sampled is *not* Gaussian, 
but that is not the point.  The point is we should be testing that the values 
are in fact uniformly distributed, which the above does little to confirm.  I 
see now that testNextFloat correctly does a chisquare test.  I  recommend that 
we drop this test and the one for floats and create one for doubles that does 
what testNextFloat does, which actually does confirm uniformity.  Now that we 
have KS tests, it might actually be even better to use that for the double 
test, which would give us two kinds of actual statistical tests of distribution 
conformance.

> Wrong tolerance in some unit tests of "RandomGeneratorAbstractTest"
> -------------------------------------------------------------------
>
>                 Key: MATH-1313
>                 URL: https://issues.apache.org/jira/browse/MATH-1313
>             Project: Commons Math
>          Issue Type: Bug
>            Reporter: Gilles
>            Assignee: Gilles
>            Priority: Minor
>              Labels: unit-test
>             Fix For: 4.0
>
>
> I doubt that the mean check in the unit test below is ever going to trigger 
> an assertion failure...
> {noformat}
>     @Test
>     public void testDoubleDirect() {
>         SummaryStatistics sample = new SummaryStatistics();
>         final int N = 10000;
>         for (int i = 0; i < N; ++i) {
>             sample.addValue(generator.nextDouble());
>         }
>         Assert.assertEquals("Note: This test will fail randomly about 1 in 
> 100 times.",
>                 0.5, sample.getMean(), FastMath.sqrt(N/12.0) * 2.576);
>         Assert.assertEquals(1.0 / (2.0 * FastMath.sqrt(3.0)),
>                      sample.getStandardDeviation(), 0.01);
>     }
> {noformat}
> And similar in "testFloatDirect()".
> I propose the following replacement:
> {noformat}
>     @Test
>     public void testDoubleDirect() {
>         SummaryStatistics sample = new SummaryStatistics();
>         final int N = 100000;
>         for (int i = 0; i < N; ++i) {
>             sample.addValue(generator.nextDouble());
>         }
>         assertUniformInUnitInterval(sample, 0.99);
>     }
> {noformat}
> where "assertUniformInUnitInterval" is defined as:
>    {noformat}
>     /**                                                                       
>                                                                               
>                                
>      * Check that the sample follows a uniform distribution on the {@code [0, 
> 1)} interval.                                                                 
>                                
>      *                                                                        
>                                                                               
>                                
>      * @param sample Data summary.                                            
>                                                                               
>                                
>      * @param confidenceIntervalLevel Confidence level. Must be in {@code (0, 
> 1)} interval.                                                                 
>                                
>      */
>     private void assertUniformInUnitInterval(SummaryStatistics sample,
>                                              double confidenceIntervalLevel) {
>         final int numSamples = (int) sample.getN();
>         final double mean = sample.getMean();
>         final double stddev = sample.getStandardDeviation() / 
> FastMath.sqrt(numSamples);
>         final TDistribution t = new TDistribution(numSamples - 1);
>         final double criticalValue = t.inverseCumulativeProbability(1 - 0.5 * 
> (1 - confidenceIntervalLevel));
>         final double tol = stddev * criticalValue;
>         Assert.assertEquals("mean=" + mean + " tol=" + tol + " (note: This 
> test will fail randomly about " +
>                             (100 * (1 - confidenceIntervalLevel)) + " in 100 
> times).",
>                             0.5, mean, tol);
>         Assert.assertEquals(FastMath.sqrt(1d / 12), 
> sample.getStandardDeviation(), 0.01);
>     }
> {noformat}
> Please correct if this new test is not what was intended.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MATH-1313) Wrong tolerance in some unit tests of "RandomGeneratorAbstractTest"

Reply via email to