[ https://issues.apache.org/jira/browse/MATH-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091445#comment-15091445 ]
Phil Steitz commented on MATH-1313: ----------------------------------- The test *should* be changed to do something else. The "improvement" effectively performs a t-test for equality of the mean to the expected mean and a non-statistical test of the standard deviation. It does not test anything other than systematic bias of the mean and variance close (no actual statistical test) to expected. The use of a t-test is questionable since we know (hope, actually) that the distribution being sampled is *not* Gaussian, but that is not the point. The point is we should be testing that the values are in fact uniformly distributed, which the above does little to confirm. I see now that testNextFloat correctly does a chisquare test. I recommend that we drop this test and the one for floats and create one for doubles that does what testNextFloat does, which actually does confirm uniformity. Now that we have KS tests, it might actually be even better to use that for the double test, which would give us two kinds of actual statistical tests of distribution conformance. > Wrong tolerance in some unit tests of "RandomGeneratorAbstractTest" > ------------------------------------------------------------------- > > Key: MATH-1313 > URL: https://issues.apache.org/jira/browse/MATH-1313 > Project: Commons Math > Issue Type: Bug > Reporter: Gilles > Assignee: Gilles > Priority: Minor > Labels: unit-test > Fix For: 4.0 > > > I doubt that the mean check in the unit test below is ever going to trigger > an assertion failure... > {noformat} > @Test > public void testDoubleDirect() { > SummaryStatistics sample = new SummaryStatistics(); > final int N = 10000; > for (int i = 0; i < N; ++i) { > sample.addValue(generator.nextDouble()); > } > Assert.assertEquals("Note: This test will fail randomly about 1 in > 100 times.", > 0.5, sample.getMean(), FastMath.sqrt(N/12.0) * 2.576); > Assert.assertEquals(1.0 / (2.0 * FastMath.sqrt(3.0)), > sample.getStandardDeviation(), 0.01); > } > {noformat} > And similar in "testFloatDirect()". > I propose the following replacement: > {noformat} > @Test > public void testDoubleDirect() { > SummaryStatistics sample = new SummaryStatistics(); > final int N = 100000; > for (int i = 0; i < N; ++i) { > sample.addValue(generator.nextDouble()); > } > assertUniformInUnitInterval(sample, 0.99); > } > {noformat} > where "assertUniformInUnitInterval" is defined as: > {noformat} > /** > > > * Check that the sample follows a uniform distribution on the {@code [0, > 1)} interval. > > * > > > * @param sample Data summary. > > > * @param confidenceIntervalLevel Confidence level. Must be in {@code (0, > 1)} interval. > > */ > private void assertUniformInUnitInterval(SummaryStatistics sample, > double confidenceIntervalLevel) { > final int numSamples = (int) sample.getN(); > final double mean = sample.getMean(); > final double stddev = sample.getStandardDeviation() / > FastMath.sqrt(numSamples); > final TDistribution t = new TDistribution(numSamples - 1); > final double criticalValue = t.inverseCumulativeProbability(1 - 0.5 * > (1 - confidenceIntervalLevel)); > final double tol = stddev * criticalValue; > Assert.assertEquals("mean=" + mean + " tol=" + tol + " (note: This > test will fail randomly about " + > (100 * (1 - confidenceIntervalLevel)) + " in 100 > times).", > 0.5, mean, tol); > Assert.assertEquals(FastMath.sqrt(1d / 12), > sample.getStandardDeviation(), 0.01); > } > {noformat} > Please correct if this new test is not what was intended. -- This message was sent by Atlassian JIRA (v6.3.4#6332)