On 9/17/07, William J Rust <[EMAIL PROTECTED]> wrote: > I'm working on a climate simulation program that takes monthly averages > and generates daily readings that are assumed to be normally > distributed. The following program creates 10 sets of 100,000 random > deviates with mean 10 and SD 5. It then applies a t test (results below) > to ensure that the generated numbers are good enough. As the results > show, they aren't. I'm wondering a) I am doing something wrong or b) is > there something wrong with the stats routines?
There are a couple of problems here. First, while your inversion method should generate approximately normally distributed values, it is better to use the JDK-supplied method for this (much faster and a better algorithm). There is a wrapped version of this provided in org.apache.commons.math.random.RandomDataImpl. To use that: import org.apache.commons.math.random.RandomData; import org.apache.commons.math.random.RandomDataImpl; RandomData randomData = new RandomDataImpl(); ... arry[idx] = randomData.nextGaussian(10, 5); Second, I don't understand what you are expecting from the t-test. TestUtils.tTest(mu, array) returns the p-value associated with a two-tailed test with the null hypothesis that the values in the array come from a distribution with mean = mu. So small p-values, say less than .01, would indicate that the mean appears to differ significantly from 10. This should happen roughly one in every 100 times. Differences as large as what you observed on your first run should happen about 34 out of every 100 times, etc. The values reported below do not look surprising to me. They do not support rejecting the null hypothesis that the mean is what it is supposed to be, which is a good thing. To test normality of the deviates, you should apply a normality test to the deviates themselves, e.g. a Kolmogorov-Smirnov test. Commons math does not currently include normality tests (patches welcome :). To do this, you would need to dump the generated arrays to a file and then do the test with R or some other package that includes normality tests. Unless I am missing something, I don't think a t-test is going to give you the information that you need to verify that the generated values are normally distributed. Another thing that you could do is to examine the empirical distribution of the generated values - lay a grid over the range and count how many fall into each range and compare these counts to what you would expect under the hypothesis of normality (essentially what the K-S test does). You can use org.apache.commons.random.EmpircalDistribution to bin the generated data and get bin counts. If you do find that normality tests fail on the generated values using either your inversion method or the RandomDataImpl.nextGaussian method, please open a Jira ticket (http://commons.apache.org/math/issue-tracking.html) including the R script or output from the package that you used for testing. Thanks! hth, Phil > > Thanks, > > wjr > > package usda.weru.cligen2; > > import org.apache.commons.math.MathException; > > /** > * > * @author wjr > */ > public class TestNormal { > > static org.apache.commons.math.distribution.NormalDistributionImpl nd = > new > org.apache.commons.math.distribution.NormalDistributionImpl(10, 5); > > public static void main(String[] args) { > double[] arry = new double[100000]; > java.util.Random ran = new java.util.Random(1l); > > for (int jdx = 0; jdx < 10; jdx++) { > for (int idx = 0; idx < arry.length; idx++) { > try { > arry[idx] = > nd.inverseCumulativeProbability(ran.nextDouble()); > } catch (MathException ex) { > ex.printStackTrace(); > } > } > try { > System.out.println("ttest " + > org.apache.commons.math.stat.inference.TestUtils.tTest(10,arry)); > } catch (IllegalArgumentException ex) { > ex.printStackTrace(); > } catch (MathException ex) { > ex.printStackTrace(); > } > } > } > } > > Output: > > > > > run-single: > > ttest 0.3433300114960922 > > ttest 0.1431930575825282 > > ttest 0.12336027805916228 > > ttest 0.49478850669361796 > > ttest 0.9216887341410063 > > ttest 0.9937228334312525 > > ttest 0.13669784550400177 > > ttest 0.9646134537758599 > > ttest 0.9965741269090211 > > ttest 0.03815948891784959 > > BUILD SUCCESSFUL (total time: 20 seconds) > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
