Matthew, You may find the following documents useful if your venture into environmental statistics is serious.
First, the 92 EPA Addendum on GW statistics--links at http://www.epa.gov/correctiveaction/resource/guidance/sitechar/gwstats/gwstats.htm The second is Helsel's book at the USGS http://pubs.usgs.gov/twri/twri4a3/ Both documents have good discussions on normality tests for GW data including probability plot correlation coefficients and variations in the (x) plotting position--Blom, Cunane, etc. Helsel is a good read 1.) his writing is so clear in his writing, 2.) he gets into nonparametric approaches in so many areas of GW stats, and 3.) the typography is nice--the book just a pleasant experience all around. Just be advised this is only the beginning... Oh, yes. It ain't safe to just dabble with environmental (contaminant)data--it is too messy. Go whole hog or pass it up. Best regards, Michael Grant (works for the competition :O)) --- Peter Dalgaard <[EMAIL PROTECTED]> wrote: > <[EMAIL PROTECTED]> writes: > > > R Users: > > > > My question is probably more about elementary statistics than the > > mechanics of using R, but I've been dabbling in R (version 2.2.0) and > > used it recently to test some data . > > > > I have a relatively small set of observations (n = 12) of arsenic > > concentrations in background groundwater and wanted to test my > > assumption of normality. I used the Shapiro-Wilk test (by calling > > shapiro.test() in R) and I'm not sure how to interpret the output. > > Here's the input/output from the R console: > > > > >As = c(13, 17, 23, 9.5, 20, 15, 11, 17, 21, 14, 22, 13) > > >shapiro.test(As) > > > > Shapiro-Wilk normality test > > > > data: As > > W = 0.9513, p-value = 0.6555 > > > > How do I interpret this? I understand, from poking around the internet, > > that the higher the W statistic the "more normal" the data. > > > > What is the null hypothesis - that the data is normally distributed? > > Yup. > > > What does the p-value tell me? 65.55% chance of what - getting > > W-statistic greater than or equal to 0.9513 (I picked this up from the > > Dalgaard book, Introductory Statistics with R, but its not really > > sinking in with respect to how it applies to a Shipiro Wilk test).? > > *Smaller* or equal - W=1.0 is the "perfect fit". The W statistic is > pretty much the Pearson correlation applied to the curve drawn by > qqnorm(). (The exact definition of what goes on the x axis differs > slightly, I believe.) > > A low p-value would indicate that the W is too extreme to be explained > by chance variation - i.e. evidence against normal distribution. > In the present case you have no evidence against normal distribution > (beware that this is not evidence _for_ normality). > > (Personally, I'm not too happy about these normality tests. They tend > to lack power in small samples and in large samples they often reject > distributions which are perfectly adequate for normal-theory > analysis. Learning to evaluate a QQ plot seems a better idea.) > > > > The method description - retrieved using ?shapiro.test() - is a bit > > light on details. > > There are references therein, though... > > -- > O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B > c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K > (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 > ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 > > ______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
