|
Dear Isobel,
Thanks. You can identify exactly where the problem
lies. -:)
Yes, we should make it clear whether we are
"describing" our samples or "estimating" the population. Unfortunately, in many
cases, they are mixed.
In the literature (escpecially in environmental
sciences which I am more familiar with), we can always see tables
showing "median", "geometric mean", "robust mean", etc. In fact, almost all
the papers based on field survey have to provide a table showing the
conventional descriptive statistics. Very few people dare to use "average" or
"standard deviation", due to non-normality, perhaps strongly affected by the
development of geostatistics.
Now, my question is: Are the authors "describing" their samples or "estimating" the
population? If the answer is the first, what's the usefulness of desribing
spatially correlated samples? If the answer is
the second, should we ask all of them to use geostatistics?!?!
Another question of my own: I am now trying to
calculate "moving window statitics" or "neighbourhood statistics" based on a
moving window. Should the neighbourhood statistics be regarded
as "descriptions" or "estimations"?
The "optimal weighted average" of "ordinary
kriging" may not really answer the question of "mean". Can the value of a
"block kriging" (perhaps on the centroid?) be regarded as the estimated
mean value of a study area?
Cheers,
Chaosheng
----- Original Message -----
Sent: Friday, May 26, 2006 7:19 PM
Subject: AI-GEOSTATS: Effects of spatial
autocorrelation on descriptive statistics
Chaosheng
If you are only describing your samples, such concepts as random and
independent are irrelevant. They apply to the use of your sample statistics to
estimate population parameters. If all you want to do is describe your
samples, you can calculate any statistics you like.
However, you talk about "normality" and "outliers". These concepts depend
on teh notion of a population from which the samples were drawn. If you are
trying to estimate the parameters of that population, then dependence and
non-randomness are as important as potential outliers and the shape of the
population.
The "optimal weighted average" is usually known as "ordinary kriging"
provided there is no significant trend. ;-)
Isobel
http://www.kriging.com
Chaosheng Zhang
<[EMAIL PROTECTED]> wrote:
Dear
Isobel,
Thanks for the helpful reply. In fact, I have been waiting
for a reply from you. -:)
I think the questions are fairly well
answered by you. However, I want to move a step forward or perhaps
backward.
A question "forward": What are the methods to calculate the
"optimal weighted average"? Are they widely accepted/used/cited?
A
question "backward": Do we really need to care about if the data
are spatially correlated or not, when we calculate descriptive statistics
even though we are aware of such an issue? Results calcuated from only
the non-correlated samples (e.g., sill in a variogram) really reflect the
"true" values of statistics? Generally we only care about outliers
and non-normality. In the spatial context, we care about sampling
clusters.
Otherwise, we still have to use conventional
statistics.
Best regards,
Chaosheng
----- Original
Message ----- From: Isobel Clark To: Chaosheng Zhang Cc:
[email protected] Sent: Thursday, May 25, 2006 3:59 PM Subject:
AI-GEOSTATS: Re: Effects of spatial autocorrelation on
descriptive statistics
Chaosheng
Some thoughts in
response to your questions:
1: "Spatially correlated data provide
redundant information for the calculation of mean"
I would not say
"redundant". Even if information is correlated, the correlation is not
perfect (=1) which would be "redundant". If the data is spatially
correlated, the correlations should be included in the choice of weight
for each sample and in the calculation of the 'standard error'
and confidence levels. An optimal weighted average of spatially
correlated data will always give a better answer than a smaller subset on
non-correlated data.
As an example, you might try kriging a large
block with a set of (internal) samples spaced at the range of influence
and then repeat the exercise with a handful of samples between these
'independent' ones.
2: "In the presence of spatially correlated data,
would a dispersion variance . be the proper calculation for the measure
of variance?"
The obvious answer is "yes and no". If by dispersion
variance you mean the standard calculation of variance:
Sum(g_i -
gbar)^2/(n-1) often calculated as
{Sum(g_i^2)/n -
gbar^2}/(n-1)
where g_i represents each sample value and gbar the
arithmetic mean of all samples, then No, it is not
appropriate.
The proper calculation for dispersion variance of a
spatially correlated data set includes all the cross-covariances, not
just the squares of sample values. It also requires a better estimate of
the population than gbar (see 1 above). If you are looking for
descriptive statistics, then the dispersion variance can be calculated
using the 'middle term' from the full estimation variance -- the
gamma-bar(S_i,S_j) term.
In prectice, the most appropriate (and
probably simplest) estimate of the 'population' dispersion variance in
the presence of spatially correlated data is the total sill on the
semi-variogram model. This is, theoretically, the dispersion variance as
calculated from samples which are
non-correlated.
Isobel
Chaosheng Zhang
<[EMAIL PROTECTED]>wrote: AI-GEOSTATS Move of the list to
[EMAIL PROTECTED]
Dear All,
I'm looking for answers to effects
of spatial autocorrelation on conventional descriptive statistics. More
specifically, any comments on the following statements?
1.
"Spatially correlated data provide redundant information for
the calculation of mean";
2. "In the presence of spatially
correlated data, would a dispersion variance . be the proper calculation
for the measure of variance?"
Best regards,
Chaosheng
Zhang ------------------ Dr. Chaosheng Zhang Lecturer in
GIS Department of Geography National University of Ireland,
Galway IRELAND Tel: +353-91-492375 Fax: +353-91-495505 E-mail:
[EMAIL PROTECTED] Web1:
www.nuigalway.ie/geography/zhang.html Web2:
www.nuigalway.ie/geography/gis
+ To post a message to the list,
send it to [email protected] + To unsubscribe, send email to majordomo@
jrc.it with no subject and "unsubscribe ai-geostats" in the message body.
DO NOT SEND Subscribe/Unsubscribe requests to the list + As a general
service to list users, please remember to post a summary of any useful
responses to your questions. + Support to the forum can be found at
http://www.ai-geostats.org/
|