Re: AI-GEOSTATS: Effects of spatial autocorrelation on descriptive statistics

Chaosheng Zhang Sat, 27 May 2006 02:45:19 -0700

Dear Isobel,

Thanks. You can identify exactly where the problem lies. -:)

Yes, we should make it clear whether we are "describing" our samples or "estimating" the population. Unfortunately, in many cases, they are mixed.

In the literature (escpecially in environmental sciences which I am more familiar with), we can always see tables showing "median", "geometric mean", "robust mean", etc. In fact, almost all the papers based on field survey have to provide a table showing the conventional descriptive statistics. Very few people dare to use "average" or "standard deviation", due to non-normality, perhaps strongly affected by the development of geostatistics.

Now, my question is: Are the authors "describing" their samples or "estimating" the population? If the answer is the first, what's the usefulness of desribing spatially correlated samples? If the answer is the second, should we ask all of them to use geostatistics?!?!

Another question of my own: I am now trying to calculate "moving window statitics" or "neighbourhood statistics" based on a moving window. Should the neighbourhood statistics be regarded as "descriptions" or "estimations"?

The "optimal weighted average" of "ordinary kriging" may not really answer the question of "mean". Can the value of a "block kriging" (perhaps on the centroid?) be regarded as the estimated mean value of a study area?

Cheers,

Chaosheng

----- Original Message -----

From: Isobel Clark

To: Chaosheng Zhang

Cc: [email protected]

Sent: Friday, May 26, 2006 7:19 PM

Subject: AI-GEOSTATS: Effects of spatial autocorrelation on descriptive statistics

Chaosheng

If you are only describing your samples, such concepts as random and independent are irrelevant. They apply to the use of your sample statistics to estimate population parameters. If all you want to do is describe your samples, you can calculate any statistics you like.

However, you talk about "normality" and "outliers". These concepts depend on teh notion of a population from which the samples were drawn. If you are trying to estimate the parameters of that population, then dependence and non-randomness are as important as potential outliers and the shape of the population.

The "optimal weighted average" is usually known as "ordinary kriging" provided there is no significant trend. ;-)

Isobel

http://www.kriging.com

Chaosheng Zhang <[EMAIL PROTECTED]> wrote:

Dear Isobel,

Thanks for the helpful reply. In fact, I have been waiting for a reply from
you. -:)

I think the questions are fairly well answered by you. However, I want to
move a step forward or perhaps backward.

A question "forward": What are the methods to calculate the "optimal
weighted average"? Are they widely accepted/used/cited?

A question "backward": Do we really need to care about if the data are
spatially correlated or not, when we calculate descriptive statistics even
though we are aware of such an issue? Results calcuated from only the
non-correlated samples (e.g., sill in a variogram) really reflect the "true"
values of statistics? Generally we only care about outliers and
non-normality. In the spatial context, we care about sampling clusters.

Otherwise, we still have to use conventional statistics.

Best regards,

Chaosheng

----- Original Message -----
From: Isobel Clark
To: Chaosheng Zhang
Cc: [email protected]
Sent: Thursday, May 25, 2006 3:59 PM
Subject: AI-GEOSTATS: Re: Effects of spatial autocorrelation on descriptive
statistics

Chaosheng

Some thoughts in response to your questions:

1: "Spatially correlated data provide redundant information for the
calculation of mean"

I would not say "redundant". Even if information is correlated, the
correlation is not perfect (=1) which would be "redundant". If the data is
spatially correlated, the correlations should be included in the choice of
weight for each sample and in the calculation of the 'standard error' and
confidence levels. An optimal weighted average of spatially correlated data
will always give a better answer than a smaller subset on non-correlated
data.

As an example, you might try kriging a large block with a set of (internal)
samples spaced at the range of influence and then repeat the exercise with a
handful of samples between these 'independent' ones.

2: "In the presence of spatially correlated data, would a dispersion
variance . be the proper calculation for the measure of variance?"

The obvious answer is "yes and no". If by dispersion variance you mean the
standard calculation of variance:

Sum(g_i - gbar)^2/(n-1) often calculated as

{Sum(g_i^2)/n - gbar^2}/(n-1)

where g_i represents each sample value and gbar the arithmetic mean of all
samples, then No, it is not appropriate.

The proper calculation for dispersion variance of a spatially correlated
data set includes all the cross-covariances, not just the squares of sample
values. It also requires a better estimate of the population than gbar (see
1 above). If you are looking for descriptive statistics, then the dispersion
variance can be calculated using the 'middle term' from the full estimation
variance -- the gamma-bar(S_i,S_j) term.

In prectice, the most appropriate (and probably simplest) estimate of the
'population' dispersion variance in the presence of spatially correlated
data is the total sill on the semi-variogram model. This is, theoretically,
the dispersion variance as calculated from samples which are non-correlated.

Isobel

Chaosheng Zhang <[EMAIL PROTECTED]>wrote:
AI-GEOSTATS
Move of the list to [EMAIL PROTECTED]

Dear All,

I'm looking for answers to effects of spatial autocorrelation on
conventional descriptive statistics. More specifically, any comments on the
following statements?

1. "Spatially correlated data provide redundant information for the
calculation of mean";

2. "In the presence of spatially correlated data, would a dispersion
variance . be the proper calculation for the measure of variance?"

Best regards,

Chaosheng Zhang
------------------
Dr. Chaosheng Zhang
Lecturer in GIS
Department of Geography
National University of Ireland, Galway
IRELAND
Tel: +353-91-492375
Fax: +353-91-495505
E-mail: [EMAIL PROTECTED]
Web1: www.nuigalway.ie/geography/zhang.html
Web2: www.nuigalway.ie/geography/gis

+ To post a message to the list, send it to [email protected]
+ To unsubscribe, send email to majordomo@ jrc.it with no subject and
"unsubscribe ai-geostats" in the message body. DO NOT SEND
Subscribe/Unsubscribe requests to the list
+ As a general service to list users, please remember to post a summary of
any useful responses to your questions.
+ Support to the forum can be found at http://www.ai-geostats.org/

Re: AI-GEOSTATS: Effects of spatial autocorrelation on descriptive statistics

Reply via email to