AI-GEOSTATS: summary: spatial stats for small data sets

Raechel Waters Tue, 07 Aug 2001 01:18:11 -0700
Dear All,

A sincere apology that it has taken be so long to post the responses to a 
query I sent earlier this year.

The content of my posting and the replies I received are listed below.

Thank you once again to those that responded, your input has been extremely 
helpful.

Raechel Waters

Original posting:
Dear All,

I am new to spatial statistics and am posting this query in the hope that 
someone may be able to point me in the right direction.  I am a biological 
oceanographer and am interested in statistically describing the 
2-dimensional distribution of biological particles, primarily to define the 
'patchiness' of the distributions.  I have two data sets which consist of 
7x7 and 9x9 point arrays, providing 49 and 81 samples respectively.  Are 
there meaningful spatial statistics that can be applied to such small data 
sets?

Thank you in advance,

Raechel Waters

Replies:
1)
If your data  are actual point locations, as opposed to aggregations by 
grid cells, you can use my CrimeStat program.  It has a variety of spatial 
statistics plus documentation on the use of them.  Even though it was 
designed for crime analysis, many of these statistics may be appropriate 
for your purposes.  The program is distributed by the Crime Mapping 
Research Center of the U.S. National Institute of Justice.  You can find 
the program at either

         http://www.icpsr.umich.edu/NACJD/crimestat.html
or

         http://www.ojp.usdoj.gov/cmrc/tools/welcome.html

Ned Levine, PhD


2)
There are many routes you can take. The first thing you need to do is think 
long and hard about what you mean by patchiness.  A lot of biologists use 
"patch" in a structural sense, i.e., they define a patch as something they 
can measure.  Li and Reynolds have a paper in the later 90's about 
structural vs. functional spatial heterogeneity (Li, H. and J. F. Reynolds. 
1995. On the quantification of spatial heterogeneity. Oikos
73: 280-284. ) that offers some excellent guidelines. Often a patch is 
defined as an area with discrete boundaries that is internally homogeneous 
and differs from the outside.  I think the best definition of patch is an 
area beyond the perimeter of which there is no biolgical effect on the 
species or system of interest.  You could have, for example, levels of some 
chemical radiating out from a spill where, below a certain concentration, 
say .005 ppm, there is no effect on plankton.  So, the spatial interface 
between .005 and .006 ppm is the patch boundary.

Indicator kriging (or just plain mapping) can allow you to make maps of 
biologically important levels of whatever it is you're measuring.  There is 
a lot written on indicator kriging.

If you want to calculate landscape statistics on your data, you have a few 
options:  using "fragstats" or, for arcview gis, "patch analyst".  Patch 
analyst is really slick, and is well worth using.  You can interpolate 
between your sample points somehow, or draw discrete polygons around what 
you consider a patch, and then bring the data into arcview with spatial 
analyst (maybe one version of patch analyst works without spatial analyst) 
and generate about 3 gajillion landscape metrics.  The PA manual is 
worthwhile having, because it gives you the equations for a lot of 
landscape metrics you might consider using.  Leap2 is another program like 
fragstats/patch analyst for Windows NT.  It works well, once you get it 
running.

Good luck!  I wouldn't yield to the temptation to use variography to 
describe the "patchiness".  I'm currently working on a paper to decide how 
much about landscape pattern can be said using variography.

Andrew

3)
Try a multiresponse permutation procedure (mrpp).  It is particularly 
appropriate for small samples like you describe, is distribution-free, and 
the software is readily available.  Search the web for BLOSSOM and 
mrpp.  It is freely downloadable.  Contact if you have any questions about 
it.

Wayne Thogmartin

4)
7x7 and 9x9 are a bit sparse but randomization techniques could eek more
out.
however before metrics can be selected, two questions must be asked: what is
the spatial extent of your point data relative to the expected patchiness?
and what is the temporal resolution of the data relative to the processes
thought to be forming the patchiness?
tchau,
geoff
=+=+=+=+=+=+=+=+=+=+=+=+=+=
Geoffrey M. Henebry, Ph.D.
CALMIT (Center for Advanced Land Management Information Technologies)
113 Nebraska Hall
University of Nebraska
Lincoln, NE 68588-0517 USA
1-402-472-6158 (-4608 fax)
[EMAIL PROTECTED]

5)
The answer to your question is a bit
chicken-and-egg-ish.

If your data is well behaved (simple distribution,
pretty continuous) then you can get meaningful results
from very few samples (probably not less than 20 or
so!!)

We have examples in the book with data sets of 27 and
up. The 27 one is no good for geostatistics but this
has more to do with the fact that the samples are 1km
apart when the range of influence is probably about
125 metres. The main tutorial set in the old book
(available free at
http://uk.geocities.com/drisobelclark/practica.html)
which we now call "Page 95" has 50 samples very
inefficiently placed which still yield good results
for interpretation and estimation purposes. Even more
so for simulation basis.

So, I would say, go ahead and try it but look at your
distribution before you go to geostatistics. Small
data sets will give much better results if Normal
(Gaussian) or normalised or transformed in some other
way.

If I can be of any more help, please let me know
Isobel Clark


6)
May I suggest that rather than asking whether you can use spatial 
statistical methods you begin with asking what kind of information would 
you like to extract or confirm from your data sets. You do mention
"patchiness", which intuitively is a pretty clear idea but maybe not so 
clear when it comes to choosing a statistical tool.

One question, is your data "point data" nor "non-point data", if non-point 
what are the supports? all the same?

Perhaps you want to simply try descriptive statistics first. For example, a 
bar chart in 3-d would give you a visual picture and perhaps a quick idea 
about patchiness. Try aggregating on subgroupings and look at
the variance within vs the variance between. Begin with a uniform 
distribution (using the interval determined by the minimal and maximal data 
values), do a Chi-square test on goodness of fit. Although this does
not capture the spatial context it can give you some information.  Compute 
and plot sample variograms, the range of the sample variogram is some 
indication of the presence or absence of patchiness. You might also
want to look at sample indicator variograms. Try Ripley's K test. I assume 
that you are looking for information from your data set, so try various 
things to see what they tell you.

Donald E. Myers
Department of Mathematics
University of Arizona
Tucson, AZ 85721
http://www.u.arizona.edu/~donaldm



--
* To post a message to the list, send it to [EMAIL PROTECTED]
* As a general service to the users, please remember to post a summary of any useful 
responses to your questions.
* To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and "unsubscribe 
ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND 
Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org
AI-GEOSTATS: summary: spatial stats for small data sets

Reply via email to