On Sun, 15 Feb 2009, Valerio Bartolino wrote:
Dear list,
I've the objective to identify hotspot areas from a model prediction
over a high resolution grid. After calculating a spatial weight object I
easily applied the localmoran function from the spdep library. It's not
really clear to me the meaning of the p-values associated to the
localmoran function and how much I can rely on them in terms of
statistical significance. For instance can I use these p-values instead
using a randomization approach? I would be glad for any clarification.
Yes, you can use the p-values - they are based on the same analytical
randomisation approach as that for global Moran's I - see the references.
This approach adjusts for the possible divergence of the observed data
from normality with respect to kurtosis, but the p-values are tainted by
multiple comparisons.
By randomisation, also below, you seem to mean permutation bootstraping
(or Monte Carlo, or Hope-type test). Note that if you permute over all the
data, you are not actually doing what you think that you are doing,
because only the (small) set of neighbour values should be used for
permutation, not all observations. The approaches may be equivalent if you
know definitely that your model of the data (mean model and covariance
model) is fully specified: there are no missing variables, all the
variables have the correct functional forms, and there are no omitted
global spatial processes. This is a very strong assumption, especially
given the typical model of y ~ 1 (just the mean) used in Moran and local
Moran tests.
Instead, it may be safer to do parametric bootstrapping, drawing from the
actual distribution of observations for the small neighbour set - this
also permits other approaches to be examined. See Waller & Gotway (2004)
p. 239 for a discussion. In fact, you can actually use localmoran.sad()
for a Saddlepoint approximation, or localmoran.exact() for the exact test,
which are typically similar to the analytical randomisation approach for
much of the range of the statistic, but perform much better where
discrimination is needed, and are pretty fast, so speed is not an issue.
This expands Danlin Yu's helpful comments, I share his concerns about
using unadjusted p-values.
If you want to look at the hotspot literature more closely, see Chapter 7
in Waller & Gotway, and perhaps review implementations of relevant methods
in the DCluster package.
Hope this helps,
Roger
Moreover, I want to calculate a statistical significance also through a
randomization approach (commonly used with Moran's I statistic). The
idea behind the randomization is rather simple, and also coding doesn't
seem too difficult, but the identified hotspots appear larger and
disaggregated respect those identified looking at the p-values provided
by the localmoran function at a similar significant level.
Did I do some mistake in the following code I wrote for the permutation?
Thanks for any advice, explanation or comment you will have
Valerio Bartolino
###########################################
require(spdep)
locMoranI.perm <- function(x, R, listw, ...){
# x is a vector of the values on which to calculate the MoranI statistic
# R, listw, ... are all the arguments passed to the localmoran function
mat <- matrix(data=NA, nrow=R, ncol=length(x))
for(i in 1:R){
perm <- sample(x, replace=F)
I.locmor <- localmoran(perm, listw, ...)
mat[i,] <- I.locmor[,1]
rm(I.locmor)
rm(perm)
}
return(mat)
}
# I used this new function as follow:
nsim <- 1000
I.perm <- locMoranI.perm(z, R=nsim, listw=nbw)
MorI <- localmoran(z, listw=nbw)
# select for instance a 0.01 pseudo-significance level
p.perm <- apply(I.perm, 2, quantile, probs=0.99)
## because I-Moran identify spatial clustering
## high and low hotspots have no distinct I values
## make a vector to distinguish significant and high hotspots
hot <- ifelse(p.perm-MorI[,1]<0 & z>mean(z),1,0)
_______________________________________________
R-sig-Geo mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [email protected]
_______________________________________________
R-sig-Geo mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo