On Sun, 15 Feb 2009, Valerio Bartolino wrote:

Dear list,
I've the objective to identify hotspot areas from a model prediction
over a high resolution grid. After calculating a spatial weight object I
easily applied the localmoran function from the spdep library. It's not
really clear to me the meaning of the p-values associated to the
localmoran function and how much I can rely on them in terms of
statistical significance. For instance can I use these p-values instead
using a randomization approach? I would be glad for any clarification.

Yes, you can use the p-values - they are based on the same analytical randomisation approach as that for global Moran's I - see the references. This approach adjusts for the possible divergence of the observed data from normality with respect to kurtosis, but the p-values are tainted by multiple comparisons.

By randomisation, also below, you seem to mean permutation bootstraping (or Monte Carlo, or Hope-type test). Note that if you permute over all the data, you are not actually doing what you think that you are doing, because only the (small) set of neighbour values should be used for permutation, not all observations. The approaches may be equivalent if you know definitely that your model of the data (mean model and covariance model) is fully specified: there are no missing variables, all the variables have the correct functional forms, and there are no omitted global spatial processes. This is a very strong assumption, especially given the typical model of y ~ 1 (just the mean) used in Moran and local Moran tests.

Instead, it may be safer to do parametric bootstrapping, drawing from the actual distribution of observations for the small neighbour set - this also permits other approaches to be examined. See Waller & Gotway (2004) p. 239 for a discussion. In fact, you can actually use localmoran.sad() for a Saddlepoint approximation, or localmoran.exact() for the exact test, which are typically similar to the analytical randomisation approach for much of the range of the statistic, but perform much better where discrimination is needed, and are pretty fast, so speed is not an issue.

This expands Danlin Yu's helpful comments, I share his concerns about using unadjusted p-values.

If you want to look at the hotspot literature more closely, see Chapter 7 in Waller & Gotway, and perhaps review implementations of relevant methods in the DCluster package.

Hope this helps,

Roger


Moreover, I want to calculate a statistical significance also through a
randomization approach (commonly used with Moran's I statistic). The
idea behind the randomization is rather simple, and also coding doesn't
seem too difficult, but the identified hotspots appear larger and
disaggregated respect those identified looking at the p-values provided
by the localmoran function at a similar significant level.

Did I do some mistake in the following code I wrote for the permutation?
Thanks for any advice, explanation or comment you will have

Valerio Bartolino

###########################################
require(spdep)

locMoranI.perm <- function(x, R, listw, ...){

# x is a vector of the values on which to calculate the MoranI statistic
# R, listw, ... are all the arguments passed to the localmoran function

        mat <- matrix(data=NA, nrow=R, ncol=length(x))
                for(i in 1:R){
                perm <- sample(x, replace=F)
                I.locmor <- localmoran(perm, listw, ...)
                mat[i,] <- I.locmor[,1]
                rm(I.locmor)
                rm(perm)
                }

        return(mat)
}

# I used this new function as follow:
nsim <- 1000
I.perm <- locMoranI.perm(z, R=nsim, listw=nbw)

MorI <- localmoran(z, listw=nbw)

# select for instance a 0.01 pseudo-significance level
p.perm <- apply(I.perm, 2, quantile, probs=0.99)

## because I-Moran identify spatial clustering
## high and low hotspots have no distinct I values
## make a vector to distinguish significant and high hotspots
hot <- ifelse(p.perm-MorI[,1]<0 & z>mean(z),1,0)

_______________________________________________
R-sig-Geo mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [email protected]

_______________________________________________
R-sig-Geo mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Reply via email to