-----Roger Bivand <[EMAIL PROTECTED]> wrote: -----
On Wed, 18 Oct 2006 [EMAIL PROTECTED] wrote:
> I have a 1000 x 1000 grid of categorical values (nine of these) and want
> to compute and plot a correlogram. Function cell2nb() will take a while
> it appears but if that succeeds then can methods of sp.correlogram() be
> used on categorical data? What are "style" options in sp.correlogram()?
>
Sorry, could you say how a correlogram might be constructed for
categorical values (eg. land cover)? Wouldn't join counts be a more
natural choice? joincoint.multi() in spdep has a Jtot value of total
different category joins, so using that with different lags of a cell2nb()
neighbour list might work. However, the 1M cell grid is pretty large for
cell2nb(), using dnearneigh() on cell centres may be faster and scale
better, (and other possibilities should exist) and nblag() will be
definitely sub-optimal in this setting. For join counts, the "B" weights
style is the obvious one to chose. Note that joincoint.multi() is not
coded in C.
This would need trying on a small subset and scaling up - I think that
alternative routes to constructing the lagged neighbour lists would be
preferable.
Roger
I wrote a short function to compute proportion agreement between classes at different lags, R code below. Because it's a brute force O(n^2) algorithm, it takes way too long in R for datasets with more than some 10's of points. So I wrote a C version that took about 20 hours for 1M points on a 2GHz linux computer. Subsampling from the 1000x1000 image reduced time considerably, naturally.
I haven't looked at Cliff and Ord or other texts on these kinds of statistics for a long time and I don't know whether this simple proportion agreement measure is closely related to a multinomial joincount or not. And so I don't know what the test for non-randomness is or what a variance measure is, etc. It does seem to portray the spatial pattern in agreement though.
isotaxogram <- function (pts, nlags=10, maxdist)
# Calculate proportion agreement of categorical point data
# at successive lags
#
# pts matrix or data frame with columns/names
# "x", "y", "val"
# nlags number of lags in the output function
# maxdist maximum distance over which to compute the function;
# needs to be specified
{
agree <- num <- lag <- rep (0, nlags)
for (i in 1:(nrow(pts)-1))
{
x1 <- pts[i, "x"]
y1 <- pts[i, "y"]
for (j in (i+1):nrow(pts))
{
x2 <- pts[j, "x"]
y2 <- pts[j, "y"]
d <- sqrt ((x2-x1)^2 + (y2-y1)^2)
d <- round (d / maxdist * nlags)
a <- as.integer (pts[i, "val"] == pts[j, "val"])
if (d %in% 1:nlags)
{
agree[d] <- agree[d] + a
num[d] <- num[d] + 1
}
}
}
for (k in 1:nlags)
{
lag[k] <- k * (maxdist / nlags)
if (num[k] != 0) agree[k] <- agree[k] / num[k]
}
return (list (agree=agree, lag=lag, num=num))
}
_______________________________________________ R-sig-Geo mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/r-sig-geo
