Hello Everyone, I have one question on the spatial autoregression tests on primary variables or residuals of linear regression models of a dataset that has multiple observations per region. Here I use columbus dataset and moran.test() on a primary variable CRIME as illustration.
I modified the dataset to have only 7 regions, not 49 regions in original dataset, and each region has 7 observations. I used poly2nb() and nb2listw() to create neighbors and weights list of each region. But after examining the list carefully, I found that two observations within the same region have weight 1. This looks not right. I think poly2nb function treats each observation (a row in the dataset) as a region. Then I modified the neighbors and weights list to have weight 0 for two observations that are in the same region. The moran.test results from original listw object (listw0) and modified listw object (listw) have different Z-scores and p-values. The which moran.test is correct, or are both incorrect? > library(dplyr) > library(spdep) > > # read dataset and modify it: only 7 regions, 7 observations per region. > columbus <- st_read(system.file("shapes/columbus.shp", package="spData")[1], > quiet=TRUE) %>% + mutate(regionId = rep(LETTERS[1:7], each = 7), + geometry = rep(.$geometry[1:7], each = 7)) > columbus %>% as.data.frame() %>% select(regionId,geometry) %>% unique() %>% > dim() [1] 7 2 > > # neighbor list and weights > nblist <- poly2nb(as(columbus, "Spatial")) > listw0 <- listw <- nb2listw(nblist, zero.policy=T, style="B") > > # 1.Moran I test on primary variable CRIME using original weights list > (m.test <- moran.test(columbus$CRIME, listw0, + randomisation=FALSE, alternative="greater")) Moran I test under normality data: columbus$CRIME weights: listw0 Moran I statistic standard deviate = 2.6121, p-value = 0.004499 alternative hypothesis: greater sample estimates: Moran I statistic Expectation Variance 0.0497371251 -0.0208333333 0.0007298841 > > # change weights in listw: two observations within the same region have > weight 0. > columbus$ID = seq(nrow(columbus)) > for (i in seq_along(listw$neighbours)){ + listw$neighbours[[i]] = setdiff(listw$neighbours[[i]], subset(as.data.frame(columbus), regionId == as.data.frame(columbus)[i, "regionId"])$ID) + listw$weights[[i]] = rep(1, length(listw$neighbours[[i]])) + + if (length(listw$neighbours[[i]]) == 0) { + listw$neighbours[[i]] = 0L + listw$weights[i] = list(NULL) + } + } > > # 2.Moran I test on primary variable CRIME using modified weights list > (m.test <- moran.test(columbus$CRIME, listw, + randomisation=FALSE, alternative="greater")) Moran I test under normality data: columbus$CRIME weights: listw n reduced by no-neighbour observations Moran I statistic standard deviate = 1.1726, p-value = 0.1205 alternative hypothesis: greater sample estimates: Moran I statistic Expectation Variance 0.014619659 -0.024390244 0.001106761 > > # the neighbors and weights of the first observation > ## original listw > card(listw0$neighbours)[[1]] [1] 20 > listw0$neighbours[[1]] [1] 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 > listw0$weights[[1]] [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > > ## modified listw > card(listw$neighbours)[[1]] [1] 14 > listw$neighbours[[1]] [1] 8 9 10 11 12 13 14 15 16 17 18 19 20 21 > listw$weights[[1]] [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > > # overview of listw0 and listw > listw0 Characteristics of weights list object: Neighbour list object: Number of regions: 49 Number of nonzero links: 1078 Percentage nonzero weights: 44.89796 Average number of links: 22 2 disjoint connected subgraphs Weights style: B Weights constants summary: n nn S0 S1 S2 B 49 2401 1078 2156 110544 > listw Characteristics of weights list object: Neighbour list object: Number of regions: 49 Number of nonzero links: 784 Percentage nonzero weights: 32.65306 Average number of links: 16 7 regions with no links: 43 44 45 46 47 48 49 8 disjoint connected subgraphs Weights style: B Weights constants summary: n nn S0 S1 S2 B 42 1764 784 1568 65856 Thank you in advance and sorry for the inconvenience. Hongyun Wang Sr. Data Scientist � Contracted Bayer AG hongyun.wang....@bayer.com ________________________________ The information contained in this e-mail is for the excl...{{dropped:14}}
_______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo