[R-sig-Geo] Adjusting for HAC in splm models
Dear community, I am trying to interpret and report the results of my spatial panel analysis, but I am running into issues while attempting to accommodate for spatial and serial autocorrelation and clustering in the errors. What implemented solutions currently exist in R for heteroskedasticity and autocorrelation correction (HAC) and/or standard error clustering for spatial panel models? Beyond the robust that could be run with non-spatial plm, fixest, or lfe packages, I have not been able to find an analogous HAC procedure for spatial panel models. The typical lmtest::coeftest() with sandwich vcov argument do not work directly on splm objects. A typical error on application of lmtest:coeftest(splm.obj, .vcov = vcovHAC(splm.obj)) is this: Error in UseMethod("estfun") : no applicable method for 'estfun' applied to an object of class "c('splm_ML', 'splm', 'splm_ML')" However, the splm documentation says that splm::vcov.splm() function is able to extract vcov matrix for interoperability with lmtest functions. How would I go about doing it? I consulted a fairly recent paper on software for spatial panel analysis by Bivand, Millo & Piras (2021), but there is little in the way of discussion on how to adjust the standard errors for spatial and serial autocorrelation and clustering. I am also aware of Conley spatial errors (Conley (1999)), implemented in fixest package for example, but these tend to require latitude and longitude as inputs (which is fine, if my inputs were points). But my dataset consists of polygons of various sizes where reducing them to a centroid to compute a distance matrix becomes theoretically untenable. I would prefer to stick with adjacency/contiguity weights. I am running R version 4.2.2 (splm v.1.6.3). Below is a simple reproducible example with similar outcomes: library(plm) library(spdep) library(splm) library(sandwich) library(lmtest) # Read in the data and spatial weights data("RiceFarms") data("riceww") # Convert into panel data frame and convert spatial weights into list form RiceFarms <- pdata.frame(RiceFarms, index = c("id", "time")) listw <- mat2listw(riceww) # Pre-compute lag for variables for use in Durbin specifications RiceFarms$slag.pesticide <- slag(RiceFarms$pesticide, listw = listw) RiceFarms$slag.wage <- slag(RiceFarms$wage, listw = listw) # PLM model plm.mod <- plm(goutput ~ pesticide + wage,data = RiceFarms, model = "within", effect = "twoways", index = c("id", "time")) # SDEM model sdem.mod <- spml(goutput ~ pesticide + wage + slag.pesticide + slag.wage, data = RiceFarms, listw = listw, spatial.error = "b", lag = FALSE, index = c("id", "time"), model = "within", effect = "twoways") # Apply robust SE corrections lmtest::coeftest(plm.mod, vcov. = vcovHC(plm.mod, method = "arellano", type = "HC0")) # Neither works with splm object lmtest::coeftest(sdem.mod, vcov. = vcovHC(sdem.mod, method = "arellano", type = "HC0")) lmtest::coeftest(sdem.mod, vcov. = sandwich::vcovHAC(sdem.mod)) A follow-up question is how to deal with HAC standard error corrections for the impacts, given that spatial panel impacts function are not directly implemented in splm as of version 1.6.3, although the coefficients can be computed as per the previous posts on the subject: https://stat.ethz.ch/pipermail/r-sig-geo/2016-April/024326.html https://stat.ethz.ch/pipermail/r-sig-geo/2019-July/027513.html Any help would be much appreciated. Best, Denys Dukhovnov ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[R-sig-Geo] Impacts for panel SDM and SDEM models in splm (revisited)
Dear community, Almost 7 years ago there was a question on panel SDM model impacts, as run following the models in splm package. https://stat.ethz.ch/pipermail/r-sig-geo/2016-April/024326.html The answer given by Roger Bivand then was that there is no built-in function for inference of SDM and SDEM impacts available and instead the impacts could be derived by following a stepwise process using dense weight matrices multiplying the inv(I - rho*W) matrix and the respective model coefficients, differently for Durbin and non-Durbin covariates. I would like to follow up with 2 more questions. As I did not find any, I am assuming there is still no built-in package function implementation for panel spatial Durbin and Durbin error impacts: 1) Would this solution work the same way for panel SDEM specification (that is, for the Durbin, local spillover effect only)? 2) How could one generate the p-values for the direct, indirect, and total impacts using this method, as given by spdep::impacts() output for SAR and SLX models? Would one need to run some form of simulations? Thank you very much. Best, Denys Dukhovnov ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Re: [R-sig-Geo] Error in splm random effects ML model
ag, errors = errors, cl = cl, > ...) > 1: spml(formula = mx.standard ~ I.score, data = analysis.subset, > listw = listw.wgts, effect = "individual", model = "random", > lag = FALSE, spatial.error = "b", index = c("Group", "Year")) > > > > My sessionInfo: > > R version 4.2.2 (2022-10-31 ucrt) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows 10 x64 (build 19044) > > > Matrix products: default > > > locale: > [1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United > States.utf8 LC_MONETARY=English_United States.utf8 > [4] LC_NUMERIC=C LC_TIME=English_United States.utf8 > > > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > > other attached packages: > [1] splm_1.6-2 plm_2.6-2 spdep_1.2-7 spData_2.2.1 sp_1.6-0 > forcats_1.0.0 stringr_1.5.0 dplyr_1.1.0 purrr_1.0.1 > [10] readr_2.1.4 tidyr_1.3.0 tibble_3.1.8 ggplot2_3.4.1 > tidyverse_1.3.2 sf_1.0-9 > > > loaded via a namespace (and not attached): > [1] nlme_3.1-160 fs_1.6.1 spatialreg_1.2-6 > lubridate_1.9.2 httr_1.4.4 tools_4.2.2 backports_1.4.1 > > [8] utf8_1.2.3 R6_2.5.1 KernSmooth_2.23-20 DBI_1.1.3 > colorspace_2.1-0 withr_2.5.0 tidyselect_1.2.0 > [15] compiler_4.2.2 cli_3.6.0 rvest_1.0.3 expm_0.999-7 > xml2_1.3.3 sandwich_3.0-2 scales_1.2.1 > [22] lmtest_0.9-40 classInt_0.4-8 proxy_0.4-27 > digest_0.6.31 pkgconfig_2.0.3 dbplyr_2.3.0 collapse_1.9.2 > > [29] ibdreg_0.3.8 rlang_1.0.6 readxl_1.4.2 > rstudioapi_0.14 generics_0.1.3 zoo_1.8-11 jsonlite_1.8.4 > > [36] googlesheets4_1.0.1 magrittr_2.0.3 Formula_1.2-4 s2_1.1.2 > dotCall64_1.0-2 Matrix_1.5-3 Rcpp_1.0.10 > [43] munsell_0.5.0 fansi_1.0.4 lifecycle_1.0.3 > stringi_1.7.8 MASS_7.3-58.1 grid_4.2.2 parallel_4.2.2 > > [50] bdsmatrix_1.3-6 crayon_1.5.2 deldir_1.0-6 > lattice_0.20-45 haven_2.5.1 splines_4.2.2 hms_1.1.2 > > [57] pillar_1.8.1 boot_1.3-28 LearnBayes_2.15.1 wk_0.7.1 > reprex_2.0.2 glue_1.6.2 modelr_0.1.10 > [64] vctrs_0.5.2 spam_2.9-1 tzdb_0.3.0 Rdpack_2.4 > miscTools_0.6-26 cellranger_1.1.0 gtable_0.3.1 > [71] assertthat_0.2.1 rbibutils_2.2.13 broom_1.0.3 e1071_1.7-13 > coda_0.19-4 class_7.3-20 googledrive_2.0.0 > [78] gargle_1.3.0 units_0.8-1 maxLik_1.5-2 > timechange_0.2.0 ellipsis_0.3.2 > > > > I tried running the same code on Linux x64 platform as well, with much the > same end result. > R version 4.2.2 Patched (2022-11-10 r83330) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 18.04.6 LTS > > > Here is the link to the minimally reproducible example (code and the actual > data): > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.dropbox.com%2Fsh%2Fs680xpvfdxowuor%2FAAAH5yF4JtrzWUgdbvQIF6LBa%3Fdl%3D0=05%7C01%7Croger.bivand%40nhh.no%7C7b2114f83ed64bb06aad08db0f85b68f%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638120839199952772%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=QX%2BWBZkxcnNypTVT%2FmJTZJC1aUmDluawNH3LGOVosTE%3D=0 > > > Thank you for your help. > > > Denys Dukhovnov > > > > On Wednesday, February 15, 2023 at 11:12:31 AM EST, Roger Bivand > wrote: > > > > > > On Wed, 15 Feb 2023, Denys Dukhovnov via R-sig-Geo wrote: > >> Dear community, >> >> I am trying to run a spatial panel random effects SEM model using splm >> package on a balanced spatial panel of N = 2472 and T = 4, and weights >> are based on k = 6 nearest neighbors. I encounter this error, which I >> can't find anything about in the documentation or online. >> >> spml(formula = mx.standard ~ I.score, >> data = analysis.subset, >> listw = listw.wgts, >> effect = "individual", >> model = "random", >> lag = FALSE, >> spatial.error = "b", >> index = c("Group", "Year
Re: [R-sig-Geo] Error in splm random effects ML model
link to the minimally reproducible example (code and the actual data): https://www.dropbox.com/sh/s680xpvfdxowuor/AAAH5yF4JtrzWUgdbvQIF6LBa?dl=0 Thank you for your help. Denys Dukhovnov On Wednesday, February 15, 2023 at 11:12:31 AM EST, Roger Bivand wrote: On Wed, 15 Feb 2023, Denys Dukhovnov via R-sig-Geo wrote: > Dear community, > > I am trying to run a spatial panel random effects SEM model using splm > package on a balanced spatial panel of N = 2472 and T = 4, and weights > are based on k = 6 nearest neighbors. I encounter this error, which I > can't find anything about in the documentation or online. > > spml(formula = mx.standard ~ I.score, > data = analysis.subset, > listw = listw.wgts, > effect = "individual", > model = "random", > lag = FALSE, > spatial.error = "b", > index = c("Group", "Year")) > > Error in .C64("aplsb1", SIGNATURE = c(SS$signature, SS$signature, > "double", : NAs in argument 7 and 'NAOK = FALSE' (dotCall64) Please at least provide the output of traceback() after the error (it does not crash, it error-exits correctly). Also provide the output of sessionInfo() - versions of underlying packages may matter (I'm thinking of dotCall64 and packages it uses). .C64() is not called by splm functions directly. Does the error persist on a fully updated system on the same OS? What happens on a different platform if available? Ideally, provide a fully reproducible minimal example using built-in data, or provide your data and the code used to create analysis.subset and listw.wgts, so that the running the code creating the error can be reproduced exactly. The error message reports NAs, are the data complete? Roger > > Unlike the same specification but with fixed effects (with model = > "within" argument), the one above takes a long while to run until it > crashes. As far as I can tell, there are no memory issues. > > An alternative spatial random effects using spgm() command runs OK, as > do non-spatial fixed and random effects models using plm(). For the one > above I could use spgm(), but I would prefer to keep all my models in > the maximum likelihood estimation for consistency. > > Any clarifications as to the nature of the error are welcome. > > Thank you. > > > Denys Dukhovnov > > ___ > R-sig-Geo mailing list > R-sig-Geo@r-project.org > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo=05%7C01%7CRoger.Bivand%40nhh.no%7Cd4a3c3a836f44c598f9608db0f6250a2%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638120687160554379%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=Oa5KOUxlzjHxBzBfAd12W9WrTRVBynr1bYdJlST%2F17g%3D=0 > -- Roger Bivand Emeritus Professor Department of Economics, Norwegian School of Economics, Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway. e-mail: roger.biv...@nhh.no https://orcid.org/-0003-2392-6140 https://scholar.google.no/citations?user=AWeghB0J=en ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[R-sig-Geo] Error in splm random effects ML model
Dear community, I am trying to run a spatial panel random effects SEM model using splm package on a balanced spatial panel of N = 2472 and T = 4, and weights are based on k = 6 nearest neighbors. I encounter this error, which I can't find anything about in the documentation or online. spml(formula = mx.standard ~ I.score, data = analysis.subset, listw = listw.wgts, effect = "individual", model = "random", lag = FALSE, spatial.error = "b", index = c("Group", "Year")) Error in .C64("aplsb1", SIGNATURE = c(SS$signature, SS$signature, "double", : NAs in argument 7 and 'NAOK = FALSE' (dotCall64) Unlike the same specification but with fixed effects (with model = "within" argument), the one above takes a long while to run until it crashes. As far as I can tell, there are no memory issues. An alternative spatial random effects using spgm() command runs OK, as do non-spatial fixed and random effects models using plm(). For the one above I could use spgm(), but I would prefer to keep all my models in the maximum likelihood estimation for consistency. Any clarifications as to the nature of the error are welcome. Thank you. Denys Dukhovnov ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[R-sig-Geo] Fitting hierarchical spatial panel data models
Hello, I am aiming to run an SLX or and SDEM local spillover type spatial model and I am looking for advice on existing methods and their implementations. I have a panel data with about 30 years of observation and nested spatial structure (such as US counties within US states). I would ideally like to account for both the hierarchical structure of the data as well as the time component. I understand that hierarchical spatial econometric models are well established (for example, as per Lacombe & McIntyre (2016)), but I struggle to find any prior reference that combines the nested/hierarchical structure with spatial panel data with a longitudinal component. I am curious if someone could point me to where I might want to look for a solution and any existing software implementation in R. Alternatively, ignoring the temporal component, is it possible to specify a multi-level model within the splm R package for SDEM type process? Many thanks in advance. Best, Denys ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[R-sig-Geo] Error in impacts() after lagsarlm (spdep package)
I posted this question on StackOverflow a while back but received no answer. I have 9,150 polygons in my data set. I was trying to run a spatial autoregressive model (SAR) in spdep to test spatial dependence of my outcome variable. After running the model, I wanted to examine the direct/indirect impacts, but encountered an error that seems to have something to do with the length of neighbors in the weights matrix not being equal to n. I tried running the very same equation as SLX model (Spatial Lag X), and impacts() worked fine, even though there were some polygons in my set that had no neighbors. > # Defining queen contiguity neighbors for polyset and storing the matrix as > list > q.nbrs <- poly2nb(polyset) > listweights <- nb2listw(q.nbrs, zero.policy = TRUE) > # Defining the model > model.equation <- TIME ~ A + B + C > # Run SAR model > reg <- lagsarlm(model.equation, data = polyset, listw = listweights, > zero.policy = TRUE) > # Run impacts() to show direct/indirect impacts > impacts(reg, listw = listweights, zero.policy = TRUE) > Error in intImpacts(rho = rho, beta = beta, P = P, n = n, mu = mu, Sigma = > Sigma, : length(listweights$neighbours) == n is not TRUE What am I doing wrong? I am running Windows 10 machine with R 3.5.3 with the most up-to-date spdep package, if it helps. Thank you very much. Regards, Denys Dukhovnov ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[R-sig-Geo] More efficient raster extraction
Hello, I am trying to extract 300 meter raster cell values over Census block polygons for each of the US states. Because 300m cells are much larger than typical urban blocks, I had to specify "small" option (as shown below) so that no polygon will remain "blank" without a corresponding raster value(s). Blocks shapefile for District Columbia with about 6,500 blocks, for example, being the smallest state, takes 15 minutes to complete, but New York with almost 350,000 block polygons would not complete even in 4 days. On a separate occasion I also tried using parallel processing (foreach), but this only appears to slow down the extract function, rather than speeding it up. I clipped the main raster to the extent of each state, but this doesn't help (not much, anyway). In this case the vector data are huge, so I would expect any improvement in efficiency to come from reducing its burden on the process. Please share any advice as to how I can make the extract function faster and more efficient. Thank you very much! Faster method: Without parallel processing: BlockIndex <- extract(Raster_300m, Blocks, df=TRUE, small=TRUE) Indefinitely slow: With parallel processing (8 registered cores): BlockIndex <- foreach (i=1:length(Blocks$GEOID10), .combine='c', .packages="raster") %dopar% { extract(Raster_300m, Blocks[i], df=TRUE, small=TRUE)} Regards, Denys Dukhovnov [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[R-sig-Geo] Help needed with extraction of raster statistics by polygons
Hello all! I need to spatially calculate 30-meter grid raster mean over US Census blocks in North East region (approx. 1.9 mln street blocks/polygons). I came up with the script that runs in parallel, but it still takes days to complete, even when the i7 CPU is used up to the max. My question: is there any other way to improve the processing speed given the set-up below (I included the backbone of the script to save space). I am new to R, so any help will be much appreciated. Thanks a lot! Regards, Denys library(sp) library(rgdal) library(raster) library(foreach) library(spatial.tools) Grid30m - raster(raster.tif)# loading main raster into R NEfips - c(09, 10, 11, ) # list of state FIPS codes, whose street block polygons are to be processed ShapePath - T:\\... # path to block shapefiles sfQuickInit(cpus=8) # setting up and registering parallel processing back-end for (x in NEfips) { State - paste(block_, x , sep=) Blocks - readOGR(dsn=ShapePath, layer=State[1], stringsAsFactors=F) # loading the block polygons BlocksReproject - spTransform(Blocks, crs(Grid30m)) GridMean - foreach (i=1:length(BlocksReproject$GEOID10), .combine='c', .packages=raster) %dopar% { # Parallel processing loop for extracting mean raster value per street block polygon extract(Grid30m, BlocksReproject[i], fun=mean) } write.table(...) # Generating output containing the mean raster statistics for each block polygon. } [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo