[R-sig-Geo] Adjusting for HAC in splm models

2023-11-21 Thread Denys Dukhovnov via R-sig-Geo
Dear community,

I am trying to interpret and report the results of my spatial panel analysis, 
but I am running into issues while attempting to accommodate for spatial and 
serial autocorrelation and clustering in the errors.

What implemented solutions currently exist in R for heteroskedasticity and 
autocorrelation correction (HAC) and/or standard error clustering for spatial 
panel models? Beyond the robust that could be run with non-spatial plm, fixest, 
or lfe packages, I have not been able to find an analogous HAC procedure for 
spatial panel models. The typical lmtest::coeftest() with sandwich vcov 
argument do not work directly on splm objects. 

A typical error on application of lmtest:coeftest(splm.obj, .vcov = 
vcovHAC(splm.obj)) is this:
Error in UseMethod("estfun") : no applicable method for 'estfun' applied to an 
object of class "c('splm_ML', 'splm', 'splm_ML')"

However, the splm documentation says that splm::vcov.splm() function is able to 
extract vcov matrix for interoperability with lmtest functions. How would I go 
about doing it?

I consulted a fairly recent paper on software for spatial panel analysis by 
Bivand, Millo & Piras (2021), but there is little in the way of discussion on 
how to adjust the standard errors for spatial and serial autocorrelation and 
clustering.

I am also aware of Conley spatial errors (Conley (1999)), implemented in fixest 
package for example, but these tend to require latitude and longitude as inputs 
(which is fine, if my inputs were points). But my dataset consists of polygons 
of various sizes where reducing them to a centroid to compute a distance matrix 
becomes theoretically untenable. I would prefer to stick with 
adjacency/contiguity weights.

I am running R version 4.2.2 (splm v.1.6.3). Below is a simple reproducible 
example with similar outcomes:

library(plm)
library(spdep)
library(splm)
library(sandwich)
library(lmtest)

# Read in the data and spatial weights
data("RiceFarms")
data("riceww")

# Convert into panel data frame and convert spatial weights into list form
RiceFarms <- pdata.frame(RiceFarms, index = c("id", "time"))
listw <- mat2listw(riceww)

# Pre-compute lag for variables for use in Durbin specifications
RiceFarms$slag.pesticide <- slag(RiceFarms$pesticide, listw = listw)
RiceFarms$slag.wage <- slag(RiceFarms$wage, listw = listw)

# PLM model
plm.mod <- plm(goutput ~ pesticide + wage,data = RiceFarms, 
               model = "within", 
               effect = "twoways", 
               index = c("id", "time"))

# SDEM model
sdem.mod <- spml(goutput ~ pesticide + wage + slag.pesticide + slag.wage,
                 data = RiceFarms, 
                 listw = listw,
                 spatial.error = "b",
                 lag = FALSE,
                 index = c("id", "time"), 
                 model = "within", 
                 effect = "twoways")

# Apply robust SE corrections
lmtest::coeftest(plm.mod, vcov. = vcovHC(plm.mod, method = "arellano", type = 
"HC0"))

# Neither works with splm object
lmtest::coeftest(sdem.mod, vcov. = vcovHC(sdem.mod, method = "arellano", type = 
"HC0"))
lmtest::coeftest(sdem.mod, vcov. = sandwich::vcovHAC(sdem.mod))


A follow-up question is how to deal with HAC standard error corrections for the 
impacts, given that spatial panel impacts function are not directly implemented 
in splm as of version 1.6.3, although the coefficients can be computed as per 
the previous posts on the subject:

https://stat.ethz.ch/pipermail/r-sig-geo/2016-April/024326.html
https://stat.ethz.ch/pipermail/r-sig-geo/2019-July/027513.html

Any help would be much appreciated.


Best,
Denys Dukhovnov

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


[R-sig-Geo] Impacts for panel SDM and SDEM models in splm (revisited)

2023-03-07 Thread Denys Dukhovnov via R-sig-Geo
Dear community,

Almost 7 years ago there was a question on panel SDM model impacts, as run 
following the models in splm package.

https://stat.ethz.ch/pipermail/r-sig-geo/2016-April/024326.html

The answer given by Roger Bivand then was that there is no built-in function 
for inference of SDM and SDEM impacts available and instead the impacts could 
be derived by following a stepwise process using dense weight matrices 
multiplying the inv(I - rho*W) matrix and the respective model coefficients, 
differently for Durbin and non-Durbin covariates.

I would like to follow up with 2 more questions. As I did not find any, I am 
assuming there is still no built-in package function implementation for panel 
spatial Durbin and Durbin error impacts:
1) Would this solution work the same way for panel SDEM specification (that is, 
for the Durbin, local spillover effect only)?
2) How could one generate the p-values for the direct, indirect, and total 
impacts using this method, as given by spdep::impacts() output for SAR and SLX 
models? Would one need to run some form of simulations?

Thank you very much.

Best,
Denys Dukhovnov

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] Error in splm random effects ML model

2023-02-16 Thread Denys Dukhovnov via R-sig-Geo
ag, errors = errors, cl = cl,
>        ...)
> 1: spml(formula = mx.standard ~ I.score, data = analysis.subset,
>        listw = listw.wgts, effect = "individual", model = "random",
>        lag = FALSE, spatial.error = "b", index = c("Group", "Year"))
>
>
>
> My sessionInfo:
>
> R version 4.2.2 (2022-10-31 ucrt)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 19044)
>
>
> Matrix products: default
>
>
> locale:
> [1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United 
> States.utf8    LC_MONETARY=English_United States.utf8
> [4] LC_NUMERIC=C                           LC_TIME=English_United States.utf8 
>    
>
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base     
>
>
> other attached packages:
>  [1] splm_1.6-2      plm_2.6-2       spdep_1.2-7     spData_2.2.1    sp_1.6-0 
>        forcats_1.0.0   stringr_1.5.0   dplyr_1.1.0     purrr_1.0.1    
> [10] readr_2.1.4     tidyr_1.3.0     tibble_3.1.8    ggplot2_3.4.1   
> tidyverse_1.3.2 sf_1.0-9       
>
>
> loaded via a namespace (and not attached):
>  [1] nlme_3.1-160        fs_1.6.1            spatialreg_1.2-6    
> lubridate_1.9.2     httr_1.4.4          tools_4.2.2         backports_1.4.1   
>  
>  [8] utf8_1.2.3          R6_2.5.1            KernSmooth_2.23-20  DBI_1.1.3    
>        colorspace_2.1-0    withr_2.5.0         tidyselect_1.2.0   
> [15] compiler_4.2.2      cli_3.6.0           rvest_1.0.3         expm_0.999-7 
>        xml2_1.3.3          sandwich_3.0-2      scales_1.2.1       
> [22] lmtest_0.9-40       classInt_0.4-8      proxy_0.4-27        
> digest_0.6.31       pkgconfig_2.0.3     dbplyr_2.3.0        collapse_1.9.2    
>  
> [29] ibdreg_0.3.8        rlang_1.0.6         readxl_1.4.2        
> rstudioapi_0.14     generics_0.1.3      zoo_1.8-11          jsonlite_1.8.4    
>  
> [36] googlesheets4_1.0.1 magrittr_2.0.3      Formula_1.2-4       s2_1.1.2     
>        dotCall64_1.0-2     Matrix_1.5-3        Rcpp_1.0.10        
> [43] munsell_0.5.0       fansi_1.0.4         lifecycle_1.0.3     
> stringi_1.7.8       MASS_7.3-58.1       grid_4.2.2          parallel_4.2.2    
>  
> [50] bdsmatrix_1.3-6     crayon_1.5.2        deldir_1.0-6        
> lattice_0.20-45     haven_2.5.1         splines_4.2.2       hms_1.1.2         
>  
> [57] pillar_1.8.1        boot_1.3-28         LearnBayes_2.15.1   wk_0.7.1     
>        reprex_2.0.2        glue_1.6.2          modelr_0.1.10      
> [64] vctrs_0.5.2         spam_2.9-1          tzdb_0.3.0          Rdpack_2.4   
>        miscTools_0.6-26    cellranger_1.1.0    gtable_0.3.1       
> [71] assertthat_0.2.1    rbibutils_2.2.13    broom_1.0.3         e1071_1.7-13 
>        coda_0.19-4         class_7.3-20        googledrive_2.0.0  
> [78] gargle_1.3.0        units_0.8-1         maxLik_1.5-2        
> timechange_0.2.0    ellipsis_0.3.2     
>
>
>
> I tried running the same code on Linux x64 platform as well, with much the 
> same end result.
> R version 4.2.2 Patched (2022-11-10 r83330)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 18.04.6 LTS
>
>
> Here is the link to the minimally reproducible example (code and the actual 
> data):
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.dropbox.com%2Fsh%2Fs680xpvfdxowuor%2FAAAH5yF4JtrzWUgdbvQIF6LBa%3Fdl%3D0=05%7C01%7Croger.bivand%40nhh.no%7C7b2114f83ed64bb06aad08db0f85b68f%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638120839199952772%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=QX%2BWBZkxcnNypTVT%2FmJTZJC1aUmDluawNH3LGOVosTE%3D=0
>
>
> Thank you for your help.
>
>
> Denys Dukhovnov
>
>
>
> On Wednesday, February 15, 2023 at 11:12:31 AM EST, Roger Bivand 
>  wrote:
>
>
>
>
>
> On Wed, 15 Feb 2023, Denys Dukhovnov via R-sig-Geo wrote:
>
>> Dear community,
>>
>> I am trying to run a spatial panel random effects SEM model using splm
>> package on a balanced spatial panel of N = 2472 and T = 4, and weights
>> are based on k = 6 nearest neighbors. I encounter this error, which I
>> can't find anything about in the documentation or online.
>>
>> spml(formula = mx.standard ~ I.score, 
>>                          data = analysis.subset,
>>                          listw = listw.wgts,
>>                          effect = "individual",
>>                          model = "random", 
>>                          lag = FALSE,
>>                          spatial.error = "b",
>>                          index = c("Group", "Year

Re: [R-sig-Geo] Error in splm random effects ML model

2023-02-15 Thread Denys Dukhovnov via R-sig-Geo
link to the minimally reproducible example (code and the actual 
data):
https://www.dropbox.com/sh/s680xpvfdxowuor/AAAH5yF4JtrzWUgdbvQIF6LBa?dl=0


Thank you for your help.


Denys Dukhovnov



On Wednesday, February 15, 2023 at 11:12:31 AM EST, Roger Bivand 
 wrote: 





On Wed, 15 Feb 2023, Denys Dukhovnov via R-sig-Geo wrote:

> Dear community,
>
> I am trying to run a spatial panel random effects SEM model using splm 
> package on a balanced spatial panel of N = 2472 and T = 4, and weights 
> are based on k = 6 nearest neighbors. I encounter this error, which I 
> can't find anything about in the documentation or online.
>
> spml(formula = mx.standard ~ I.score, 
>                          data = analysis.subset,
>                          listw = listw.wgts,
>                          effect = "individual",
>                          model = "random", 
>                          lag = FALSE,
>                          spatial.error = "b",
>                          index = c("Group", "Year"))
>
> Error in .C64("aplsb1", SIGNATURE = c(SS$signature, SS$signature, 
> "double", : NAs in argument 7 and 'NAOK = FALSE' (dotCall64)

Please at least provide the output of traceback() after the error (it does 
not crash, it error-exits correctly). Also provide the output of 
sessionInfo() - versions of underlying packages may matter (I'm thinking 
of dotCall64 and packages it uses). .C64() is not called by splm functions 
directly.

Does the error persist on a fully updated system on the same OS?

What happens on a different platform if available?

Ideally, provide a fully reproducible minimal example using built-in data, 
or provide your data and the code used to create analysis.subset and 
listw.wgts, so that the running the code creating the error can be 
reproduced exactly. The error message reports NAs, are the data complete?

Roger


>
> Unlike the same specification but with fixed effects (with model = 
> "within" argument), the one above takes a long while to run until it 
> crashes. As far as I can tell, there are no memory issues.
>
> An alternative spatial random effects using spgm() command runs OK, as 
> do non-spatial fixed and random effects models using plm(). For the one 
> above I could use spgm(), but I would prefer to keep all my models in 
> the maximum likelihood estimation for consistency.
>
> Any clarifications as to the nature of the error are welcome. 
>
> Thank you.
>
>
> Denys Dukhovnov
>
> ___
> R-sig-Geo mailing list
> R-sig-Geo@r-project.org
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo=05%7C01%7CRoger.Bivand%40nhh.no%7Cd4a3c3a836f44c598f9608db0f6250a2%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638120687160554379%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=Oa5KOUxlzjHxBzBfAd12W9WrTRVBynr1bYdJlST%2F17g%3D=0
>

-- 
Roger Bivand
Emeritus Professor
Department of Economics, Norwegian School of Economics,
Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway.
e-mail: roger.biv...@nhh.no
https://orcid.org/-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0J=en

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


[R-sig-Geo] Error in splm random effects ML model

2023-02-15 Thread Denys Dukhovnov via R-sig-Geo
Dear community,

I am trying to run a spatial panel random effects SEM model using splm package 
on a balanced spatial panel of N = 2472 and T = 4, and weights are based on k = 
6 nearest neighbors. I encounter this error, which I can't find anything about 
in the documentation or online.

spml(formula = mx.standard ~ I.score, 
                         data = analysis.subset,
                         listw = listw.wgts,
                         effect = "individual",
                         model = "random", 
                         lag = FALSE,
                         spatial.error = "b",
                         index = c("Group", "Year"))

Error in .C64("aplsb1", SIGNATURE = c(SS$signature, SS$signature, "double",  : 
  NAs in argument 7 and 'NAOK = FALSE' (dotCall64)

Unlike the same specification but with fixed effects (with model = "within" 
argument), the one above takes a long while to run until it crashes. As far as 
I can tell, there are no memory issues.

An alternative spatial random effects using spgm() command runs OK, as do 
non-spatial fixed and random effects models using plm(). For the one above I 
could use spgm(), but I would prefer to keep all my models in the maximum 
likelihood estimation for consistency.

Any clarifications as to the nature of the error are welcome. 

Thank you.


Denys Dukhovnov

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


[R-sig-Geo] Fitting hierarchical spatial panel data models

2022-10-01 Thread Denys Dukhovnov via R-sig-Geo
Hello,

I am aiming to run an SLX or and SDEM local spillover type spatial model and I 
am looking for advice on existing methods and their implementations. 

I have a panel data with about 30 years of observation and nested spatial 
structure (such as US counties within US states). I would ideally like to 
account for both the hierarchical structure of the data as well as the time 
component. 

I understand that hierarchical spatial econometric models are well established 
(for example, as per Lacombe & McIntyre (2016)), but I struggle to find any 
prior reference that combines the nested/hierarchical structure with spatial 
panel data with a longitudinal component.

I am curious if someone could point me to where I might want to look for a 
solution and any existing software implementation in R. Alternatively, ignoring 
the temporal component, is it possible to specify a multi-level model within 
the splm R package for SDEM type process? 


Many thanks in advance.

Best,
Denys

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


[R-sig-Geo] Error in impacts() after lagsarlm (spdep package)

2019-11-13 Thread Denys Dukhovnov via R-sig-Geo
I posted this question on StackOverflow a while back but received no answer. 

I have 9,150 polygons in my data set. I was trying to run a spatial 
autoregressive model (SAR) in spdep to test spatial dependence of my outcome 
variable. After running the model, I wanted to examine the direct/indirect 
impacts, but encountered an error that seems to have something to do with the 
length of neighbors in the weights matrix not being equal to n. I tried running 
the very same equation as SLX model (Spatial Lag X), and impacts() worked fine, 
even though there were some polygons in my set that had no neighbors. 

> # Defining queen contiguity neighbors for polyset and storing the matrix as 
> list
> q.nbrs <- poly2nb(polyset) 
> listweights <- nb2listw(q.nbrs, zero.policy = TRUE)

> # Defining the model
> model.equation <- TIME ~ A + B + C

> # Run SAR model
> reg <- lagsarlm(model.equation, data = polyset, listw = listweights, 
> zero.policy = TRUE)

> # Run impacts() to show direct/indirect impacts
> impacts(reg, listw = listweights, zero.policy = TRUE)

> Error in intImpacts(rho = rho, beta = beta, P = P, n = n, mu = mu, Sigma = 
> Sigma,  : 
    length(listweights$neighbours) == n is not TRUE

What am I doing wrong? I am running Windows 10 machine with R 3.5.3 with the 
most up-to-date spdep package, if it helps.

Thank you very much.

Regards,
Denys Dukhovnov

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


[R-sig-Geo] More efficient raster extraction

2015-09-14 Thread Denys Dukhovnov via R-sig-Geo
Hello,
I am trying to extract 300 meter raster cell values over Census block polygons 
for each of the US states. Because 300m cells are much larger than typical 
urban blocks, I had to specify "small" option (as shown below) so that no 
polygon will remain "blank" without a corresponding raster value(s). Blocks 
shapefile for District Columbia with about 6,500 blocks, for example, being the 
smallest state, takes 15 minutes to complete, but New York with almost 350,000 
block polygons would not complete even in 4 days. On a separate occasion I also 
tried using parallel processing (foreach), but this only appears to slow down 
the extract function, rather than speeding it up. I clipped the main raster to 
the extent of each state, but this doesn't help (not much, anyway). In this 
case the vector data are huge, so I would expect any improvement in efficiency 
to come from reducing its burden on the process.

 Please share any advice as to how I can make the extract function faster and 
more efficient. Thank you very much!


Faster method: Without parallel processing:

 BlockIndex <- extract(Raster_300m, Blocks, df=TRUE, small=TRUE)

Indefinitely slow: With parallel processing (8 registered cores):

BlockIndex <- foreach (i=1:length(Blocks$GEOID10), .combine='c', 
.packages="raster") %dopar% {
        extract(Raster_300m, Blocks[i], df=TRUE, small=TRUE)}

Regards, 

Denys Dukhovnov

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


[R-sig-Geo] Help needed with extraction of raster statistics by polygons

2015-08-25 Thread Denys Dukhovnov via R-sig-Geo
Hello all! 

I need to spatially calculate 30-meter grid raster mean over US Census blocks 
in North East region (approx. 1.9 mln street blocks/polygons). I came up with 
the script that runs in parallel, but it still takes days to complete, even 
when the i7 CPU is used up to the max. My question: is there any other way to 
improve the processing speed given the set-up below (I included the backbone of 
the script to save space). I am new to R, so any help will be much appreciated. 
Thanks a lot!

Regards,
Denys


library(sp)
library(rgdal)
library(raster)
library(foreach)
library(spatial.tools)

Grid30m - raster(raster.tif)# loading main raster into R
NEfips - c(09, 10, 11, )   # list of state FIPS codes, whose street 
block polygons are to be processed
ShapePath - T:\\...      # path to block shapefiles 

sfQuickInit(cpus=8)     # setting up and registering parallel 
processing back-end

for (x in NEfips) {
    State - paste(block_, x , sep=)
    Blocks - readOGR(dsn=ShapePath, layer=State[1], stringsAsFactors=F)    # 
loading the block polygons
    BlocksReproject - spTransform(Blocks, crs(Grid30m))
    
    GridMean - foreach (i=1:length(BlocksReproject$GEOID10), .combine='c', 
.packages=raster) %dopar% {    # Parallel processing loop for extracting 
mean raster value per street block polygon
           extract(Grid30m, BlocksReproject[i], fun=mean)
    }
    write.table(...)    # Generating output containing the mean raster 
statistics for each block polygon.
}

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo