Thanks, Josiah and Connor!

We are moving the discussion to: https://github.com/gpiras/sphet/issues/17 for 
those who can participate there - if someone with comments prefers not to write 
there, please continue to follow up here.

Roger

--
Roger Bivand
Emeritus Professor
Norwegian School of Economics
Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway
roger.biv...@nhh.no

________________________________________
From: Josiah Parry <josiah.pa...@gmail.com>
Sent: 05 November 2023 18:01
To: Roger Bivand
Cc: r-sig-geo@r-project.org
Subject: Re: [R-sig-Geo] spdep: new zero.policy attribute

You don't often get email from josiah.pa...@gmail.com. Learn why this is 
important<https://aka.ms/LearnAboutSenderIdentification>
My take is generally that less is more. In the case of an isolated node, I 
think the best thing to do is to return NA rather than 0. For example consider 
administrative boundaries where the lagged variable is median household income. 
If we returned a 0 in that case, we'd likely be introducing quite the outlier!

My preference would be to _always_ default to NA lagged values when no 
neighbors are present. Additionally, it would be quite nice to instruct users 
on how to impute these values with other lagged variables. Say we have an 
isolated node, but we don't want an NA value. We can impute that missing value 
with the spatial lag of the variable using a different neighborhood 
construction—e.g. using KNN with k = 3 to ensure that the node always has 
values to lag.

Here is an example gist imputing k=3 lag 
https://gist.github.com/JosiahParry/eb7878fc375fb931ddd6675a2c591a2b<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2FJosiahParry%2Feb7878fc375fb931ddd6675a2c591a2b&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=K8weAXIlvFdTkHIlsVnJSgTbwG%2F2GNOn57XTtu9rMpQ%3D&reserved=0>

Another option could be to use the focal feature's value as the lag itself. If 
a location has no neighborhood could we argue that the location is its own 
neighborhood?

I'm not too familiar with the models to comment on them, though. I do see this 
more as a pre-processing / data imputation issue. I suspect that's more of a 
machine learning paradigm though!


On Sun, Nov 5, 2023 at 10:31 AM Roger Bivand 
<roger.biv...@nhh.no<mailto:roger.biv...@nhh.no>> wrote:
And a question: in nb2listw() and similar functions creating spatial weights 
listw objects, would it be sensible to guess that the presence of no-neighbour 
observations in the input nb neighbour implies the choice of a spatially lagged 
value of zero (zero.policy=TRUE), lx = Wx, rather than NA (zero.policy=FALSE)?

That is, use by default zero.policy=any(card(nb) == 0L) rather than 
zero.policy=NULL and look in the spdep option set by default on package load to 
FALSE but settable by the user?

Would this be taking trying to be helpful too far, given that the analyst is 
creating the neighbour object and presumably should take responsibility for 
choices made?

Context: polygons not sharing boundaries with other polygons do exist 
legitimately in data sources, but setting spatially lagged values to zero for 
those polygons is quite an invasive imputation. It may be better to oblige the 
user to make the choice when the spatial weights listw object is created.

Little is known about the problem, for a recent treatment for CAR models see: 
https://arxiv.org/abs/1705.04854<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Farxiv.org%2Fabs%2F1705.04854&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=GzEc02zEBhB8Y9D6Xy7To%2FX2qjCF5p9EzxeEjmVF6a8%3D&reserved=0>,
 published as 
https://doi.org/10.1016/j.sste.2018.04.002<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoi.org%2F10.1016%2Fj.sste.2018.04.002&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=i9DpuNPqlxtc%2ByPoiR2dUAV4uCRo33La9TxYVqAg838%3D&reserved=0>,
 where: "The specification of a CAR model on a disconnected graph is undefined 
... [t]here are essentially two types of disconnected graphs: first, a graph 
containing an island (a singleton node with no neighbours), second, a graph 
split in different sub-graphs (each of them being a connected graph)".

This question concerns the former, singleton, case, but adding sub-graph counts 
if greater than unity to summary.nb and print.nb address the second . Very 
possibly, functions creating nb neighbour objects should themselves report that 
an output object (graph) is not connected, bigDM CARBayes CARBayesST geostan 
spatialreg stampr do call spdep::n.comp.nb themselves to check the subgraph 
count.

Interested in feedback,

Roger

--
Roger Bivand
Emeritus Professor
Norwegian School of Economics
Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway
roger.biv...@nhh.no<mailto:roger.biv...@nhh.no>

________________________________________
From: R-sig-Geo 
<r-sig-geo-boun...@r-project.org<mailto:r-sig-geo-boun...@r-project.org>> on 
behalf of Roger Bivand <roger.biv...@nhh.no<mailto:roger.biv...@nhh.no>>
Sent: 04 November 2023 18:53
To: r-sig-geo@r-project.org<mailto:r-sig-geo@r-project.org>
Subject: [R-sig-Geo] spdep: new zero.policy attribute

In forthcoming spdep 1.3-1, spatial weight listw objects get a new zero.policy 
attribute. The attribute is added as objects are created to record the status 
of the zero.policy argument in the function creating the object, see: 
https://github.com/r-spatial/spdep/commit/e159de922c61713529a4075b0dfc2966eb8f9ad6<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fr-spatial%2Fspdep%2Fcommit%2Fe159de922c61713529a4075b0dfc2966eb8f9ad6&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=fZnjmNrcQKjLU3%2BDdz%2BagbRLj%2BnosBV7lapfcZwOiks%3D&reserved=0>.

Reverse dependency checks only show problems from over-eager unit testing in 
SpatialFeatureExperiment, a Bioconductor package, but other workflows may be 
impacted. The new attribute is used in tests for spatial autocorrelation to set 
the zero.policy argument in those tests (the arguments were zero.policy=NULL, 
are now zero.policy=attr(listw, "zero.policy") where listw is the spatial 
weights object argument to the test function.

This will be extended to spatialreg and friends if nobody reports negative 
impacts here soon. I'll wait before releasing 1.3-1 for a few days to see if 
any feedback is forthcoming.

Hope this long-overdue change is helpful,

Roger

--
Roger Bivand
Emeritus Professor
Norwegian School of Economics
Postboks 3490 Ytre Sandviken, 5045 Bergen, Norway
roger.biv...@nhh.no<mailto:roger.biv...@nhh.no>
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org<mailto:R-sig-Geo@r-project.org>
https://stat.ethz.ch/mailman/listinfo/r-sig-geo<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MZGKctkwXrNZulRgleZh3QDfeLHWz4sTMcO9%2FTUMKMI%3D&reserved=0>

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org<mailto:R-sig-Geo@r-project.org>
https://stat.ethz.ch/mailman/listinfo/r-sig-geo<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-geo&data=05%7C01%7CRoger.Bivand%40nhh.no%7C2486f70674ed4d8f788508dbde20f09c%7C33a15b2f849941998d56f20b5aa91af2%7C0%7C0%7C638348005283188250%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=MZGKctkwXrNZulRgleZh3QDfeLHWz4sTMcO9%2FTUMKMI%3D&reserved=0>
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Reply via email to