On Tue, 3 Dec 2019, Chanda Chiseni wrote:

Hi Roger

Thank you for your very helpful feedback. I was indeed treating my point
data as polygons and did not impose a distance thresh hold.Essentially, as
you stated, many observations had many neighbors. I have since tried to you
K-neighbors and imposed a restriction of k=4. However, this is still taking
a bit long.

#Increasing the memory capacity
memory.limit(size = 80000)
## defining data
censusdata= CensusFinal_Analysis_R1

#Creating Matrix of Coordinates
sp_point <- cbind(censusdata$X, censusdata$Y)

colnames(sp_point)= c("Long","Lat")
head(sp_point)

## Create the K nearest neighbour
censusdata.4nn = knearneigh(sp_point,k=4,longlat = TRUE)

Don't use geographical coordinates. Project first, then K-nearest neighbours uses RANN, which is fast (Euclidean as against Great Circle distances).

Roger


I get stuck at the stage where i try to create the K nearest neighbor, the
operation is quite slow. Am i still doing something wrong?


Kind Regards,

Michael Chanda Chiseni

Phd Candidate

Department of Economic History

Lund University

Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund



*Africa is not poor, it is poorly managed (Ellen Johnson-Sirleaf ). *






On Mon, Dec 2, 2019 at 1:00 PM Roger Bivand <roger.biv...@nhh.no> wrote:

On Mon, 2 Dec 2019, Chanda Chiseni wrote:

I am currently working with a census data that has about 758 000
individuals. I am trying to create a spatial weight matrix using the X-Y
coordinates for their place of birth . However, i am running into
problems
when I try to create the nb type weights matrix using the poly2nb, R is
taking super long and after running for a long time it crushes. I have
increased R's memory size to about 80000 but this is still not working.

Please provide the (shortened) code used. poly2nb() is used for polygons,
not points. If you were using distances between points, you may have used
a distance threshold such that many observations have many neighbours.
Also ask yourself whether this is not a multi-level problem, in that
spatial interactions perhaps occur between aggregates of observations, not
the observations themselves.


Is there a way i can get around this problem? If anyone has any ideas on
how i can create a spatial weight matrix for such a large data set please
help.

An nb object (and listw) are just lists of length n, so a neighbour object
with 800K observations and 4 neighbours each only takes about 13MB, the
listw takes 38MB. What you can use them for may be another problem, and
much of the data may actually simply be noise not signal.

Roger


Kind Regards,


Michael Chanda Chiseni

Phd Candidate

Department of Economic History

Lund University

Visiting address: Alfa 1, Scheelevägen 15 B, 22363 Lund



*Africa is not poor, it is poorly managed (Ellen Johnson-Sirleaf ). *

      [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: roger.biv...@nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: roger.biv...@nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Reply via email to