Re: [R-sig-Geo] Comparing distance among point pattern events

2020-01-09 Thread ASANTOS via R-sig-Geo
Dear R-Sig-Geo Members,

I have the three hypothetical point process situation (A, B and C) and my 
question is: What point distribution (B or C) is more close to A?

For this problem, I make a simple example:

library(spatstat)
set.seed(2023)
A <- rpoispp(30) ## First event
B <- rpoispp(30) ## Second event
C <- rThomas(10,0.02,5) ## Third event with Thomas cluster process
plot(A, pch=16)
plot(B, col="red", add=T)
plot(C, col="blue", add=T)

First, I takesthe distances between pairs of events:

ABd<-crossdist(A, B)
ACd<-crossdist(A, C)

mean(ABd)
# 0.4846027
mean(ACd)
# 0.5848766



# test the hypothesis that ABd is equal to ACd courtesy of Sarah Goslee

nperm <- 999

permout <- data.frame(ABd = rep(NA, nperm), ACd = rep(NA, nperm))

# create framework for a random assignment of B and C to the existing points

BC <- superimpose(B, C)
B.len <- npoints(B)
C.len <- npoints(C)
B.sampvect <- c(rep(TRUE, B.len), rep(FALSE, C.len))

set.seed(2023)
for(i in seq_len(nperm)) {
     B.sampvect <- sample(B.sampvect)
     B.perm <- BC[B.sampvect]
     C.perm <- BC[!B.sampvect]

     permout[i, ] <- c(mean(crossdist(A, B.perm)), mean(crossdist(A, C.perm)))
}


boxplot(permout$ABd - permout$ACd)
points(1, mean(ABd) - mean(ACd), col="red")

table(abs(mean(ABd) - mean(ACd)) >= abs(permout$ABd - permout$ACd))
#TRUE
# 999

sum(abs(mean(ABd) - mean(ACd)) >= abs(permout$ABd - permout$ACd)) / nperm
# [1] 1


The difference between ACd and ABd is distinguishable from that obtained by a 
random resampling of B and C.
Then B (0.4846027) is more close to A, that C (0.5848766).


But, now I comparing the distance to mean nearest neighbour and minimum 
distance between each pair of types:

marks(A)<-as.factor("A")
marks(B)<-as.factor("B")
marks(C)<-as.factor("C")

# distance to nearest neighbour A to B
nnda <- nncross(A,B, by=marks(A,B))

# mean nearest neighbour distances
mean(nnda[,1])
#[1] 0.09847543

# distance to nearest neighbour A to C
nndb <- nncross(A,C, by=marks(A,C))

# mean nearest neighbour distances
mean(nndb[,1])
#[1] 0.151127

# test again the hypothesis that ABd is equal to ACd

nperm <- 999

permout <- data.frame(ABd = rep(NA, nperm), ACd = rep(NA, nperm))

# create framework for a random assignment of B and C to the existing points

BC <- superimpose(B, C)
B.len <- npoints(B)
C.len <- npoints(C)
B.sampvect <- c(rep(TRUE, B.len), rep(FALSE, C.len))

set.seed(2023)
for(i in seq_len(nperm)) {
 B.sampvect <- sample(B.sampvect)
 B.perm <- BC[B.sampvect]
 C.perm <- BC[!B.sampvect]
 ab<-nncross(A, B.perm)
 ac<-nncross(A, C.perm)

 permout[i, ] <- c(mean(ab[,1]), mean(ac[,1]))
}


boxplot(permout$ABd - permout$ACd)
points(1, mean(nnda[,1]) - mean(nndb[,1]), col="red")

table(abs(mean(nnda[,1]) - mean(nndb[,1])) >= abs(permout$ABd - permout$ACd))
#FALSE  TRUE
#   91   908

sum(abs(mean(nnda[,1]) - mean(nndb[,1])) >= abs(permout$ABd - permout$ACd)) / 
nperm
#[1] 0.9089089


Now, the same conclusion or the mean nearest neighbour distances of A to B 
(0.10887343) is smaller than A to C (0.151127),
but is not so clear for me, what is the better approach if a comparing 
crossdist() and nndist () results for a good answer to my question?

Any conceptual tips?

Thanks in advance,

-- 
Alexandre dos Santos
Geotechnologies and Spatial Statistics applied to Forest Entomology
Instituto Federal de Mato Grosso (IFMT) - Campus Caceres
Caixa Postal 244 (PO Box)
Avenida dos Ramires, s/n - Distrito Industrial
Caceres - MT - CEP 78.200-000 (ZIP code)
Phone: (+55) 65 99686-6970 / (+55) 65 3221-2674
Lattes CV: http://lattes.cnpq.br/1360403201088680
OrcID: orcid.org/-0001-8232-6722
ResearchGate: www.researchgate.net/profile/Alexandre_Santos10
Publons: https://publons.com/researcher/3085587/alexandre-dos-santos/
--

Em 22/11/2019 10:09, Sarah Goslee escreveu:
> Hi,
>
> Great question, and clear example.
>
> The first problem:
> ACd<-pairdist(A) instead of ACd <- pairdist(AC)
>
> BUT
>
> pairdist() is the wrong function: that calculates the mean distance
> between ALL points, A to A and C to C as well as A to C.
>
> You need crossdist() instead.
>
> The most flexible approach is to roll your own permutation test. That
> will work even if B and C are different sizes, etc. If you specify the
> problem more exactly, there are probably parametric tests, but I like
> permutation tests.
>
>
> library(spatstat)
> set.seed(2019)
> A <- rpoispp(100) ## First event
> B <- rpoispp(50) ## Second event
> C <- rpoispp(50) ## Third event
> plot(A, pch=16)
> plot(B, col="red", add=T)
> plot(C, col="blue", add=T)
>
> ABd<-crossdist(A, B)
> ACd<-crossdist(A, C)
>
> mean(ABd)
> # 0.5168865
> mean(ACd)
> # 0.5070118
>
>
> # test the hypothesis that ABd is equal to ACd
>
> nperm <- 999
>
> permout <- data.frame(ABd = rep(NA, nperm), ACd = rep(NA, nperm))
>
> # create framework for a random assignment of B and C to the existing points
>
> BC <- superimpose(B, C)
> B.len <- npoints(B)
> C.len <- npoints(C)
> B.sampvect <- c(rep(TRUE, 

Re: [R-sig-Geo] Comparing distance among point pattern events

2019-11-22 Thread Sarah Goslee
Hi,

Great question, and clear example.

The first problem:
ACd<-pairdist(A) instead of ACd <- pairdist(AC)

BUT

pairdist() is the wrong function: that calculates the mean distance
between ALL points, A to A and C to C as well as A to C.

You need crossdist() instead.

The most flexible approach is to roll your own permutation test. That
will work even if B and C are different sizes, etc. If you specify the
problem more exactly, there are probably parametric tests, but I like
permutation tests.


library(spatstat)
set.seed(2019)
A <- rpoispp(100) ## First event
B <- rpoispp(50) ## Second event
C <- rpoispp(50) ## Third event
plot(A, pch=16)
plot(B, col="red", add=T)
plot(C, col="blue", add=T)

ABd<-crossdist(A, B)
ACd<-crossdist(A, C)

mean(ABd)
# 0.5168865
mean(ACd)
# 0.5070118


# test the hypothesis that ABd is equal to ACd

nperm <- 999

permout <- data.frame(ABd = rep(NA, nperm), ACd = rep(NA, nperm))

# create framework for a random assignment of B and C to the existing points

BC <- superimpose(B, C)
B.len <- npoints(B)
C.len <- npoints(C)
B.sampvect <- c(rep(TRUE, B.len), rep(FALSE, C.len))

set.seed(2019)
for(i in seq_len(nperm)) {
B.sampvect <- sample(B.sampvect)
B.perm <- BC[B.sampvect]
C.perm <- BC[!B.sampvect]

permout[i, ] <- c(mean(crossdist(A, B.perm)), mean(crossdist(A, C.perm)))
}


boxplot(permout$ABd - permout$ACd)
points(1, mean(ABd) - mean(ACd), col="red")

table(abs(mean(ABd) - mean(ACd)) >= abs(permout$ABd - permout$ACd))
# FALSE  TRUE
#  573   426

sum(abs(mean(ABd) - mean(ACd)) >= abs(permout$ABd - permout$ACd)) / nperm
# 0.4264264

The difference between ACd and ABd is indistinguishable from that
obtained by a random resampling of B and C.


Sarah

On Fri, Nov 22, 2019 at 8:26 AM ASANTOS via R-sig-Geo
 wrote:
>
> Dear R-Sig-Geo Members,
>
> I have the hypothetical point process situation:
>
> library(spatstat)
> set.seed(2019)
> A <- rpoispp(100) ## First event
> B <- rpoispp(50) ## Second event
> C <- rpoispp(50) ## Third event
> plot(A, pch=16)
> plot(B, col="red", add=T)
> plot(C, col="blue", add=T)
>
> I've like to know an adequate spatial approach for comparing if on
> average the event B or C is more close to A. For this, I try to make:
>
> AB<-superimpose(A,B)
> ABd<-pairdist(AB)
> AC<-superimpose(A,C)
> ACd<-pairdist(A)
> mean(ABd)
> #[1] 0.5112954
> mean(ACd)
> #[1] 0.5035042
>
> With this naive approach, I concluded that event C is more close of A
> that B. This sounds enough for a final conclusion or more robust
> analysis is possible?
>
> Thanks in advance,
>
> Alexandre
>

-- 
Sarah Goslee (she/her)
http://www.numberwright.com

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo