Hi,

I am interested in calculating multiple statistics based on
skater{spdep} results
for a SpatialPointsDataFrame, and I was wondering if someone could help me
verify that what I have done is correct (Q1).

My objective is to evaluate the performance of the clustering while
using different
parameters for different skater() runs. Specifically, I am not sure
how to measure
the within-group similarity and I believe the other statistics are defined
correctly.

Also, can someone provide more details on the objects "not.prune" and
"candidates" (Q2)?

Q1 ------------------------------ These are the statistics that I would
like to calculate:
res1 <- skater() # Example of skater object

# The sum of the between-group dissimilarity
sst <- res1$ssto

# The within-group similarity
sse <- sum(res1$ssw)/max(res1$groups)

# R2
R2 <- (sst-sse)/sst

# AIC,AICc
# AIC = n*log(SSD/n)+2*cov_count
# AICc = AIC + 2*cov_count(cov_count+1)/(n-cov_count-1))
cov_count <- 1 # Number of covariates considered by skater and provided in
data
n_count <- nrow(shape2) # Node count
aic <- (n_count * log(sst)/(n_count) + 2.0 * cov_count)
aicc <- aic + 2.0 * cov_count * (cov_count + 1.0)/(n_count - cov_count -
1.0)

# Calinski-Harabasz pseudo F-statistic
nc <- max(res1$groups)
n <- nrow(shape2)
fstat = (R2 / (nc - 1)) / ((1 - R2) / (n - nc))

# Review
print(c(aic, aicc, fstat, R2))

Q2 ------------------------------
Define "not.prune" and "candidates"

For example, are candidates a list of cluster groups that are statistically
significant while not.prune is a list of nodes that did not get assigned to
a group. I have not been able to locate enough documentation on these
objects and I am not sure how to interpret.

Thank you for your assistance,
Mike

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Reply via email to