Hi, I am interested in calculating multiple statistics based on skater{spdep} results for a SpatialPointsDataFrame, and I was wondering if someone could help me verify that what I have done is correct (Q1).
My objective is to evaluate the performance of the clustering while using different parameters for different skater() runs. Specifically, I am not sure how to measure the within-group similarity and I believe the other statistics are defined correctly. Also, can someone provide more details on the objects "not.prune" and "candidates" (Q2)? Q1 ------------------------------ These are the statistics that I would like to calculate: res1 <- skater() # Example of skater object # The sum of the between-group dissimilarity sst <- res1$ssto # The within-group similarity sse <- sum(res1$ssw)/max(res1$groups) # R2 R2 <- (sst-sse)/sst # AIC,AICc # AIC = n*log(SSD/n)+2*cov_count # AICc = AIC + 2*cov_count(cov_count+1)/(n-cov_count-1)) cov_count <- 1 # Number of covariates considered by skater and provided in data n_count <- nrow(shape2) # Node count aic <- (n_count * log(sst)/(n_count) + 2.0 * cov_count) aicc <- aic + 2.0 * cov_count * (cov_count + 1.0)/(n_count - cov_count - 1.0) # Calinski-Harabasz pseudo F-statistic nc <- max(res1$groups) n <- nrow(shape2) fstat = (R2 / (nc - 1)) / ((1 - R2) / (n - nc)) # Review print(c(aic, aicc, fstat, R2)) Q2 ------------------------------ Define "not.prune" and "candidates" For example, are candidates a list of cluster groups that are statistically significant while not.prune is a list of nodes that did not get assigned to a group. I have not been able to locate enough documentation on these objects and I am not sure how to interpret. Thank you for your assistance, Mike [[alternative HTML version deleted]] _______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo