I have a sf object with 2 columns, named fclass and geometry. I am interested in the fclass column. The fclass has several unique values. I convert this sf object to spatraster, I resample it and I perform a linear regression where I use another spatraster, called ntl, as my response (lm(ntl ~ road)).
My goal is to find which unique values give the largest r-squared in the lm model using stepwise linear regression whith backward elimination. My initial try was: library(pacman) pacman::p_load(terra, sf, dplyr) ntl <- rast("path/ntl.tif") # response v <- st_read("path/road17.shp") # sf object # baseline r2 vterra <- vect(v) ref <- rast("path/pop.tif") # get ext and pixel size r <- rast(v, res = res(ref), ext = ext(ref)) x <- rasterizeGeom(vterra, r, "length", "m") x_res <- resample(x, ntl, "average") s <- c(ntl, x_res) names(s) <- c("ntl", "road") m <- lm(ntl ~ road, s, na.action = na.exclude) baseline <- sqrt(summary(m)$adj.r.squared) orig_baseline <- sqrt(summary(m)$adj.r.squared) classes <- unique(v$fclass) inclasses <- unique(v$fclass) i <- 1 while (i <= length(classes)) { class <- classes[i] print(paste0("current class ", class)) print(paste0("orig baseline: ", orig_baseline, " - baseline: ", baseline)) print(classes) v_filtered <- v[v$fclass != class, ] vterra <- vect(v_filtered) r <- rast(v, res = res(ref), ext = ext(ref)) x <- rasterizeGeom(vterra, r, "length", "m") x_res <- resample(x, ntl, "average") s <- c(ntl, x_res) names(s) <- c("ntl", "road") m <- lm(ntl ~ road, s, na.action = na.exclude) class_r2 <- sqrt(summary(m)$adj.r.squared) if (class_r2 > baseline) { classes <- classes[-i] baseline <- class_r2 } else { print("class_r2 is less than baseline") print(paste0("class_r2: ", class_r2, " - baseline: ", baseline)) i <- i + 1 } } but it's not efficient (it takes ~ 5 minutes). I have found the MASS package which has the lmStepAIC function but I am not sure how to select the unique values from the fclass column to use as predictiors. head(v, 6) Simple feature collection with 6 features and 1 field Geometry type: MULTILINESTRING Dimension: XY Bounding box: xmin: 598675.9 ymin: 7111459 xmax: 609432.8 ymax: 7118729 Projected CRS: WGS 84 / UTM zone 35S fclass geometry 1 secondary MULTILINESTRING ((598675.9 ... 2 secondary MULTILINESTRING ((600641.7 ... 3 residential MULTILINESTRING ((601734.8 ... 4 residential MULTILINESTRING ((601163.9 ... 5 residential MULTILINESTRING ((601104.2 ... 6 motorway_link MULTILINESTRING ((609432.8 ... The unique values in the fclass column: unique(v$fclass) [1] "secondary" "residential" "motorway_link" "service" "primary" "unclassified" "motorway" [8] "tertiary" "trunk" "primary_link" "footway" "track" "secondary_link" "unknown" [15] "living_street" "pedestrian" "path" "bridleway" "steps" "trunk_link" "track_grade1" [22] "track_grade3" "track_grade5" "cycleway" "track_grade4" "tertiary_lin The dataset can be downloaded from my GoogleDrive <https://drive.google.com/drive/folders/1a0Khhej4koA0uhYTrQmqmMg02qfRWbjt?usp=drive_link> . Thank you. -- Tziokas Nikolaos Cartographer Tel:(+44)07561120302 LinkedIn <http://linkedin.com/in/nikolaos-tziokas-896081130> [[alternative HTML version deleted]] _______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo