Hello, all I'm trying to process the names of the variables in the US Census database, that I'm retrieving with tidycensus. My end goal is to produce nicely formatted tables with natural labels.
The labels as downloaded from the US Census look like this: ## Get the P1 table for block group 3 in census tract 2711.01: bg3_race <- get_decennial( geography = "block group", state = "MD", county = "Baltimore city", table = "P1", cache_table = TRUE, year = "2020", sumfile = "pl")%>% filter(substr(GEOID, 6, 12) == "2711013") ## Load the names and labels of the variables: pl_vars <- load_variables(year = "2020", dataset = "pl", cache = TRUE) ## Join the labels to the variables, and drop the zero counts bg3_race_sum <- bg3_race %>% left_join(pl_vars, by=c("variable" = "name")) %>% filter(value > 0) %>% select(c(GEOID, value, label)) head(bg3_race_sum$label) [1] " !!Total:" [2] " !!Total:!!Population of one race:" [3] " !!Total:!!Population of one race:!!White alone" [4] " !!Total:!!Population of one race:!!Black or African American alone" [5] " !!Total:!!Population of one race:!!American Indian and Alaska Native alone" [6] " !!Total:!!Population of one race:!!Asian alone" I think my algorithm for the labels is: 1. keep everything from the last "!!" up to and including the last character 2. for everything remaining, replace each "!!.*:" group with a single space. This turns head() into: "Total:" " Population of one race:" " White alone" " Black or African American alone" " American Indian and Alaska Native alone" " Asian alone" [may not be clearly visible if not rendered in a monospaced font] I think that I need lapply here, but I'm not sure of that, and of what to do next. I can split the label using str_split(label, pattern = "!!") to get a vector of strings, but don't know how to work on the last string and all the rest of the strings separately. Thank you for any suggestions to nudge me along towards a workable solution. -Kevin ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.