Here is one way to fix the data: # First note that "value" is a factor so we need to convert it to character > str(zp) 'data.frame': 20 obs. of 2 variables: $ variable: Factor w/ 5 levels "ZP.1","ZP.3",..: 1 1 1 1 2 2 2 2 3 3 ... $ value : Factor w/ 19 levels "<0.030","<1.2",..: 3 4 2 1 7 8 6 5 12 11 ... > zp$value <- as.character(zp$value) > str(zp) 'data.frame': 20 obs. of 2 variables: $ variable: Factor w/ 5 levels "ZP.1","ZP.3",..: 1 1 1 1 2 2 2 2 3 3 ... $ value : chr "1160" "27.3" "<1.2" "<0.030" ...
# Next we need to see which values are preceded by "<", and record that in # a new variable, "note" > zp$note <- ifelse(grepl("<", zp$value), "Limit", "Measured") # Finally we strip the "<" off and convert "value" to numeric > zp$value <- as.numeric(gsub("<", "", zp$value)) > str(zp) 'data.frame': 20 obs. of 3 variables: $ variable: Factor w/ 5 levels "ZP.1","ZP.3",..: 1 1 1 1 2 2 2 2 3 3 ... $ value : num 1160 27.3 1.2 0.03 1870 45.7 0.85 0.025 695 31.9 ... $ note : chr "Measured" "Measured" "Limit" "Limit" ... > head(zp) variable value note 1 ZP.1 1160.00 Measured 2 ZP.1 27.30 Measured 3 ZP.1 1.20 Limit 4 ZP.1 0.03 Limit 5 ZP.3 1870.00 Measured 6 ZP.3 45.70 Measured ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Sam Albers Sent: Monday, January 26, 2015 12:41 PM To: r-help@r-project.org Subject: [R] Working with < and > is data sets Hello, I am having some trouble figuring out how to deal with data that has some observations that are detection limits and others that are integers denoted by greater and less than symbols. Ideally I would like a column that has the data as numbers then another column with values "Measured" or "Limit" or something like that. Data and further clarification below. ##Data zp<-structure(list(variable = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L), .Label = c("ZP.1", "ZP.3", "ZP.5", "ZP.7", "ZP.9"), class = "factor"), value = structure(c(3L, 4L, 2L, 1L, 7L, 8L, 6L, 5L, 12L, 11L, 10L, 9L, 15L, 16L, 14L, 13L, 19L, 18L, 17L, 9L), .Label = c("<0.030", "<1.2", "1160", "27.3", "<0.025", "<0.85", "1870", "45.7", "<0.0020", "<0.050", "31.9", "695", "<0.0060", "<0.20", "311", "8.84", "<0.090", "12", "646"), class = "factor")), .Names = c("variable", "value"), row.names = c(NA, -20L), class = "data.frame") ## As expected converting everything to numeric results is a slew of NA values zp$valuefactor<-as.numeric(as.character(zp$value)) ## At this point I am unsure how to proceed. zp ### So I am just wondering how folks deal with this type of data. Any advice would be much appreciated as I am looking for something that will reliably works on a large data set. Thanks in advance! Sam [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.