Use the paste command to make the whole list as a character vector, then use parse(text=var.list) to turn var.list into an expression that you can call in data.table?
On 28 September 2011 10:30, Erik Iverson <[email protected]> wrote: > Hello, > > Thank you for providing the data.table package, I think it will be > very useful to me going forward. I have a question about passing > around expressions, and have come up with an example to show what I'm > after. > > library(data.table) > ## test data > N <- 500000 > set.seed(100) > testData <- data.frame(id = c(sample(1:10000, N, replace = TRUE)), > clinic = c(sample(1:10, N, replace = TRUE)), > dx = c(sample(1:200, N, replace = TRUE)), > rx = c(sample(1:1000, N, replace = TRUE))) > > ## want to know mean number of dx per ID > mean(tapply(testData$dx, testData$id, > function(x) length(unique(x)))) ## 44.2212 > > ## in my real use case, I want to run this with different 'by' > ## variables, so let's write a function and try to use data.table, > ## call the function uniqueSummary1 > > uniqueSummary1 <- function(df, key) { > DT <- data.table(df) > key(DT) <- key > > summaryDT <- DT[, list(length(unique(dx)), > length(unique(rx))), by = key] > > mean(summaryDT[,list(V1, V2)]) > > } > > ## agrees with tapply > uniqueSummary1(df = testData, key = c("id")) > > ## The above works great, but isn't general, since in my real use > ## case, I won't know dx and rx are the variables of interest. I want > ## to be able to pass them in as arguments. This is exactly what FAQ > ## 1.6 is, so let's use that solution to define uniqueSummary2 > > uniqueSummary2 <- function(df, key, vars) { > DT <- data.table(df) > key(DT) <- key > > sList <- substitute(vars) > summaryDT <- DT[, eval(sList), by = key] > ncols <- ncol(summaryDT) > > mean(summaryDT[,(ncols-length(sList) + 2):ncols, with = FALSE]) > } > > uniqueSummary2(df = testData, key = c("id"), > vars = list(length(unique(dx)), > length(unique(rx)), > length(unique(clinic)))) > > ## uniqueSummary2 is better, but relies on me repeating the > ## "length(unique())" bit several times. Ideally, I'd just like to > ## pass in a list of QUOTED vars to summarize, like the following > ## hypothetical call to my yet-unwritten uniqueSummary3 function: > > uniqueSummary3(df = testData, key = c("id"), > vars = c("dx", "rx", "clinic")) > > I assume I can somehow construct the expression for the j index inside > my function, based on the 'vars' character vector, but am stuck on > how. Any ideas? > > Thanks so much, > Erik > _______________________________________________ > datatable-help mailing list > [email protected] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
