Re: [R] User-defined functions in dplyr

2015-11-02 Thread Axel Urbiz
Actually, the results are not the same. Looks like in the code below (see "using dplyr”), the function create_bins2 is not being applied separately to each "group_by" variable. That is surprising to me, or I'm misunderstanding dplyr. ### Create some data set.seed(4) df <- data.frame(pred =

Re: [R] User-defined functions in dplyr

2015-11-02 Thread William Dunlap
dplyr::mutate does not collapse factor variables well. They seem to get their levels from the levels computed for the first group and mutate does not check for them having different levels. > data.frame(group=rep(c("A","B","C"),each=2), value=rep(c("X","Y","Z"),3:1)) %>% dplyr::group_by(group)

Re: [R] User-defined functions in dplyr

2015-11-02 Thread Axel Urbiz
Nice example of the issue Bill. Thank you. Is this a known issue? Plans to be fixed? Thanks again, Axel. > On Nov 2, 2015, at 8:58 PM, William Dunlap wrote: > > dplyr::mutate does not collapse factor variables well. They seem to get > their levels from the levels >

Re: [R] User-defined functions in dplyr

2015-10-30 Thread Axel Urbiz
So in this case, "create_bins" returns a vector and I still get the same error. create_bins <- function(x, nBins) { Breaks <- unique(quantile(x$pred, probs = seq(0, 1, 1/nBins))) bin <- cut(x$pred, breaks = Breaks, include.lowest = TRUE) bin } ### Using dplyr (fails) nBins = 10 by_group

Re: [R] User-defined functions in dplyr

2015-10-30 Thread William Dunlap
The error message is not very helpful and the stack trace is pretty inscrutable as well > dplyr::group_by(df, models) %>% dplyr::summarize(create_bins) Error: not a vector > traceback() 14: stop(list(message = "not a vector", call = NULL, cppstack = NULL)) 13: .Call("dplyr_summarise_impl", PACKAGE

Re: [R] User-defined functions in dplyr

2015-10-30 Thread William Dunlap
dplyr::mutate is probably what you want instead of dplyr::summarize: create_bins3 <- function (xpred, nBins) { Breaks <- unique(quantile(xpred, probs = seq(0, 1, 1/nBins))) bin <- cut(xpred, breaks = Breaks, include.lowest = TRUE) bin } dplyr::group_by(df, models) %>%

Re: [R] User-defined functions in dplyr

2015-10-29 Thread Jeff Newmiller
You are jumping the gun (your other email did get through) and you are posting using HTML (which does not come through on the list). Some time (re)reading the Posting Guide mentioned at the bottom of all emails on this list seems to be in order. The error is actually quite clear. You should

[R] User-defined functions in dplyr

2015-10-29 Thread Axel Urbiz
Hello, Sorry, resending this question as the prior was not sent properly. I’m using the plyr package below to add a variable named "bin" to my original data frame "df" with the user-defined function "create_bins". I'd like to get similar results using dplyr instead, but failing to do so.

[R] User-defined functions in dplyr

2015-10-29 Thread Axel Urbiz
Hello, I’m using the plyr package to add a variable named "bin" to my original data frame "df" with a user-defined function "create_bins". I'd like to get similar results using dplyr instead, but failing to do so. set.seed(4)df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels =