dplyr::mutate is probably what you want instead of dplyr::summarize: create_bins3 <- function (xpred, nBins) { Breaks <- unique(quantile(xpred, probs = seq(0, 1, 1/nBins))) bin <- cut(xpred, breaks = Breaks, include.lowest = TRUE) bin } dplyr::group_by(df, models) %>% dplyr::mutate(Bin=create_bins3(pred,nBins)) #Source: local data frame [100 x 3] #Groups: models [2] # # pred models Bin # (dbl) (fctr) (fctr) #1 0.2167549 model1 (0.167,0.577] #2 -0.5424926 model1 (-0.869,-0.481] ...
Bill Dunlap TIBCO Software wdunlap tibco.com On Fri, Oct 30, 2015 at 9:06 AM, William Dunlap <wdun...@tibco.com> wrote: > The error message is not very helpful and the stack trace is pretty > inscrutable as well > > dplyr::group_by(df, models) %>% dplyr::summarize(create_bins) > Error: not a vector > > traceback() > 14: stop(list(message = "not a vector", call = NULL, cppstack = NULL)) > 13: .Call("dplyr_summarise_impl", PACKAGE = "dplyr", df, dots) > 12: summarise_impl(.data, dots) > 11: summarise_.tbl_df(.data, .dots = lazyeval::lazy_dots(...)) > 10: summarise_(.data, .dots = lazyeval::lazy_dots(...)) > 9: dplyr::summarize(., create_bins) > 8: function_list[[k]](value) > 7: withVisible(function_list[[k]](value)) > 6: freduce(value, `_function_list`) > 5: `_fseq`(`_lhs`) > 4: eval(expr, envir, enclos) > 3: eval(quote(`_fseq`(`_lhs`)), env, env) > 2: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env)) > 1: dplyr::group_by(df, models) %>% dplyr::summarize(create_bins) > > > It does not mean that your function, create_bins, does not return a vector > -- > the sum function gives the same result. help(summarize,package="dplyr") > says: > ...: Name-value pairs of summary functions like ‘min()’, ‘mean()’, > ‘max()’ etc. > It apparently means calls to summary functions, not summary functions > themselves. The examples in the help file show the proper usage. > > Use a call to your function and you will see it works better > > dplyr::group_by(df, models) %>% > dplyr::summarize(create_bins(pred,nBins)) > Error: $ operator is invalid for atomic vectors > The traceback again is not very useful, because the call information was > stripped by dplyr (by the call=NULL in the call to stop()): > > traceback() > 14: stop(list(message = "$ operator is invalid for atomic vectors", > call = NULL, cppstack = NULL)) > 13: .Call("dplyr_summarise_impl", PACKAGE = "dplyr", df, dots) > However it is clear that the fault is in your function, which is expecting > a > data.frame x with a column called pred but gets pred itself. Change x to > xpred > in the argument list and x$pred to xpred in the body of the function. > > You will run into more problems because your function returns a vector > the length of its input but summarize expects a summary function - one > that returns a scalar for any size vector input. > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Fri, Oct 30, 2015 at 4:04 AM, Axel Urbiz <axel.ur...@gmail.com> wrote: > >> So in this case, "create_bins" returns a vector and I still get the same >> error. >> >> >> create_bins <- function(x, nBins) >> { >> Breaks <- unique(quantile(x$pred, probs = seq(0, 1, 1/nBins))) >> bin <- cut(x$pred, breaks = Breaks, include.lowest = TRUE) >> bin >> } >> >> >> ### Using dplyr (fails) >> nBins = 10 >> by_group <- dplyr::group_by(df, models) >> res_dplyr <- dplyr::summarize(by_group, create_bins, nBins) >> Error: not a vector >> >> On Thu, Oct 29, 2015 at 8:28 PM, Jeff Newmiller <jdnew...@dcn.davis.ca.us >> > >> wrote: >> >> > You are jumping the gun (your other email did get through) and you are >> > posting using HTML (which does not come through on the list). Some time >> > (re)reading the Posting Guide mentioned at the bottom of all emails on >> this >> > list seems to be in order. >> > >> > The error is actually quite clear. You should return a vector from your >> > function, not a data frame. >> > >> --------------------------------------------------------------------------- >> > Jeff Newmiller The ..... ..... Go >> Live... >> > DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live >> > Go... >> > Live: OO#.. Dead: OO#.. Playing >> > Research Engineer (Solar/Batteries O.O#. #.O#. with >> > /Software/Embedded Controllers) .OO#. .OO#. >> rocks...1k >> > >> --------------------------------------------------------------------------- >> > Sent from my phone. Please excuse my brevity. >> > >> > On October 29, 2015 4:55:19 PM MST, Axel Urbiz <axel.ur...@gmail.com> >> > wrote: >> > >Hello, >> > > >> > >Sorry, resending this question as the prior was not sent properly. >> > > >> > >I’m using the plyr package below to add a variable named "bin" to my >> > >original data frame "df" with the user-defined function "create_bins". >> > >I'd >> > >like to get similar results using dplyr instead, but failing to do so. >> > > >> > >set.seed(4) >> > >df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels = >> > >c("model1", "model2"))) >> > > >> > > >> > >### Using plyr (works fine) >> > >create_bins <- function(x, nBins) >> > >{ >> > > Breaks <- unique(quantile(x$pred, probs = seq(0, 1, 1/nBins))) >> > > dfB <- data.frame(pred = x$pred, >> > > bin = cut(x$pred, breaks = Breaks, include.lowest = >> > >TRUE)) >> > > dfB >> > >} >> > > >> > >nBins = 10 >> > >res_plyr <- plyr::ddply(df, plyr::.(models), create_bins, nBins) >> > >head(res_plyr) >> > > >> > >### Using dplyr (fails) >> > > >> > >by_group <- dplyr::group_by(df, models) >> > >res_dplyr <- dplyr::summarize(by_group, create_bins, nBins) >> > >Error: not a vector >> > > >> > > >> > >Any help would be much appreciated. >> > > >> > >Best, >> > >Axel. >> > > >> > > [[alternative HTML version deleted]] >> > > >> > >______________________________________________ >> > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > >https://stat.ethz.ch/mailman/listinfo/r-help >> > >PLEASE do read the posting guide >> > >http://www.R-project.org/posting-guide.html >> > >and provide commented, minimal, self-contained, reproducible code. >> > >> > >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.