> On 06 Nov 2015, at 00:59 , Axel Urbiz <axel.ur...@gmail.com> wrote:
> 
> Hello, 
> 
> Is there a way to avoid the warning below in dplyr. I’m performing an 
> operation within groups, and the warning says that the factors created from 
> each group do not have the same levels, and so it coerces the factor to 
> character. I’m using this inside a package I’m developing. I’d appreciate 
> your recommendation on how to handle this.

Well, what did you intend? If you cut according to quantiles, the levels of the 
result will reflect the value of the quantiles, as in

> y <- runif(10)
> cut(y, quantile(y,c(0,.25,.5,.75, 1)), include.lowest=T)
 [1] (0.65,0.765]  [0.108,0.281] [0.108,0.281] (0.65,0.765]  (0.281,0.528]
 [6] [0.108,0.281] (0.528,0.65]  (0.281,0.528] (0.65,0.765]  (0.528,0.65] 
Levels: [0.108,0.281] (0.281,0.528] (0.528,0.65] (0.65,0.765]

If you do it in different groups, the quantiles will differ, hence the factor 
levels too. Concatenating the resulting factors will get you in trouble.

If you don't mind losing the information about that the quantile intervals are, 
you could consider standardizing the levels with somthing like levels(bin$bin) 
<- 1:nBins.

-pd

> 
> library(dplyr)
> 
> set.seed(4)
> df <- data.frame(pred = rnorm(100), models = gl(2, 50, 100, labels = 
> c("model1", "model2")))
> 
> create_bins <- function (pred, nBins) {
>  Breaks <- unique(quantile(pred, probs = seq(0, 1, 1/nBins)))
>  bin <- data.frame(pred = pred, bin = cut(pred, breaks = Breaks, 
> include.lowest = TRUE))
>  bin
> }
> 
> res_dplyr <- df %>% group_by(models) %>% do(create_bins(.$pred, 10))
> Warning message:
>  In rbind_all(out[[1]]) : Unequal factor levels: coercing to character
> 
> Thank you,
> Axel.
> 
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to