I have a dataframe "small" whch has 5,000 rows and contains data for several tickers every month, as below:
monthend_n ticker wgtdiff ret interval b1 b2 b3 b4 b5 b6 1 19990228 AA 0.7172 -2.58 0.33896 -0.5868 -0.24784 0.09112 0.43008 0.76904 1.108 2 19990228 AAPL -0.0828 -15.48 0.33896 -0.5868 -0.24784 0.09112 0.43008 0.76904 1.108 3 19990228 ABCW 0.0966 -7.36 0.33896 -0.5868 -0.24784 0.09112 0.43008 0.76904 1.108 … … 705 19990331 AA 0.1932 1.7 0.31602 -0.7641 -0.44808 -0.13206 0.18396 0.49998 0.816 706 19990331 AAPL 0.033 3.23 0.31602 -0.7641 -0.44808 -0.13206 0.18396 0.49998 0.816 707 19990331 ABF 0.154 -20.51 0.31602 -0.7641 -0.44808 -0.13206 0.18396 0.49998 0.816 708 19990331 ABI 0.286 8.33 0.31602 -0.7641 -0.44808 -0.13206 0.18396 0.49998 0.816 etc. Variables b1 through b6 are break points that I want to use in the "cut" function and they vary each month according to the distribution of the variable "wgtdiff " during that month. To handle this I wrote a function as below: cutfunc <- function(df) { vec <- df$wgtdiff # need to apply unique function as break points within each month are same for all tickers (b1-b6 values same in each within month) breaks <- c(unique(df$b1), unique(df$b2), unique(df$b3), unique(df$b4), unique(df$b5), unique(df$b6)) bin <- cut(vec, breaks,labels=F) bin } Then I tried: temp4 <- ddply(small, .(monthend_n), summarize, bins=cutfunc(small)) I was expecting to get back a data frame with 5,000 rows with bins assignments for each ticker, and if there are 6 break points the bin #s should range from 1 to 5. However instead I get a data frame with 40,000 rows and bin # ranging from 1- 40, as below: monthend_n bins 1 19990228 40 2 19990228 17 3 19990228 22 ... 5000 19990228 17 5001 19990331 40 5002 19990331 17 5003 19990331 22 etc It seems ddply doesn't pass in monthly pieces of the data frame "small" into my "cutfunc" in the way I expect Any guidance is appreciated. Thanks ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.