Thank you! I did not know about the split and unsplit functions. It looks like a very powerful and useful combination to master.
Regards, Adai On Thu, 2006-02-23 at 07:28 +0100, Peter Dalgaard wrote: > "maneesh deshpande" <[EMAIL PROTECTED]> writes: > > > Hi Adai, > > > > I think your solution only works if the rows of the data frame are ordered > > by "date" and > > the ordering function is the same used to order the levels of > > factor(df$date) ? > > It turns out (as I implied in my question) my data is indeed organized in > > this manner, so my > > current problem is solved. > > In the general case, I suppose, one could always order the data frame by > > date before proceeding ? > > > > Thanks, > > > > Maneesh > > You might prefer to look at split/unsplit/split<-, i.e. the z-scores > by group line: > > z <- unsplit(lapply(split(x, g), scale), g) > > with "scale" suitably replaced. Presumably (meaning: I didn't quite > read your code closely enough) > > z <- unsplit(lapply(split(x, g), bucket, 10), g) > > could do it. > > > > > >From: Adaikalavan Ramasamy <[EMAIL PROTECTED]> > > >Reply-To: [EMAIL PROTECTED] > > >To: maneesh deshpande <[EMAIL PROTECTED]> > > >CC: [email protected] > > >Subject: Re: [R] Ranking within factor subgroups > > >Date: Wed, 22 Feb 2006 03:44:45 +0000 > > > > > >It might help to give a simple reproducible example in the future. For > > >example > > > > > > df <- cbind.data.frame( date=rep( 1:5, each=100 ), A=rpois(500, 100), > > > B=rpois(500, 50), C=rpois(500, 30) ) > > > > > >might generate something like > > > > > > date A B C > > > 1 1 93 51 32 > > > 2 1 95 51 30 > > > 3 1 102 59 28 > > > 4 1 105 52 32 > > > 5 1 105 53 26 > > > 6 1 99 59 37 > > > ... . ... .. .. > > > 495 5 100 57 19 > > > 496 5 96 47 44 > > > 497 5 111 56 35 > > > 498 5 105 49 23 > > > 499 5 105 61 30 > > > 500 5 92 53 32 > > > > > >Here is my proposed solution. Can you double check with your existing > > >functions to see if they are correct. > > > > > > decile.fn <- function(x, nbreaks=10){ > > > br <- quantile( x, seq(0, 1, len=nbreaks+1), na.rm=T ) > > > br[1] <- -Inf > > > return( cut(x, br, labels=F) ) > > > } > > > > > > out <- apply( df[ ,c("A", "B", "C")], 2, > > > function(v) unlist( tapply( v, df$date, decile.fn ) ) ) > > > > > > rownames(out) <- rownames(df) > > > out <- cbind(df$date, out) > > > > > >Regards, Adai > > > > > > > > > > > >On Tue, 2006-02-21 at 21:44 -0500, maneesh deshpande wrote: > > > > Hi, > > > > > > > > I have a dataframe, x of the following form: > > > > > > > > Date Symbol A B C > > > > 20041201 ABC 10 12 15 > > > > 20041201 DEF 9 5 4 > > > > ... > > > > 20050101 ABC 5 3 1 > > > > 20050101 GHM 12 4 2 > > > > .... > > > > > > > > here A, B,C are properties of a set symbols recorded for a given date. > > > > I wante to decile the symbols For each date and property and > > > > create another set of columns "bucketA","bucketB", "bucketC" containing > > >the > > > > decile rank > > > > for each symbol. The following non-vectorized code does what I want, > > > > > > > > bucket <- function(data,nBuckets) { > > > > q <- quantile(data,seq(0,1,len=nBuckets+1),na.rm=T) > > > > q[1] <- q[1] - 0.1 # need to do this to ensure there are no extra > > >NAs > > > > cut(data,q,include.lowest=T,labels=F) > > > > } > > > > > > > > calcDeciles <- function(x,colNames) { > > > > nBuckets <- 10 > > > > dates <- unique(x$Date) > > > > for ( date in dates) { > > > > iVec <- x$Date == date > > > > xx <- x[iVec,] > > > > for (colName in colNames) { > > > > data <- xx[,colName] > > > > bColName <- paste("bucket",colName,sep="") > > > > x[iVec,bColName] <- bucket(data,nBuckets) > > > > } > > > > } > > > > x > > > > } > > > > > > > > x <- calcDeciles(x,c("A","B","C")) > > > > > > > > > > > > I was wondering if it is possible to vectorize the above function to > > >make it > > > > more efficient. > > > > I tried, > > > > rlist <- tapply(x$A,x$Date,bucket) > > > > but I am not sure how to assign the contents of "rlist" to their > > >appropriate > > > > slots in the original > > > > dataframe. > > > > > > > > Thanks, > > > > > > > > Maneesh > > > > > > > > ______________________________________________ > > > > [email protected] mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide! > > >http://www.R-project.org/posting-guide.html > > > > > > > > > > > ______________________________________________ > > [email protected] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
