Dear R users,

I have a data frame with a few thousand rows and several hundred
numeric columns (plus a date column). For each row (day), I want to
assign +/- 1 to the highest X absolute values, 0 to the other values,
and save all that in a separate data frame. 

I have a working solution (below), however I find it rather slow. Is
there something I could do to increase the speed? (The code is
CPU-bound; Pentium 4 @ 2.4 GHz, 512 MB RAM, Win XP, R 2.0.0.)

Thank you,
b.


#all is the original data frame (date + a number of columns)
#set up the output data frame
DailyTopN <- data.frame(all[1,1],matrix(ncol=ncol(all)-1))
names(DailyTopN) <- names(all)
top <- 20
for (i in 1:1000)       #the rows to be processed
        {
        #data frame row as vector
        onerow <- na.omit(as.matrix(all[i,][2:ncol(all)])[1, ])
        #select the 'top' highest absolute values
        r <- rank(abs(onerow),ties.method="random")
        selected <- names(r[which(r <= top)])
        #set +/-1 for the highest absolute values, 0 for the others
        DailyTopN[i,selected] <- 1 * sign(all[i,selected])
        DailyTopN[i,1] <- all[i,1]      #add the date
        }
DailyTopN[is.na(DailyTopN)] <- 0
rownames(DailyTopN) <- 1:nrow(DailyTopN)

______________________________________________
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to