subject:"\[R\] Fast Normalize by Group"

[R] Fast Normalize by Group

2012-11-29 Thread Noah Silverman

Hi, I have a very large data set (aprox. 100,000 rows.) The data comes from around 10,000 groups with about 10 entered per group. The values are in one column, the group ID is an integer in the second column. I want to normalize the values by group: for(g in unique(groups){

Re: [R] Fast Normalize by Group

2012-11-29 Thread Peter Langfelder

Not tested but should work: sums = tapply(x, group, sum); sums.ext = sums[ match(group, names(sums))] normalized = x/sums.ext It may be that the tapply is just as slow as your loop though, I'm not sure. HTH, Peter On Thu, Nov 29, 2012 at 10:55 AM, Noah Silverman noahsilver...@ucla.edu wrote:

Re: [R] Fast Normalize by Group

2012-11-29 Thread Mikołaj Hnatiuk

Yes, type in: ?by for example: data - data.frame(fac=factor(c(A,A,B,B)), vec=c(1:4) ) by(data$vec,data$fac, FUN=sum) Best, MikoÅaj Hnatiuk 2012/11/29 Noah Silverman noahsilver...@ucla.edu Hi, I have a very large data set (aprox. 100,000 rows.) The data comes from around 10,000 groups

Re: [R] Fast Normalize by Group

2012-11-29 Thread Rui Barradas

Hello, If yopu want one value per group use tapply(), if you want one value per value of x use ave() tapply(x, group, FUN = function(.x) .x/sum(.x)) ave(x, group, FUN = function(.x) .x/sum(.x)) Hope this helps, Rui Barradas Em 29-11-2012 18:55, Noah Silverman escreveu: Hi, I have a very

Re: [R] Fast Normalize by Group

2012-11-29 Thread jim holtman

try the 'data.table' package. Takes about 0.1 seconds to normalize the data. x - data.frame(id = sample(1, 10, TRUE), value = runif(10)) require(data.table) Loading required package: data.table data.table 1.8.2 For help type: help(data.table) system.time({ + x - data.table(x)

Re: [R] Fast Normalize by Group

2012-11-29 Thread Noah Silverman

Close, but not quite what I need. That very nicely gives me sums by group. I need to take each value of X and divide it by the sum of the group it belongs to. With your example, I have 100,000 X and only 10,000 group. The by command gives me 10,000 sums. I still have to loop over all

Re: [R] Fast Normalize by Group

2012-11-29 Thread Berend Hasselman

On 29-11-2012, at 19:55, Noah Silverman wrote: Hi, I have a very large data set (aprox. 100,000 rows.) The data comes from around 10,000 groups with about 10 entered per group. The values are in one column, the group ID is an integer in the second column. I want to normalize the

[R] Fast Normalize by Group

Re: [R] Fast Normalize by Group

Re: [R] Fast Normalize by Group

Re: [R] Fast Normalize by Group

Re: [R] Fast Normalize by Group

Re: [R] Fast Normalize by Group

Re: [R] Fast Normalize by Group

7 matches

Site Navigation

Mail list logo

Footer information