Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)
On Thu, 1 Feb 2007, Ravi Varadhan wrote: Jeff, Here is something which is a little faster: sum1 - sum(outer(x, x, FUN=pmax)) sum3 - sum(outer(x, y, FUN=pmax)) This is the sort of problem where profiling can be useful. My experience with pmax() is that it is surprisingly slow, presumably because it handles recycling and NAs In the example I profiled (an MCMC calculation) it was measurably faster to use function(x,y) {i- xy; x[i]-y[i]; x} -thomas Best, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Racine Sent: Thursday, February 01, 2007 1:18 PM To: r-help@stat.math.ethz.ch Subject: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors) Greetings. For R gurus this may be a no brainer, but I could not find pointers to efficient computation of this beast in past help files. Background - I wish to implement a Cramer-von Mises type test statistic which involves double sums of max(X_i,Y_j) where X and Y are vectors of differing length. I am currently using ifelse pointwise in a vector, but have a nagging suspicion that there is a more efficient way to do this. Basically, I require three sums: sum1: \sum_i\sum_j max(X_i,X_j) sum2: \sum_i\sum_j max(Y_i,Y_j) sum3: \sum_i\sum_j max(X_i,Y_j) Here is my current implementation - any pointers to more efficient computation greatly appreciated. nx - length(x) ny - length(y) sum1 - 0 sum3 - 0 for(i in 1:nx) { sum1 - sum1 + sum(ifelse(x[i]x,x[i],x)) sum3 - sum3 + sum(ifelse(x[i]y,x[i],y)) } sum2 - 0 sum4 - sum3 # symmetric and identical for(i in 1:ny) { sum2 - sum2 + sum(ifelse(y[i]y,y[i],y)) } Thanks in advance for your help. -- Jeff -- Professor J. S. Racine Phone: (905) 525 9140 x 23825 Department of EconomicsFAX:(905) 521-8232 McMaster Universitye-mail: [EMAIL PROTECTED] 1280 Main St. W.,Hamilton, URL: http://www.economics.mcmaster.ca/racine/ Ontario, Canada. L8S 4M4 `The generation of random numbers is too important to be left to chance' __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)
Thomas, you are absolutely correct. Here are some results comparing Jeff's original implementation, my suggestion using outer and pmax, and your clever trick using fast.pmax. Your fast.pmax function is more than 3 times faster than pmax. Thanks for this wonderful insight. Best, Ravi. x - rnorm(2000) y - runif(500) nx - length(x) ny - length(y) sum1 - 0 sum3 - 0 # Here is the straightforward way system.time( + for(i in 1:nx) { + sum1 - sum1 + sum(ifelse(x[i]x,x[i],x)) + sum3 - sum3 + sum(ifelse(x[i]y,x[i],y)) + } + ) [1] 3.83 0.00 3.83 NA NA # Here is a faster way using outer and pmax system.time({ + sum11 - sum(outer(x,x,FUN=pmax)) + sum33 - sum(outer(x,y,FUN=pmax)) + }) [1] 2.55 0.48 3.04 NA NA # Here is an even faster method using Tom Lumley's suggestion: fast.pmax - function(x,y) {i- xy; x[i]-y[i]; x} system.time({ + sum111 - sum(outer(x,x,FUN=fast.pmax)) + sum333 - sum(outer(x,y,FUN=fast.pmax)) + }) [1] 0.78 0.08 0.86 NA NA all.equal(sum1,sum11,sum111) [1] TRUE all.equal(sum3,sum33,sum333) [1] TRUE --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: Thomas Lumley [mailto:[EMAIL PROTECTED] Sent: Friday, February 02, 2007 9:06 AM To: Ravi Varadhan Cc: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch Subject: Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors) On Thu, 1 Feb 2007, Ravi Varadhan wrote: Jeff, Here is something which is a little faster: sum1 - sum(outer(x, x, FUN=pmax)) sum3 - sum(outer(x, y, FUN=pmax)) This is the sort of problem where profiling can be useful. My experience with pmax() is that it is surprisingly slow, presumably because it handles recycling and NAs In the example I profiled (an MCMC calculation) it was measurably faster to use function(x,y) {i- xy; x[i]-y[i]; x} -thomas Best, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Racine Sent: Thursday, February 01, 2007 1:18 PM To: r-help@stat.math.ethz.ch Subject: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors) Greetings. For R gurus this may be a no brainer, but I could not find pointers to efficient computation of this beast in past help files. Background - I wish to implement a Cramer-von Mises type test statistic which involves double sums of max(X_i,Y_j) where X and Y are vectors of differing length. I am currently using ifelse pointwise in a vector, but have a nagging suspicion that there is a more efficient way to do this. Basically, I require three sums: sum1: \sum_i\sum_j max(X_i,X_j) sum2: \sum_i\sum_j max(Y_i,Y_j) sum3: \sum_i\sum_j max(X_i,Y_j) Here is my current implementation - any pointers to more efficient computation greatly appreciated. nx - length(x) ny - length(y) sum1 - 0 sum3 - 0 for(i in 1:nx) { sum1 - sum1 + sum(ifelse(x[i]x,x[i],x)) sum3 - sum3 + sum(ifelse(x[i]y,x[i],y)) } sum2 - 0 sum4 - sum3 # symmetric and identical for(i in 1:ny) { sum2 - sum2 + sum(ifelse(y[i]y,y[i],y)) } Thanks in advance for your help. -- Jeff -- Professor J. S. Racine Phone: (905) 525 9140 x 23825 Department of EconomicsFAX:(905) 521-8232 McMaster Universitye-mail: [EMAIL PROTECTED] 1280 Main St. W.,Hamilton, URL: http://www.economics.mcmaster.ca/racine/ Ontario, Canada. L8S 4M4 `The generation of random numbers is too important to be left to chance' __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas
[R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)
Greetings. For R gurus this may be a no brainer, but I could not find pointers to efficient computation of this beast in past help files. Background - I wish to implement a Cramer-von Mises type test statistic which involves double sums of max(X_i,Y_j) where X and Y are vectors of differing length. I am currently using ifelse pointwise in a vector, but have a nagging suspicion that there is a more efficient way to do this. Basically, I require three sums: sum1: \sum_i\sum_j max(X_i,X_j) sum2: \sum_i\sum_j max(Y_i,Y_j) sum3: \sum_i\sum_j max(X_i,Y_j) Here is my current implementation - any pointers to more efficient computation greatly appreciated. nx - length(x) ny - length(y) sum1 - 0 sum3 - 0 for(i in 1:nx) { sum1 - sum1 + sum(ifelse(x[i]x,x[i],x)) sum3 - sum3 + sum(ifelse(x[i]y,x[i],y)) } sum2 - 0 sum4 - sum3 # symmetric and identical for(i in 1:ny) { sum2 - sum2 + sum(ifelse(y[i]y,y[i],y)) } Thanks in advance for your help. -- Jeff -- Professor J. S. Racine Phone: (905) 525 9140 x 23825 Department of EconomicsFAX:(905) 521-8232 McMaster Universitye-mail: [EMAIL PROTECTED] 1280 Main St. W.,Hamilton, URL: http://www.economics.mcmaster.ca/racine/ Ontario, Canada. L8S 4M4 `The generation of random numbers is too important to be left to chance' __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)
Jeff, Here is something which is a little faster: sum1 - sum(outer(x, x, FUN=pmax)) sum3 - sum(outer(x, y, FUN=pmax)) Best, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Racine Sent: Thursday, February 01, 2007 1:18 PM To: r-help@stat.math.ethz.ch Subject: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors) Greetings. For R gurus this may be a no brainer, but I could not find pointers to efficient computation of this beast in past help files. Background - I wish to implement a Cramer-von Mises type test statistic which involves double sums of max(X_i,Y_j) where X and Y are vectors of differing length. I am currently using ifelse pointwise in a vector, but have a nagging suspicion that there is a more efficient way to do this. Basically, I require three sums: sum1: \sum_i\sum_j max(X_i,X_j) sum2: \sum_i\sum_j max(Y_i,Y_j) sum3: \sum_i\sum_j max(X_i,Y_j) Here is my current implementation - any pointers to more efficient computation greatly appreciated. nx - length(x) ny - length(y) sum1 - 0 sum3 - 0 for(i in 1:nx) { sum1 - sum1 + sum(ifelse(x[i]x,x[i],x)) sum3 - sum3 + sum(ifelse(x[i]y,x[i],y)) } sum2 - 0 sum4 - sum3 # symmetric and identical for(i in 1:ny) { sum2 - sum2 + sum(ifelse(y[i]y,y[i],y)) } Thanks in advance for your help. -- Jeff -- Professor J. S. Racine Phone: (905) 525 9140 x 23825 Department of EconomicsFAX:(905) 521-8232 McMaster Universitye-mail: [EMAIL PROTECTED] 1280 Main St. W.,Hamilton, URL: http://www.economics.mcmaster.ca/racine/ Ontario, Canada. L8S 4M4 `The generation of random numbers is too important to be left to chance' __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)
Well, a reproducible example would be nice =) not tested: x = rnorm(10) y = rnorm(20) mymax - function(t1, t2) apply(cbind(t1, t2), 1, max) sum(outer(x, y, mymax)) is this sth like what you need? b On Feb 1, 2007, at 1:18 PM, Jeffrey Racine wrote: Greetings. For R gurus this may be a no brainer, but I could not find pointers to efficient computation of this beast in past help files. Background - I wish to implement a Cramer-von Mises type test statistic which involves double sums of max(X_i,Y_j) where X and Y are vectors of differing length. I am currently using ifelse pointwise in a vector, but have a nagging suspicion that there is a more efficient way to do this. Basically, I require three sums: sum1: \sum_i\sum_j max(X_i,X_j) sum2: \sum_i\sum_j max(Y_i,Y_j) sum3: \sum_i\sum_j max(X_i,Y_j) Here is my current implementation - any pointers to more efficient computation greatly appreciated. nx - length(x) ny - length(y) sum1 - 0 sum3 - 0 for(i in 1:nx) { sum1 - sum1 + sum(ifelse(x[i]x,x[i],x)) sum3 - sum3 + sum(ifelse(x[i]y,x[i],y)) } sum2 - 0 sum4 - sum3 # symmetric and identical for(i in 1:ny) { sum2 - sum2 + sum(ifelse(y[i]y,y[i],y)) } Thanks in advance for your help. -- Jeff -- Professor J. S. Racine Phone: (905) 525 9140 x 23825 Department of EconomicsFAX:(905) 521-8232 McMaster Universitye-mail: [EMAIL PROTECTED] 1280 Main St. W.,Hamilton, URL: http://www.economics.mcmaster.ca/racine/ Ontario, Canada. L8S 4M4 `The generation of random numbers is too important to be left to chance' __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)
Jeff, you can do sum1: \sum_i\sum_j max(X_i,X_j) sum2: \sum_i\sum_j max(Y_i,Y_j) sum(x * (2 * rank(x) - 1)) sum3: \sum_i\sum_j max(X_i,Y_j) sum(outer(x, y, pmax)) Probably, the latter can be speeded up even more... Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.