Re: [R] speed up this algorithm (apply-fuction / 4D array)
here's another one - which is easier to generalize: x - array(rnorm(50 * 50 * 50 * 91, 0, 2), dim=c(50, 50, 50, 91)) y - x [,,,1:90] # decide yourself what to do with slice 91, but # 91 is not divisible by 3 system.time ({ dim (y) - c (50, 50, 50, 3, 90 %/% 3) y - aperm (y, c (4, 1:3, 5)) v2 - colMeans (y) }) User System verstrichen 0.320.080.40 (my computer is a bit slower than Bill's:) system.time (v1 - f1 (x)) User System verstrichen 0.360 0.030 0.396 Claudia Am 05.10.2011 20:24, schrieb William Dunlap: I corrected your code a bit and put it into a function, f0, to make testing easier. I also made a small dataset to make testing easier. Then I made a new function f1 which does what f0 does in a vectorized manner: x- array(rnorm(50 * 50 * 50 * 91, 0, 2), dim=c(50, 50, 50, 91)) xsmall- array(log(seq_len(2 * 2 * 2 * 91)), dim=c(2, 2, 2, 91)) f0- function(x) { data_reduced- array(0, dim=c(dim(x)[1:3], trunc(dim(x)[4]/3))) reduce- seq(1, dim(x)[4]-1, by=3) for( i in 1:length(reduce) ) { data_reduced[ , , , i]- apply(x[ , , , reduce[i] : (reduce[i]+2) ], 1:3, mean) } data_reduced } f1- function(x) { reduce- seq(1, dim(x)[4]-1, by=3) data_reduced- (x[, , , reduce] + x[, , , reduce+1] + x[, , , reduce+2]) / 3 data_reduced } The results were: system.time(v1- f1(x)) user system elapsed 0.280 0.040 0.323 system.time(v0- f0(x)) user system elapsed 73.760 0.060 73.867 all.equal(v0, v1) [1] TRUE I thought apply would already vectorize, rather than loop over every coordinate. No, you have that backwards. Use *apply functions when you cannot figure out how to vectorize. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Martin Batholdy Sent: Wednesday, October 05, 2011 10:40 AM To: R Help Subject: [R] speed up this algorithm (apply-fuction / 4D array) Hi, I have this sample-code (see above) and I was wondering wether it is possible to speed things up. What this code does is the following: x is 4D array (you can imagine it as x, y, z-coordinates and a time-coordinate). So x contains 50x50x50 data-arrays for 91 time-points. Now I want to reduce the 91 time-points. I want to merge three consecutive time points to one time-points by calculating the mean of this three time-points for every x,y,z coordinate. The reduce-sequence defines which time-points should get merged. And the apply-function in the for-loop calculates the mean of the three 3D-Arrays and puts them into a new 4D array (data_reduced). The problem is that even in this example it takes really long. I thought apply would already vectorize, rather than loop over every coordinate. But for my actual data-set it takes a really long time ... So I would be really grateful for any suggestions how to speed this up. x- array(rnorm(50 * 50 * 50 * 90, 0, 2), dim=c(50, 50, 50, 91)) data_reduced- array(0, dim=c(50, 50, 50, 90/3)) reduce- seq(1,90, 3) for( i in 1:length(reduce) ) { data_reduced[ , , , i]-apply(x[ , , , reduce[i] : (reduce[i]+3) ], 1:3, mean) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Claudia Beleites Spectroscopy/Imaging Institute of Photonic Technology Albert-Einstein-Str. 9 07745 Jena Germany email: claudia.belei...@ipht-jena.de phone: +49 3641 206-133 fax: +49 2641 206-399 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] speed up this algorithm (apply-fuction / 4D array)
Hi, I have this sample-code (see above) and I was wondering wether it is possible to speed things up. What this code does is the following: x is 4D array (you can imagine it as x, y, z-coordinates and a time-coordinate). So x contains 50x50x50 data-arrays for 91 time-points. Now I want to reduce the 91 time-points. I want to merge three consecutive time points to one time-points by calculating the mean of this three time-points for every x,y,z coordinate. The reduce-sequence defines which time-points should get merged. And the apply-function in the for-loop calculates the mean of the three 3D-Arrays and puts them into a new 4D array (data_reduced). The problem is that even in this example it takes really long. I thought apply would already vectorize, rather than loop over every coordinate. But for my actual data-set it takes a really long time … So I would be really grateful for any suggestions how to speed this up. x - array(rnorm(50 * 50 * 50 * 90, 0, 2), dim=c(50, 50, 50, 91)) data_reduced - array(0, dim=c(50, 50, 50, 90/3)) reduce - seq(1,90, 3) for( i in 1:length(reduce) ) { data_reduced[ , , , i]-apply(x[ , , , reduce[i] : (reduce[i]+3) ], 1:3, mean) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] speed up this algorithm (apply-fuction / 4D array)
I corrected your code a bit and put it into a function, f0, to make testing easier. I also made a small dataset to make testing easier. Then I made a new function f1 which does what f0 does in a vectorized manner: x - array(rnorm(50 * 50 * 50 * 91, 0, 2), dim=c(50, 50, 50, 91)) xsmall - array(log(seq_len(2 * 2 * 2 * 91)), dim=c(2, 2, 2, 91)) f0 - function(x) { data_reduced - array(0, dim=c(dim(x)[1:3], trunc(dim(x)[4]/3))) reduce - seq(1, dim(x)[4]-1, by=3) for( i in 1:length(reduce) ) { data_reduced[ , , , i] - apply(x[ , , , reduce[i] : (reduce[i]+2) ], 1:3, mean) } data_reduced } f1 - function(x) { reduce - seq(1, dim(x)[4]-1, by=3) data_reduced - (x[, , , reduce] + x[, , , reduce+1] + x[, , , reduce+2]) / 3 data_reduced } The results were: system.time(v1 - f1(x)) user system elapsed 0.280 0.040 0.323 system.time(v0 - f0(x)) user system elapsed 73.760 0.060 73.867 all.equal(v0, v1) [1] TRUE I thought apply would already vectorize, rather than loop over every coordinate. No, you have that backwards. Use *apply functions when you cannot figure out how to vectorize. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Martin Batholdy Sent: Wednesday, October 05, 2011 10:40 AM To: R Help Subject: [R] speed up this algorithm (apply-fuction / 4D array) Hi, I have this sample-code (see above) and I was wondering wether it is possible to speed things up. What this code does is the following: x is 4D array (you can imagine it as x, y, z-coordinates and a time-coordinate). So x contains 50x50x50 data-arrays for 91 time-points. Now I want to reduce the 91 time-points. I want to merge three consecutive time points to one time-points by calculating the mean of this three time-points for every x,y,z coordinate. The reduce-sequence defines which time-points should get merged. And the apply-function in the for-loop calculates the mean of the three 3D-Arrays and puts them into a new 4D array (data_reduced). The problem is that even in this example it takes really long. I thought apply would already vectorize, rather than loop over every coordinate. But for my actual data-set it takes a really long time ... So I would be really grateful for any suggestions how to speed this up. x - array(rnorm(50 * 50 * 50 * 90, 0, 2), dim=c(50, 50, 50, 91)) data_reduced - array(0, dim=c(50, 50, 50, 90/3)) reduce - seq(1,90, 3) for( i in 1:length(reduce) ) { data_reduced[ , , , i]-apply(x[ , , , reduce[i] : (reduce[i]+3) ], 1:3, mean) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.