-----Original Message----- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Thursday, December 8, 2016 4:59 PM To: John P. Nolan <jpno...@american.edu> Cc: Charles C. Berry <R-devel@r-project.org> Subject: Re: [Rd] wish list: generalized apply
> On Dec 8, 2016, at 12:09 PM, John P. Nolan <jpno...@american.edu> wrote: > > Dear All, > > I regularly want to "apply" some function to an array in a way that the > arguments to the user function depend on the index on which the apply is > working. A simple example is: > > A <- array( runif(160), dim=c(5,4,8) ) x <- matrix( runif(32), nrow=4, > ncol=8 ) b <- runif(8) > f1 <- function( A, x, b ) { sum( A %*% x ) + b } result <- rep(0.0,8) > for (i in 1:8) { result[i] <- f1( A[,,i], x[,i] , b[i] ) } > > This works, but is slow. I'd like to be able to do something like: > generalized.apply( A, MARGIN=3, FUN=f1, list(x=x,MARGIN=2), > list(b=b,MARGIN=1) ), where the lists tell generalized.apply to pass x[,i] > and b[i] to FUN in addition to A[,,i]. > > Does such a generalized.apply already exist somewhere? While I can write a C > function to do a particular case, it would be nice if there was a fast, > general way to do this. I would have thought that this would achieve the same result: result <- sapply( seq_along(b) , function(i) { f1( A[,,i], x[,i] , b[i] )} ) Or: result <- sapply( seq.int( dim(A)[3] ) , function(i) { f1( A[,,i], x[,i] , b[i] )} ) (I doubt it will be any faster, but if 'i' is large, parallelism might help. The inner function appears to be fairly efficient.) -- David Winsemius Alameda, CA, USA ==================================================================================== Thanks for the response. I gave a toy example with 8 iterations to illustrate the point, so I thought I would bump it up to make my point about speed. But to my surprise, using a 'for' loop is FASTER than using 'sapply' as David suggest or even 'apply' on a bit simpler problem. Here is the example: n <- 800000; m <- 10; k <- 10 A <- array( 1:(m*n*k), dim=c(m,k,n) ) y <- matrix( 1:(k*n), nrow=k, ncol=n ) b <- 1:n f1 <- function( A, y, b ) { sum( A %*% y ) + b } # use a for loop time1 <- system.time( { result <- rep(0.0,n) for (i in 1:n) { result[i] <- f1( A[,,i], y[,i] , b[i] ) } result } ) # use sapply time2 <- system.time( result2 <- sapply( seq.int( dim(A)[3] ) , function(i) { f1( A[,,i], y[,i] , b[i] )} )) # fix y and b, and use standard apply time3 <- system.time( result3 <- apply( A, MARGIN=3, FUN=f1, y=y[,1], b=b[1] ) ) # user times, then ratios of user times c( time1[1], time2[1],time3[1]); c( time2[1]/time1[1], time3[1]/time1[1] ) # 4.84 5.22 5.32 # 1.078512 1.099174 So using a for loop saves 8-10% of the execution time as compared to sapply and apply!? Years ago I experimented and found out I could speed things up noticeably by replacing loops with apply. This is no longer the case, at least in this simple experiment. Is this a result of byte code? Can someone tell us when a for loop is going to be slower than using apply? A more complicated loop that computes multiple quantities? John ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel