That _is_ interesting. Reduce() calls the sum function at the
interpreted level, so I would not expect this. Can you check whether
most of the time for my "vectorized" version is spent on the ...) part, which is what I would guess. Otherwise, this
sounds strange, since .rowSums is specifically built for speed -- so
it says.. I also assume z is as I constructed.

-- Bert

On Mon, Apr 16, 2012 at 3:01 PM, David Winsemius <> wrote:
> On Apr 16, 2012, at 4:32 PM, Bert Gunter wrote:
>> David:
>> Here is a comparison of the gains to be made by vectorization (again,
>> assuming I have interpreted your query correctly)
>> ## create a list of arrays
>>> z <- lapply(seq_len(10000),function(i)array(runif(24),dim=2:4))
>> ## Using an apply type approach
>>> system.time(ans1 <- array(,c(sum,z)),dim=2:4))
>>  user  system elapsed
>>  0.62    0.00    0.62
>> ## vectorizing via rowSums and cbind
>>> system.time(ans2 <-array(rowSums(,z)),dim=2:4))
>>  user  system elapsed
>>  0.02    0.00    0.02
>>> identical(ans1,ans2)
>> [1] TRUE
> It's an example as well for the possibility that different OSes may perform
> differently. My Mac (an early 2008 model) is nowhere nearly as efficient
> with the second solution, despite being the the same ballpark with the
> first:
>> system.time(ans1 <- array(,c(sum,z)),dim=2:4))
>   user  system elapsed
>  0.841   0.007   0.851
>> system.time(ans2 <-array(rowSums(,z)),dim=2:4))
>   user  system elapsed
>  0.132   0.003   0.145
> And on my system ....  the Reduce strategy is fastest:
>> system.time(ans3 <- Reduce("+", z) )
>   user  system elapsed
>  0.129   0.001   0.134
> And ...the Reduce() strategy would preserve other object attributes,
> something I'm quite sure the re-dimensioning of rowSums(cbind(.)) could not
> preserve.
>  L <- list( table(a, sample(a)) ,
>            table(a, sample(a)),
>            table(a, sample(a)),
>            table(a, sample(a)),
>            table(a, sample(a)) )
>  str(Reduce("+", L) )
>  'table' int [1:3, 1:3] 1 1 3 4 0 1 0 4 1
>  - attr(*, "dimnames")=List of 2
>  ..$ a: chr [1:3] "a" "b" "c"
>  ..$  : chr [1:3] "a" "b" "c"
>  str( array(rowSums(,L)),dim=c(3,3))  )
>  num [1:3, 1:3] 5 5 5 5 5 5 5 5 5
> -- David.
>> Cheers,
>> Bert
>> On Mon, Apr 16, 2012 at 1:19 PM, David A Vavra <>
>> wrote:
>>> Thanks Bill,
>>> For reasons that aren't important here, I must start from a list.
>>> Computing
>>> the sum while generating the tables may be a solution but it means doing
>>> something in one piece of code that is unrelated to the surrounding code.
>>> Bad practice where I'm from. If it's needed it's needed but if I can
>>> avoid
>>> doing so, I will.
>>> I haven't done any timing but because of the extra operations of get and
>>> assign, the non-loop implementation will likely suffer. It seems you have
>>> shown this to be true.
>>> DAV
>>> -----Original Message-----
>>> From: William Dunlap []
>>> Sent: Monday, April 16, 2012 3:26 PM
>>> To: David A Vavra; 'Bert Gunter'
>>> Cc:
>>> Subject: RE: [R] Effeciently sum 3d table
>>>> Example in partial code:
>>>> Env <- CreatEnv() # my own function
>>>> Assign('final',T1-T1,envir=env)
>>>> L<-listOfTables
>>>> lapply(L,function(t) {
>>>>    final <- get('final',envir=env) + t
>>>>    assign('final',final,envir=env)
>>>>    NULL
>>>> })
>>> First, finish writing that code so it runs and you can make sure its
>>> output is ok:
>>> L <- lapply(1:50000, function(i) array(i:(i+3), c(2,2))) # list of 50,000
>>> 2x2 matrices
>>> env <- new.env()
>>> assign('final', L[[1]] - L[[1]], envir=env)
>>> junk <- lapply(L, function(t) {
>>>    final <- get('final', envir=env) + t
>>>    assign('final', final, envir=env)
>>>    NULL
>>> })
>>> get('final', envir=env)
>>> #            [,1]       [,2]
>>> # [1,] 1250025000 1250125000
>>> # [2,] 1250075000 1250175000
>>>> sum( (2:50001) ) # should be final[2,1]
>>> # [1] 1250075000
>>> You asked for something less "clunky".
>>> You are fighting the system by using get() and assign(), just use
>>> ordinary expression syntax to get and set variables:
>>> final <- L[[1]]
>>> for(i in seq_along(L)[-1]) final <- final + L[[i]]
>>> final
>>> #           [,1]       [,2]
>>> # [1,] 1250025000 1250125000
>>> # [2,] 1250075000 1250175000
>>> The former took 0.22 seconds on my machine, the latter 0.06.
>>> You don't have to compute the whole list of matrices before
>>> doing the sum, just add to the current sum when you have
>>> computed one matrix and then forget about it.
>>> Bill Dunlap
>>> Spotfire, TIBCO Software
>>> wdunlap
>>>> -----Original Message-----
>>>> From: []
>>> On Behalf
>>>> Of David A Vavra
>>>> Sent: Monday, April 16, 2012 11:35 AM
>>>> To: 'Bert Gunter'
>>>> Cc:
>>>> Subject: Re: [R] Effeciently sum 3d table
>>>> Thanks Gunter,
>>>> I mean what I think is the normal definition of 'sum' as in:
>>>>   T1 + T2 + T3 + ...
>>>> It never occurred to me that there would be a question.
>>>> I have gotten the impression that a for loop is very inefficient.
>>>> Whenever
>>> I
>>>> change them to lapply calls there is a noticeable improvement in run
>>>> time
>>>> for whatever reason. The problem with lapply here is that I effectively
>>> need
>>>> a global table to hold the final sum. lapply also  wants to return a
>>> value.
>>>> You may be correct that in the long run, the loop is the best. There's a
>>> lot
>>>> of extraneous memory wastage holding all of the tables in a list as well
>>> as
>>>> the return 'values'.
>>>> As an alternate and given a pre-existing list of tables, I was thinking
>>>> of
>>>> creating a temporary environment to hold the final result so it could be
>>>> passed globally to each lapply execution level but that seems clunky and
>>>> wasteful as well.
>>>> Example in partial code:
>>>> Env <- CreatEnv() # my own function
>>>> Assign('final',T1-T1,envir=env)
>>>> L<-listOfTables
>>>> lapply(L,function(t) {
>>>>    final <- get('final',envir=env) + t
>>>>    assign('final',final,envir=env)
>>>>    NULL
>>>> })
>>>> But I was hoping for a more elegant and hopefully more efficient
>>>> solution.
>>>> Greg's suggestion for using reduce seems in order but as yet I'm
>>> unfamiliar
>>>> with the function.
>>>> DAV
>>>> -----Original Message-----
>>>> From: Bert Gunter []
>>>> Sent: Monday, April 16, 2012 12:42 PM
>>>> To: Greg Snow
>>>> Cc: David A Vavra;
>>>> Subject: Re: [R] Effeciently sum 3d table
>>>> Define "sum" . Do you mean you want to get a single sum for each
>>>> array? -- get marginal sums for each array? -- get a single array in
>>>> which each value is the sum of all the individual values at the
>>>> position?
>>>> Due thought and consideration for those trying to help by formulating
>>>> your query carefully and concisely vastly increases the chance of
>>>> getting a useful answer. See the posting guide -- this is a skill that
>>>> needs to be learned and the guide is quite helpful. And I must
>>>> acknowledge that it is a skill that I also have not yet mastered.
>>>> Concerning your query, I would only note that the two responses from
>>>> Greg and Petr that you received are unlikely to be significantly
>>>> faster than just using loops, since both are still essentially looping
>>>> at the interpreted level. Whether either give you what you want, I do
>>>> not know.
>>>> -- Bert
>>>> On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow <> wrote:
>>>>> Look at the Reduce function.
>>>>> On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra <>
>>>> wrote:
>>>>>> I have a large number of 3d tables that I wish to sum
>>>>>> Is there an efficient way to do this? Or perhaps a function I can
>>>>>> call?
>>>>>> I tried using"sum",listoftables) but that returns a single
>>>> value.
>>>>>> So far, it seems only a loop will do the job.
>>>>>> TIA,
>>>>>> DAV
>>>> --
>>>> Bert Gunter
>>>> Genentech Nonclinical Biostatistics
>>>> Internal Contact Info:
>>>> Phone: 467-7374
>>>> Website:
>>>> atistics/pdb-ncb-home.htm
>>>> ______________________________________________
>>>> mailing list
>>>> PLEASE do read the posting guide
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>       [[alternative HTML version deleted]]
>>> ______________________________________________
>>> mailing list
>>> PLEASE do read the posting guide
>>> and provide commented, minimal, self-contained, reproducible code.
>> --
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>> ______________________________________________
>> mailing list
>> PLEASE do read the posting guide
>> and provide commented, minimal, self-contained, reproducible code.
> David Winsemius, MD
> West Hartford, CT


Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374

______________________________________________ mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

Reply via email to