Re: [R] Effeciently sum 3d table

2012-04-16 Thread Petr Savicky
On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?
 
 I tried using do.call(sum,listoftables) but that returns a single value. 
 
 So far, it seems only a loop will do the job.

Hi.

Use lapply(), for example

  listoftables - list(array(1:8, dim=c(2, 2, 2)), array(2:9, dim=c(2, 2, 2)))
  lapply(listoftables, sum)

  [[1]]
  [1] 36

  [[2]]
  [1] 44

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread Greg Snow
Look at the Reduce function.

On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?

 I tried using do.call(sum,listoftables) but that returns a single value.

 So far, it seems only a loop will do the job.


 TIA,
 DAV

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread Bert Gunter
Define sum . Do you mean you want to get a single sum for each
array? -- get marginal sums for each array? -- get a single array in
which each value is the sum of all the individual values at the
position?

Due thought and consideration for those trying to help by formulating
your query carefully and concisely vastly increases the chance of
getting a useful answer. See the posting guide -- this is a skill that
needs to be learned and the guide is quite helpful. And I must
acknowledge that it is a skill that I also have not yet mastered.

Concerning your query, I would only note that the two responses from
Greg and Petr that you received are unlikely to be significantly
faster than just using loops, since both are still essentially looping
at the interpreted level. Whether either give you what you want, I do
not know.

-- Bert

On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow 538...@gmail.com wrote:
 Look at the Reduce function.

 On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?

 I tried using do.call(sum,listoftables) but that returns a single value.

 So far, it seems only a loop will do the job.


 TIA,
 DAV

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread David A Vavra
Thanks Gunter,

I mean what I think is the normal definition of 'sum' as in: 
   T1 + T2 + T3 + ...
It never occurred to me that there would be a question.

I have gotten the impression that a for loop is very inefficient. Whenever I
change them to lapply calls there is a noticeable improvement in run time
for whatever reason. The problem with lapply here is that I effectively need
a global table to hold the final sum. lapply also  wants to return a value.

You may be correct that in the long run, the loop is the best. There's a lot
of extraneous memory wastage holding all of the tables in a list as well as
the return 'values'.

As an alternate and given a pre-existing list of tables, I was thinking of
creating a temporary environment to hold the final result so it could be
passed globally to each lapply execution level but that seems clunky and
wasteful as well. 

Example in partial code:

Env - CreatEnv() # my own function
Assign('final',T1-T1,envir=env)
L-listOfTables

lapply(L,function(t) {
final - get('final',envir=env) + t
assign('final',final,envir=env)
NULL
})

But I was hoping for a more elegant and hopefully more efficient solution.
Greg's suggestion for using reduce seems in order but as yet I'm unfamiliar
with the function.

DAV



-Original Message-
From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Monday, April 16, 2012 12:42 PM
To: Greg Snow
Cc: David A Vavra; r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table

Define sum . Do you mean you want to get a single sum for each
array? -- get marginal sums for each array? -- get a single array in
which each value is the sum of all the individual values at the
position?

Due thought and consideration for those trying to help by formulating
your query carefully and concisely vastly increases the chance of
getting a useful answer. See the posting guide -- this is a skill that
needs to be learned and the guide is quite helpful. And I must
acknowledge that it is a skill that I also have not yet mastered.

Concerning your query, I would only note that the two responses from
Greg and Petr that you received are unlikely to be significantly
faster than just using loops, since both are still essentially looping
at the interpreted level. Whether either give you what you want, I do
not know.

-- Bert

On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow 538...@gmail.com wrote:
 Look at the Reduce function.

 On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net
wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?

 I tried using do.call(sum,listoftables) but that returns a single
value.

 So far, it seems only a loop will do the job.


 TIA,
 DAV


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
atistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread David A Vavra
Thanks Petr,

I'm after T1 + T2 + T3 + ... and your solution is giving a list of n items
each containing sum(T[i]). I guess I should have been clearer in stating
what I need.

Cheers,
DAV 



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Petr Savicky
Sent: Monday, April 16, 2012 11:07 AM
To: r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table

On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?
 
 I tried using do.call(sum,listoftables) but that returns a single value.

 
 So far, it seems only a loop will do the job.

Hi.

Use lapply(), for example

  listoftables - list(array(1:8, dim=c(2, 2, 2)), array(2:9, dim=c(2, 2,
2)))
  lapply(listoftables, sum)

  [[1]]
  [1] 36

  [[2]]
  [1] 44

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread David A Vavra
Thanks Greg,

I think this may be what I'm after but the documentation for it isn't
particularly clear. I hate it when someone documents a piece of code saying
it works kinda like some other code (running elsewhere, of course) making
the tacit assumption that everybody will immediately know what that means
and implies. 

I'm sure I'll understand it once I know what it is trying to say. :) There's
an item in the examples which may be exactly what I'm after.

DAV


-Original Message-
From: Greg Snow [mailto:538...@gmail.com] 
Sent: Monday, April 16, 2012 11:54 AM
To: David A Vavra
Cc: r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table

Look at the Reduce function.

On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?

 I tried using do.call(sum,listoftables) but that returns a single value.

 So far, it seems only a loop will do the job.


 TIA,
 DAV


-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread Bert Gunter
David:

1. My first name is Bert.

2.  It never occurred to me that there would be a question.
Indeed. But in fact you got solutions for two different
interpretations (Greg's is what you wanted). That is what I meant when
I said that clarity in asking the question is important.

3.  I have gotten the impression that a for loop is very inefficient.
Whenever I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason.
I'd like to see your data on this. My experience is that they are
typically comparable. Chambers in his Software for Data Analysis
book says (pp 213): (with apply type functions rather than explicit
loops),   The computation should run faster... However, none of the
apply mechanisms changes the number of times the supplied functions is
called, so serious improvements will be limited to iterating simple
calculations many times.

4. You can get serious improvements by vectorizing; and you can do
that here, if I understand correctly, because all your arrays have
identical dim = d. Here's how:

## assume your list of arrays is in listoftables

alldat - do.call(cbind,listoftables) ## this might be the slow part
ans - array(.rowSums (allDat), dim = d)

See ?rowSums for explanations and caveats, especially with NA's .

Cheers,
Bert

On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra dava...@verizon.net wrote:
 Thanks Gunter,

 I mean what I think is the normal definition of 'sum' as in:
   T1 + T2 + T3 + ...
 It never occurred to me that there would be a question.

 I have gotten the impression that a for loop is very inefficient. Whenever I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason. The problem with lapply here is that I effectively need
 a global table to hold the final sum. lapply also  wants to return a value.

 You may be correct that in the long run, the loop is the best. There's a lot
 of extraneous memory wastage holding all of the tables in a list as well as
 the return 'values'.

 As an alternate and given a pre-existing list of tables, I was thinking of
 creating a temporary environment to hold the final result so it could be
 passed globally to each lapply execution level but that seems clunky and
 wasteful as well.

 Example in partial code:

 Env - CreatEnv() # my own function
 Assign('final',T1-T1,envir=env)
 L-listOfTables

 lapply(L,function(t) {
        final - get('final',envir=env) + t
        assign('final',final,envir=env)
        NULL
 })

 But I was hoping for a more elegant and hopefully more efficient solution.
 Greg's suggestion for using reduce seems in order but as yet I'm unfamiliar
 with the function.

 DAV



 -Original Message-
 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Monday, April 16, 2012 12:42 PM
 To: Greg Snow
 Cc: David A Vavra; r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 Define sum . Do you mean you want to get a single sum for each
 array? -- get marginal sums for each array? -- get a single array in
 which each value is the sum of all the individual values at the
 position?

 Due thought and consideration for those trying to help by formulating
 your query carefully and concisely vastly increases the chance of
 getting a useful answer. See the posting guide -- this is a skill that
 needs to be learned and the guide is quite helpful. And I must
 acknowledge that it is a skill that I also have not yet mastered.

 Concerning your query, I would only note that the two responses from
 Greg and Petr that you received are unlikely to be significantly
 faster than just using loops, since both are still essentially looping
 at the interpreted level. Whether either give you what you want, I do
 not know.

 -- Bert

 On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow 538...@gmail.com wrote:
 Look at the Reduce function.

 On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net
 wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?

 I tried using do.call(sum,listoftables) but that returns a single
 value.

 So far, it seems only a loop will do the job.


 TIA,
 DAV


 --

 Bert Gunter
 Genentech Nonclinical Biostatistics

 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
 atistics/pdb-ncb-home.htm




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread David Winsemius


On Apr 16, 2012, at 2:43 PM, David A Vavra wrote:


Thanks Petr,

I'm after T1 + T2 + T3 + ...


Which would be one number ... i.e. the result you originally said you  
did not want.



and your solution is giving a list of n items
each containing sum(T[i]). I guess I should have been clearer in  
stating

what I need.


Or even now you _could_ be clearer. Do you want successive partial  
sums? That would yield to:


Reduce(+, listoftables, accumaulate=TRUE)






Cheers,
DAV 



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
] On

Behalf Of Petr Savicky
Sent: Monday, April 16, 2012 11:07 AM
To: r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table

On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:

I have a large number of 3d tables that I wish to sum
Is there an efficient way to do this? Or perhaps a function I can  
call?


I tried using do.call(sum,listoftables) but that returns a single  
value.




So far, it seems only a loop will do the job.


Hi.

Use lapply(), for example

 listoftables - list(array(1:8, dim=c(2, 2, 2)), array(2:9,  
dim=c(2, 2,

2)))
 lapply(listoftables, sum)

 [[1]]
 [1] 36

 [[2]]
 [1] 44

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread William Dunlap
 Example in partial code:
 
 Env - CreatEnv() # my own function
 Assign('final',T1-T1,envir=env)
 L-listOfTables
 
 lapply(L,function(t) {
   final - get('final',envir=env) + t
   assign('final',final,envir=env)
   NULL
 })

First, finish writing that code so it runs and you can make sure its
output is ok:

L - lapply(1:5, function(i) array(i:(i+3), c(2,2))) # list of 50,000 2x2 
matrices
env - new.env()
assign('final', L[[1]] - L[[1]], envir=env)
junk - lapply(L, function(t) {
 final - get('final', envir=env) + t
 assign('final', final, envir=env)
 NULL
})
get('final', envir=env)
#[,1]   [,2]
# [1,] 1250025000 1250125000
# [2,] 1250075000 1250175000
 sum( (2:50001) ) # should be final[2,1]
# [1] 1250075000

You asked for something less clunky.
You are fighting the system by using get() and assign(), just use
ordinary expression syntax to get and set variables:
final - L[[1]]
for(i in seq_along(L)[-1]) final - final + L[[i]]
final
#   [,1]   [,2]
# [1,] 1250025000 1250125000
# [2,] 1250075000 1250175000

The former took 0.22 seconds on my machine, the latter 0.06.

You don't have to compute the whole list of matrices before
doing the sum, just add to the current sum when you have
computed one matrix and then forget about it.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of David A Vavra
 Sent: Monday, April 16, 2012 11:35 AM
 To: 'Bert Gunter'
 Cc: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table
 
 Thanks Gunter,
 
 I mean what I think is the normal definition of 'sum' as in:
T1 + T2 + T3 + ...
 It never occurred to me that there would be a question.
 
 I have gotten the impression that a for loop is very inefficient. Whenever I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason. The problem with lapply here is that I effectively need
 a global table to hold the final sum. lapply also  wants to return a value.
 
 You may be correct that in the long run, the loop is the best. There's a lot
 of extraneous memory wastage holding all of the tables in a list as well as
 the return 'values'.
 
 As an alternate and given a pre-existing list of tables, I was thinking of
 creating a temporary environment to hold the final result so it could be
 passed globally to each lapply execution level but that seems clunky and
 wasteful as well.
 
 Example in partial code:
 
 Env - CreatEnv() # my own function
 Assign('final',T1-T1,envir=env)
 L-listOfTables
 
 lapply(L,function(t) {
   final - get('final',envir=env) + t
   assign('final',final,envir=env)
   NULL
 })
 
 But I was hoping for a more elegant and hopefully more efficient solution.
 Greg's suggestion for using reduce seems in order but as yet I'm unfamiliar
 with the function.
 
 DAV
 
 
 
 -Original Message-
 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Monday, April 16, 2012 12:42 PM
 To: Greg Snow
 Cc: David A Vavra; r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table
 
 Define sum . Do you mean you want to get a single sum for each
 array? -- get marginal sums for each array? -- get a single array in
 which each value is the sum of all the individual values at the
 position?
 
 Due thought and consideration for those trying to help by formulating
 your query carefully and concisely vastly increases the chance of
 getting a useful answer. See the posting guide -- this is a skill that
 needs to be learned and the guide is quite helpful. And I must
 acknowledge that it is a skill that I also have not yet mastered.
 
 Concerning your query, I would only note that the two responses from
 Greg and Petr that you received are unlikely to be significantly
 faster than just using loops, since both are still essentially looping
 at the interpreted level. Whether either give you what you want, I do
 not know.
 
 -- Bert
 
 On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow 538...@gmail.com wrote:
  Look at the Reduce function.
 
  On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net
 wrote:
  I have a large number of 3d tables that I wish to sum
  Is there an efficient way to do this? Or perhaps a function I can call?
 
  I tried using do.call(sum,listoftables) but that returns a single
 value.
 
  So far, it seems only a loop will do the job.
 
 
  TIA,
  DAV
 
 
 --
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biost
 atistics/pdb-ncb-home.htm
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code

Re: [R] Effeciently sum 3d table

2012-04-16 Thread David A Vavra
Bert,

My apologies on the name.

I haven't kept any data on loop times. I don't know why lapply seems faster
but the difference is quite noticeable. It has struck me as odd. I would
have thought lapply would be slower. It has taken an effort to change my
thinking to force fit solutions to it but I've gotten used to it. As of now
I reserve loops to times when there are only a few iterations (as in 10) and
to solutions that require passing large amounts of information among
iterations. lapply is particularly handy when constructing lists.

As for vectorizing, see the code below. Note that it uses mapply but that
simply may have made implementation easier. However, if vectorizing gives an
improvement over looping, the mapply may be the reason.

 f-function(x,y,z) catn(do something)
 Vectorize(f,c('x','y'))
function (x, y, z) 
{
args - lapply(as.list(match.call())[-1L], eval, parent.frame())
names - if (is.null(names(args))) 
character(length(args))
else names(args)
dovec - names %in% vectorize.args
do.call(mapply, c(FUN = FUN, args[dovec], MoreArgs =
list(args[!dovec]), 
SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES))
}
environment: 0x7fb3442553c8

DAV


-Original Message-
From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Monday, April 16, 2012 3:07 PM
To: David A Vavra
Cc: r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table

David:

1. My first name is Bert.

2.  It never occurred to me that there would be a question.
Indeed. But in fact you got solutions for two different
interpretations (Greg's is what you wanted). That is what I meant when
I said that clarity in asking the question is important.

3.  I have gotten the impression that a for loop is very inefficient.
Whenever I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason.
I'd like to see your data on this. My experience is that they are
typically comparable. Chambers in his Software for Data Analysis
book says (pp 213): (with apply type functions rather than explicit
loops),   The computation should run faster... However, none of the
apply mechanisms changes the number of times the supplied functions is
called, so serious improvements will be limited to iterating simple
calculations many times.

4. You can get serious improvements by vectorizing; and you can do
that here, if I understand correctly, because all your arrays have
identical dim = d. Here's how:

## assume your list of arrays is in listoftables

alldat - do.call(cbind,listoftables) ## this might be the slow part
ans - array(.rowSums (allDat), dim = d)

See ?rowSums for explanations and caveats, especially with NA's .

Cheers,
Bert

On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra dava...@verizon.net wrote:
 Thanks Gunter,

 I mean what I think is the normal definition of 'sum' as in:
   T1 + T2 + T3 + ...
 It never occurred to me that there would be a question.

 I have gotten the impression that a for loop is very inefficient. Whenever
I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason. The problem with lapply here is that I effectively
need
 a global table to hold the final sum. lapply also  wants to return a
value.

 You may be correct that in the long run, the loop is the best. There's a
lot
 of extraneous memory wastage holding all of the tables in a list as well
as
 the return 'values'.

 As an alternate and given a pre-existing list of tables, I was thinking of
 creating a temporary environment to hold the final result so it could be
 passed globally to each lapply execution level but that seems clunky and
 wasteful as well.

 Example in partial code:

 Env - CreatEnv() # my own function
 Assign('final',T1-T1,envir=env)
 L-listOfTables

 lapply(L,function(t) {
        final - get('final',envir=env) + t
        assign('final',final,envir=env)
        NULL
 })

 But I was hoping for a more elegant and hopefully more efficient solution.
 Greg's suggestion for using reduce seems in order but as yet I'm
unfamiliar
 with the function.

 DAV



 -Original Message-
 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Monday, April 16, 2012 12:42 PM
 To: Greg Snow
 Cc: David A Vavra; r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 Define sum . Do you mean you want to get a single sum for each
 array? -- get marginal sums for each array? -- get a single array in
 which each value is the sum of all the individual values at the
 position?

 Due thought and consideration for those trying to help by formulating
 your query carefully and concisely vastly increases the chance of
 getting a useful answer. See the posting guide -- this is a skill that
 needs to be learned and the guide is quite helpful. And I must
 acknowledge that it is a skill that I also have not yet mastered.

 Concerning your query, I would only note that the two responses from
 Greg and Petr that you received are unlikely to be significantly

Re: [R] Effeciently sum 3d table

2012-04-16 Thread David Winsemius


On Apr 16, 2012, at 3:26 PM, David Winsemius wrote:



On Apr 16, 2012, at 2:43 PM, David A Vavra wrote:


Thanks Petr,

I'm after T1 + T2 + T3 + ...


Which would be one number ... i.e. the result you originally said  
you did not want.



and your solution is giving a list of n items
each containing sum(T[i]). I guess I should have been clearer in  
stating

what I need.


Or even now you _could_ be clearer. Do you want successive partial  
sums? That would yield to:


Reduce(+, listoftables, accumaulate=TRUE)


If Dunlap's interpretation is corect then consder this

 L - lapply(1:5, function(i) array(i:(i+7), c(2,2,2)))
 system.time({final - L[[1]]
 for(i in seq_along(L)[-1]) final - final + L[[i]]
 final}  )
#   user  system elapsed
#  0.179   0.002   0.187

 system.time(Reduce(+, L))
#   user  system elapsed
#  0.150   0.002   0.157

 identical(Reduce(+, L), final)
[1] TRUE









Cheers,
DAV 



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
] On

Behalf Of Petr Savicky
Sent: Monday, April 16, 2012 11:07 AM
To: r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table

On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:

I have a large number of 3d tables that I wish to sum
Is there an efficient way to do this? Or perhaps a function I can  
call?


I tried using do.call(sum,listoftables) but that returns a  
single value.




So far, it seems only a loop will do the job.


Hi.

Use lapply(), for example

listoftables - list(array(1:8, dim=c(2, 2, 2)), array(2:9,  
dim=c(2, 2,

2)))
lapply(listoftables, sum)

[[1]]
[1] 36

[[2]]
[1] 44

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread David A Vavra
 even now you _could_ be clearer

I fail to see why it's unclear.

 I'm after T1 + T2 + T3 + ...
 Which would be one number ... i.e. the result you originally said you  
did not want.

I think it's precisely what I want. If I have two 3d tables, T1 and T2, then
say either
1) T1 + T2
2) T1 - T2
(1) yields a third table equal to the sum of the individual cells and (2)
yields a table full of zeroes. At least it does for matrices. Are you saying
the T1+T2+T3+... above is equivalent to:

   sum(T1)+sum(T2)+sum(T3)+

when the table has more than 2d? I tried it out by hand I get the result I'm
after. What I want is a general solution. Reduce may be the answer but I
find the documentation for it a bit daunting. Not to mention that it is far
from obvious that I should have originally thought of using it.

DAV



-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Monday, April 16, 2012 3:26 PM
To: David A Vavra
Cc: 'Petr Savicky'; r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table


On Apr 16, 2012, at 2:43 PM, David A Vavra wrote:

 Thanks Petr,

 I'm after T1 + T2 + T3 + ...

Which would be one number ... i.e. the result you originally said you  
did not want.

 and your solution is giving a list of n items
 each containing sum(T[i]). I guess I should have been clearer in  
 stating
 what I need.

Or even now you _could_ be clearer. Do you want successive partial  
sums? That would yield to:

Reduce(+, listoftables, accumaulate=TRUE)





 Cheers,
 DAV   



 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
 ] On
 Behalf Of Petr Savicky
 Sent: Monday, April 16, 2012 11:07 AM
 To: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can  
 call?

 I tried using do.call(sum,listoftables) but that returns a single  
 value.


 So far, it seems only a loop will do the job.

 Hi.

 Use lapply(), for example

  listoftables - list(array(1:8, dim=c(2, 2, 2)), array(2:9,  
 dim=c(2, 2,
 2)))
  lapply(listoftables, sum)

  [[1]]
  [1] 36

  [[2]]
  [1] 44

 Hope this helps.

 Petr Savicky.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread Bert Gunter
For purposes of clarity only...

On Mon, Apr 16, 2012 at 12:40 PM, David A Vavra dava...@verizon.net wrote:
 Bert,

 My apologies on the name.

 I haven't kept any data on loop times. I don't know why lapply seems faster
 but the difference is quite noticeable. It has struck me as odd. I would
 have thought lapply would be slower. It has taken an effort to change my
 thinking to force fit solutions to it but I've gotten used to it. As of now
 I reserve loops to times when there are only a few iterations (as in 10) and
 to solutions that require passing large amounts of information among
 iterations. lapply is particularly handy when constructing lists.

 As for vectorizing, see the code below.

No. Despite the name, this is **not** what I mean by vectorization.
What I mean is pushing the loops down to the C level rather than doing
them at the interpreted level, which is where your code below still
leaves you.

-- Bert

 Note that it uses mapply but that
 simply may have made implementation easier. However, if vectorizing gives an
 improvement over looping, the mapply may be the reason.

 f-function(x,y,z) catn(do something)
 Vectorize(f,c('x','y'))
 function (x, y, z)
 {
    args - lapply(as.list(match.call())[-1L], eval, parent.frame())
    names - if (is.null(names(args)))
        character(length(args))
    else names(args)
    dovec - names %in% vectorize.args
    do.call(mapply, c(FUN = FUN, args[dovec], MoreArgs =
 list(args[!dovec]),
        SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES))
 }
 environment: 0x7fb3442553c8

 DAV


 -Original Message-
 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Monday, April 16, 2012 3:07 PM
 To: David A Vavra
 Cc: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 David:

 1. My first name is Bert.

 2.  It never occurred to me that there would be a question.
 Indeed. But in fact you got solutions for two different
 interpretations (Greg's is what you wanted). That is what I meant when
 I said that clarity in asking the question is important.

 3.  I have gotten the impression that a for loop is very inefficient.
 Whenever I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason.
 I'd like to see your data on this. My experience is that they are
 typically comparable. Chambers in his Software for Data Analysis
 book says (pp 213): (with apply type functions rather than explicit
 loops),   The computation should run faster... However, none of the
 apply mechanisms changes the number of times the supplied functions is
 called, so serious improvements will be limited to iterating simple
 calculations many times.

 4. You can get serious improvements by vectorizing; and you can do
 that here, if I understand correctly, because all your arrays have
 identical dim = d. Here's how:

 ## assume your list of arrays is in listoftables

 alldat - do.call(cbind,listoftables) ## this might be the slow part
 ans - array(.rowSums (allDat), dim = d)

 See ?rowSums for explanations and caveats, especially with NA's .

 Cheers,
 Bert

 On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra dava...@verizon.net wrote:
 Thanks Gunter,

 I mean what I think is the normal definition of 'sum' as in:
   T1 + T2 + T3 + ...
 It never occurred to me that there would be a question.

 I have gotten the impression that a for loop is very inefficient. Whenever
 I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason. The problem with lapply here is that I effectively
 need
 a global table to hold the final sum. lapply also  wants to return a
 value.

 You may be correct that in the long run, the loop is the best. There's a
 lot
 of extraneous memory wastage holding all of the tables in a list as well
 as
 the return 'values'.

 As an alternate and given a pre-existing list of tables, I was thinking of
 creating a temporary environment to hold the final result so it could be
 passed globally to each lapply execution level but that seems clunky and
 wasteful as well.

 Example in partial code:

 Env - CreatEnv() # my own function
 Assign('final',T1-T1,envir=env)
 L-listOfTables

 lapply(L,function(t) {
        final - get('final',envir=env) + t
        assign('final',final,envir=env)
        NULL
 })

 But I was hoping for a more elegant and hopefully more efficient solution.
 Greg's suggestion for using reduce seems in order but as yet I'm
 unfamiliar
 with the function.

 DAV



 -Original Message-
 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Monday, April 16, 2012 12:42 PM
 To: Greg Snow
 Cc: David A Vavra; r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 Define sum . Do you mean you want to get a single sum for each
 array? -- get marginal sums for each array? -- get a single array in
 which each value is the sum of all the individual values at the
 position?

 Due thought and consideration for those trying to help by formulating
 your query carefully

Re: [R] Effeciently sum 3d table

2012-04-16 Thread David A Vavra
Thanks Bill,

 

For reasons that aren't important here, I must start from a list. Computing
the sum while generating the tables may be a solution but it means doing
something in one piece of code that is unrelated to the surrounding code.
Bad practice where I'm from. If it's needed it's needed but if I can avoid
doing so, I will. 

 

I haven't done any timing but because of the extra operations of get and
assign, the non-loop implementation will likely suffer. It seems you have
shown this to be true.

 

DAV



 

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Monday, April 16, 2012 3:26 PM
To: David A Vavra; 'Bert Gunter'
Cc: r-help@r-project.org
Subject: RE: [R] Effeciently sum 3d table

 

 Example in partial code:

 

 Env - CreatEnv() # my own function

 Assign('final',T1-T1,envir=env)

 L-listOfTables

 

 lapply(L,function(t) {

 final - get('final',envir=env) + t

 assign('final',final,envir=env)

 NULL

 })

 

First, finish writing that code so it runs and you can make sure its

output is ok:

 

L - lapply(1:5, function(i) array(i:(i+3), c(2,2))) # list of 50,000
2x2 matrices

env - new.env()

assign('final', L[[1]] - L[[1]], envir=env)

junk - lapply(L, function(t) {

 final - get('final', envir=env) + t

 assign('final', final, envir=env)

 NULL

})

get('final', envir=env)

#[,1]   [,2]

# [1,] 1250025000 1250125000

# [2,] 1250075000 1250175000

 sum( (2:50001) ) # should be final[2,1]

# [1] 1250075000

 

You asked for something less clunky.

You are fighting the system by using get() and assign(), just use

ordinary expression syntax to get and set variables:

final - L[[1]]

for(i in seq_along(L)[-1]) final - final + L[[i]]

final

#   [,1]   [,2]

# [1,] 1250025000 1250125000

# [2,] 1250075000 1250175000

 

The former took 0.22 seconds on my machine, the latter 0.06.

 

You don't have to compute the whole list of matrices before

doing the sum, just add to the current sum when you have

computed one matrix and then forget about it.

 

Bill Dunlap

Spotfire, TIBCO Software

wdunlap tibco.com

 

 

 -Original Message-

 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf

 Of David A Vavra

 Sent: Monday, April 16, 2012 11:35 AM

 To: 'Bert Gunter'

 Cc: r-help@r-project.org

 Subject: Re: [R] Effeciently sum 3d table

 

 Thanks Gunter,

 

 I mean what I think is the normal definition of 'sum' as in:

T1 + T2 + T3 + ...

 It never occurred to me that there would be a question.

 

 I have gotten the impression that a for loop is very inefficient. Whenever
I

 change them to lapply calls there is a noticeable improvement in run time

 for whatever reason. The problem with lapply here is that I effectively
need

 a global table to hold the final sum. lapply also  wants to return a
value.

 

 You may be correct that in the long run, the loop is the best. There's a
lot

 of extraneous memory wastage holding all of the tables in a list as well
as

 the return 'values'.

 

 As an alternate and given a pre-existing list of tables, I was thinking of

 creating a temporary environment to hold the final result so it could be

 passed globally to each lapply execution level but that seems clunky and

 wasteful as well.

 

 Example in partial code:

 

 Env - CreatEnv() # my own function

 Assign('final',T1-T1,envir=env)

 L-listOfTables

 

 lapply(L,function(t) {

 final - get('final',envir=env) + t

 assign('final',final,envir=env)

 NULL

 })

 

 But I was hoping for a more elegant and hopefully more efficient solution.

 Greg's suggestion for using reduce seems in order but as yet I'm
unfamiliar

 with the function.

 

 DAV

 

 

 

 -Original Message-

 From: Bert Gunter [mailto:gunter.ber...@gene.com]

 Sent: Monday, April 16, 2012 12:42 PM

 To: Greg Snow

 Cc: David A Vavra; r-help@r-project.org

 Subject: Re: [R] Effeciently sum 3d table

 

 Define sum . Do you mean you want to get a single sum for each

 array? -- get marginal sums for each array? -- get a single array in

 which each value is the sum of all the individual values at the

 position?

 

 Due thought and consideration for those trying to help by formulating

 your query carefully and concisely vastly increases the chance of

 getting a useful answer. See the posting guide -- this is a skill that

 needs to be learned and the guide is quite helpful. And I must

 acknowledge that it is a skill that I also have not yet mastered.

 

 Concerning your query, I would only note that the two responses from

 Greg and Petr that you received are unlikely to be significantly

 faster than just using loops, since both are still essentially looping

 at the interpreted level. Whether either give you what you want, I do

 not know.

 

 -- Bert

 

 On Mon, Apr 16, 2012 at 8:53 AM, Greg Snow 538...@gmail.com wrote:

  Look at the Reduce function.

 

  On Mon

Re: [R] Effeciently sum 3d table

2012-04-16 Thread Bert Gunter
David:

Here is a comparison of the gains to be made by vectorization (again,
assuming I have interpreted your query correctly)

## create a list of arrays
 z - lapply(seq_len(1),function(i)array(runif(24),dim=2:4))
## Using an apply type approach
 system.time(ans1 - array(do.call(mapply,c(sum,z)),dim=2:4))
   user  system elapsed
   0.620.000.62
## vectorizing via rowSums and cbind
 system.time(ans2 -array(rowSums(do.call(cbind,z)),dim=2:4))
   user  system elapsed
   0.020.000.02
 identical(ans1,ans2)
[1] TRUE

Cheers,
Bert



On Mon, Apr 16, 2012 at 1:19 PM, David A Vavra dava...@verizon.net wrote:
 Thanks Bill,



 For reasons that aren't important here, I must start from a list. Computing
 the sum while generating the tables may be a solution but it means doing
 something in one piece of code that is unrelated to the surrounding code.
 Bad practice where I'm from. If it's needed it's needed but if I can avoid
 doing so, I will.



 I haven't done any timing but because of the extra operations of get and
 assign, the non-loop implementation will likely suffer. It seems you have
 shown this to be true.



 DAV





 -Original Message-
 From: William Dunlap [mailto:wdun...@tibco.com]
 Sent: Monday, April 16, 2012 3:26 PM
 To: David A Vavra; 'Bert Gunter'
 Cc: r-help@r-project.org
 Subject: RE: [R] Effeciently sum 3d table



 Example in partial code:



 Env - CreatEnv() # my own function

 Assign('final',T1-T1,envir=env)

 L-listOfTables



 lapply(L,function(t) {

     final - get('final',envir=env) + t

     assign('final',final,envir=env)

     NULL

 })



 First, finish writing that code so it runs and you can make sure its

 output is ok:



 L - lapply(1:5, function(i) array(i:(i+3), c(2,2))) # list of 50,000
 2x2 matrices

 env - new.env()

 assign('final', L[[1]] - L[[1]], envir=env)

 junk - lapply(L, function(t) {

     final - get('final', envir=env) + t

     assign('final', final, envir=env)

     NULL

 })

 get('final', envir=env)

 #            [,1]       [,2]

 # [1,] 1250025000 1250125000

 # [2,] 1250075000 1250175000

 sum( (2:50001) ) # should be final[2,1]

 # [1] 1250075000



 You asked for something less clunky.

 You are fighting the system by using get() and assign(), just use

 ordinary expression syntax to get and set variables:

 final - L[[1]]

 for(i in seq_along(L)[-1]) final - final + L[[i]]

 final

 #           [,1]       [,2]

 # [1,] 1250025000 1250125000

 # [2,] 1250075000 1250175000



 The former took 0.22 seconds on my machine, the latter 0.06.



 You don't have to compute the whole list of matrices before

 doing the sum, just add to the current sum when you have

 computed one matrix and then forget about it.



 Bill Dunlap

 Spotfire, TIBCO Software

 wdunlap tibco.com





 -Original Message-

 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf

 Of David A Vavra

 Sent: Monday, April 16, 2012 11:35 AM

 To: 'Bert Gunter'

 Cc: r-help@r-project.org

 Subject: Re: [R] Effeciently sum 3d table



 Thanks Gunter,



 I mean what I think is the normal definition of 'sum' as in:

    T1 + T2 + T3 + ...

 It never occurred to me that there would be a question.



 I have gotten the impression that a for loop is very inefficient. Whenever
 I

 change them to lapply calls there is a noticeable improvement in run time

 for whatever reason. The problem with lapply here is that I effectively
 need

 a global table to hold the final sum. lapply also  wants to return a
 value.



 You may be correct that in the long run, the loop is the best. There's a
 lot

 of extraneous memory wastage holding all of the tables in a list as well
 as

 the return 'values'.



 As an alternate and given a pre-existing list of tables, I was thinking of

 creating a temporary environment to hold the final result so it could be

 passed globally to each lapply execution level but that seems clunky and

 wasteful as well.



 Example in partial code:



 Env - CreatEnv() # my own function

 Assign('final',T1-T1,envir=env)

 L-listOfTables



 lapply(L,function(t) {

     final - get('final',envir=env) + t

     assign('final',final,envir=env)

     NULL

 })



 But I was hoping for a more elegant and hopefully more efficient solution.

 Greg's suggestion for using reduce seems in order but as yet I'm
 unfamiliar

 with the function.



 DAV







 -Original Message-

 From: Bert Gunter [mailto:gunter.ber...@gene.com]

 Sent: Monday, April 16, 2012 12:42 PM

 To: Greg Snow

 Cc: David A Vavra; r-help@r-project.org

 Subject: Re: [R] Effeciently sum 3d table



 Define sum . Do you mean you want to get a single sum for each

 array? -- get marginal sums for each array? -- get a single array in

 which each value is the sum of all the individual values at the

 position?



 Due thought and consideration for those trying to help by formulating

 your query carefully and concisely vastly increases the chance

Re: [R] Effeciently sum 3d table

2012-04-16 Thread William Dunlap
I generally prefer the list approach too.  I only mentioned that you didn't
need to have a list of inputs before starting the summation because
you said
There's a lot
   of extraneous memory wastage holding all of the tables in a list as well as
   the return 'values'.
I guess I misinterpreted that sentence.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

From: David A Vavra [mailto:dava...@verizon.net] 
Sent: Monday, April 16, 2012 1:20 PM
To: William Dunlap
Cc: r-help@r-project.org
Subject: RE: [R] Effeciently sum 3d table

Thanks Bill,

For reasons that aren't important here, I must start from a list. Computing the 
sum while generating the tables may be a solution but it means doing something 
in one piece of code that is unrelated to the surrounding code. Bad practice 
where I'm from. If it's needed it's needed but if I can avoid doing so, I will. 

I haven't done any timing but because of the extra operations of get and 
assign, the non-loop implementation will likely suffer. It seems you have shown 
this to be true.

DAV
    

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Monday, April 16, 2012 3:26 PM
To: David A Vavra; 'Bert Gunter'
Cc: r-help@r-project.org
Subject: RE: [R] Effeciently sum 3d table

 Example in partial code:
 
 Env - CreatEnv() # my own function
 Assign('final',T1-T1,envir=env)
 L-listOfTables
 
 lapply(L,function(t) {
     final - get('final',envir=env) + t
     assign('final',final,envir=env)
     NULL
 })

First, finish writing that code so it runs and you can make sure its
output is ok:

L - lapply(1:5, function(i) array(i:(i+3), c(2,2))) # list of 50,000 2x2 
matrices
env - new.env()
assign('final', L[[1]] - L[[1]], envir=env)
junk - lapply(L, function(t) {
 final - get('final', envir=env) + t
 assign('final', final, envir=env)
 NULL
})
get('final', envir=env)
#    [,1]   [,2]
# [1,] 1250025000 1250125000
# [2,] 1250075000 1250175000
 sum( (2:50001) ) # should be final[2,1]
# [1] 1250075000

You asked for something less clunky.
You are fighting the system by using get() and assign(), just use
ordinary expression syntax to get and set variables:
final - L[[1]]
for(i in seq_along(L)[-1]) final - final + L[[i]]
final
#   [,1]   [,2]
# [1,] 1250025000 1250125000
# [2,] 1250075000 1250175000

The former took 0.22 seconds on my machine, the latter 0.06.

You don't have to compute the whole list of matrices before
doing the sum, just add to the current sum when you have
computed one matrix and then forget about it.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of David A Vavra
 Sent: Monday, April 16, 2012 11:35 AM
 To: 'Bert Gunter'
 Cc: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table
 
 Thanks Gunter,
 
 I mean what I think is the normal definition of 'sum' as in:
    T1 + T2 + T3 + ...
 It never occurred to me that there would be a question.
 
 I have gotten the impression that a for loop is very inefficient. Whenever I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason. The problem with lapply here is that I effectively need
 a global table to hold the final sum. lapply also  wants to return a value.
 
 You may be correct that in the long run, the loop is the best. There's a lot
 of extraneous memory wastage holding all of the tables in a list as well as
 the return 'values'.
 
 As an alternate and given a pre-existing list of tables, I was thinking of
 creating a temporary environment to hold the final result so it could be
 passed globally to each lapply execution level but that seems clunky and
 wasteful as well.
 
 Example in partial code:
 
 Env - CreatEnv() # my own function
 Assign('final',T1-T1,envir=env)
 L-listOfTables
 
 lapply(L,function(t) {
     final - get('final',envir=env) + t
     assign('final',final,envir=env)
     NULL
 })
 
 But I was hoping for a more elegant and hopefully more efficient solution.
 Greg's suggestion for using reduce seems in order but as yet I'm unfamiliar
 with the function.
 
 DAV
 
 
 
 -Original Message-
 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Monday, April 16, 2012 12:42 PM
 To: Greg Snow
 Cc: David A Vavra; r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table
 
 Define sum . Do you mean you want to get a single sum for each
 array? -- get marginal sums for each array? -- get a single array in
 which each value is the sum of all the individual values at the
 position?
 
 Due thought and consideration for those trying to help by formulating
 your query carefully and concisely vastly increases the chance of
 getting a useful answer. See the posting guide -- this is a skill that
 needs to be learned and the guide is quite helpful. And I must
 acknowledge that it is a skill that I also have not yet mastered.
 
 Concerning

Re: [R] Effeciently sum 3d table

2012-04-16 Thread David A Vavra
OK. I'll take your word for it. The mapply function calls do_mapply so I
would have thought it is passing the operation down to the C code. I haven't
tracked it any further than below.

 mapply
function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE) 
{
FUN - match.fun(FUN)
dots - list(...)
answer - .Call(do_mapply, FUN, dots, MoreArgs, environment(), 
PACKAGE = base)

... etc.


-Original Message-
From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Monday, April 16, 2012 4:13 PM
To: David A Vavra
Cc: r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table

For purposes of clarity only...

On Mon, Apr 16, 2012 at 12:40 PM, David A Vavra dava...@verizon.net wrote:
 Bert,

 My apologies on the name.

 I haven't kept any data on loop times. I don't know why lapply seems
faster
 but the difference is quite noticeable. It has struck me as odd. I would
 have thought lapply would be slower. It has taken an effort to change my
 thinking to force fit solutions to it but I've gotten used to it. As of
now
 I reserve loops to times when there are only a few iterations (as in 10)
and
 to solutions that require passing large amounts of information among
 iterations. lapply is particularly handy when constructing lists.

 As for vectorizing, see the code below.

No. Despite the name, this is **not** what I mean by vectorization.
What I mean is pushing the loops down to the C level rather than doing
them at the interpreted level, which is where your code below still
leaves you.

-- Bert

 Note that it uses mapply but that
 simply may have made implementation easier. However, if vectorizing gives
an
 improvement over looping, the mapply may be the reason.

 f-function(x,y,z) catn(do something)
 Vectorize(f,c('x','y'))
 function (x, y, z)
 {
    args - lapply(as.list(match.call())[-1L], eval, parent.frame())
    names - if (is.null(names(args)))
        character(length(args))
    else names(args)
    dovec - names %in% vectorize.args
    do.call(mapply, c(FUN = FUN, args[dovec], MoreArgs =
 list(args[!dovec]),
        SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES))
 }
 environment: 0x7fb3442553c8

 DAV


 -Original Message-
 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Monday, April 16, 2012 3:07 PM
 To: David A Vavra
 Cc: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 David:

 1. My first name is Bert.

 2.  It never occurred to me that there would be a question.
 Indeed. But in fact you got solutions for two different
 interpretations (Greg's is what you wanted). That is what I meant when
 I said that clarity in asking the question is important.

 3.  I have gotten the impression that a for loop is very inefficient.
 Whenever I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason.
 I'd like to see your data on this. My experience is that they are
 typically comparable. Chambers in his Software for Data Analysis
 book says (pp 213): (with apply type functions rather than explicit
 loops),   The computation should run faster... However, none of the
 apply mechanisms changes the number of times the supplied functions is
 called, so serious improvements will be limited to iterating simple
 calculations many times.

 4. You can get serious improvements by vectorizing; and you can do
 that here, if I understand correctly, because all your arrays have
 identical dim = d. Here's how:

 ## assume your list of arrays is in listoftables

 alldat - do.call(cbind,listoftables) ## this might be the slow part
 ans - array(.rowSums (allDat), dim = d)

 See ?rowSums for explanations and caveats, especially with NA's .

 Cheers,
 Bert

 On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra dava...@verizon.net
wrote:
 Thanks Gunter,

 I mean what I think is the normal definition of 'sum' as in:
   T1 + T2 + T3 + ...
 It never occurred to me that there would be a question.

 I have gotten the impression that a for loop is very inefficient.
Whenever
 I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason. The problem with lapply here is that I effectively
 need
 a global table to hold the final sum. lapply also  wants to return a
 value.

 You may be correct that in the long run, the loop is the best. There's a
 lot
 of extraneous memory wastage holding all of the tables in a list as well
 as
 the return 'values'.

 As an alternate and given a pre-existing list of tables, I was thinking
of
 creating a temporary environment to hold the final result so it could be
 passed globally to each lapply execution level but that seems clunky and
 wasteful as well.

 Example in partial code:

 Env - CreatEnv() # my own function
 Assign('final',T1-T1,envir=env)
 L-listOfTables

 lapply(L,function(t) {
        final - get('final',envir=env) + t
        assign('final',final,envir=env)
        NULL
 })

 But I was hoping for a more elegant and hopefully more efficient
solution.
 Greg's

Re: [R] Effeciently sum 3d table

2012-04-16 Thread Bert Gunter
On Mon, Apr 16, 2012 at 1:39 PM, David A Vavra dava...@verizon.net wrote:
 OK. I'll take your word for it. The mapply function calls do_mapply so I
 would have thought it is passing the operation down to the C code. I haven't
 tracked it any further than below.

No, they can't. Function evaluation must take place at the interpreted
level. However, don't take my word -- take Chambers's.

-- Bert


 mapply
 function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)
 {
    FUN - match.fun(FUN)
    dots - list(...)
    answer - .Call(do_mapply, FUN, dots, MoreArgs, environment(),
        PACKAGE = base)

 ... etc.


 -Original Message-
 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Monday, April 16, 2012 4:13 PM
 To: David A Vavra
 Cc: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 For purposes of clarity only...

 On Mon, Apr 16, 2012 at 12:40 PM, David A Vavra dava...@verizon.net wrote:
 Bert,

 My apologies on the name.

 I haven't kept any data on loop times. I don't know why lapply seems
 faster
 but the difference is quite noticeable. It has struck me as odd. I would
 have thought lapply would be slower. It has taken an effort to change my
 thinking to force fit solutions to it but I've gotten used to it. As of
 now
 I reserve loops to times when there are only a few iterations (as in 10)
 and
 to solutions that require passing large amounts of information among
 iterations. lapply is particularly handy when constructing lists.

 As for vectorizing, see the code below.

 No. Despite the name, this is **not** what I mean by vectorization.
 What I mean is pushing the loops down to the C level rather than doing
 them at the interpreted level, which is where your code below still
 leaves you.

 -- Bert

  Note that it uses mapply but that
 simply may have made implementation easier. However, if vectorizing gives
 an
 improvement over looping, the mapply may be the reason.

 f-function(x,y,z) catn(do something)
 Vectorize(f,c('x','y'))
 function (x, y, z)
 {
    args - lapply(as.list(match.call())[-1L], eval, parent.frame())
    names - if (is.null(names(args)))
        character(length(args))
    else names(args)
    dovec - names %in% vectorize.args
    do.call(mapply, c(FUN = FUN, args[dovec], MoreArgs =
 list(args[!dovec]),
        SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES))
 }
 environment: 0x7fb3442553c8

 DAV


 -Original Message-
 From: Bert Gunter [mailto:gunter.ber...@gene.com]
 Sent: Monday, April 16, 2012 3:07 PM
 To: David A Vavra
 Cc: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 David:

 1. My first name is Bert.

 2.  It never occurred to me that there would be a question.
 Indeed. But in fact you got solutions for two different
 interpretations (Greg's is what you wanted). That is what I meant when
 I said that clarity in asking the question is important.

 3.  I have gotten the impression that a for loop is very inefficient.
 Whenever I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason.
 I'd like to see your data on this. My experience is that they are
 typically comparable. Chambers in his Software for Data Analysis
 book says (pp 213): (with apply type functions rather than explicit
 loops),   The computation should run faster... However, none of the
 apply mechanisms changes the number of times the supplied functions is
 called, so serious improvements will be limited to iterating simple
 calculations many times.

 4. You can get serious improvements by vectorizing; and you can do
 that here, if I understand correctly, because all your arrays have
 identical dim = d. Here's how:

 ## assume your list of arrays is in listoftables

 alldat - do.call(cbind,listoftables) ## this might be the slow part
 ans - array(.rowSums (allDat), dim = d)

 See ?rowSums for explanations and caveats, especially with NA's .

 Cheers,
 Bert

 On Mon, Apr 16, 2012 at 11:35 AM, David A Vavra dava...@verizon.net
 wrote:
 Thanks Gunter,

 I mean what I think is the normal definition of 'sum' as in:
   T1 + T2 + T3 + ...
 It never occurred to me that there would be a question.

 I have gotten the impression that a for loop is very inefficient.
 Whenever
 I
 change them to lapply calls there is a noticeable improvement in run time
 for whatever reason. The problem with lapply here is that I effectively
 need
 a global table to hold the final sum. lapply also  wants to return a
 value.

 You may be correct that in the long run, the loop is the best. There's a
 lot
 of extraneous memory wastage holding all of the tables in a list as well
 as
 the return 'values'.

 As an alternate and given a pre-existing list of tables, I was thinking
 of
 creating a temporary environment to hold the final result so it could be
 passed globally to each lapply execution level but that seems clunky and
 wasteful as well.

 Example in partial code:

 Env - CreatEnv() # my own function
 Assign('final',T1-T1

Re: [R] Effeciently sum 3d table

2012-04-16 Thread David Winsemius


On Apr 16, 2012, at 4:04 PM, David A Vavra wrote:


even now you _could_ be clearer


I fail to see why it's unclear.


I'm after T1 + T2 + T3 + ...

Which would be one number ... i.e. the result you originally said you
did not want.


I think it's precisely what I want. If I have two 3d tables, T1 and  
T2, then

say either
1) T1 + T2
2) T1 - T2
(1) yields a third table equal to the sum of the individual cells  
and (2)
yields a table full of zeroes. At least it does for matrices. Are  
you saying

the T1+T2+T3+... above is equivalent to:

  sum(T1)+sum(T2)+sum(T3)+

when the table has more than 2d? I tried it out by hand I get the  
result I'm

after.


For me (with my slightly constricted mindset) it would have been  
clearer to have started out talking about matrices and arrays.  An  
example would have save a bunch of time.



What I want is a general solution. Reduce may be the answer but I
find the documentation for it a bit daunting. Not to mention that it  
is far

from obvious that I should have originally thought of using it.


It is a function designed to do exactly what you requested: Reduce  
uses a binary function to successively combine the elements of a given  
vector. As it turns out the term 'vector' in this case includes lists  
of classed and/or dimensioned objects rather than being restricted to  
atomic vectors.


--
David.



DAV



-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Monday, April 16, 2012 3:26 PM
To: David A Vavra
Cc: 'Petr Savicky'; r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table


On Apr 16, 2012, at 2:43 PM, David A Vavra wrote:


Thanks Petr,

I'm after T1 + T2 + T3 + ...


Which would be one number ... i.e. the result you originally said you
did not want.


and your solution is giving a list of n items
each containing sum(T[i]). I guess I should have been clearer in
stating
what I need.


Or even now you _could_ be clearer. Do you want successive partial
sums? That would yield to:

Reduce(+, listoftables, accumaulate=TRUE)






Cheers,
DAV 



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org
] On
Behalf Of Petr Savicky
Sent: Monday, April 16, 2012 11:07 AM
To: r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table

On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:

I have a large number of 3d tables that I wish to sum
Is there an efficient way to do this? Or perhaps a function I can
call?

I tried using do.call(sum,listoftables) but that returns a single
value.




So far, it seems only a loop will do the job.


Hi.

Use lapply(), for example

listoftables - list(array(1:8, dim=c(2, 2, 2)), array(2:9,
dim=c(2, 2,
2)))
lapply(listoftables, sum)

[[1]]
[1] 36

[[2]]
[1] 44

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread Greg Snow
Here is a simple example:

 mylist - replicate(4, matrix(rnorm(12), ncol=3), simplify=FALSE)
 A - Reduce( `+`, mylist )
 B - mylist[[1]] + mylist[[2]] + mylist[[3]] + mylist[[4]]
 all.equal(A,B)
[1] TRUE

Basically what Reduce does is it first applies the function (`+` in
this case) to the 1st 2 elements of mylist, then applies it to that
result and the 3rd element, then that result and the 4th element (and
would continue on if mylist had more than 4 elements).  It is
basically a way to create functions like sum from functions like `+`
which only work on 2 objects at a time.

Another way to see what it is doing is to run something like:

 Reduce( function(a,b){ cat(I am adding,a,and,b,\n); a+b }, 1:10 )

The Reduce function will probably not be any faster than a really well
written loop, but will probably be faster (both to write the command
and to run) than a poorly designed naive loop application.


On Mon, Apr 16, 2012 at 12:52 PM, David A Vavra dava...@verizon.net wrote:
 Thanks Greg,

 I think this may be what I'm after but the documentation for it isn't
 particularly clear. I hate it when someone documents a piece of code saying
 it works kinda like some other code (running elsewhere, of course) making
 the tacit assumption that everybody will immediately know what that means
 and implies.

 I'm sure I'll understand it once I know what it is trying to say. :) There's
 an item in the examples which may be exactly what I'm after.

 DAV


 -Original Message-
 From: Greg Snow [mailto:538...@gmail.com]
 Sent: Monday, April 16, 2012 11:54 AM
 To: David A Vavra
 Cc: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 Look at the Reduce function.

 On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?

 I tried using do.call(sum,listoftables) but that returns a single value.

 So far, it seems only a loop will do the job.


 TIA,
 DAV


 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread David Winsemius


On Apr 16, 2012, at 5:41 PM, Greg Snow wrote:


Here is a simple example:


mylist - replicate(4, matrix(rnorm(12), ncol=3), simplify=FALSE)
A - Reduce( `+`, mylist )
B - mylist[[1]] + mylist[[2]] + mylist[[3]] + mylist[[4]]
all.equal(A,B)

[1] TRUE

Basically what Reduce does is it first applies the function (`+` in
this case) to the 1st 2 elements of mylist, then applies it to that
result and the 3rd element, then that result and the 4th element (and
would continue on if mylist had more than 4 elements).  It is
basically a way to create functions like sum from functions like `+`
which only work on 2 objects at a time.

Another way to see what it is doing is to run something like:

Reduce( function(a,b){ cat(I am adding,a,and,b,\n); a+b },  
1:10 )


The Reduce function will probably not be any faster than a really well
written loop, but will probably be faster (both to write the command
and to run) than a poorly designed naive loop application.



It's faster on my machine (but only fractionally) but it has the as  
yet unremarked-upon advantage that it will preserve attributes of the  
tables such as dimnames.


 system.time(ans1 - array(do.call(mapply,c(sum,z)),dim=2:4))
   user  system elapsed
  0.841   0.007   0.851
 system.time(ans2 -array(rowSums(do.call(cbind,z)),dim=2:4))
   user  system elapsed
  0.132   0.003   0.145

And on my system   the Reduce strategy wins by a hair:

 system.time(ans3 - Reduce(+, z) )
   user  system elapsed
  0.129   0.001   0.134

--
(the other) David.


On Mon, Apr 16, 2012 at 12:52 PM, David A Vavra  
dava...@verizon.net wrote:

Thanks Greg,

I think this may be what I'm after but the documentation for it isn't
particularly clear. I hate it when someone documents a piece of  
code saying
it works kinda like some other code (running elsewhere, of course)  
making
the tacit assumption that everybody will immediately know what that  
means

and implies.

I'm sure I'll understand it once I know what it is trying to  
say. :) There's

an item in the examples which may be exactly what I'm after.

DAV


-Original Message-
From: Greg Snow [mailto:538...@gmail.com]
Sent: Monday, April 16, 2012 11:54 AM
To: David A Vavra
Cc: r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table

Look at the Reduce function.

On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra  
dava...@verizon.net wrote:

I have a large number of 3d tables that I wish to sum
Is there an efficient way to do this? Or perhaps a function I can  
call?


I tried using do.call(sum,listoftables) but that returns a  
single value.


So far, it seems only a loop will do the job.


TIA,
DAV



--
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com





--
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread David Winsemius


On Apr 16, 2012, at 4:32 PM, Bert Gunter wrote:


David:

Here is a comparison of the gains to be made by vectorization (again,
assuming I have interpreted your query correctly)

## create a list of arrays

z - lapply(seq_len(1),function(i)array(runif(24),dim=2:4))

## Using an apply type approach

system.time(ans1 - array(do.call(mapply,c(sum,z)),dim=2:4))

  user  system elapsed
  0.620.000.62
## vectorizing via rowSums and cbind

system.time(ans2 -array(rowSums(do.call(cbind,z)),dim=2:4))

  user  system elapsed
  0.020.000.02

identical(ans1,ans2)

[1] TRUE



It's an example as well for the possibility that different OSes may  
perform differently. My Mac (an early 2008 model) is nowhere nearly as  
efficient with the second solution, despite being the the same  
ballpark with the first:


 system.time(ans1 - array(do.call(mapply,c(sum,z)),dim=2:4))
   user  system elapsed
  0.841   0.007   0.851
 system.time(ans2 -array(rowSums(do.call(cbind,z)),dim=2:4))
   user  system elapsed
  0.132   0.003   0.145

And on my system   the Reduce strategy is fastest:

 system.time(ans3 - Reduce(+, z) )
   user  system elapsed
  0.129   0.001   0.134

And ...the Reduce() strategy would preserve other object attributes,  
something I'm quite sure the re-dimensioning of rowSums(cbind(.))  
could not preserve.


 L - list( table(a, sample(a)) ,
table(a, sample(a)),
table(a, sample(a)),
table(a, sample(a)),
table(a, sample(a)) )

 str(Reduce(+, L) )
 'table' int [1:3, 1:3] 1 1 3 4 0 1 0 4 1
 - attr(*, dimnames)=List of 2
  ..$ a: chr [1:3] a b c
  ..$  : chr [1:3] a b c

 str( array(rowSums(do.call(cbind,L)),dim=c(3,3))  )
 num [1:3, 1:3] 5 5 5 5 5 5 5 5 5


-- David.



Cheers,
Bert



On Mon, Apr 16, 2012 at 1:19 PM, David A Vavra dava...@verizon.net  
wrote:

Thanks Bill,



For reasons that aren't important here, I must start from a list.  
Computing
the sum while generating the tables may be a solution but it means  
doing
something in one piece of code that is unrelated to the surrounding  
code.
Bad practice where I'm from. If it's needed it's needed but if I  
can avoid

doing so, I will.



I haven't done any timing but because of the extra operations of  
get and
assign, the non-loop implementation will likely suffer. It seems  
you have

shown this to be true.



DAV





-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com]
Sent: Monday, April 16, 2012 3:26 PM
To: David A Vavra; 'Bert Gunter'
Cc: r-help@r-project.org
Subject: RE: [R] Effeciently sum 3d table




Example in partial code:







Env - CreatEnv() # my own function



Assign('final',T1-T1,envir=env)



L-listOfTables







lapply(L,function(t) {



final - get('final',envir=env) + t



assign('final',final,envir=env)



NULL



})




First, finish writing that code so it runs and you can make sure its

output is ok:



L - lapply(1:5, function(i) array(i:(i+3), c(2,2))) # list of  
50,000

2x2 matrices

env - new.env()

assign('final', L[[1]] - L[[1]], envir=env)

junk - lapply(L, function(t) {

final - get('final', envir=env) + t

assign('final', final, envir=env)

NULL

})

get('final', envir=env)

#[,1]   [,2]

# [1,] 1250025000 1250125000

# [2,] 1250075000 1250175000


sum( (2:50001) ) # should be final[2,1]


# [1] 1250075000



You asked for something less clunky.

You are fighting the system by using get() and assign(), just use

ordinary expression syntax to get and set variables:

final - L[[1]]

for(i in seq_along(L)[-1]) final - final + L[[i]]

final

#   [,1]   [,2]

# [1,] 1250025000 1250125000

# [2,] 1250075000 1250175000



The former took 0.22 seconds on my machine, the latter 0.06.



You don't have to compute the whole list of matrices before

doing the sum, just add to the current sum when you have

computed one matrix and then forget about it.



Bill Dunlap

Spotfire, TIBCO Software

wdunlap tibco.com






-Original Message-


From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
]

On Behalf


Of David A Vavra



Sent: Monday, April 16, 2012 11:35 AM



To: 'Bert Gunter'



Cc: r-help@r-project.org



Subject: Re: [R] Effeciently sum 3d table







Thanks Gunter,







I mean what I think is the normal definition of 'sum' as in:



   T1 + T2 + T3 + ...



It never occurred to me that there would be a question.






I have gotten the impression that a for loop is very inefficient.  
Whenever

I

change them to lapply calls there is a noticeable improvement in  
run time


for whatever reason. The problem with lapply here is that I  
effectively

need


a global table to hold the final sum. lapply also  wants to return a

value.





You may be correct that in the long run, the loop is the best.  
There's a

lot

of extraneous memory wastage holding all of the tables in a list  
as well

as


the return 'values'.






As an alternate

Re: [R] Effeciently sum 3d table

2012-04-16 Thread Bert Gunter
That _is_ interesting. Reduce() calls the sum function at the
interpreted level, so I would not expect this. Can you check whether
most of the time for my vectorized version is spent on the
do.call(cbind ...) part, which is what I would guess. Otherwise, this
sounds strange, since .rowSums is specifically built for speed -- so
it says.. I also assume z is as I constructed.

-- Bert



On Mon, Apr 16, 2012 at 3:01 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Apr 16, 2012, at 4:32 PM, Bert Gunter wrote:

 David:

 Here is a comparison of the gains to be made by vectorization (again,
 assuming I have interpreted your query correctly)

 ## create a list of arrays

 z - lapply(seq_len(1),function(i)array(runif(24),dim=2:4))

 ## Using an apply type approach

 system.time(ans1 - array(do.call(mapply,c(sum,z)),dim=2:4))

  user  system elapsed
  0.62    0.00    0.62
 ## vectorizing via rowSums and cbind

 system.time(ans2 -array(rowSums(do.call(cbind,z)),dim=2:4))

  user  system elapsed
  0.02    0.00    0.02

 identical(ans1,ans2)

 [1] TRUE


 It's an example as well for the possibility that different OSes may perform
 differently. My Mac (an early 2008 model) is nowhere nearly as efficient
 with the second solution, despite being the the same ballpark with the
 first:

 system.time(ans1 - array(do.call(mapply,c(sum,z)),dim=2:4))
   user  system elapsed
  0.841   0.007   0.851
 system.time(ans2 -array(rowSums(do.call(cbind,z)),dim=2:4))
   user  system elapsed
  0.132   0.003   0.145

 And on my system   the Reduce strategy is fastest:

 system.time(ans3 - Reduce(+, z) )
   user  system elapsed
  0.129   0.001   0.134

 And ...the Reduce() strategy would preserve other object attributes,
 something I'm quite sure the re-dimensioning of rowSums(cbind(.)) could not
 preserve.

  L - list( table(a, sample(a)) ,
            table(a, sample(a)),
            table(a, sample(a)),
            table(a, sample(a)),
            table(a, sample(a)) )

  str(Reduce(+, L) )
  'table' int [1:3, 1:3] 1 1 3 4 0 1 0 4 1
  - attr(*, dimnames)=List of 2
  ..$ a: chr [1:3] a b c
  ..$  : chr [1:3] a b c

  str( array(rowSums(do.call(cbind,L)),dim=c(3,3))  )
  num [1:3, 1:3] 5 5 5 5 5 5 5 5 5


 -- David.


 Cheers,
 Bert



 On Mon, Apr 16, 2012 at 1:19 PM, David A Vavra dava...@verizon.net
 wrote:

 Thanks Bill,



 For reasons that aren't important here, I must start from a list.
 Computing
 the sum while generating the tables may be a solution but it means doing
 something in one piece of code that is unrelated to the surrounding code.
 Bad practice where I'm from. If it's needed it's needed but if I can
 avoid
 doing so, I will.



 I haven't done any timing but because of the extra operations of get and
 assign, the non-loop implementation will likely suffer. It seems you have
 shown this to be true.



 DAV





 -Original Message-
 From: William Dunlap [mailto:wdun...@tibco.com]
 Sent: Monday, April 16, 2012 3:26 PM
 To: David A Vavra; 'Bert Gunter'
 Cc: r-help@r-project.org
 Subject: RE: [R] Effeciently sum 3d table



 Example in partial code:




 Env - CreatEnv() # my own function


 Assign('final',T1-T1,envir=env)


 L-listOfTables




 lapply(L,function(t) {


    final - get('final',envir=env) + t


    assign('final',final,envir=env)


    NULL


 })




 First, finish writing that code so it runs and you can make sure its

 output is ok:



 L - lapply(1:5, function(i) array(i:(i+3), c(2,2))) # list of 50,000
 2x2 matrices

 env - new.env()

 assign('final', L[[1]] - L[[1]], envir=env)

 junk - lapply(L, function(t) {

    final - get('final', envir=env) + t

    assign('final', final, envir=env)

    NULL

 })

 get('final', envir=env)

 #            [,1]       [,2]

 # [1,] 1250025000 1250125000

 # [2,] 1250075000 1250175000

 sum( (2:50001) ) # should be final[2,1]


 # [1] 1250075000



 You asked for something less clunky.

 You are fighting the system by using get() and assign(), just use

 ordinary expression syntax to get and set variables:

 final - L[[1]]

 for(i in seq_along(L)[-1]) final - final + L[[i]]

 final

 #           [,1]       [,2]

 # [1,] 1250025000 1250125000

 # [2,] 1250075000 1250175000



 The former took 0.22 seconds on my machine, the latter 0.06.



 You don't have to compute the whole list of matrices before

 doing the sum, just add to the current sum when you have

 computed one matrix and then forget about it.



 Bill Dunlap

 Spotfire, TIBCO Software

 wdunlap tibco.com





 -Original Message-


 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]

 On Behalf

 Of David A Vavra


 Sent: Monday, April 16, 2012 11:35 AM


 To: 'Bert Gunter'


 Cc: r-help@r-project.org


 Subject: Re: [R] Effeciently sum 3d table




 Thanks Gunter,




 I mean what I think is the normal definition of 'sum' as in:


   T1 + T2 + T3 + ...


 It never occurred to me that there would be a question.




 I have gotten the impression

Re: [R] Effeciently sum 3d table

2012-04-16 Thread David A Vavra
OK, then. Thanks. I've read the docs more carefully and Reduce does indeed
look like the ticket. For whatever reason, the first time I looked at the
documentation my initial reaction was: huh?

DAV


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Monday, April 16, 2012 4:55 PM
To: David A Vavra
Cc: r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table


On Apr 16, 2012, at 4:04 PM, David A Vavra wrote:

 even now you _could_ be clearer

 I fail to see why it's unclear.

 I'm after T1 + T2 + T3 + ...
 Which would be one number ... i.e. the result you originally said you
 did not want.

 I think it's precisely what I want. If I have two 3d tables, T1 and  
 T2, then
 say either
   1) T1 + T2
   2) T1 - T2
 (1) yields a third table equal to the sum of the individual cells  
 and (2)
 yields a table full of zeroes. At least it does for matrices. Are  
 you saying
 the T1+T2+T3+... above is equivalent to:

   sum(T1)+sum(T2)+sum(T3)+

 when the table has more than 2d? I tried it out by hand I get the  
 result I'm
 after.

For me (with my slightly constricted mindset) it would have been  
clearer to have started out talking about matrices and arrays.  An  
example would have save a bunch of time.

 What I want is a general solution. Reduce may be the answer but I
 find the documentation for it a bit daunting. Not to mention that it  
 is far
 from obvious that I should have originally thought of using it.

It is a function designed to do exactly what you requested: Reduce  
uses a binary function to successively combine the elements of a given  
vector. As it turns out the term 'vector' in this case includes lists  
of classed and/or dimensioned objects rather than being restricted to  
atomic vectors.

-- 
David.


 DAV



 -Original Message-
 From: David Winsemius [mailto:dwinsem...@comcast.net]
 Sent: Monday, April 16, 2012 3:26 PM
 To: David A Vavra
 Cc: 'Petr Savicky'; r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table


 On Apr 16, 2012, at 2:43 PM, David A Vavra wrote:

 Thanks Petr,

 I'm after T1 + T2 + T3 + ...

 Which would be one number ... i.e. the result you originally said you
 did not want.

 and your solution is giving a list of n items
 each containing sum(T[i]). I guess I should have been clearer in
 stating
 what I need.

 Or even now you _could_ be clearer. Do you want successive partial
 sums? That would yield to:

 Reduce(+, listoftables, accumaulate=TRUE)





 Cheers,
 DAV  



 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org
 ] On
 Behalf Of Petr Savicky
 Sent: Monday, April 16, 2012 11:07 AM
 To: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 On Mon, Apr 16, 2012 at 10:28:43AM -0400, David A Vavra wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can
 call?

 I tried using do.call(sum,listoftables) but that returns a single
 value.


 So far, it seems only a loop will do the job.

 Hi.

 Use lapply(), for example

 listoftables - list(array(1:8, dim=c(2, 2, 2)), array(2:9,
 dim=c(2, 2,
 2)))
 lapply(listoftables, sum)

 [[1]]
 [1] 36

 [[2]]
 [1] 44

 Hope this helps.

 Petr Savicky.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effeciently sum 3d table

2012-04-16 Thread David A Vavra
Thanks again, Greg. I must have gotten up on the wrong side of the keyboard
this morning and been having a spate of dim insight. What you've said here
makes things clearer.

DAV


-Original Message-
From: Greg Snow [mailto:538...@gmail.com] 
Sent: Monday, April 16, 2012 5:42 PM
To: David A Vavra
Cc: r-help@r-project.org
Subject: Re: [R] Effeciently sum 3d table

Here is a simple example:

 mylist - replicate(4, matrix(rnorm(12), ncol=3), simplify=FALSE)
 A - Reduce( `+`, mylist )
 B - mylist[[1]] + mylist[[2]] + mylist[[3]] + mylist[[4]]
 all.equal(A,B)
[1] TRUE

Basically what Reduce does is it first applies the function (`+` in
this case) to the 1st 2 elements of mylist, then applies it to that
result and the 3rd element, then that result and the 4th element (and
would continue on if mylist had more than 4 elements).  It is
basically a way to create functions like sum from functions like `+`
which only work on 2 objects at a time.

Another way to see what it is doing is to run something like:

 Reduce( function(a,b){ cat(I am adding,a,and,b,\n); a+b }, 1:10 )

The Reduce function will probably not be any faster than a really well
written loop, but will probably be faster (both to write the command
and to run) than a poorly designed naive loop application.


On Mon, Apr 16, 2012 at 12:52 PM, David A Vavra dava...@verizon.net wrote:
 Thanks Greg,

 I think this may be what I'm after but the documentation for it isn't
 particularly clear. I hate it when someone documents a piece of code
saying
 it works kinda like some other code (running elsewhere, of course) making
 the tacit assumption that everybody will immediately know what that means
 and implies.

 I'm sure I'll understand it once I know what it is trying to say. :)
There's
 an item in the examples which may be exactly what I'm after.

 DAV


 -Original Message-
 From: Greg Snow [mailto:538...@gmail.com]
 Sent: Monday, April 16, 2012 11:54 AM
 To: David A Vavra
 Cc: r-help@r-project.org
 Subject: Re: [R] Effeciently sum 3d table

 Look at the Reduce function.

 On Mon, Apr 16, 2012 at 8:28 AM, David A Vavra dava...@verizon.net
wrote:
 I have a large number of 3d tables that I wish to sum
 Is there an efficient way to do this? Or perhaps a function I can call?

 I tried using do.call(sum,listoftables) but that returns a single
value.

 So far, it seems only a loop will do the job.


 TIA,
 DAV


 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com




-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.