Prof Brian Ripley wrote:
On Thu, 2 Jun 2005, hadley wickham wrote:
An environment is a hash table, and copying occurs only on
modification,
when any language would have to copy in this context.
Caveat: the default is new.env(hash=FALSE), so an environment is a
hash table in one sense but not necessarily so in the strict sense.
Yes, I'm aware that copying only occurs on modification. However, it
was my understanding that
a <- list(a =1)
a$b <- 4
would create a new copy of a,
Thomas was talking about environments, not lists (and $ works
differently for them).
whereas in Java say
HashMap a = new HashMap();
a.put("a", 1);
a.put("b", 2);
wouldn't create a copy of a. (please bear in mind my java syntax is
very rusty!)
Caching data implies updating at each step, thus possibly creating n
copies of the list. Is that wrong?
It depends what you mean by `copy'. If you expand a hash table you at
some point need to re-arrange the hashing of the entries. That's what
will happen with an R environment or an R list too. The possible
difference is that you might expect the table to have some room for
expansion, and in your list example you did not give any.
R's operations on lists make more copies than are strictly necessary,
but it is not clear that this is more costly than the housekeeping
that would otherwise be necessary. In a$b <- 4, the wrapper VECSXP is
recreated and the pointers copied across, but the list elements are
not copied. For
a$a <- 4 it is probable that no copying is done (although if a2 <- a
had been done previously, the pending recursive copy would then be done).
(It is also possible that I have overlooked something in the rather
complex code used to do these subassignments.)
The original poster asked for caching of results. here is an example
using new.env():
memo <- function(fun) {
mem <- new.env()
function(x) {
if (exists(as.character(x), envir=mem)) get(as.character(x),
envir=mem, inherits=FALSE) else {
val <- fun(x)
assign(as.character(x), value=val, envir=mem)
val }
}
}
> fib <- memo(function(n) if(n<=1) 1 else fib(n-1)+fib(n-2))
> system.time( fib(300) )
[1] 0.01 0.00 0.02 NA NA
ls(get("mem", env=environment(fib)))
*output supressed*
To compare:
system.time( {fib2 <- function(n)if(n<=1)1 else
fib2(n-1)+fib2(n-2);fib2(30)})
[1] 8.07 0.08 12.75 NA NA
(there is (at least) one problem with this solution: if the global
workspace contains a
variable `6`, it gives error. Why?)
Kjetil
--
Kjetil Halvorsen.
Peace is the most effective weapon of mass construction.
-- Mahdi Elmandjra
--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html