On Thu, 10 Jul 2008, Henrik Bengtsson wrote:

Hi,

by replacing 'll' with 'wh' in the source code for base::which() one
gets ~20% speed up for *named logical vectors*.


The amount of speedup depends on how sparse the TRUE values are.

When the proportion of TRUEs gets small the speedup is more than twofold on my macbook. For high proportions of TRUE, the speedup is more like the 20% you cite.

HTH,

Chuck


CURRENT CODE:

which <- function(x, arr.ind = FALSE)
{
   if(!is.logical(x))
        stop("argument to 'which' is not logical")
   wh <- seq_along(x)[ll <- x & !is.na(x)]
   m <- length(wh)
   dl <- dim(x)
   if (is.null(dl) || !arr.ind) {
       names(wh) <- names(x)[ll]
   }
   ...
   wh;
}

SUGGESTED CODE: (Remove 'll' and use 'wh')

which2 <- function(x, arr.ind = FALSE)
{
   if(!is.logical(x))
        stop("argument to 'which' is not logical")
   wh <- seq_along(x)[x & !is.na(x)]
   m <- length(wh)
   dl <- dim(x)
   if (is.null(dl) || !arr.ind) {
       names(wh) <- names(x)[wh]
   }
   ...
   wh;
}

That's all.

BENCHMARKING:

# To measure both in same environment
which1 <- base::which;
environment(which1) <- globalenv();  # Needed?

N <- 1e6;
set.seed(0xbeef);
x <- sample(c(TRUE, FALSE), size=N, replace=TRUE);
names(x) <- seq_along(x);
B <- 10;
t1 <- system.time({ for (bb in 1:B) idxs1 <- which1(x); });
t2 <- system.time({ for (bb in 1:B) idxs2 <- which2(x); });
stopifnot(identical(idxs1, idxs2));
print(t1/t2);
# Fair benchmarking
t2 <- system.time({ for (bb in 1:B) idxs2 <- which2(x); });
t1 <- system.time({ for (bb in 1:B) idxs1 <- which1(x); });
print(t1/t2);
##      user    system   elapsed
##   1.283186   1.052632   1.250000

You get similar results if you put for loop outside the system.time()
call (and sum up the timings).

Cheers

Henrik

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Charles C. Berry                            (858) 534-2098
                                            Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]                  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to