This was in error since s3 was not set. The as.numeric in the calculation of s3 can be omitted if its ok to have an integer rather than numeric result and in that case its still faster yet.
> set.seed(1) > C <- sample(c("a", "b"), 1000000, replace = TRUE) > system.time({ + s0 <- vector(length = length(C)) + for(i in seq_along(C)) s0[i] <- if (C[i] == "a") 1 else -1 + s0 + }) user system elapsed 21.32 0.02 26.10 > system.time(s1 <- ifelse(C == "a", 1, -1)) user system elapsed 2.37 0.26 2.64 > system.time(s2 <- 2 * (C == "a") - 1) user system elapsed 0.32 0.02 0.35 > system.time({tmp <- C == "a"; s3 <- as.numeric(tmp - !tmp)}) user system elapsed 0.28 0.02 0.31 > identical(s0, s1) [1] TRUE > identical(s0, s2) [1] TRUE > identical(s0, s3) [1] TRUE > On 7/4/07, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > In thinking about this a bit more I have found a slightly faster one still. > See s3. Also I have added s0, the original solution, to the timings. > > > set.seed(1) > > C <- sample(c("a", "b"), 1000000, replace = TRUE) > > system.time({ > + s0 <- vector(length = length(C)) > + for(i in seq_along(C)) s0[i] <- if (C[i] == "a") 1 else -1 > + s0 > + }) > user system elapsed > 21.75 0.02 25.99 > > system.time(s1 <- ifelse(C == "a", 1, -1)) > user system elapsed > 2.32 0.17 2.54 > > system.time(s2 <- 2 * (C == "a") - 1) > user system elapsed > 0.29 0.02 0.32 > > system.time({tmp <- C == "a"; tmp - !tmp}) > user system elapsed > 0.21 0.00 0.21 > > identical(s0, s1) > [1] TRUE > > identical(s0, s2) > [1] TRUE > > identical(s0, s3) > [1] TRUE > > On 7/4/07, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > > Here are two ways. The second way is more than 10x faster. > > > > > set.seed(1) > > > C <- sample(c("a", "b"), 100000, replace = TRUE) > > > system.time(s1 <- ifelse(C == "a", 1, -1)) > > user system elapsed > > 0.37 0.01 0.38 > > > system.time(s2 <- 2 * (C == "a") - 1) > > user system elapsed > > 0.02 0.00 0.02 > > > identical(s1, s2) > > [1] TRUE > > > > On 7/4/07, Keith Alan Chamberlain <[EMAIL PROTECTED]> wrote: > > > Dear Rhelpers, > > > > > > Is there a faster way than below to set a vector based on values from > > > another vector? I'd like to call a pre-existing function for this, but one > > > which can also handle an arbitrarily large number of categories. Any > > > ideas? > > > > > > Cat=c('a','a','a','b','b','b','a','a','b') # Categorical variable > > > C1=vector(length=length(Cat)) # New vector for numeric values > > > > > > # Cycle through each column and set C1 to corresponding value of Cat. > > > for(i in 1:length(C1)){ > > > if(Cat[i]=='a') C1[i]=-1 else C1[i]=1 > > > } > > > > > > C1 > > > [1] -1 -1 -1 1 1 1 -1 -1 1 > > > Cat > > > [1] "a" "a" "a" "b" "b" "b" "a" "a" "b" > > > > > > Sincerely, > > > KeithC. > > > Psych Undergrad, CU Boulder (US) > > > RE McNair Scholar > > > > > > ______________________________________________ > > > R-help@stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.