Keith Alan Chamberlain Keith.Chamberlain at Colorado.EDU writes:
Cat=c('a','a','a','b','b','b','a','a','b')# Categorical variable
C1=vector(length=length(Cat)) # New vector for numeric values
for(i in 1:length(C1)){
if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
}
C1
[1] -1 -1 -1 1 1
Dear Rhelpers,
Is there a faster way than below to set a vector based on values from
another vector? I'd like to call a pre-existing function for this, but one
which can also handle an arbitrarily large number of categories. Any ideas?
Cat=c('a','a','a','b','b','b','a','a','b') #
Onderwerp: [R] A More efficient method?
Dear Rhelpers,
Is there a faster way than below to set a vector based on
values from another vector? I'd like to call a pre-existing
function for this, but one which can also handle an
arbitrarily large number of categories. Any ideas?
Cat=c
C1 - rep(-1, length(Cat))
C1[Cat == b]] - 1
b
On Jul 4, 2007, at 9:44 AM, Keith Alan Chamberlain wrote:
Dear Rhelpers,
Is there a faster way than below to set a vector based on values from
another vector? I'd like to call a pre-existing function for this,
but one
which can also handle
Cat=c('a','a','a','b','b','b','a','a','b')# Categorical variable
C1=vector(length=length(Cat)) # New vector for numeric values
# Cycle through each column and set C1 to corresponding value of Cat.
for(i in 1:length(C1)){
if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
}
C1
[1] -1 -1 -1
On 04-Jul-07 13:44:44, Keith Alan Chamberlain wrote:
Dear Rhelpers,
Is there a faster way than below to set a vector based on values
from another vector? I'd like to call a pre-existing function for
this, but one which can also handle an arbitrarily large number
of categories. Any ideas?
Here are two ways. The second way is more than 10x faster.
set.seed(1)
C - sample(c(a, b), 10, replace = TRUE)
system.time(s1 - ifelse(C == a, 1, -1))
user system elapsed
0.370.010.38
system.time(s2 - 2 * (C == a) - 1)
user system elapsed
0.020.000.02
at.math.ethz.chcc
Subject
04/07/2007 17:17 Re: [R] A More efficient method
Gabor Grothendieck wrote:
set.seed(1)
C - sample(c(a, b), 10, replace = TRUE)
system.time(s1 - ifelse(C == a, 1, -1))
user system elapsed
0.370.010.38
system.time(s2 - 2 * (C == a) - 1)
user system elapsed
0.020.000.02
system.time(s1 -
#Given
Cat=c('a','a','a','b','b','b','a','a','b') # Categorical variable
#and defining
coding-array(c(-1,1), dimnames=list(unique(Cat) ))
#(ie an array of values corresponding to your character array levels, and with
names set to those levels)
coding[Cat]
#does what you want.
Keith
Dear Ted,
You are correct in that factors are probably what I had in mind since I
would be using them as predictors in a regression. I didn't know the syntax
to get R to do the arithmetic.
Many thanks to everyone who replied!
Sincerely,
KeithC.
Psych Undergrad, CU Boulder (US)
RE McNair
In thinking about this a bit more I have found a slightly faster one still.
See s3. Also I have added s0, the original solution, to the timings.
set.seed(1)
C - sample(c(a, b), 100, replace = TRUE)
system.time({
+ s0 - vector(length = length(C))
+ for(i in seq_along(C)) s0[i] - if (C[i]
This was in error since s3 was not set. The as.numeric in the calculation
of s3 can be omitted if its ok to have an integer rather than numeric result
and in that case its still faster yet.
set.seed(1)
C - sample(c(a, b), 100, replace = TRUE)
system.time({
+ s0 - vector(length =
[Keith Alan Chamberlain]
Is there a faster way than below to set a vector based on values
from another vector? I'd like to call a pre-existing function for
this, but one which can also handle an arbitrarily large number of
categories. Any ideas?
Cat=c('a','a','a','b','b','b','a','a','b') #
User and System are a measure of the CPU time that was consumed. Elapsed
time is the wall clock and even though they are both measured in seconds,
they are not really the same units. The reason for the difference is any
idle time that they system may have waiting for I/O to complete which does
One other thing, in a multiprocessor configuration, if your application is
making use of the additional CPUs, then
User + System Elapsed
In some cases.
On 7/4/07, jim holtman [EMAIL PROTECTED] wrote:
User and System are a measure of the CPU time that was consumed. Elapsed
time is the wall
16 matches
Mail list logo