[R] How do I vectorize this loop....

2009-10-21 Thread chipmaney

Basically I need to use the following data to calculate a squared error for
each Sample based on the expected Survival for the zone.

Basically, this code has Live/Dead for each sample, and I need to calculate
the square error based on the Expected Mean (ie, Survival).  The code looks
up the expectation for each zone and applies for each sample in the zone
using a loop:

Data1 - data.frame(Sample=1:6, Live =c(24,25,30,31,22,23), Baseline =
c(35,35,35,32,34,33),Zone = c(rep(Cottonwood,3),rep(OregonAsh,3)))

Data2 - data.frame(Zone = c(Cottonwood,OregonAsh), Survival =
c(0.83,0.76))

for (i in 1:nrow(Data1)) #(Yi -Ybar*Yo)^2
Data1$SquaredError[i] - (Data1$Live[i] -
Data2$Survival[which(Data1$Zone[i]==Data2$Zone)]*Data1$Baseline[i])^2


My question is, can I vectorize this code to avoid the loop?  Obviously, I
could merge the 2 datasets first, but that would still require 2 steps and
Data1 would have a bunch of redundant data.  So, is there a better
alternative?  Is there some way I improve indexing syntax efficiency by
using rownames instead of a column vector?



-- 
View this message in context: 
http://www.nabble.com/How-do-I-vectorize-this-loop-tp26000933p26000933.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I vectorize this loop....

2009-10-21 Thread William Dunlap


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of chipmaney
 Sent: Wednesday, October 21, 2009 2:58 PM
 To: r-help@r-project.org
 Subject: [R] How do I vectorize this loop
 
 
 Basically I need to use the following data to calculate a 
 squared error for
 each Sample based on the expected Survival for the zone.
 
 Basically, this code has Live/Dead for each sample, and I 
 need to calculate
 the square error based on the Expected Mean (ie, Survival).  
 The code looks
 up the expectation for each zone and applies for each sample 
 in the zone
 using a loop:
 
 Data1 - data.frame(Sample=1:6, Live =c(24,25,30,31,22,23), Baseline =
 c(35,35,35,32,34,33),Zone = c(rep(Cottonwood,3),rep(OregonAsh,3)))
 
 Data2 - data.frame(Zone = c(Cottonwood,OregonAsh), Survival =
 c(0.83,0.76))
 
 for (i in 1:nrow(Data1)) #(Yi -Ybar*Yo)^2
   Data1$SquaredError[i] - (Data1$Live[i] -
 Data2$Survival[which(Data1$Zone[i]==Data2$Zone)]*Data1$Baseline[i])^2
 
 
 My question is, can I vectorize this code to avoid the loop?

Does the following do what you want?
   (Data1$Live -
Data2$Survival[match(Data1$Zone,Data2$Zone)]*Data1$Baseline)^2
  [1] 25.5025 16.4025  0.9025 44.6224 14.7456  4.3264
  
 Obviously, I
 could merge the 2 datasets first, but that would still 
 require 2 steps and
 Data1 would have a bunch of redundant data.

Why worry about a bunch of 'redundant' data?  Its space is
freed after it is no longer referenced, so if you do the
processing in a function and don't put all the temporarily
needed data in a permanent dataset it won't take up space
for long.  If you are running into the memory limits of your
hardware then you have to work harder to avoid using more
memory that you need.  (Note that vec[which(logical)] generally
gives the same result as vec[logical] but uses more space doing
so.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 
 
 So, is there a better
 alternative?  Is there some way I improve indexing syntax 
 efficiency by
 using rownames instead of a column vector?
 
 
 
 -- 
 View this message in context: 
 http://www.nabble.com/How-do-I-vectorize-this-loop-tp26000
933p26000933.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.