On Sep 8, 2010, at 2:24 PM, Joshua Wiley wrote:

Hi Jakob,

You can use is.na() to create an index of which rows in column 3 are
missing data, and then select these from column 1.  Here is a simple
example:

dat <- data.frame(V1 = 1:5, V3 = c(1, NA, 3, 4,  NA))
dat$new <- dat$V3
my.na <- is.na(dat$V3)
dat$new[my.na] <- dat$V1[my.na]

dat

This should be quite fast.  I broke the steps up to be explicit, but
you can readily simplify them.

I was about to post something similar except I was going to avoid the "$" operator thinking, incorrectly as it turned out, that it would be faster. I also include the Holtman/Rizopoulos suggestion of ifelse(). I was also surprised that ifelse is the winning strategy:

dat[4] <- dat[3]; idx <-is.na(dat[, 3])
dat[is.na(dat[, 3]), 4] <- dat[is.na(dat[, 3]), 1]

> benchmark(meth.ifelse = {dat$z.new <- ifelse(is.na(dat$V3), dat$V1, dat$V3)},
+  meth.dlr.sign={dat$new <- dat$V3
+  my.na <- is.na(dat$V3)
+  dat$new[my.na] <- dat$V1[my.na]},
+  meth.index ={dat[4] <- dat[3]; idx <-is.na(dat[, 3])
+  dat[idx, 4] <- dat[idx, 1]},
+ meth.forloop ={for (i in 1:nrow(dat)){
+ if (is.na(dat[i,3])==TRUE){
+ dat[i,4]<- dat[i,1]}
+ else{
+ dat[i,4]<- dat[i,3]} }
+ },
+ replications=5000, columns = c("test", "replications", "elapsed",
+      "relative", "user.self") )
           test replications elapsed  relative user.self
2 meth.dlr.sign         5000   0.502  1.081897     0.501
4  meth.forloop         5000   6.419 13.834052     6.409
1   meth.ifelse         5000   0.464  1.000000     0.463
3    meth.index         5000   2.908  6.267241     2.904

--
David.

HTH,

Josh

On Wed, Sep 8, 2010 at 11:17 AM, Jakob Hedegaard
<jakob.hedega...@agrsci.dk> wrote:
Hi list,

I have a data frame (m) with 169221 rows and 10 columns and would like to make a new column containing the content of column 3 but replace the NAs in column 3 with the data in column 1 (from the same row as the NA in column 3). Column 1 has data in all rows.

My first attempt was:

for (i in 1:169221){
if (is.na(m[i,3])==TRUE){
m[i,11] <- as.character(m[i,1])}
else{
m[i,11] <- as.character(m[i,3])}
}

Works - but takes too long time.
I would appreciate alternative solutions.

Best regards, Jakob

--

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to