Re: [R] Choose between duplicated rows

2012-04-15 Thread francy
Thank you very much to both your replies. Trinker's solution works great for small dataset, but the 'split' function just hangs when I try to apply it to all my data (around 9,000 rows)…Has anyone encountered this problem before, and do you know what I could try? Thanks again. -- View this

Re: [R] Choose between duplicated rows

2012-04-15 Thread francy
I also tried using Jim's code, but it doesn't work as expected with my real dataset. This is what I did: Best.na - do.call(rbind, lapply(split(x, x$A), function(.grp){ best - which.min(apply(.grp, 1, function(a) sum(is.na(a .grp[best, ] })) df.split - split(Best.na,

[R] Choose between duplicated rows

2012-04-14 Thread francy
Dear r experts, Sorry for this basic question, but I can't seem to find a solution… I have this data frame: df - data.frame(id = c(id1, id1, id1, id2, id2, id2), A = c(11905, 11907, 11907, 11829, 11829, 11829), v1 = c(NA, 3, NA,1,2,NA), v2 = c(NA,2,NA, 2, NA,NA), v3 = c(NA,1,NA,1,NA,NA), v4 =

Re: [R] Choose between duplicated rows

2012-04-14 Thread jim holtman
try this: x # print data id A v1 v2 v3 v4 v5 numMiss 1 id1 11905 NA NA NA N 0 3 2 id1 11907 3 2 1 Y 0 0 3 id1 11907 NA NA NA N 0 3 4 id2 11829 1 2 1 Y 1 0 5 id2 11829 2 NA NA N 0 2 6 id2 11829 NA NA NA N 0 3 # select best data

Re: [R] Choose between duplicated rows

2012-04-14 Thread Tyler Rinker
12:03:36 -0700 From: francy.casal...@gmail.com To: r-help@r-project.org Subject: [R] Choose between duplicated rows Dear r experts, Sorry for this basic question, but I can't seem to find a solution… I have this data frame: df - data.frame(id = c(id1, id1, id1, id2, id2, id2), A = c