Thank you very much to both your replies.
Trinker's solution works great for small dataset, but the 'split' function
just hangs when I try to apply it to all my data (around 9,000 rows)…Has
anyone encountered this problem before, and do you know what I could try?
Thanks again.
--
View this
I also tried using Jim's code, but it doesn't work as expected with my real
dataset. This is what I did:
Best.na - do.call(rbind, lapply(split(x, x$A), function(.grp){
best - which.min(apply(.grp, 1, function(a) sum(is.na(a
.grp[best, ]
}))
df.split - split(Best.na,
Dear r experts,
Sorry for this basic question, but I can't seem to find a solution…
I have this data frame:
df - data.frame(id = c(id1, id1, id1, id2, id2, id2), A =
c(11905, 11907, 11907, 11829, 11829, 11829), v1 = c(NA, 3, NA,1,2,NA), v2 =
c(NA,2,NA, 2, NA,NA), v3 = c(NA,1,NA,1,NA,NA), v4 =
try this:
x # print data
id A v1 v2 v3 v4 v5 numMiss
1 id1 11905 NA NA NA N 0 3
2 id1 11907 3 2 1 Y 0 0
3 id1 11907 NA NA NA N 0 3
4 id2 11829 1 2 1 Y 1 0
5 id2 11829 2 NA NA N 0 2
6 id2 11829 NA NA NA N 0 3
# select best data
12:03:36 -0700
From: francy.casal...@gmail.com
To: r-help@r-project.org
Subject: [R] Choose between duplicated rows
Dear r experts,
Sorry for this basic question, but I can't seem to find a solution…
I have this data frame:
df - data.frame(id = c(id1, id1, id1, id2, id2, id2), A =
c
5 matches
Mail list logo