Re: [R] the union of several data frame rows
Hi, Thanks to Henrique Dallazuanna, Erik Iverson, Mark Leeds, and J. Scott Olson for pointing me down the path of joy. I finally figured out a solution to the problem: Given the following list of partially overlapping test keys, a data frame called keys1: ID X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 A KEYD NADA NADDDA NA NA NA NA NA NA B KEYD NADA NADDDA NA NA NA NA NA NA C KEYD NADA NADDDA NA NA NA NA NA NA D KEYDCDABDDDADDD ACC E KEYDCDABDDDADDD ACC F KEYDCD NABD NA NA NAD NA NA NA NA NA G KEYD NADA NADDDA NA NA NA NA NA NA H KEYDCDABDDDADDD ACC I KEYD NADA NADDDA NA NA NA NA NA NA J KEYDCDAB NA NA NA NA NADD ACC K KEYDC NA NA NA NA NA NA NA NA NA NA NA NA NA L KEYDCD NABD NA NA NAD NA NA NA NA NA M KEYD NADA NADDDA NA NA NA NA NA NA N KEYD NADA NADDDA NA NA NA NA NA NA The goal was to wind up with a common test key: Common Key D C D A B D D D A D D D A C C What worked was the following: ck - for (i in 1:dim(keys1)[1]) {keys1[1, is.na(keys1[1,])] - keys1[i+1, is.na(keys1[1,])]} I neglected to mention in my first example that there were NA observations, which may have affected the kinds of solutions that were suggested. Chalk up another testimonial in favor providing a small workable examples when asking for help. Thanks very much, Scot Henrique Dallazuanna wrote: Perhaps: data - data.frame(key, row.names=1) names(data) - paste(q, 1:6, sep=) apply(data, 2, function(x)unique(x)[unique(x) != ]) On 01/02/2008, Scot W. McNary [EMAIL PROTECTED] wrote: Hi, I have a question about how to obtain the union of several data frame rows. I'm trying to create a common key for several tests composed of different items. Here is a small scale version of the problem. These are keys for 4 different tests, not all mutually exclusive: id q1 q2 q3 q4 q5 q6 1 A C 2 B D 3 A D B 4 C D B D I would like to create a single key all test versions, the union of the above: id q1 q2 q3 q4 q5 q6 key A C D B B D Here is what I have (unsuccessfully) tried so far: key - + matrix(c(1, A, C, , , , , + 2, , , , , B, D, + 3, A, , D, B, , , + 4, , C, D, , B, D), +byrow=TRUE, ncol = 7) k1 - key[1, 2:7] k2 - key[2, 2:7] k3 - key[3, 2:7] k4 - key[4, 2:7] itemid - c(q1, q2, q3, q4, q5, q6) k1 - cbind(itemid, k1) k2 - cbind(itemid, k2) k3 - cbind(itemid, k3) k4 - cbind(itemid, k4) tmp - merge(k1, k2, by = itemid) tmp - merge(tmp, k3, by = itemid) tmp - merge(tmp, k4, by = itemid) t(tmp) [,1] [,2] [,3] [,4] [,5] [,6] itemid q1 q2 q3 q4 q5 q6 k1 A C k2 B D k3 A D B k4C D B D The actual problem involves 300 or so items instead of 6 and 10 different keys instead of four. Any suggestions welcome. Thanks in advance, Scot McNary version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 6.1 year 2007 month 11 day26 svn rev43537 language R version.string R version 2.6.1 (2007-11-26) -- Scot McNary smcnary at charm dot net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Scot McNary smcnary at charm dot net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the union of several data frame rows
It's ugly, but you could use something like sum(tmp[i,] == A) 0 on each column. pax, Scott On Fri, Feb 1, 2008 at 1:58 PM, Scot W. McNary [EMAIL PROTECTED] wrote: Hi, I have a question about how to obtain the union of several data frame rows. I'm trying to create a common key for several tests composed of different items. Here is a small scale version of the problem. These are keys for 4 different tests, not all mutually exclusive: id q1 q2 q3 q4 q5 q6 1 A C 2 B D 3 A D B 4 C D B D I would like to create a single key all test versions, the union of the above: id q1 q2 q3 q4 q5 q6 key A C D B B D Here is what I have (unsuccessfully) tried so far: key - + matrix(c(1, A, C, , , , , + 2, , , , , B, D, + 3, A, , D, B, , , + 4, , C, D, , B, D), +byrow=TRUE, ncol = 7) k1 - key[1, 2:7] k2 - key[2, 2:7] k3 - key[3, 2:7] k4 - key[4, 2:7] itemid - c(q1, q2, q3, q4, q5, q6) k1 - cbind(itemid, k1) k2 - cbind(itemid, k2) k3 - cbind(itemid, k3) k4 - cbind(itemid, k4) tmp - merge(k1, k2, by = itemid) tmp - merge(tmp, k3, by = itemid) tmp - merge(tmp, k4, by = itemid) t(tmp) [,1] [,2] [,3] [,4] [,5] [,6] itemid q1 q2 q3 q4 q5 q6 k1 A C k2 B D k3 A D B k4C D B D The actual problem involves 300 or so items instead of 6 and 10 different keys instead of four. Any suggestions welcome. Thanks in advance, Scot McNary version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 6.1 year 2007 month 11 day26 svn rev43537 language R version.string R version 2.6.1 (2007-11-26) -- Scot McNary smcnary at charm dot net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] the union of several data frame rows
Perhaps: data - data.frame(key, row.names=1) names(data) - paste(q, 1:6, sep=) apply(data, 2, function(x)unique(x)[unique(x) != ]) On 01/02/2008, Scot W. McNary [EMAIL PROTECTED] wrote: Hi, I have a question about how to obtain the union of several data frame rows. I'm trying to create a common key for several tests composed of different items. Here is a small scale version of the problem. These are keys for 4 different tests, not all mutually exclusive: id q1 q2 q3 q4 q5 q6 1 A C 2 B D 3 A D B 4 C D B D I would like to create a single key all test versions, the union of the above: id q1 q2 q3 q4 q5 q6 key A C D B B D Here is what I have (unsuccessfully) tried so far: key - + matrix(c(1, A, C, , , , , + 2, , , , , B, D, + 3, A, , D, B, , , + 4, , C, D, , B, D), +byrow=TRUE, ncol = 7) k1 - key[1, 2:7] k2 - key[2, 2:7] k3 - key[3, 2:7] k4 - key[4, 2:7] itemid - c(q1, q2, q3, q4, q5, q6) k1 - cbind(itemid, k1) k2 - cbind(itemid, k2) k3 - cbind(itemid, k3) k4 - cbind(itemid, k4) tmp - merge(k1, k2, by = itemid) tmp - merge(tmp, k3, by = itemid) tmp - merge(tmp, k4, by = itemid) t(tmp) [,1] [,2] [,3] [,4] [,5] [,6] itemid q1 q2 q3 q4 q5 q6 k1 A C k2 B D k3 A D B k4C D B D The actual problem involves 300 or so items instead of 6 and 10 different keys instead of four. Any suggestions welcome. Thanks in advance, Scot McNary version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 6.1 year 2007 month 11 day26 svn rev43537 language R version.string R version 2.6.1 (2007-11-26) -- Scot McNary smcnary at charm dot net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] the union of several data frame rows
Hi, I have a question about how to obtain the union of several data frame rows. I'm trying to create a common key for several tests composed of different items. Here is a small scale version of the problem. These are keys for 4 different tests, not all mutually exclusive: id q1 q2 q3 q4 q5 q6 1 A C 2 B D 3 A D B 4 C D B D I would like to create a single key all test versions, the union of the above: id q1 q2 q3 q4 q5 q6 key A C D B B D Here is what I have (unsuccessfully) tried so far: key - + matrix(c(1, A, C, , , , , + 2, , , , , B, D, + 3, A, , D, B, , , + 4, , C, D, , B, D), +byrow=TRUE, ncol = 7) k1 - key[1, 2:7] k2 - key[2, 2:7] k3 - key[3, 2:7] k4 - key[4, 2:7] itemid - c(q1, q2, q3, q4, q5, q6) k1 - cbind(itemid, k1) k2 - cbind(itemid, k2) k3 - cbind(itemid, k3) k4 - cbind(itemid, k4) tmp - merge(k1, k2, by = itemid) tmp - merge(tmp, k3, by = itemid) tmp - merge(tmp, k4, by = itemid) t(tmp) [,1] [,2] [,3] [,4] [,5] [,6] itemid q1 q2 q3 q4 q5 q6 k1 A C k2 B D k3 A D B k4C D B D The actual problem involves 300 or so items instead of 6 and 10 different keys instead of four. Any suggestions welcome. Thanks in advance, Scot McNary version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 6.1 year 2007 month 11 day26 svn rev43537 language R version.string R version 2.6.1 (2007-11-26) -- Scot McNary smcnary at charm dot net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.