Re: [R] the union of several data frame rows

2008-02-08 Thread Scot W. McNary
Hi,

Thanks to Henrique Dallazuanna, Erik Iverson, Mark Leeds, and J. Scott 
Olson for pointing me down the path of joy.  I finally figured out a 
solution to the problem:

Given the following list of partially overlapping test keys, a data 
frame called keys1:

   ID   X1   X2   X3   X4   X5   X6   X7   X8   X9  X10  X11  X12  X13  
X14  X15
A KEYD NADA NADDDA NA NA NA NA 
NA NA
B KEYD NADA NADDDA NA NA NA NA 
NA NA
C KEYD NADA NADDDA NA NA NA NA 
NA NA
D KEYDCDABDDDADDD
ACC
E KEYDCDABDDDADDD
ACC
F KEYDCD NABD NA NA NAD NA NA NA 
NA NA
G KEYD NADA NADDDA NA NA NA NA 
NA NA
H KEYDCDABDDDADDD
ACC
I KEYD NADA NADDDA NA NA NA NA 
NA NA
J KEYDCDAB NA NA NA NA NADD
ACC
K KEYDC NA NA NA NA NA NA NA NA NA NA NA 
NA NA
L KEYDCD NABD NA NA NAD NA NA NA 
NA NA
M KEYD NADA NADDDA NA NA NA NA 
NA NA
N KEYD NADA NADDDA NA NA NA NA 
NA NA

The goal was to wind up with a common test key:

Common Key  D  C  D  A  B  D  D  D  A  D  D  D  A  C  C

What worked was the following:

ck - for (i in 1:dim(keys1)[1]) {keys1[1, is.na(keys1[1,])] - 
keys1[i+1, is.na(keys1[1,])]}

I neglected to mention in my first example that there were NA 
observations, which may have affected the kinds of solutions that were 
suggested.  Chalk up another testimonial in favor providing a small 
workable examples when asking for help.

Thanks very much,

Scot


Henrique Dallazuanna wrote:
 Perhaps:

 data - data.frame(key, row.names=1)
 names(data) - paste(q, 1:6, sep=)
 apply(data, 2, function(x)unique(x)[unique(x) !=  ])


 On 01/02/2008, Scot W. McNary [EMAIL PROTECTED] wrote:
   
 Hi,

 I have a question about how to obtain the union of several data frame
 rows.  I'm trying to create a common key for several tests composed of
 different items.   Here is a small scale version of the problem.  These
 are keys for 4 different tests, not all mutually exclusive:

 id q1 q2 q3 q4 q5 q6
 1  A  C
 2  B  D
 3  A D  B
 4 C  D B  D

 I would like to create a single key all test versions, the union of
 the above:

 id   q1 q2 q3 q4 q5 q6
 key  A  C  D  B  B  D


 Here is what I have (unsuccessfully) tried so far:

   key -
 +   matrix(c(1, A, C,  ,  ,  ,  ,
 +  2,  ,  ,  ,  , B, D,
 +  3, A,  , D, B,  ,  ,
 +  4,  , C, D,  , B, D),
 +byrow=TRUE, ncol = 7)
  
   k1 - key[1, 2:7]
   k2 - key[2, 2:7]
   k3 - key[3, 2:7]
   k4 - key[4, 2:7]
  
   itemid - c(q1, q2, q3, q4, q5, q6)
  
   k1 - cbind(itemid, k1)
   k2 - cbind(itemid, k2)
   k3 - cbind(itemid, k3)
   k4 - cbind(itemid, k4)
  
   tmp - merge(k1, k2, by = itemid)
   tmp - merge(tmp, k3, by = itemid)
   tmp - merge(tmp, k4, by = itemid)
  
   t(tmp)
[,1] [,2] [,3] [,4] [,5] [,6]
 itemid q1 q2 q3 q4 q5 q6
 k1 A  C
 k2 B  D
 k3 A D  B  
 k4C  D B  D

 The actual problem involves 300 or so items instead of 6 and 10
 different keys instead of four.  Any suggestions welcome.

 Thanks in advance,

 Scot McNary

   version
_
 platform   i386-pc-mingw32
 arch   i386
 os mingw32
 system i386, mingw32
 status
 major  2
 minor  6.1
 year   2007
 month  11
 day26
 svn rev43537
 language   R
 version.string R version 2.6.1 (2007-11-26)


 --
 Scot McNary
 smcnary at charm dot net

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 


   

-- 
Scot McNary
smcnary at charm dot net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the union of several data frame rows

2008-02-01 Thread J. Scott Olsson
It's ugly, but you could use something like

 sum(tmp[i,] == A)  0

on each column.

pax,
Scott

On Fri, Feb 1, 2008 at 1:58 PM, Scot W. McNary [EMAIL PROTECTED] wrote:

 Hi,

 I have a question about how to obtain the union of several data frame
 rows.  I'm trying to create a common key for several tests composed of
 different items.   Here is a small scale version of the problem.  These
 are keys for 4 different tests, not all mutually exclusive:

 id q1 q2 q3 q4 q5 q6
 1  A  C
 2  B  D
 3  A D  B
 4 C  D B  D

 I would like to create a single key all test versions, the union of
 the above:

 id   q1 q2 q3 q4 q5 q6
 key  A  C  D  B  B  D


 Here is what I have (unsuccessfully) tried so far:

   key -
 +   matrix(c(1, A, C,  ,  ,  ,  ,
 +  2,  ,  ,  ,  , B, D,
 +  3, A,  , D, B,  ,  ,
 +  4,  , C, D,  , B, D),
 +byrow=TRUE, ncol = 7)
  
   k1 - key[1, 2:7]
   k2 - key[2, 2:7]
   k3 - key[3, 2:7]
   k4 - key[4, 2:7]
  
   itemid - c(q1, q2, q3, q4, q5, q6)
  
   k1 - cbind(itemid, k1)
   k2 - cbind(itemid, k2)
   k3 - cbind(itemid, k3)
   k4 - cbind(itemid, k4)
  
   tmp - merge(k1, k2, by = itemid)
   tmp - merge(tmp, k3, by = itemid)
   tmp - merge(tmp, k4, by = itemid)
  
   t(tmp)
   [,1] [,2] [,3] [,4] [,5] [,6]
 itemid q1 q2 q3 q4 q5 q6
 k1 A  C
 k2 B  D
 k3 A D  B  
 k4C  D B  D

 The actual problem involves 300 or so items instead of 6 and 10
 different keys instead of four.  Any suggestions welcome.

 Thanks in advance,

 Scot McNary

   version
   _
 platform   i386-pc-mingw32
 arch   i386
 os mingw32
 system i386, mingw32
 status
 major  2
 minor  6.1
 year   2007
 month  11
 day26
 svn rev43537
 language   R
 version.string R version 2.6.1 (2007-11-26)


 --
 Scot McNary
 smcnary at charm dot net

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the union of several data frame rows

2008-02-01 Thread Henrique Dallazuanna
Perhaps:

data - data.frame(key, row.names=1)
names(data) - paste(q, 1:6, sep=)
apply(data, 2, function(x)unique(x)[unique(x) !=  ])


On 01/02/2008, Scot W. McNary [EMAIL PROTECTED] wrote:
 Hi,

 I have a question about how to obtain the union of several data frame
 rows.  I'm trying to create a common key for several tests composed of
 different items.   Here is a small scale version of the problem.  These
 are keys for 4 different tests, not all mutually exclusive:

 id q1 q2 q3 q4 q5 q6
 1  A  C
 2  B  D
 3  A D  B
 4 C  D B  D

 I would like to create a single key all test versions, the union of
 the above:

 id   q1 q2 q3 q4 q5 q6
 key  A  C  D  B  B  D


 Here is what I have (unsuccessfully) tried so far:

   key -
 +   matrix(c(1, A, C,  ,  ,  ,  ,
 +  2,  ,  ,  ,  , B, D,
 +  3, A,  , D, B,  ,  ,
 +  4,  , C, D,  , B, D),
 +byrow=TRUE, ncol = 7)
  
   k1 - key[1, 2:7]
   k2 - key[2, 2:7]
   k3 - key[3, 2:7]
   k4 - key[4, 2:7]
  
   itemid - c(q1, q2, q3, q4, q5, q6)
  
   k1 - cbind(itemid, k1)
   k2 - cbind(itemid, k2)
   k3 - cbind(itemid, k3)
   k4 - cbind(itemid, k4)
  
   tmp - merge(k1, k2, by = itemid)
   tmp - merge(tmp, k3, by = itemid)
   tmp - merge(tmp, k4, by = itemid)
  
   t(tmp)
[,1] [,2] [,3] [,4] [,5] [,6]
 itemid q1 q2 q3 q4 q5 q6
 k1 A  C
 k2 B  D
 k3 A D  B  
 k4C  D B  D

 The actual problem involves 300 or so items instead of 6 and 10
 different keys instead of four.  Any suggestions welcome.

 Thanks in advance,

 Scot McNary

   version
_
 platform   i386-pc-mingw32
 arch   i386
 os mingw32
 system i386, mingw32
 status
 major  2
 minor  6.1
 year   2007
 month  11
 day26
 svn rev43537
 language   R
 version.string R version 2.6.1 (2007-11-26)


 --
 Scot McNary
 smcnary at charm dot net

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] the union of several data frame rows

2008-02-01 Thread Scot W. McNary
Hi,

I have a question about how to obtain the union of several data frame 
rows.  I'm trying to create a common key for several tests composed of 
different items.   Here is a small scale version of the problem.  These 
are keys for 4 different tests, not all mutually exclusive:

id q1 q2 q3 q4 q5 q6
1  A  C 
2  B  D
3  A D  B 
4 C  D B  D

I would like to create a single key all test versions, the union of 
the above:

id   q1 q2 q3 q4 q5 q6
key  A  C  D  B  B  D


Here is what I have (unsuccessfully) tried so far:

  key -
+   matrix(c(1, A, C,  ,  ,  ,  ,
+  2,  ,  ,  ,  , B, D,
+  3, A,  , D, B,  ,  ,
+  4,  , C, D,  , B, D),
+byrow=TRUE, ncol = 7)

  k1 - key[1, 2:7]
  k2 - key[2, 2:7]
  k3 - key[3, 2:7]
  k4 - key[4, 2:7]  
 
  itemid - c(q1, q2, q3, q4, q5, q6)
 
  k1 - cbind(itemid, k1)
  k2 - cbind(itemid, k2)
  k3 - cbind(itemid, k3)
  k4 - cbind(itemid, k4)
 
  tmp - merge(k1, k2, by = itemid)
  tmp - merge(tmp, k3, by = itemid)
  tmp - merge(tmp, k4, by = itemid)
 
  t(tmp)
   [,1] [,2] [,3] [,4] [,5] [,6]
itemid q1 q2 q3 q4 q5 q6
k1 A  C
k2 B  D
k3 A D  B  
k4C  D B  D

The actual problem involves 300 or so items instead of 6 and 10 
different keys instead of four.  Any suggestions welcome.

Thanks in advance,

Scot McNary

  version
   _  
platform   i386-pc-mingw32
arch   i386   
os mingw32
system i386, mingw32  
status
major  2  
minor  6.1
year   2007   
month  11 
day26 
svn rev43537  
language   R  
version.string R version 2.6.1 (2007-11-26)


-- 
Scot McNary
smcnary at charm dot net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.