Re: [R] Creating contingency table from mixed data
On 05-May-07 23:14:38, spime wrote: Hi, I am new in R. Please help me in the following case. I have data in hand: http://www.nabble.com/file/8225/Data.txt Data.txt There are some categorical (binary and nominal) and continuous variables. How can i get a generic RXC contingency table from this table? My main objective is to fine count in each cell and mean of continuous variables in each cell. Please reply. Thanks in advance If what is in that file is all your data, then it is easily and quite wuickly (10 minutes) done by hand, facilitated by first re-ordering your data as: Var1Var2 Var3 Var4 Var5 011 10 144 017 11 123 015 11 117 018 20 99 022 21 142 117 10 136 110 11 109 1 8 21 133 117 21 108 111 30 112 116 30 121 112 31 152 From which, the following is easy to obtain: Var3: - Var1:0 | 1 | 2 | 3 | = Var4:0 | (11,144) | (18, 99) | | | | | | - Count: | 1 | 1 | 0 | Mean: | (11,144) | (18. 99) | | = Var4:1 | (17,123) | (22,142) | | | (15,117) | | | - Count: | 2 | 1 | 0 | Mean: | (16,120) | (22,142) | | = Var3: - Var1:1 | 1 | 2 | 3 | = Var4:0 | (17,136) | | (11,112) | | | | (16,121) | - Count: | 1 | 0 | 2 | Mean: | (17,136) | | (13.5,116.5) | = Var4:1 | (10,109) | ( 8,133) | (12,152) | | | (17,108) | | - Count: | 1 | 2 | 1 | Mean: | (10,109) | (12.5,120.5) | (12,152) | = To do it automatically, you could get the counts alone by applying table() to the factor columns (vars 1, 2, 4, taken all together). Thus (where Dat is a dataframe with columns Var1,...,Var5): table(Dat$Var4,Dat$Var3,Dat$Var1,dnn=c(Var4,Var3,Var1)) , , Var1 = 0 Var3 Var4 1 2 3 0 1 1 0 1 2 1 0 , , Var1 = 1 Var3 Var4 1 2 3 0 1 0 2 1 1 2 1 which is basicaloy a contingency table format already, or counts and means by() with functions sun() and mean() to the continuous variables, thus: CT - by(Dat,list(var1=Dat$Var1,Var3=Dat$Var3,Var4=Dat$Var4), function(x){list(Count=sum(x[,2]0),Mean=mean(x[,c(2,5)]))}) which produces: var1: 0 Var3: 1 Var4: 0 $Count [1] 1 $Mean Var2 Var5 11 144 var1: 1 Var3: 1 Var4: 0 $Count [1] 1 $Mean Var2 Var5 17 136 var1: 0 Var3: 2 Var4: 0 $Count [1] 1 $Mean Var2 Var5 18 99 var1: 1 Var3: 2 Var4: 0 NULL var1: 0 Var3: 3 Var4: 0 NULL var1: 1 Var3: 3 Var4: 0 $Count [1] 2 $Mean Var2 Var5 13.5 116.5 var1: 0 Var3: 1 Var4: 1 $Count [1] 2 $Mean Var2 Var5 16 120 var1: 1 Var3: 1 Var4: 1 $Count [1] 1 $Mean Var2 Var5 10 109 var1: 0 Var3: 2 Var4: 1 $Count [1] 1 $Mean Var2 Var5 22 142 var1: 1 Var3: 2 Var4: 1 $Count [1] 2 $Mean Var2 Var5 12.5 120.5 var1: 0 Var3: 3 Var4: 1 NULL var1: 1 Var3: 3 Var4: 1 $Count [1] 1 $Mean Var2 Var5 12 152 but this format is not very convenient for incorporating into
[R] Creating contingency table from mixed data
Hi, I am new in R. Please help me in the following case. I have data in hand: http://www.nabble.com/file/8225/Data.txt Data.txt There are some categorical (binary and nominal) and continuous variables. How can i get a generic RXC contingency table from this table? My main objective is to fine count in each cell and mean of continuous variables in each cell. Please reply. Thanks in advance. -- View this message in context: http://www.nabble.com/Creating-contingency-table-from-mixed-data-tf3698055.html#a10341180 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a table
Dear R List, I am a new to R, so my question may be easy to answer for you: I have a dataframe, for example: df-data.frame(loc=c(A,B,A,A,A), year=as.numeric(c(1970,1970,1970,1976,1980))) and I want to create the following table without using loops: 1970-74 ; 1975-79 ; 1980-85; rowsum A 2 1 1 4 B 1 00 1 colsum 31 15 so that the frequencies of df$loc are shown in the table for different time intervals. Thanks in advance for any hint, Michael Graber __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a table
addmargins(table(df)) On 14/11/06, Michael Graber [EMAIL PROTECTED] wrote: Dear R List, I am a new to R, so my question may be easy to answer for you: I have a dataframe, for example: df-data.frame(loc=c(A,B,A,A,A), year=as.numeric(c(1970,1970,1970,1976,1980))) and I want to create the following table without using loops: 1970-74 ; 1975-79 ; 1980-85; rowsum A 2 1 1 4 B 1 00 1 colsum 31 15 so that the frequencies of df$loc are shown in the table for different time intervals. Thanks in advance for any hint, Michael Graber __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a table
tb = table(df$loc, cut(df$year, seq(1970, 1985, by=5), right=F)) rs = rowSums(tb) tb = cbind(tb, rs) cs = colSums(tb) tb = rbind(tb, cs) cheers, b On Nov 14, 2006, at 3:20 PM, Michael Graber wrote: Dear R List, I am a new to R, so my question may be easy to answer for you: I have a dataframe, for example: df-data.frame(loc=c(A,B,A,A,A), year=as.numeric(c(1970,1970,1970,1976,1980))) and I want to create the following table without using loops: 1970-74 ; 1975-79 ; 1980-85; rowsum A 2 1 1 4 B 1 00 1 colsum 31 15 so that the frequencies of df$loc are shown in the table for different time intervals. Thanks in advance for any hint, Michael Graber __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a table
Michael One solution: df-data.frame(loc=c(A,B,A,A,A), year=c(1970,1970,1970,1976,1980)) df[,3] - cut(df$year, c(1969.5,1974.5,1979.5,1984.5), c('1970-74','1975-79','1980-85')) with(df, addmargins(table(loc, V3))) Peter Alspach -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Graber Sent: Wednesday, 15 November 2006 9:21 a.m. To: R-Mailingliste Subject: [R] Creating a table Dear R List, I am a new to R, so my question may be easy to answer for you: I have a dataframe, for example: df-data.frame(loc=c(A,B,A,A,A), year=as.numeric(c(1970,1970,1970,1976,1980))) and I want to create the following table without using loops: 1970-74 ; 1975-79 ; 1980-85; rowsum A 2 1 1 4 B 1 00 1 colsum 31 15 so that the frequencies of df$loc are shown in the table for different time intervals. Thanks in advance for any hint, Michael Graber __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ The contents of this e-mail are privileged and/or confidenti...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.