Re: [R] Summarize data for MCA (FactoMineR)
Nelson Castillo [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: On Sun, Apr 27, 2008 at 10:10 AM, David Winsemius [EMAIL PROTECTED] wrote: Nelson Castillo [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: snip That was exactly what I needed :-) Thanks a lot. I tried to make the list that you have to pass as by from colnames(edom)[2:14] : [1] VB21 VB17_NEV VB17_LAV VB17_EQS VB17_CAL VB17_DEL [7] VB17_LIC VB17_HEL VB17_AIR VB17_VEN VB17_TVC VB17_PC [13] VB17_HMI But I couldn't do it. You needed a list, not the character vector which colnames returns. I wonder if it would have worked if you had used: by= as.list(colnames(edom)[2:14]) Using the orignal examples, note the class of the different objects. s.wt var1 var2 weight 1AB 5 2CD 2 colnames(s.wt)[2:3] [1] var2 weight class(colnames(s.wt)[2:3]) [1] character as.list(colnames(s.wt)[2:3]) [[1]] [1] var2 [[2]] [1] weight class(as.list(colnames(s.wt)[2:3])) [1] list If you sent an un-named list to by, you end up with column names: Group.1 through Group.13 in the returned dataframe. I think you could have then used: colnames(edom2)[1:13] - c(Income,colnames(edom)[3:14]) It may not have been worth the extra trouble, but it serves to emphasize the need for using the proper class for function arguments. -- David Winsemius So, I did the list by hand. edom2 = with(edom,aggregate(FACT_EXP_CAL_H, by=list(Income=VB21,VB17_NEV=VB17_NEV, VB17_LAV=VB17_LAV, VB17_EQS=VB17_EQS, VB17_CAL=VB17_CAL, VB17_DEL=VB17_DEL, VB17_LIC=VB17_LIC, VB17_HEL=VB17_HEL, VB17_AIR=VB17_AIR, VB17_VEN=VB17_VEN, VB17_TVC=VB17_TVC, VB17_PC=VB17_PC, VB17_HMI=VB17_HMI), sum)) nrow(edom2) [1] 9817 And the row count matches what I did before with Perl :-) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summarize data for MCA (FactoMineR)
On Sun, Apr 27, 2008 at 10:10 AM, David Winsemius [EMAIL PROTECTED] wrote: Nelson Castillo [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: (cut) That is, from: x weight var1 var2 1 1AB 2 1AB 3 2AB 4 1AB 5 2CD to: y weihgt var1 var2 1 5AB 2 2CD Does this suffice? s.wt - with(x, aggregate(weight, by=list(var1=var1,var2=var2), sum) ) # s.wt # var1 var2 x #1AB 5 #2CD 2 #then fix names names(s.wt)[3] - weight # s.wt # var1 var2 weight #1AB 5 #2CD 2 That was exactly what I needed :-) Thanks a lot. I tried to make the list that you have to pass as by from colnames(edom)[2:14] : [1] VB21 VB17_NEV VB17_LAV VB17_EQS VB17_CAL VB17_DEL [7] VB17_LIC VB17_HEL VB17_AIR VB17_VEN VB17_TVC VB17_PC [13] VB17_HMI But I couldn't do it. So, I did the list by hand. edom2 = with(edom,aggregate(FACT_EXP_CAL_H, by=list(Income=VB21,VB17_NEV=VB17_NEV, VB17_LAV=VB17_LAV, VB17_EQS=VB17_EQS, VB17_CAL=VB17_CAL, VB17_DEL=VB17_DEL, VB17_LIC=VB17_LIC, VB17_HEL=VB17_HEL, VB17_AIR=VB17_AIR, VB17_VEN=VB17_VEN, VB17_TVC=VB17_TVC, VB17_PC=VB17_PC, VB17_HMI=VB17_HMI), sum)) nrow(edom2) [1] 9817 And the row count matches what I did before with Perl :-) I believe that the reshape or reShape packages could do this in one step. I skimmed over the paper and reshape seems to be very powerful. I didn't know how to use it in this case but I guess I'll get back to the paper some other time. Regards, Nelson.- -- http://arhuaco.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summarize data for MCA (FactoMineR)
Nelson Castillo [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: Hi :-) I'm new to R and I started using it for a project (I'm the CS guy in a group of statisticians helping them find out how to solve issues as they come out). This is my first post to the list and I am starting to learn R. Well, they were used to doing MCA analysis in other programs where the data seems to be preprocessed automatically before running MCA. So, they need to process a data set that comes with N=100 of elements, but there are really about N/100 distinct elements over all the variables, so the MCA can be run in reasonable time summarizing data. So, the question is: How can I turn x from: x - structure(list(weight = c(1, 1, 2, 1, 2), var1 = structure(c(1L, 1L, 1L, 1L, 2L), .Label = c(A, C), class = factor), var2 = structure(c(1L, 1L, 1L, 1L, 2L), .Label = c(B, D), class = factor)), .Names = c(weight, var1, var2), row.names = c(NA, 5L), class = data.frame) to: y - structure(list(weihgt = c(5L, 2L), var1 = structure(1:2, .Label = c(A, C), class = factor), var2 = structure(1:2, .Label = c(B, D), class = factor)), .Names = c(weihgt, var1, var2 ), class = data.frame, row.names = c(NA, -2L)) using R? That is, from: x weight var1 var2 1 1AB 2 1AB 3 2AB 4 1AB 5 2CD to: y weihgt var1 var2 1 5AB 2 2CD Does this suffice? s.wt - with(x, aggregate(weight, by=list(var1=var1,var2=var2), sum) ) # s.wt # var1 var2 x #1AB 5 #2CD 2 #then fix names names(s.wt)[3] - weight # s.wt # var1 var2 weight #1AB 5 #2CD 2 I believe that the reshape or reShape packages could do this in one step. -- David Winsemius The idea is that there is one occurrence of A B repeated 4 times in the original table, and it is summarized in the second table, computing the sum of the weights. I solved the problem using Perl, but I'd like to know what I have to read in order to do it in R. Regards, Nelson.- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summarize data for MCA (FactoMineR)
Hi :-) I'm new to R and I started using it for a project (I'm the CS guy in a group of statisticians helping them find out how to solve issues as they come out). This is my first post to the list and I am starting to learn R. Well, they were used to doing MCA analysis in other programs where the data seems to be preprocessed automatically before running MCA. So, they need to process a data set that comes with N=100 of elements, but there are really about N/100 distinct elements over all the variables, so the MCA can be run in reasonable time summarizing data. So, the question is: How can I turn x from: x - structure(list(weight = c(1, 1, 2, 1, 2), var1 = structure(c(1L, 1L, 1L, 1L, 2L), .Label = c(A, C), class = factor), var2 = structure(c(1L, 1L, 1L, 1L, 2L), .Label = c(B, D), class = factor)), .Names = c(weight, var1, var2), row.names = c(NA, 5L), class = data.frame) to: y - structure(list(weihgt = c(5L, 2L), var1 = structure(1:2, .Label = c(A, C), class = factor), var2 = structure(1:2, .Label = c(B, D), class = factor)), .Names = c(weihgt, var1, var2 ), class = data.frame, row.names = c(NA, -2L)) using R? That is, from: x weight var1 var2 1 1AB 2 1AB 3 2AB 4 1AB 5 2CD to: y weihgt var1 var2 1 5AB 2 2CD The idea is that there is one occurrence of A B repeated 4 times in the original table, and it is summarized in the second table, computing the sum of the weights. I solved the problem using Perl, but I'd like to know what I have to read in order to do it in R. Regards, Nelson.- -- http://arhuaco.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.