Re: [R] Summarize data for MCA (FactoMineR)

2008-05-03 Thread David Winsemius
Nelson Castillo [EMAIL PROTECTED] wrote in
news:[EMAIL PROTECTED]: 

 On Sun, Apr 27, 2008 at 10:10 AM, David Winsemius
 [EMAIL PROTECTED] wrote:
 Nelson Castillo [EMAIL PROTECTED] wrote in
  news:[EMAIL PROTECTED]:
 
snip
 
 That was exactly what I needed :-) Thanks a lot. I tried to make the
 list that you have to pass
 as by from   colnames(edom)[2:14] :
 
  [1] VB21 VB17_NEV VB17_LAV VB17_EQS VB17_CAL
  VB17_DEL [7] VB17_LIC VB17_HEL VB17_AIR VB17_VEN
  VB17_TVC VB17_PC 
 [13] VB17_HMI
 
 
 But I couldn't do it. 

You needed a list, not the character vector which colnames returns. I 
wonder if it would have worked if you had used:

 by= as.list(colnames(edom)[2:14]) 

Using the orignal examples, note the class of the different objects.
 s.wt
  var1 var2 weight
1AB  5
2CD  2

 colnames(s.wt)[2:3]
[1] var2   weight
 class(colnames(s.wt)[2:3])
[1] character
 as.list(colnames(s.wt)[2:3])
[[1]]
[1] var2

[[2]]
[1] weight
 class(as.list(colnames(s.wt)[2:3]))
[1] list

If you sent an un-named list to by, you end up with column names: 
Group.1 through Group.13 in the returned dataframe. I think you could 
have then used:

 colnames(edom2)[1:13] - c(Income,colnames(edom)[3:14]) 

It may not have been worth the extra trouble, but it serves to 
emphasize the need for using the proper class for function arguments.

-- 
David Winsemius


 So, I did the list by hand.
 
 edom2 = with(edom,aggregate(FACT_EXP_CAL_H,
 by=list(Income=VB21,VB17_NEV=VB17_NEV, VB17_LAV=VB17_LAV,
 VB17_EQS=VB17_EQS, VB17_CAL=VB17_CAL, VB17_DEL=VB17_DEL,
 VB17_LIC=VB17_LIC, VB17_HEL=VB17_HEL, VB17_AIR=VB17_AIR,
 VB17_VEN=VB17_VEN, VB17_TVC=VB17_TVC, VB17_PC=VB17_PC,
 VB17_HMI=VB17_HMI), sum))
 
 nrow(edom2)
 
 [1] 9817
 
 And the row count matches what I did before with Perl :-)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summarize data for MCA (FactoMineR)

2008-05-02 Thread Nelson Castillo
On Sun, Apr 27, 2008 at 10:10 AM, David Winsemius
[EMAIL PROTECTED] wrote:
 Nelson Castillo [EMAIL PROTECTED] wrote in
  news:[EMAIL PROTECTED]:

(cut)

   That is, from:
  
   x
 weight var1 var2
   1  1AB
   2  1AB
   3  2AB
   4  1AB
   5  2CD
  
   to:
  
   y
 weihgt var1 var2
   1  5AB
   2  2CD
  

  Does this suffice?

  s.wt - with(x,
   aggregate(weight, by=list(var1=var1,var2=var2), sum)
  )
  # s.wt
  #  var1 var2 x
  #1AB 5
  #2CD 2

  #then fix names
  names(s.wt)[3] - weight

  # s.wt
  #  var1 var2 weight
  #1AB  5
  #2CD  2

That was exactly what I needed :-) Thanks a lot. I tried to make the
list that you have to pass
as by from   colnames(edom)[2:14] :

 [1] VB21 VB17_NEV VB17_LAV VB17_EQS VB17_CAL VB17_DEL
 [7] VB17_LIC VB17_HEL VB17_AIR VB17_VEN VB17_TVC VB17_PC
[13] VB17_HMI


But I couldn't do it. So, I did the list by hand.

edom2 = with(edom,aggregate(FACT_EXP_CAL_H,
by=list(Income=VB21,VB17_NEV=VB17_NEV, VB17_LAV=VB17_LAV,
VB17_EQS=VB17_EQS, VB17_CAL=VB17_CAL, VB17_DEL=VB17_DEL,
VB17_LIC=VB17_LIC, VB17_HEL=VB17_HEL, VB17_AIR=VB17_AIR,
VB17_VEN=VB17_VEN, VB17_TVC=VB17_TVC, VB17_PC=VB17_PC,
VB17_HMI=VB17_HMI), sum))

 nrow(edom2)

[1] 9817

And the row count matches what I did before with Perl :-)

  I believe that the reshape or reShape packages could do this in one
  step.

I skimmed over the paper and reshape seems to be very powerful. I
didn't know how to
use it in this case but I guess I'll get back to the paper some other time.

Regards,
Nelson.-

-- 
http://arhuaco.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summarize data for MCA (FactoMineR)

2008-04-27 Thread David Winsemius
Nelson Castillo [EMAIL PROTECTED] wrote in
news:[EMAIL PROTECTED]: 

 Hi :-)
 
 I'm new to R and I started using it for a project (I'm the CS guy in
 a group of statisticians helping them find out how to solve issues
 as they come out). This is my first post to the list and I am
 starting to learn R. 
 
 Well, they were used to doing MCA analysis in other programs where
 the data seems to be preprocessed automatically before running MCA.
 
 So, they need to process a data set that comes with N=100 of
 elements, but there are really about N/100 distinct elements over
 all the variables, so the MCA can be run in reasonable time
 summarizing data. 
 
 So, the question is:
 
 How can I turn x from:
 
 x -
 structure(list(weight = c(1, 1, 2, 1, 2), var1 = structure(c(1L,
 1L, 1L, 1L, 2L), .Label = c(A, C), class = factor), var2 =
 structure(c(1L,
 1L, 1L, 1L, 2L), .Label = c(B, D), class = factor)), .Names =
 c(weight, var1, var2), row.names = c(NA, 5L), class =
 data.frame) 
 
 to:
 
 y -
 structure(list(weihgt = c(5L, 2L), var1 = structure(1:2, .Label =
 c(A, C), class = factor), var2 = structure(1:2, .Label =
 c(B, D), class = factor)), .Names = c(weihgt, var1, var2
 ), class = data.frame, row.names = c(NA, -2L))
 
 using R?
 
 That is, from:
 
 x
   weight var1 var2
 1  1AB
 2  1AB
 3  2AB
 4  1AB
 5  2CD
 
 to:
 
 y
   weihgt var1 var2
 1  5AB
 2  2CD
 

Does this suffice?

s.wt - with(x, 
  aggregate(weight, by=list(var1=var1,var2=var2), sum)
 )
# s.wt
#  var1 var2 x
#1AB 5
#2CD 2

#then fix names
names(s.wt)[3] - weight

# s.wt
#  var1 var2 weight
#1AB  5
#2CD  2

I believe that the reshape or reShape packages could do this in one 
step.


-- 
David Winsemius


 
 The idea is that there is one occurrence of A B repeated 4 times
 in the original table,
 and it is summarized in the second table, computing the sum of the
 weights. 
 
 I solved the problem using Perl, but I'd like to know what I have to
 read in order to
 do it in R.
 
 Regards,
 Nelson.-


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Summarize data for MCA (FactoMineR)

2008-04-25 Thread Nelson Castillo
Hi :-)

I'm new to R and I started using it for a project (I'm the CS guy in a group
of statisticians helping them find out how to solve issues as they come out).
This is my first post to the list and I am starting to learn R.

Well, they were used to doing MCA analysis in other programs where the data
seems to be preprocessed automatically before running MCA.

So, they need to process a data set that comes with N=100 of elements,
but there are really about N/100 distinct elements over all the variables, so
the MCA can be run in reasonable time summarizing data.

So, the question is:

How can I turn x from:

x -
structure(list(weight = c(1, 1, 2, 1, 2), var1 = structure(c(1L,
1L, 1L, 1L, 2L), .Label = c(A, C), class = factor), var2 =
structure(c(1L,
1L, 1L, 1L, 2L), .Label = c(B, D), class = factor)), .Names = c(weight,
var1, var2), row.names = c(NA, 5L), class = data.frame)

to:

y -
structure(list(weihgt = c(5L, 2L), var1 = structure(1:2, .Label = c(A,
C), class = factor), var2 = structure(1:2, .Label = c(B,
D), class = factor)), .Names = c(weihgt, var1, var2
), class = data.frame, row.names = c(NA, -2L))

using R?

That is, from:

 x
  weight var1 var2
1  1AB
2  1AB
3  2AB
4  1AB
5  2CD

to:

 y
  weihgt var1 var2
1  5AB
2  2CD


The idea is that there is one occurrence of A B repeated 4 times in
the original table,
and it is summarized in the second table, computing the sum of the weights.

I solved the problem using Perl, but I'd like to know what I have to
read in order to
do it in R.

Regards,
Nelson.-

-- 
http://arhuaco.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.