Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction

2011-08-20 Thread khosoda

Dear Mark,

Thank you very much for your advice.
I will try it.

I really appreciate your all kind advice.
Thanks a lot again.

Best regards,

Kohkichi


(11/08/19 22:28), Mark Difford wrote:

On Aug 19, 2011 khosoda wrote:


I used x10.homals4$objscores[, 1] as a predictor for logistic regression
as in the same way as PC1 in PCA.
Am I going the right way?


Hi Kohkichi,

Yes, but maybe explore the sets= argument (set Response as the target
variable and the others as the predictor variables). Then use Dim1 scores.
Also think about fitting a rank-1 restricted model, combined with the sets=
option.

See the vignette to the package and look at

@ARTICLE{MIC98,
   author = {Michailides, G. and de Leeuw, J.},
   title = {The {G}ifi system of descriptive multivariate analysis},
   journal = {Statistical Science},
   year = {1998},
   volume = {13},
   pages = {307--336},
   abstract = {}
}

Regards, Mark.

-
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3755163.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction

2011-08-19 Thread khosoda

Dear Mark,

Thank you very much for your kind advice.

Actually, I already performed penalized logistic regression by pentrace 
and lrm in package rms.


The reason why I wanted to reduce dimensionality of those 9 variables 
was that these variables were not so important according to the subject 
matter knowledge and that I wanted to avoid events per variable problem.


Your answer about dudi.mix$l1 helped me a lot.
I finally was able to perform penalized logistic regression for data 
consisting of 4 important variables and x18.dudi.mix$l1[, 1]. Thanks a 
lot again.


One more question, I investigated homals package too. I found it has 
ndim option.


mydata is followings;

 head(x10homals.df)
  age sex  symptom   HT   DM  IHD  smoking 
hyperlipidemia   Statin Response
1  62   M asymptomatic positive negative negative positive 
positive positive negative
2  82   M  symptomatic positive negative negative negative 
positive positive negative
3  64   M asymptomatic negative positive negative negative 
positive positive negative
4  55   M  symptomatic positive positive positive negative 
positive positive negative
5  67   M  symptomatic positive negative negative negative 
negative positive negative
6  79   M asymptomatic positive positive negative negative 
positive positive negative


age is continuous variable, and Response should not be active for 
computation, so, ...


x10.homals4 - homals(x10homals.df, active = c(rep(TRUE, 9), FALSE), 
level=c(numerical, rep(nominal, 9)), ndim=4)


I did it with ndim from 2 to 9, compared Classification rate of Response 
by predict(x10.homals).


 p.x10.homals4

Classification rate:
 Variable Cl. Rate %Cl. Rate
1 age   0.4712 47.12
2 sex   0.9808 98.08
3 symptom   0.8269 82.69
4  HT   0.9135 91.35
5  DM   0.8558 85.58
6 IHD   0.8750 87.50
7 smoking   0.9423 94.23
8  hyperlipidemia   0.9519 95.19
9  Statin   0.8942 89.42
10   Response   0.6154 61.54

This is the best for classification of Response, so, I selected ndim=4. 
Then, I found objscores.


 head(x10.homals4$objscores)
D1   D2   D3  D4
1 -0.002395321 -0.034032230 -0.008140378  0.02369123
2  0.036788626 -0.010308707  0.005725984 -0.02751958
3  0.014363031  0.049594466 -0.025627467  0.06254055
4  0.083092285  0.065147519  0.045903394 -0.03751551
5 -0.013692504  0.005106661 -0.007656776 -0.04107009
6  0.002320747  0.024375393 -0.017785415 -0.01752556

I used x10.homals4$objscores[, 1] as a predictor for logistic regression 
as in the same way as PC1 in PCA.


Am I going the right way?

Thanks a lot for your help in advance.

Best regards

--
Kohkichi Hosoda


(11/08/19 4:21), Mark Difford wrote:

On Aug 18, 2011 khosoda wrote:


I'm trying to do model reduction for logistic regression.


Hi Kohkichi,

My general advice to you would be to do this by fitting a penalized logistic
model (see lrm in package rms and glmnet in package glmnet; there are
several others).

Other points are that the amount of variance explained by mixed PCA and MCA
are not comparable. Furthermore, homals() is a much better choice than MCA
because it handles different types of variables whereas MCA is for
categorical variables.

On the more specific question of whether you should use dudi.mix$l1 or
dudi.mix$li, it doesn't matter: the former is a scaled version of the
latter. Same for dudi.acm. To see this do the following:

##
plot(x18.dudi.mix$li[, 1], x18.dudi.mix$l1[, 1])

Regards, Mark.

-
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3753437.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
*
 神戸大学大学院医学研究科 脳神経外科学分野
 細田 弘吉
 
 〒650-0017 神戸市中央区楠町7丁目5-1
Phone: 078-382-5966
Fax  : 078-382-5979
E-mail address
Office: khos...@med.kobe-u.ac.jp
Home  : khos...@venus.dti.ne.jp

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction

2011-08-19 Thread Mark Difford
On Aug 19, 2011 khosoda wrote:

 I used x10.homals4$objscores[, 1] as a predictor for logistic regression 
 as in the same way as PC1 in PCA. 
 Am I going the right way?

Hi Kohkichi,

Yes, but maybe explore the sets= argument (set Response as the target
variable and the others as the predictor variables). Then use Dim1 scores.
Also think about fitting a rank-1 restricted model, combined with the sets=
option.

See the vignette to the package and look at

@ARTICLE{MIC98,
  author = {Michailides, G. and de Leeuw, J.},
  title = {The {G}ifi system of descriptive multivariate analysis},
  journal = {Statistical Science},
  year = {1998},
  volume = {13},
  pages = {307--336},
  abstract = {}
}

Regards, Mark.

-
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3755163.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction

2011-08-18 Thread Daniel Malter
Pooling nominal with numeric variables and running pca on them sounds like
conceptual nonsense to me. You use PCA to reduce the dimensionality of the
data if the data are numeric. For categorical data analysis, you should use
latent class analysis or something along those lines.

The fact that your first PC captures only 20 percent of the variance
indicates that either you apply the wrong technique or that dimensionality
reduction is of little use for these data more generally. The first step
should generally be to check the correlations/associations between the
variables to inspect whether what you intend to do makes sense.

HTH,
Daniel



khosoda wrote:
 
 Hi all,
 
 I'm trying to do model reduction for logistic regression. I have 13
 predictor (4 continuous variables and 9 binary variables). Using subject
 matter knowledge, I selected 4 important variables. Regarding the rest 9
 variables, I tried to perform data reduction by principal component
 analysis (PCA). However, 8 of 9 variables were binary and only one
 continuous. I transformed the data by transcan of rms package and did
 PCA with princomp. PC1 explained only 20% of the variance. Still, I used
 the PC1 as a predictor of the logistic model and obtained some results.
 
 Then, I tried multiple correspondence analysis (MCA). The only one
 continuous variable was age. I transformed age variable to age_Q
 factor variable as the followings.
 
 quantile(mydata.df$age)
0%   25%   50%   75%  100%
 53.00 66.75 72.00 76.25 85.00
 age_Q - cut(x17.df$age, right=TRUE, breaks=c(-Inf, 66, 72, 76, Inf),
 labels=c(53-66, 67-72, 73-76, 77-85))
 table(age_Q)
 age_Q
 53-66 67-72 73-76 77-85
26272526
 
 Then, I used mjca of ca pacakge for MCA.
 
 mjca1 -  mjca(mydata.df[, c(age_Q,sex,symptom, HT, DM,
 IHD,smoking,DL, Statin)])
 
 summary(mjca1)
 
 Principal inertias (eigenvalues):
 
  dimvalue  %   cum%   scree plot
  1  0.009592  43.4  43.4  *
  2  0.003983  18.0  61.4  **
  3  0.001047   4.7  66.1  **
  4  0.000367   1.7  67.8
  -
  Total: 0.022111
 
 The dimension 1 explained 43% of the variance. Then, I was wondering
 which values I could use like PC1 in PCA. I explored in mjca1 and found
 rowcoord.
 
 mjca1$rowcoord
   [,1]  [,2][,3] [,4]
   [1,]  0.07403748  0.8963482181  0.10828273  1.581381849
   [2,]  0.92433996 -1.1497911361  1.28872517  0.304065865
   [3,]  0.49833354  0.6482940556 -2.4314  0.365023261
   [4,]  0.18998290 -1.4028117048 -1.70962159  0.451951744
   [5,] -0.13008173  0.2557656854  1.16561601 -1.012992485
 .
 .
 [101,] -1.86940216  0.5918128751  0.87352987 -1.118865117
 [102,] -2.19096615  1.2845448725  0.25227354 -0.938612155
 [103,]  0.77981265 -1.1931087587  0.23934034  0.627601413
 [104,] -2.37058237 -1.4014005013 -0.73578248 -1.455055095
 
 Then, I used mjca1$rowcoord[, 1] as the followings.
 
 mydata.df$NewScore - mjca1$rowcoord[, 1]
 
 I used this NewScore as one of the predictors for the model instead of
 original 9 variables.
 
 The final logistic model obtained by use of MCA was similar to the one
 obtained by use of PCA.
 
 My questions are;
 
 1. Is it O.K. to perform PCA for data consisting of 1 continuous
 variable and 8 binary variables?
 
 2. Is it O.K to perform transformation of age from continuous variable
 to factor variable for MCA?
 
 3. Is mjca1$rowcoord[, 1] the correct values as a predictor of
 logistic regression model like PC1 of PCA?
 
 I would appreciate your help in advance.
 
 --
 Kohkichi Hosoda
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3752062.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction

2011-08-18 Thread khosoda

Dear Daniel,

Thank you for your mail.
Your comment is exactly what I was worried about.

I konw very little about latent class analysis. So, I would like to use 
multiple correspondence analysis (MCA) for data redution. Besides, the 
first plane of the MCA captured 43% of the variance.


Do you think my use of mjca1$rowcoord[, 1] in ca package for data 
reduction in the previous mail is O.K.?


Thank you for your help.

--
Kohkichi Hosoda

(11/08/18 17:39), Daniel Malter wrote:

Pooling nominal with numeric variables and running pca on them sounds like
conceptual nonsense to me. You use PCA to reduce the dimensionality of the
data if the data are numeric. For categorical data analysis, you should use
latent class analysis or something along those lines.

The fact that your first PC captures only 20 percent of the variance
indicates that either you apply the wrong technique or that dimensionality
reduction is of little use for these data more generally. The first step
should generally be to check the correlations/associations between the
variables to inspect whether what you intend to do makes sense.

HTH,
Daniel



khosoda wrote:


Hi all,

I'm trying to do model reduction for logistic regression. I have 13
predictor (4 continuous variables and 9 binary variables). Using subject
matter knowledge, I selected 4 important variables. Regarding the rest 9
variables, I tried to perform data reduction by principal component
analysis (PCA). However, 8 of 9 variables were binary and only one
continuous. I transformed the data by transcan of rms package and did
PCA with princomp. PC1 explained only 20% of the variance. Still, I used
the PC1 as a predictor of the logistic model and obtained some results.

Then, I tried multiple correspondence analysis (MCA). The only one
continuous variable was age. I transformed age variable to age_Q
factor variable as the followings.


quantile(mydata.df$age)

0%   25%   50%   75%  100%
53.00 66.75 72.00 76.25 85.00

age_Q- cut(x17.df$age, right=TRUE, breaks=c(-Inf, 66, 72, 76, Inf),

labels=c(53-66, 67-72, 73-76, 77-85))

table(age_Q)

age_Q
53-66 67-72 73-76 77-85
26272526

Then, I used mjca of ca pacakge for MCA.


mjca1-  mjca(mydata.df[, c(age_Q,sex,symptom, HT, DM,

IHD,smoking,DL, Statin)])


summary(mjca1)


Principal inertias (eigenvalues):

  dimvalue  %   cum%   scree plot
  1  0.009592  43.4  43.4  *
  2  0.003983  18.0  61.4  **
  3  0.001047   4.7  66.1  **
  4  0.000367   1.7  67.8
  -
  Total: 0.022111

The dimension 1 explained 43% of the variance. Then, I was wondering
which values I could use like PC1 in PCA. I explored in mjca1 and found
rowcoord.


mjca1$rowcoord

   [,1]  [,2][,3] [,4]
   [1,]  0.07403748  0.8963482181  0.10828273  1.581381849
   [2,]  0.92433996 -1.1497911361  1.28872517  0.304065865
   [3,]  0.49833354  0.6482940556 -2.4314  0.365023261
   [4,]  0.18998290 -1.4028117048 -1.70962159  0.451951744
   [5,] -0.13008173  0.2557656854  1.16561601 -1.012992485
.
.
[101,] -1.86940216  0.5918128751  0.87352987 -1.118865117
[102,] -2.19096615  1.2845448725  0.25227354 -0.938612155
[103,]  0.77981265 -1.1931087587  0.23934034  0.627601413
[104,] -2.37058237 -1.4014005013 -0.73578248 -1.455055095

Then, I used mjca1$rowcoord[, 1] as the followings.


mydata.df$NewScore- mjca1$rowcoord[, 1]


I used this NewScore as one of the predictors for the model instead of
original 9 variables.

The final logistic model obtained by use of MCA was similar to the one
obtained by use of PCA.

My questions are;

1. Is it O.K. to perform PCA for data consisting of 1 continuous
variable and 8 binary variables?

2. Is it O.K to perform transformation of age from continuous variable
to factor variable for MCA?

3. Is mjca1$rowcoord[, 1] the correct values as a predictor of
logistic regression model like PC1 of PCA?

I would appreciate your help in advance.

--
Kohkichi Hosoda

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3752062.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing 

Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction

2011-08-18 Thread Mark Difford
On Aug 17, 2011 khosoda wrote:

 1. Is it O.K. to perform PCA for data consisting of 1 continuous 
 variable and 8 binary variables? 
 2. Is it O.K to perform transformation of age from continuous variable 
 to factor variable for MCA? 
 3. Is mjca1$rowcoord[, 1] the correct values as a predictor of 
 logistic regression model like PC1 of PCA?

Hi Kohkichi,

If you want to do this, i.e. PCA-type analysis with different
variable-types, then look at dudi.mix() in package ade4 and homals() in
package homals.

Regards, Mark.

-
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3752168.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction

2011-08-18 Thread Mark Difford
On Aug 18, 2011; Daniel Malter wrote:

 Pooling nominal with numeric variables and running pca on them sounds like
 conceptual 
 nonsense to me.

Hi Daniel,

This is not true. There are methods that are specifically designed to do a
PCA-type analysis on mixed categorical and continuous variables, viz
dudi.mix and dudi.hillsmith in package ade4. De Leeuw's homals method takes
this a step further, doing amongst other things, a non-linear version of PCA
using any type of variable.

Regards, Mark.

-
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3752516.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction

2011-08-18 Thread khosoda

Dear Mark,

Thank you very much for your mail. This is what I really wanted!
I tried dudi.mix in ade4 package.

 ade4plaque.df - x18.df[c(age, sex, symptom, HT, DM, IHD, 
smoking, DL, Statin)]


 head(ade4plaque.df)
  age sex  symptom   HT   DM  IHD  smoking 
hyperlipidemia   Statin
1  62   M asymptomatic positive negative negative positive 
positive positive
2  82   M  symptomatic positive negative negative negative 
positive positive
3  64   M asymptomatic negative positive negative negative 
positive positive
4  55   M  symptomatic positive positive positive negative 
positive positive
5  67   M  symptomatic positive negative negative negative 
negative positive
6  79   M asymptomatic positive positive negative negative 
positive positive


 x18.dudi.mix - dudi.mix(ade4plaque.df)
 x18.dudi.mix$eig
[1] 1.7750557 1.4504641 1.2178640 1.0344946 0.8496640 0.8248379 
0.7011151 0.6367328 0.5097718

 x18.dudi.mix$eig[1:9]/sum(x18.dudi.mix$eig)
[1] 0.19722841 0.16116268 0.13531822 0.11494385 0.09440711 0.09164866 
0.07790168 0.07074809 0.05664131


Still first component explained only 19.8% of the variances, right?

Then, I investigated values of dudi.mix corresponding to PC1 of PCA. 
Help file say;

l1   principal components, data frame with n rows and nf columns
li   row coordinates, data frame with n rows and nf columns

So, I guess I should use x18.dudi.mix$l1[, 1].
Am I right?

Or should I use multiple correpondence analysis because the first plane 
explained 43% of the variance?


Thank you for your help in advance.

Kohkichi


(11/08/18 18:33), Mark Difford wrote:

On Aug 17, 2011 khosoda wrote:


1. Is it O.K. to perform PCA for data consisting of 1 continuous
variable and 8 binary variables?
2. Is it O.K to perform transformation of age from continuous variable
to factor variable for MCA?
3. Is mjca1$rowcoord[, 1] the correct values as a predictor of
logistic regression model like PC1 of PCA?


Hi Kohkichi,

If you want to do this, i.e. PCA-type analysis with different
variable-types, then look at dudi.mix() in package ade4 and homals() in
package homals.

Regards, Mark.

-
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-use-PC1-of-PCA-and-dim1-of-MCA-as-a-predictor-in-logistic-regression-model-for-data-reduction-tp3750251p3752168.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
*
 神戸大学大学院医学研究科 脳神経外科学分野
 細田 弘吉
 
 〒650-0017 神戸市中央区楠町7丁目5-1
Phone: 078-382-5966
Fax  : 078-382-5979
E-mail address
Office: khos...@med.kobe-u.ac.jp
Home  : khos...@venus.dti.ne.jp

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to use PC1 of PCA and dim1 of MCA as a predictor in logistic regression model for data reduction

2011-08-17 Thread khosoda
Hi all,

I'm trying to do model reduction for logistic regression. I have 13
predictor (4 continuous variables and 9 binary variables). Using subject
matter knowledge, I selected 4 important variables. Regarding the rest 9
variables, I tried to perform data reduction by principal component
analysis (PCA). However, 8 of 9 variables were binary and only one
continuous. I transformed the data by transcan of rms package and did
PCA with princomp. PC1 explained only 20% of the variance. Still, I used
the PC1 as a predictor of the logistic model and obtained some results.

Then, I tried multiple correspondence analysis (MCA). The only one
continuous variable was age. I transformed age variable to age_Q
factor variable as the followings.

 quantile(mydata.df$age)
   0%   25%   50%   75%  100%
53.00 66.75 72.00 76.25 85.00
 age_Q - cut(x17.df$age, right=TRUE, breaks=c(-Inf, 66, 72, 76, Inf),
labels=c(53-66, 67-72, 73-76, 77-85))
 table(age_Q)
age_Q
53-66 67-72 73-76 77-85
   26272526

Then, I used mjca of ca pacakge for MCA.

 mjca1 -  mjca(mydata.df[, c(age_Q,sex,symptom, HT, DM,
IHD,smoking,DL, Statin)])

 summary(mjca1)

Principal inertias (eigenvalues):

 dimvalue  %   cum%   scree plot
 1  0.009592  43.4  43.4  *
 2  0.003983  18.0  61.4  **
 3  0.001047   4.7  66.1  **
 4  0.000367   1.7  67.8
 -
 Total: 0.022111

The dimension 1 explained 43% of the variance. Then, I was wondering
which values I could use like PC1 in PCA. I explored in mjca1 and found
rowcoord.

 mjca1$rowcoord
  [,1]  [,2][,3] [,4]
  [1,]  0.07403748  0.8963482181  0.10828273  1.581381849
  [2,]  0.92433996 -1.1497911361  1.28872517  0.304065865
  [3,]  0.49833354  0.6482940556 -2.4314  0.365023261
  [4,]  0.18998290 -1.4028117048 -1.70962159  0.451951744
  [5,] -0.13008173  0.2557656854  1.16561601 -1.012992485
.
.
[101,] -1.86940216  0.5918128751  0.87352987 -1.118865117
[102,] -2.19096615  1.2845448725  0.25227354 -0.938612155
[103,]  0.77981265 -1.1931087587  0.23934034  0.627601413
[104,] -2.37058237 -1.4014005013 -0.73578248 -1.455055095

Then, I used mjca1$rowcoord[, 1] as the followings.

 mydata.df$NewScore - mjca1$rowcoord[, 1]

I used this NewScore as one of the predictors for the model instead of
original 9 variables.

The final logistic model obtained by use of MCA was similar to the one
obtained by use of PCA.

My questions are;

1. Is it O.K. to perform PCA for data consisting of 1 continuous
variable and 8 binary variables?

2. Is it O.K to perform transformation of age from continuous variable
to factor variable for MCA?

3. Is mjca1$rowcoord[, 1] the correct values as a predictor of
logistic regression model like PC1 of PCA?

I would appreciate your help in advance.

--
Kohkichi Hosoda

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.