Re: [R] Appropriate regression model for categorical variables

Moshe Olshansky Wed, 13 Jun 2007 18:37:45 -0700

Tirtha wrote:

>Dear users,
>In my psychometric test i have applied logistic
>regression on my data. 
>My
>data consists of 50 predictors (22 continuous and 28
>categorical) plus 
>a
>binary response. 
>
>Using glm(), stepAIC() i didn't get satisfactory
>result as 
>misclassification
>rate is too high. I think categorical variables are
>responsible for 
>this
>debacle. Some of them have more than 6 level (one has
>10 level).
>
>Please suggest some better regression model for this
>situation. If 
>possible
>you can suggest some article.
>
>thanking you.
>
>Tirtha



Hi Tirtha,

Are your categorical variables really categorical? 
What I mean is if you variable is user's satisfaction
level (0 for very unsatisfied, 1 for moderately
unsatisfied, 2 for slightly unsatisfied, 4 for
neutral, etc., finally 7 for very satisfied) then your
variable is not really categorical (since 1 is closer
to 3 than to 6) and then try what other people
suggest.  However, if your variable is, say, the 50-th
amino acid in a certain gene (with values of 1 for the
first amino acid, 2 for the second one,...,20 for the
20-th one) then your variable is really categorical
(you generally can not say that amino acid 2 is much
closer to amino acid 3 than to amino acid 17).  In
such a case I would have tried classification method
which can treat categorical variables or,
alternatively,  may be regression trees (i.e. split on
the values of categorical variables and at each "node"
find regression coefficients of the continuous
variables).

Regards,

Moshe Olshansky
[EMAIL PROTECTED]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Appropriate regression model for categorical variables

Reply via email to