Dear All,

I am looking at discriminating among several individuals based on a few variable sets (I think some variables do not make sense unless they are entered together, so I "force" them into the models together, hence datasets). I have done so with linear discriminant analysis (LDA) using "MASS::lda", with acceptable results. However, one of my collaborators suggested I use multinomial regression instead. I think his suggestion is mainly concerned with the choice of which variables (sets) best describe the data. I have used a stepwise approach (using klaR::stepclass) using the proportion of correct classifications to choose among the sets of variables. However I've been suggested that use a method that will give out an AIC instead, that will "penalize" the use of more variables. I have never done multinomial regression, and am uncertain about some details. I am looking into using R for this, and function multinom from MASS in particular.

In my previous analysis with LDA I have measured the proportion of correct classifications using a jackknife procedure (i.e. leaving each datum out of the LDA at a time, and using the obtained discriminant functions to classify it). I am thinking about doing the same with the multinomial regression. I would appreciate any ideas about if this may not be good for some reason.

Also, with the LDA I have looked at how much better the discriminant functions are compared with random assignment of individual identity. To do this I randomly shuffle the categories prior to running the LDA, then run the LDA, and measure the proportion of correct classifications using the above described jackknife procedure. I run this for many iterations and compare the distribution of proportion of correct classifications obtained from random assignment, with the one I obtained initially. Again, I though about repeating this with the multinom. Is this unnecessary as another way of looking at this already included in the multinom function?

Perhaps this is more of a general statistics question, that one about the use of R, but I would appreciate any helpful comments.

Thank you in advance.

Ricardo Antunes

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to