In article <[EMAIL PROTECTED]>, PrimateAvenger <[EMAIL PROTECTED]> wrote: >I need help with a, hopefully, easy stats problem. I have two independent >variables, gender and marital status. These take on the values >of A or B and X, Y or Z, respectively. I also have a dependent >variable, shoe size, that takes on real values. >I've sampled the data and want to run a linear regression model to >predict shoe size based on gender and marital status. How is this >done? Can anyone tell me a good reference to learn about dealing >with categorical values? Whatever encoding you use, you can run a regression. Normality for the VARIABLES in a regression is NEVER needed, and even the "full" theory only needs normality and homoscedasticity in the errors. If you code your independent variables as five different variables, the resulting regression will give the same prediction as just using the observed mean in each of the six categories. You could also do the analysis in various ways by using some ANOVA methods, but you have replications, and quite likely different numbers of individuals in the six different classes. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 =========================================================================== This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===========================================================================
