On Thu, 18 Sep 2003 13:57:14 -0700, Bastian wrote: > Hello, > > I did a regression analysis with 15 variables and 4 of them were not > significant. I'm not quite sure what's the best solution for this > problem: > > - leaving the regression equation like it is with all variables and > just don't interprete the not signifikant variables > > or > > - making a new regression analysis without the not significant > variables, i.e. with the method "stepwise". > > Any comments on this or literature how to solve this problem right? I > really appreciate every answer and have to admit I'm quite a newbie in > statistics... > > Thanks a lot, > > Bastian
Some of the questions you need to ask: 1. Do the 'non-significant' (NS) variables contribute to improving the model fit? Compare model fit metrics with and without the NS variables. 2. Do the NS variables contribute to the interpretation of the model for the target audience/users or detract from it? 3. If you remove the NS variables from the model, are there other variables in your dataset that might be considered? Do the 15 constitute all of your available data or only a subset? 4. Would the inclusion of the NS variables result in over-fitting of the model? 5. Are there are any transformations of the NS variables that might increase their power in the model? For example, if you used log(var) or var^2 for continuous variables. If so, how might this impact the other variables and model fit? 6. On a univariate basis how, if at all, are the NS variables correlated to the independent variable? Does the correlation make sense within the context of your data? 7. As with 6, within the multivariable model, do the regression model parameters for the NS variables make sense? 8. If you drop the NS variables, how does that impact the remaining variables (which goes back to number 1 above) and the interpretation of the model? 9. Is there any pre-cursor work in the domain of your data that can offer some guidance? This may offer some insight into what others have done and possibly any domain specific community standards that might be applicable. 10. What is the intended purpose of the model? Are you doing exploratory reviews or trying to create a prediction model? Also, one thought to keep in mind: Non-sigificant does not mean irrelevant. A good book to review would be: Regression Modeling Strategies by Frank E. Harrell http://www.amazon.com/exec/obidos/ASIN/0387952322/ While some would suggest that Frank's book is targeted to more advance users, many of the fundamental design concepts in his book can be used by anyone. You might want to see if a copy is available in your library for review. You are in an area that can at times be more art than science, so you may want to get some support from folks with more experience who can look over your shoulder and offer advice. HTH, Marc Schwartz . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
