On Thu, 18 Sep 2003 13:57:14 -0700, Bastian wrote:

> Hello,
> 
> I did a regression analysis with 15 variables and 4 of them were not
> significant. I'm not quite sure what's the best solution for this
> problem:
> 
> - leaving the regression equation like it is with all variables and
> just don't interprete the not signifikant variables
> 
> or
> 
> - making a new regression analysis without the not significant
> variables, i.e. with the method "stepwise".
> 
> Any comments on this or literature how to solve this problem right? I
> really appreciate every answer and have to admit I'm quite a newbie in
> statistics...
> 
> Thanks a lot,
> 
> Bastian


Some of the questions you need to ask:

1. Do the 'non-significant' (NS) variables contribute to improving the
model fit?  Compare model fit metrics with and without the NS variables.

2. Do the NS variables contribute to the interpretation of
the model for the target audience/users or detract from it?

3. If you remove the NS variables from the model, are there other
variables in your dataset that might be considered? Do the 15 constitute
all of your available data or only a subset?

4. Would the inclusion of the NS variables result in over-fitting of the
model?

5. Are there are any transformations of the NS variables that might
increase their power in the model? For example, if you used log(var) or
var^2 for continuous variables. If so, how might this impact the other
variables and model fit?

6. On a univariate basis how, if at all, are the NS variables correlated
to the independent variable? Does the correlation make sense within the
context of your data?

7. As with 6, within the multivariable model, do the regression model
parameters for the NS variables make sense?

8. If you drop the NS variables, how does that impact the remaining
variables (which goes back to number 1 above) and the interpretation of
the model?

9. Is there any pre-cursor work in the domain of your data that can offer
some guidance? This may offer some insight into what others have done and
possibly any domain specific community standards that might be applicable.

10. What is the intended purpose of the model? Are you doing exploratory
reviews or trying to create a prediction model?

Also, one thought to keep in mind:

Non-sigificant does not mean irrelevant.


A good book to review would be:

Regression Modeling Strategies
by Frank E. Harrell 
http://www.amazon.com/exec/obidos/ASIN/0387952322/

While some would suggest that Frank's book is targeted to more advance
users, many of the fundamental design concepts in his book can be used by
anyone.  You might want to see if a copy is available in your library for
review.

You are in an area that can at times be more art than science, so you may
want to get some support from folks with more experience who can look over
your shoulder and offer advice.

HTH,

Marc Schwartz

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to