Dear Jason,

In relation to question 1, I believe that what is critical is the final use
of the regression. If one is making causal claims, then it is very
important to understand the causal structure since conditioning on
inappropriate variables can lead to nonsense results. If the use of the
regression is purely descriptive then collinearity may be the only problem,
but one must be careful not to make causal interpretations. The book
"Causality: Models, Reasoning and Inference" by Judea Pearl discusses the
causal question and is full of references.

In relation to question 2, I've seen mention of this question in many
books. In Wooldridge's Introductory Econometrics book he draws the
distinction between statistical and economic significance, however I don't
know if he cites any particular paper on the subject.

Good teaching,
Donald Pianto
Department of Statistics
University of Brasília

On Wed, Mar 21, 2012 at 5:59 AM, Iasonas Lamprianou <[email protected]>wrote:

> Dear all,
> I have a question which can be expanded to the geeneral context of
> regression
> modelling in general. If you feel that this question is beyond the scope
> of this list, please say so and I will apologize. However, this has to do
> with teaching.
>
>
> Question 1:  I am revieweing a paper and the author uses a sample size of
> around 50,000 cases to run a logistic regression. He is using 22 independent
> variables. Using too many independent variables may cause collinearity
> problems. Beyond this, however, I am not aware of any other problems
> caused by
> using too many variables in a model. However, this is also related to the
> problem of massively throwing tens of variables in amodel and then waiting
> for statistically significant results. Can anyone suggest relevant
> literature to give to my students to read?
>
>
> Question 2: Some coefficients of a diffrent logistic model in the same
> paper are marginally significant e.g. b=-0.18 and se=0.08. The only reason
> this is signficant is because the researcher used in this model a large
> sample size (around two thousand cases N=2000). The lower bound of the
> confidence interval is almost zero. Can anyone suggest a good reference to
> say that in such a case we should also check the "practical significance"
> and since the lower bound is so close to zero, we should be careful on what
> we claim about the effect?
>
> Thank you for your time
> Jason
>
>
>
> Dr. Iasonas Lamprianou
> Department of Social and Political Sciences
> University of Cyprus
>
>
>
>
> >
>        [[alternative HTML version deleted]]
>
>
> _______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
>
>

        [[alternative HTML version deleted]]

_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching

Reply via email to