Dear all,
I have a question which can be expanded to the geeneral context of regression 
modelling in general. If you feel that this question is beyond the scope of 
this list, please say so and I will apologize. However, this has to do with 
teaching. 


Question 1:  I am revieweing a paper and the author uses a sample size of 
around 50,000 cases to run a logistic regression. He is using 22 independent 
variables. Using too many independent variables may cause collinearity 
problems. Beyond this, however, I am not aware of any other problems caused by 
using too many variables in a model. However, this is also related to the 
problem of massively throwing tens of variables in amodel and then waiting for 
statistically significant results. Can anyone suggest relevant literature to 
give to my students to read?


Question 2: Some coefficients of a diffrent logistic model in the same paper 
are marginally significant e.g. b=-0.18 and se=0.08. The only reason this is 
signficant is because the researcher used in this model a large sample size 
(around two thousand cases N=2000). The lower bound of the confidence interval 
is almost zero. Can anyone suggest a good reference to say that in such a case 
we should also check the "practical significance" and since the lower bound is 
so close to zero, we should be careful on what we claim about the effect?

Thank you for your time
Jason  


 
Dr. Iasonas Lamprianou
Department of Social and Political Sciences
University of Cyprus




>
        [[alternative HTML version deleted]]

_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching

Reply via email to