Thanks, I will have a look. Judea Pearl's book seems to be famous!

 
Dr. Iasonas Lamprianou
Department of Social and Political Sciences
University of Cyprus


>________________________________
> From: Donald Pianto <[email protected]>
>To: Iasonas Lamprianou <[email protected]> 
>Cc: "[email protected]" <[email protected]> 
>Sent: Wednesday, 21 March 2012, 16:50
>Subject: Re: [R-sig-teaching] regerssion issues
> 
>
>Dear Jason,
>
>
>In relation to question 1, I believe that what is critical is the final use of 
>the regression. If one is making causal claims, then it is very important to 
>understand the causal structure since conditioning on inappropriate variables 
>can lead to nonsense results. If the use of the regression is purely 
>descriptive then collinearity may be the only problem, but one must be careful 
>not to make causal interpretations. The book "Causality: Models, Reasoning and 
>Inference" by Judea Pearl discusses the causal question and is full of 
>references.
>
>
>In relation to question 2, I've seen mention of this question in many books. 
>In Wooldridge's Introductory Econometrics book he draws the distinction 
>between statistical and economic significance, however I don't know if he 
>cites any particular paper on the subject.
>
>
>Good teaching,
>Donald Pianto
>Department of Statistics
>University of Brasília
>
>
>On Wed, Mar 21, 2012 at 5:59 AM, Iasonas Lamprianou <[email protected]> 
>wrote:
>
>Dear all,
>>I have a question which can be expanded to the geeneral context of regression
>>modelling in general. If you feel that this question is beyond the scope of 
>>this list, please say so and I will apologize. However, this has to do with 
>>teaching.
>>
>>
>>Question 1:  I am revieweing a paper and the author uses a sample size of 
>>around 50,000 cases to run a logistic regression. He is using 22 independent
>>variables. Using too many independent variables may cause collinearity
>>problems. Beyond this, however, I am not aware of any other problems caused by
>>using too many variables in a model. However, this is also related to the 
>>problem of massively throwing tens of variables in amodel and then waiting 
>>for statistically significant results. Can anyone suggest relevant literature 
>>to give to my students to read?
>>
>>
>>Question 2: Some coefficients of a diffrent logistic model in the same paper 
>>are marginally significant e.g. b=-0.18 and se=0.08. The only reason this is 
>>signficant is because the researcher used in this model a large sample size 
>>(around two thousand cases N=2000). The lower bound of the confidence 
>>interval is almost zero. Can anyone suggest a good reference to say that in 
>>such a case we should also check the "practical significance" and since the 
>>lower bound is so close to zero, we should be careful on what we claim about 
>>the effect?
>>
>>Thank you for your time
>>Jason 
>>
>>
>> 
>>Dr. Iasonas Lamprianou
>>Department of Social and Political Sciences
>>University of Cyprus
>>
>>
>>
>>
>>>
>>       [[alternative HTML version deleted]]
>>
>>
>>_______________________________________________
>>[email protected] mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
>>
>>
>
>
>
        [[alternative HTML version deleted]]

_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching

Reply via email to