On 04/05/2017 10:28 AM, Nick Brown wrote:
Hallo,

I hope I am posting to the right place. I was advised to try this list by Ben 
Bolker (https://twitter.com/bolkerb/status/859909918446497795). I also posted 
this question to StackOverflow 
(http://stackoverflow.com/questions/43771269/lm-gives-different-results-from-lm-ridgelambda-0).
 I am a relative newcomer to R, but I wrote my first program in 1975 and have 
been paid to program in about 15 different languages, so I have some general 
background knowledge.


I have a regression from which I extract the coefficients like this:
lm(y ~ x1 * x2, data=ds)$coef
That gives: x1=0.40, x2=0.37, x1*x2=0.09



When I do the same regression in SPSS, I get:
beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14.
So the main effects are in agreement, but there is quite a difference in the 
coefficient for the interaction.

I don't know about this instance, but a common cause of this sort of difference is a different parametrization. If that's the case, then predictions in the two systems would match, even if coefficients don't.

Duncan Murdoch



X1 and X2 are correlated about .75 (yes, yes, I know - this model wasn't my 
idea, but it got published), so there is quite possibly something going on with 
collinearity. So I thought I'd try lm.ridge() to see if I can get an idea of 
where the problems are occurring.


The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge penalty) 
and check we get the same results as with lm():
lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef
x1=0.40, x2=0.37, x1*x2=0.14
So lm.ridge() agrees with SPSS, but not with lm(). (Of course, lambda=0 is the default, 
so it can be omitted; I can alternate between including or deleting ".ridge" in 
the function call, and watch the coefficient for the interaction change.)



What seems slightly strange to me here is that I assumed that lm.ridge() just piggybacks 
on lm() anyway, so in the specific case where lambda=0 and there is no 
"ridging" to do, I'd expect exactly the same results.


Unfortunately there are 34,000 cases in the dataset, so a "minimal" reprex will 
not be easy to make, but I can share the data via Dropbox or something if that would help.



I appreciate that when there is strong collinearity then all bets are off in 
terms of what the betas mean, but I would really expect lm() and lm.ridge() to 
give the same results. (I would be happy to ignore SPSS, but for the moment 
it's part of the majority!)



Thanks for reading,
Nick


        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to