subject:"\[Rd\] lm\(\) gives different results to lm.ridge\(\) and SPSS"

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-06 Thread peter dalgaard


> On 6 May 2017, at 01:49 , Nick Brown <nick.br...@free.fr> wrote:
> 
> Hi John,
> 
> Thanks for the comment... but that appears to mean that SPSS has a big 
> problem.  I have always been told that to include an interaction term in a 
> regression, the only way is to do the multiplication by hand.  But then it 
> seems to be impossible to stop SPSS from re-standardizing the variable that 
> corresponds to the interaction term.  Am I missing something?  Is there a way 
> to perform the regression with the interaction in SPSS without computing the 
> interaction as a separate variable?

Just look at the unstandardized coefficients in SPSS. The standardized ones are 
of some usefulness, but it is limited in the case of syntesized regressors like 
product terms. I imagine that the interpretation also goes whacky for squared 
terms, dummy coded groupings, etc.

(Does SPSS really still not have an automated way of generating interaction 
terms? 1977 called googling Looks like GLM understands them, REGRESS 
not.)

-pd

> 
> Best,
> Nick
> 
> From: "John Fox" <j...@mcmaster.ca>
> To: "Nick Brown" <nick.br...@free.fr>, "peter dalgaard" <pda...@gmail.com>
> Cc: r-devel@r-project.org
> Sent: Friday, 5 May, 2017 8:22:53 PM
> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
> 
> Dear Nick,
> 
> 
> On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown"
> <r-devel-boun...@r-project.org on behalf of nick.br...@free.fr> wrote:
> 
> >>I conjecture that something in the vicinity of
> >> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
> >>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
> >>summary(res) 
> >> would reproduce the SPSS Beta values.
> >
> >Yes, that works. Thanks!
> 
> That you have to work hard in R to match the SPSS results isn’t such a bad
> thing when you factor in the observation that standardizing the
> interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of
> its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense.
> 
> Best,
>  John
> 
> -
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> Web: http://socserv.mcmaster.ca/jfox/
> 
> 
> > 
> >
> >- Original Message -----
> >
> >From: "peter dalgaard" <pda...@gmail.com>
> >To: "Viechtbauer Wolfgang (SP)"
> ><wolfgang.viechtba...@maastrichtuniversity.nl>, "Nick Brown"
> ><nick.br...@free.fr>
> >Cc: r-devel@r-project.org
> >Sent: Friday, 5 May, 2017 3:33:29 PM
> >Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
> >
> >Thanks, I was getting to try this, but got side tracked by actual work...
> >
> >Your analysis reproduces the SPSS unscaled estimates. It still remains to
> >figure out how Nick got
> >
> >> 
> >coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
> >
> >(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
> >0.07342198 -0.39650356 -0.36569488 -0.09435788
> >
> >
> >which does not match your output. I suspect that ZMEAN_PA and
> >ZDIVERSITY_PA were scaled for this analysis (but the interaction term
> >still obviously is not). I conjecture that something in the vicinity of
> >
> >res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
> >scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
> >summary(res) 
> >
> >would reproduce the SPSS Beta values.
> >
> >
> >> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP)
> >><wolfgang.viechtba...@maastrichtuniversity.nl> wrote:
> >> 
> >> I had no problems running regression models in SPSS and R that yielded
> >>the same results for these data.
> >> 
> >> The difference you are observing is from fitting different models. In
> >>R, you fitted: 
> >> 
> >> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
> >> summary(res) 
> >> 
> >> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
> >>is not a standardized variable itself and not the same as "ZINTER_PA_C"
> >>in the png you showed, which is not a variable in the dataset, but can
> >>be created with: 
> >> 
> >> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
> >> 
> >> If you want the same results as in SPSS, then you need to fit:
> >> 
> >> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C,
>

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread Nick Brown

Hi John, 

Thanks for the comment... but that appears to mean that SPSS has a big problem. 
I have always been told that to include an interaction term in a regression, 
the only way is to do the multiplication by hand. But then it seems to be 
impossible to stop SPSS from re-standardizing the variable that corresponds to 
the interaction term. Am I missing something? Is there a way to perform the 
regression with the interaction in SPSS without computing the interaction as a 
separate variable? 

Best, 
Nick 

- Original Message -

From: "John Fox" <j...@mcmaster.ca> 
To: "Nick Brown" <nick.br...@free.fr>, "peter dalgaard" <pda...@gmail.com> 
Cc: r-devel@r-project.org 
Sent: Friday, 5 May, 2017 8:22:53 PM 
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 

Dear Nick, 


On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown" 
<r-devel-boun...@r-project.org on behalf of nick.br...@free.fr> wrote: 

>>I conjecture that something in the vicinity of 
>> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) + 
>>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat) 
>>summary(res) 
>> would reproduce the SPSS Beta values. 
> 
>Yes, that works. Thanks! 

That you have to work hard in R to match the SPSS results isn’t such a bad 
thing when you factor in the observation that standardizing the 
interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of 
its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense. 

Best, 
John 

- 
John Fox, Professor 
McMaster University 
Hamilton, Ontario, Canada 
Web: http://socserv.mcmaster.ca/jfox/ 


> 
> 
>- Original Message - 
> 
>From: "peter dalgaard" <pda...@gmail.com> 
>To: "Viechtbauer Wolfgang (SP)" 
><wolfgang.viechtba...@maastrichtuniversity.nl>, "Nick Brown" 
><nick.br...@free.fr> 
>Cc: r-devel@r-project.org 
>Sent: Friday, 5 May, 2017 3:33:29 PM 
>Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 
> 
>Thanks, I was getting to try this, but got side tracked by actual work...
> 
>Your analysis reproduces the SPSS unscaled estimates. It still remains to
>figure out how Nick got 
> 
>> 
>coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)) 
> 
>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
>0.07342198 -0.39650356 -0.36569488 -0.09435788 
> 
> 
>which does not match your output. I suspect that ZMEAN_PA and 
>ZDIVERSITY_PA were scaled for this analysis (but the interaction term 
>still obviously is not). I conjecture that something in the vicinity of 
> 
>res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) + 
>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat) 
>summary(res) 
> 
>would reproduce the SPSS Beta values. 
> 
> 
>> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP) 
>><wolfgang.viechtba...@maastrichtuniversity.nl> wrote: 
>> 
>> I had no problems running regression models in SPSS and R that yielded 
>>the same results for these data. 
>> 
>> The difference you are observing is from fitting different models. In 
>>R, you fitted: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat) 
>> summary(res) 
>> 
>> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
>>is not a standardized variable itself and not the same as "ZINTER_PA_C" 
>>in the png you showed, which is not a variable in the dataset, but can 
>>be created with: 
>> 
>> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA)) 
>> 
>> If you want the same results as in SPSS, then you need to fit: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C, 
>>data=dat) 
>> summary(res) 
>> 
>> This yields: 
>> 
>> Coefficients: 
>> Estimate Std. Error t value Pr(>|t|) 
>> (Intercept) 6.41041 0.01722 372.21 <2e-16 *** 
>> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 *** 
>> ZDIVERSITY_PA -1.50082 0.07447 -20.15 <2e-16 *** 
>> ZINTER_PA_C -0.58955 0.05288 -11.15 <2e-16 *** 
>> --- 
>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
>> 
>> Exactly the same as in the png. 
>> 
>> Peter already mentioned this as a possible reason for the discrepancy: 
>>https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it 
>>perhaps the case that x1 and x2 have already been scaled to have 
>>standard deviation 1? In that case, x1*x2 won't be.") 
>> 
>> Best, 
>> Wolfgang 
>> 
>> -Original Message- 
>> From: R-devel [mailt

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread Viechtbauer Wolfgang (SP)

Well, one correction -- the 'standardized coefficients' that SPSS shows are 
based on standardizing all variables separately (so x1, x2, and x1*x2 are all 
standardized). So with respect to that, the criticism certainly stands.

-Original Message-
From: Viechtbauer Wolfgang (SP) 
Sent: Friday, May 05, 2017 22:35
To: r-devel@r-project.org
Subject: RE: [Rd] lm() gives different results to lm.ridge() and SPSS

Totally agree that standardizing the interaction term is nonsense. But in all 
fairness, SPSS doesn't do that. In fact, the 'REGRESSION' command in SPSS 
doesn't compute any interaction terms -- one has to first compute them 'by 
hand' and then add them to the model like any other variable. So somebody 
worked extra hard to standardize that interaction term in SPSS as well :/

Best,
Wolfgang

-Original Message-
From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Fox, John
Sent: Friday, May 05, 2017 20:23
To: Nick Brown; peter dalgaard
Cc: r-devel@r-project.org
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Dear Nick,

On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown"
<r-devel-boun...@r-project.org on behalf of nick.br...@free.fr> wrote:

>>I conjecture that something in the vicinity of
>> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>>summary(res) 
>> would reproduce the SPSS Beta values.
>
>Yes, that works. Thanks!

That you have to work hard in R to match the SPSS results isn’t such a bad
thing when you factor in the observation that standardizing the
interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of
its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense.

Best,
 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/

>- Original Message -
>
>From: "peter dalgaard" <pda...@gmail.com>
>To: "Viechtbauer Wolfgang (SP)"
><wolfgang.viechtba...@maastrichtuniversity.nl>, "Nick Brown"
><nick.br...@free.fr>
>Cc: r-devel@r-project.org
>Sent: Friday, 5 May, 2017 3:33:29 PM
>Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>
>Thanks, I was getting to try this, but got side tracked by actual work...
>
>Your analysis reproduces the SPSS unscaled estimates. It still remains to
>figure out how Nick got
>
>> 
>coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>
>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>0.07342198 -0.39650356 -0.36569488 -0.09435788
>
>
>which does not match your output. I suspect that ZMEAN_PA and
>ZDIVERSITY_PA were scaled for this analysis (but the interaction term
>still obviously is not). I conjecture that something in the vicinity of
>
>res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>summary(res) 
>
>would reproduce the SPSS Beta values.
>
>> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP)
>><wolfgang.viechtba...@maastrichtuniversity.nl> wrote:
>> 
>> I had no problems running regression models in SPSS and R that yielded
>>the same results for these data.
>> 
>> The difference you are observing is from fitting different models. In
>>R, you fitted: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
>> summary(res) 
>> 
>> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
>>is not a standardized variable itself and not the same as "ZINTER_PA_C"
>>in the png you showed, which is not a variable in the dataset, but can
>>be created with: 
>> 
>> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
>> 
>> If you want the same results as in SPSS, then you need to fit:
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C,
>>data=dat) 
>> summary(res) 
>> 
>> This yields: 
>> 
>> Coefficients: 
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 6.41041 0.01722 372.21 <2e-16 ***
>> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 ***
>> ZDIVERSITY_PA -1.50082 0.07447 -20.15 <2e-16 ***
>> ZINTER_PA_C -0.58955 0.05288 -11.15 <2e-16 ***
>> --- 
>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> 
>> Exactly the same as in the png.
>> 
>> Peter already mentioned this as a possible reason for the discrepancy:
>>https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it
>>perhaps the case that x1 and x2 have already been scaled to have
>>standard deviation 1? In that case, x1*x2 won't be.")
>> 
>> Best, 
>> Wolfgang 
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread Viechtbauer Wolfgang (SP)

Totally agree that standardizing the interaction term is nonsense. But in all 
fairness, SPSS doesn't do that. In fact, the 'REGRESSION' command in SPSS 
doesn't compute any interaction terms -- one has to first compute them 'by 
hand' and then add them to the model like any other variable. So somebody 
worked extra hard to standardize that interaction term in SPSS as well :/

Best,
Wolfgang

-Original Message-
From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Fox, John
Sent: Friday, May 05, 2017 20:23
To: Nick Brown; peter dalgaard
Cc: r-devel@r-project.org
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Dear Nick,

On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown"
<r-devel-boun...@r-project.org on behalf of nick.br...@free.fr> wrote:

>>I conjecture that something in the vicinity of
>> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>>summary(res) 
>> would reproduce the SPSS Beta values.
>
>Yes, that works. Thanks!

That you have to work hard in R to match the SPSS results isn’t such a bad
thing when you factor in the observation that standardizing the
interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of
its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense.

Best,
 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/

>- Original Message -
>
>From: "peter dalgaard" <pda...@gmail.com>
>To: "Viechtbauer Wolfgang (SP)"
><wolfgang.viechtba...@maastrichtuniversity.nl>, "Nick Brown"
><nick.br...@free.fr>
>Cc: r-devel@r-project.org
>Sent: Friday, 5 May, 2017 3:33:29 PM
>Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>
>Thanks, I was getting to try this, but got side tracked by actual work...
>
>Your analysis reproduces the SPSS unscaled estimates. It still remains to
>figure out how Nick got
>
>> 
>coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>
>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>0.07342198 -0.39650356 -0.36569488 -0.09435788
>
>
>which does not match your output. I suspect that ZMEAN_PA and
>ZDIVERSITY_PA were scaled for this analysis (but the interaction term
>still obviously is not). I conjecture that something in the vicinity of
>
>res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>summary(res) 
>
>would reproduce the SPSS Beta values.
>
>> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP)
>><wolfgang.viechtba...@maastrichtuniversity.nl> wrote:
>> 
>> I had no problems running regression models in SPSS and R that yielded
>>the same results for these data.
>> 
>> The difference you are observing is from fitting different models. In
>>R, you fitted: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
>> summary(res) 
>> 
>> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
>>is not a standardized variable itself and not the same as "ZINTER_PA_C"
>>in the png you showed, which is not a variable in the dataset, but can
>>be created with: 
>> 
>> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
>> 
>> If you want the same results as in SPSS, then you need to fit:
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C,
>>data=dat) 
>> summary(res) 
>> 
>> This yields: 
>> 
>> Coefficients: 
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 6.41041 0.01722 372.21 <2e-16 ***
>> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 ***
>> ZDIVERSITY_PA -1.50082 0.07447 -20.15 <2e-16 ***
>> ZINTER_PA_C -0.58955 0.05288 -11.15 <2e-16 ***
>> --- 
>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> 
>> Exactly the same as in the png.
>> 
>> Peter already mentioned this as a possible reason for the discrepancy:
>>https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it
>>perhaps the case that x1 and x2 have already been scaled to have
>>standard deviation 1? In that case, x1*x2 won't be.")
>> 
>> Best, 
>> Wolfgang 
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread Fox, John

Dear Nick,


On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown"
<r-devel-boun...@r-project.org on behalf of nick.br...@free.fr> wrote:

>>I conjecture that something in the vicinity of
>> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>>summary(res) 
>> would reproduce the SPSS Beta values.
>
>Yes, that works. Thanks!

That you have to work hard in R to match the SPSS results isn’t such a bad
thing when you factor in the observation that standardizing the
interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of
its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense.

Best,
 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/


> 
>
>- Original Message -
>
>From: "peter dalgaard" <pda...@gmail.com>
>To: "Viechtbauer Wolfgang (SP)"
><wolfgang.viechtba...@maastrichtuniversity.nl>, "Nick Brown"
><nick.br...@free.fr>
>Cc: r-devel@r-project.org
>Sent: Friday, 5 May, 2017 3:33:29 PM
>Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>
>Thanks, I was getting to try this, but got side tracked by actual work...
>
>Your analysis reproduces the SPSS unscaled estimates. It still remains to
>figure out how Nick got
>
>> 
>coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>
>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>0.07342198 -0.39650356 -0.36569488 -0.09435788
>
>
>which does not match your output. I suspect that ZMEAN_PA and
>ZDIVERSITY_PA were scaled for this analysis (but the interaction term
>still obviously is not). I conjecture that something in the vicinity of
>
>res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>summary(res) 
>
>would reproduce the SPSS Beta values.
>
>
>> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP)
>><wolfgang.viechtba...@maastrichtuniversity.nl> wrote:
>> 
>> I had no problems running regression models in SPSS and R that yielded
>>the same results for these data.
>> 
>> The difference you are observing is from fitting different models. In
>>R, you fitted: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
>> summary(res) 
>> 
>> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
>>is not a standardized variable itself and not the same as "ZINTER_PA_C"
>>in the png you showed, which is not a variable in the dataset, but can
>>be created with: 
>> 
>> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
>> 
>> If you want the same results as in SPSS, then you need to fit:
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C,
>>data=dat) 
>> summary(res) 
>> 
>> This yields: 
>> 
>> Coefficients: 
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 6.41041 0.01722 372.21 <2e-16 ***
>> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 ***
>> ZDIVERSITY_PA -1.50082 0.07447 -20.15 <2e-16 ***
>> ZINTER_PA_C -0.58955 0.05288 -11.15 <2e-16 ***
>> --- 
>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> 
>> Exactly the same as in the png.
>> 
>> Peter already mentioned this as a possible reason for the discrepancy:
>>https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it
>>perhaps the case that x1 and x2 have already been scaled to have
>>standard deviation 1? In that case, x1*x2 won't be.")
>> 
>> Best, 
>> Wolfgang 
>> 
>> -Original Message-
>> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Nick
>>Brown 
>> Sent: Friday, May 05, 2017 10:40
>> To: peter dalgaard
>> Cc: r-devel@r-project.org
>> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>> 
>> Hi, 
>> 
>> Here is (I hope) all the relevant output from R.
>> 
>>> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 >
>>>mean(s1$ZDIVERSITY_PA, na.rm=T) [1] -9.660583e-16 > mean(s1$ZMEAN_PA,
>>>na.rm=T) [1] -5.430282e-15 > lm.ridge(ZDEPRESSION ~ ZMEAN_PA *
>>>ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA ZDIVERSITY_PA
>>>ZMEAN_PA:ZDIVERSITY_PA
>> -0.3962254 -0.3636026 -0.1425772 ## This is what I thought was the
>>problem originally. :-)
>> 
>> 
>>> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>>>(Intercept) Z

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread Nick Brown

>I conjecture that something in the vicinity of 
> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) + 
> scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat) 
>summary(res) 
> would reproduce the SPSS Beta values. 

Yes, that works. Thanks! 

- Original Message -

From: "peter dalgaard" <pda...@gmail.com> 
To: "Viechtbauer Wolfgang (SP)" <wolfgang.viechtba...@maastrichtuniversity.nl>, 
"Nick Brown" <nick.br...@free.fr> 
Cc: r-devel@r-project.org 
Sent: Friday, 5 May, 2017 3:33:29 PM 
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 

Thanks, I was getting to try this, but got side tracked by actual work... 

Your analysis reproduces the SPSS unscaled estimates. It still remains to 
figure out how Nick got 

> 
coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)) 

(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
0.07342198 -0.39650356 -0.36569488 -0.09435788 


which does not match your output. I suspect that ZMEAN_PA and ZDIVERSITY_PA 
were scaled for this analysis (but the interaction term still obviously is 
not). I conjecture that something in the vicinity of 

res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) + scale(ZMEAN_PA 
* ZDIVERSITY_PA), data=dat) 
summary(res) 

would reproduce the SPSS Beta values. 


> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP) 
> <wolfgang.viechtba...@maastrichtuniversity.nl> wrote: 
> 
> I had no problems running regression models in SPSS and R that yielded the 
> same results for these data. 
> 
> The difference you are observing is from fitting different models. In R, you 
> fitted: 
> 
> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat) 
> summary(res) 
> 
> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This is 
> not a standardized variable itself and not the same as "ZINTER_PA_C" in the 
> png you showed, which is not a variable in the dataset, but can be created 
> with: 
> 
> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA)) 
> 
> If you want the same results as in SPSS, then you need to fit: 
> 
> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C, data=dat) 
> summary(res) 
> 
> This yields: 
> 
> Coefficients: 
> Estimate Std. Error t value Pr(>|t|) 
> (Intercept) 6.41041 0.01722 372.21 <2e-16 *** 
> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 *** 
> ZDIVERSITY_PA -1.50082 0.07447 -20.15 <2e-16 *** 
> ZINTER_PA_C -0.58955 0.05288 -11.15 <2e-16 *** 
> --- 
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> 
> Exactly the same as in the png. 
> 
> Peter already mentioned this as a possible reason for the discrepancy: 
> https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it perhaps 
> the case that x1 and x2 have already been scaled to have standard deviation 
> 1? In that case, x1*x2 won't be.") 
> 
> Best, 
> Wolfgang 
> 
> -----Original Message- 
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Nick Brown 
> Sent: Friday, May 05, 2017 10:40 
> To: peter dalgaard 
> Cc: r-devel@r-project.org 
> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 
> 
> Hi, 
> 
> Here is (I hope) all the relevant output from R. 
> 
>> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 > mean(s1$ZDIVERSITY_PA, 
>> na.rm=T) [1] -9.660583e-16 > mean(s1$ZMEAN_PA, na.rm=T) [1] -5.430282e-15 > 
>> lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA 
>> ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
> -0.3962254 -0.3636026 -0.1425772 ## This is what I thought was the problem 
> originally. :-) 
> 
> 
>> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)) 
>> (Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
> 0.07342198 -0.39650356 -0.36569488 -0.09435788 > 
> coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)) 
> ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
> 0.07342198 -0.39650356 -0.36569488 -0.09435788 The equivalent from SPSS is 
> attached. The unstandardized coefficients in SPSS look nothing like those in 
> R. The standardized coefficients in SPSS match the lm.ridge()$coef numbers 
> very closely indeed, suggesting that the same algorithm may be in use. 
> 
> I have put the dataset file, which is the untouched original I received from 
> the authors, in this Dropbox folder: 
> https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0. 
> You can read it into R with this code (one variable needs to be standardized 
> and centered; everything else is already in the file): 
> 
> s1 <- read.csv("Emodiversity_Study1.csv", s

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread peter dalgaard

Thanks, I was getting to try this, but got side tracked by actual work...

Your analysis reproduces the SPSS unscaled estimates. It still remains to 
figure out how Nick got

> 
coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))

   (Intercept)   ZMEAN_PA  ZDIVERSITY_PA 
ZMEAN_PA:ZDIVERSITY_PA 
0.07342198-0.39650356-0.36569488
-0.09435788 


which does not match your output. I suspect that ZMEAN_PA and ZDIVERSITY_PA 
were scaled for this analysis (but the interaction term still obviously is 
not). I conjecture that something in the vicinity of

res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) + scale(ZMEAN_PA 
* ZDIVERSITY_PA), data=dat)
summary(res)

would reproduce the SPSS Beta values.


> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP) 
> <wolfgang.viechtba...@maastrichtuniversity.nl> wrote:
> 
> I had no problems running regression models in SPSS and R that yielded the 
> same results for these data.
> 
> The difference you are observing is from fitting different models. In R, you 
> fitted:
> 
> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
> summary(res)
> 
> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This is 
> not a standardized variable itself and not the same as "ZINTER_PA_C" in the 
> png you showed, which is not a variable in the dataset, but can be created 
> with:
> 
> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
> 
> If you want the same results as in SPSS, then you need to fit:
> 
> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C, data=dat)
> summary(res)
> 
> This yields:
> 
> Coefficients:
>  Estimate Std. Error t value Pr(>|t|)
> (Intercept)6.410410.01722  372.21   <2e-16 ***
> ZMEAN_PA  -1.627260.04200  -38.74   <2e-16 ***
> ZDIVERSITY_PA -1.500820.07447  -20.15   <2e-16 ***
> ZINTER_PA_C   -0.589550.05288  -11.15   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> Exactly the same as in the png.
> 
> Peter already mentioned this as a possible reason for the discrepancy: 
> https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it perhaps 
> the case that x1 and x2 have already been scaled to have standard deviation 
> 1? In that case, x1*x2 won't be.")
> 
> Best,
> Wolfgang
> 
> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Nick Brown
> Sent: Friday, May 05, 2017 10:40
> To: peter dalgaard
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
> 
> Hi, 
> 
> Here is (I hope) all the relevant output from R. 
> 
>> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 > mean(s1$ZDIVERSITY_PA, 
>> na.rm=T) [1] -9.660583e-16 > mean(s1$ZMEAN_PA, na.rm=T) [1] -5.430282e-15 > 
>> lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA  
>> ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
>-0.3962254 -0.3636026 -0.1425772  ## 
> This is what I thought was the problem originally. :-) 
> 
> 
>> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)) 
>> (Intercept)   ZMEAN_PA  ZDIVERSITY_PA 
>> ZMEAN_PA:ZDIVERSITY_PA 
>0.07342198-0.39650356-0.36569488   
>  -0.09435788 > coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, 
> data=s1)) ZMEAN_PA  ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
>0.07342198-0.39650356-0.36569488   
>  -0.09435788 The equivalent from SPSS is attached. The unstandardized 
> coefficients in SPSS look nothing like those in R. The standardized 
> coefficients in SPSS match the lm.ridge()$coef numbers very closely indeed, 
> suggesting that the same algorithm may be in use. 
> 
> I have put the dataset file, which is the untouched original I received from 
> the authors, in this Dropbox folder: 
> https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0. 
> You can read it into R with this code (one variable needs to be standardized 
> and centered; everything else is already in the file): 
> 
> s1 <- read.csv("Emodiversity_Study1.csv", stringsAsFactors=FALSE) 
> s1$ZDEPRESSION <- scale(s1$DEPRESSION) 
> Hey, maybe R is fine and I've stumbled on a bug in SPSS? If so, I'm sure IBM 
> will want to fix it quickly (ha ha ha). 
> 
> Nick 
> 
> - Original Message -
> 
> From: "peter dalgaard" <pda...@gmail.com> 
> To: "Nick Brown

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread Viechtbauer Wolfgang (SP)

I had no problems running regression models in SPSS and R that yielded the same 
results for these data.

The difference you are observing is from fitting different models. In R, you 
fitted:

res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
summary(res)

The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This is not 
a standardized variable itself and not the same as "ZINTER_PA_C" in the png you 
showed, which is not a variable in the dataset, but can be created with:

dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))

If you want the same results as in SPSS, then you need to fit:

res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C, data=dat)
summary(res)

This yields:

Coefficients:
  Estimate Std. Error t value Pr(>|t|)
(Intercept)6.410410.01722  372.21   <2e-16 ***
ZMEAN_PA  -1.627260.04200  -38.74   <2e-16 ***
ZDIVERSITY_PA -1.500820.07447  -20.15   <2e-16 ***
ZINTER_PA_C   -0.589550.05288  -11.15   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Exactly the same as in the png.

Peter already mentioned this as a possible reason for the discrepancy: 
https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it perhaps the 
case that x1 and x2 have already been scaled to have standard deviation 1? In 
that case, x1*x2 won't be.")

Best,
Wolfgang

-Original Message-
From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Nick Brown
Sent: Friday, May 05, 2017 10:40
To: peter dalgaard
Cc: r-devel@r-project.org
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Hi, 

Here is (I hope) all the relevant output from R. 

> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 > mean(s1$ZDIVERSITY_PA, 
> na.rm=T) [1] -9.660583e-16 > mean(s1$ZMEAN_PA, na.rm=T) [1] -5.430282e-15 > 
> lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA   
>ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
-0.3962254 -0.3636026 -0.1425772  ## 
This is what I thought was the problem originally. :-) 

> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)) (Intercept) 
>   ZMEAN_PA  ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
0.07342198-0.39650356-0.36569488
-0.09435788 > coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, 
data=s1)) ZMEAN_PA  ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
0.07342198-0.39650356-0.36569488
-0.09435788 The equivalent from SPSS is attached. The unstandardized 
coefficients in SPSS look nothing like those in R. The standardized 
coefficients in SPSS match the lm.ridge()$coef numbers very closely indeed, 
suggesting that the same algorithm may be in use. 

I have put the dataset file, which is the untouched original I received from 
the authors, in this Dropbox folder: 
https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0. You 
can read it into R with this code (one variable needs to be standardized and 
centered; everything else is already in the file): 

s1 <- read.csv("Emodiversity_Study1.csv", stringsAsFactors=FALSE) 
s1$ZDEPRESSION <- scale(s1$DEPRESSION) 
Hey, maybe R is fine and I've stumbled on a bug in SPSS? If so, I'm sure IBM 
will want to fix it quickly (ha ha ha). 

Nick 

- Original Message -

From: "peter dalgaard" <pda...@gmail.com> 
To: "Nick Brown" <nick.br...@free.fr> 
Cc: "Simon Bonner" <sbonn...@uwo.ca>, r-devel@r-project.org 
Sent: Friday, 5 May, 2017 10:02:10 AM 
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 

I asked you before, but in case you missed it: Are you looking at the right 
place in SPSS output? 

The UNstandardized coefficients should be comparable to R, i.e. the "B" column, 
not "Beta". 

-pd 

> On 5 May 2017, at 01:58 , Nick Brown <nick.br...@free.fr> wrote: 
> 
> Hi Simon, 
> 
> Yes, if I uses coefficients() I get the same results for lm() and lm.ridge(). 
> So that's consistent, at least. 
> 
> Interestingly, the "wrong" number I get from lm.ridge()$coef agrees with the 
> value from SPSS to 5dp, which is an interesting coincidence if these numbers 
> have no particular external meaning in lm.ridge(). 
> 
> Kind regards, 
> Nick 
> 
> ----- Original Message - 
> 
> From: "Simon Bonner" <sbonn...@uwo.ca> 
> To: "Nick Brown" <nick.br...@free.fr>, r-devel@r-project.org 
> Sent: Thursday, 4 May, 2017 7:07:33 PM 
> Subject: RE: [Rd] lm() gives different results to lm.ridge() and SPSS 
> 
> Hi Nick, 
> 
> I think that the problem here is your use of $coef to extract the 
> coefficients of

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread Nick Brown

Hi, 

Here is (I hope) all the relevant output from R. 

> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 > mean(s1$ZDIVERSITY_PA, 
> na.rm=T) [1] -9.660583e-16 > mean(s1$ZMEAN_PA, na.rm=T) [1] -5.430282e-15 > 
> lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA   
>ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
-0.3962254 -0.3636026 -0.1425772  ## 
This is what I thought was the problem originally. :-) 


> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1)) (Intercept) 
>   ZMEAN_PA  ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
0.07342198-0.39650356-0.36569488
-0.09435788 > coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, 
data=s1)) ZMEAN_PA  ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA 
0.07342198-0.39650356-0.36569488
-0.09435788 The equivalent from SPSS is attached. The unstandardized 
coefficients in SPSS look nothing like those in R. The standardized 
coefficients in SPSS match the lm.ridge()$coef numbers very closely indeed, 
suggesting that the same algorithm may be in use. 

I have put the dataset file, which is the untouched original I received from 
the authors, in this Dropbox folder: 
https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0. You 
can read it into R with this code (one variable needs to be standardized and 
centered; everything else is already in the file): 

s1 <- read.csv("Emodiversity_Study1.csv", stringsAsFactors=FALSE) 
s1$ZDEPRESSION <- scale(s1$DEPRESSION) 
Hey, maybe R is fine and I've stumbled on a bug in SPSS? If so, I'm sure IBM 
will want to fix it quickly (ha ha ha). 

Nick 



- Original Message -

From: "peter dalgaard" <pda...@gmail.com> 
To: "Nick Brown" <nick.br...@free.fr> 
Cc: "Simon Bonner" <sbonn...@uwo.ca>, r-devel@r-project.org 
Sent: Friday, 5 May, 2017 10:02:10 AM 
Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS 

I asked you before, but in case you missed it: Are you looking at the right 
place in SPSS output? 

The UNstandardized coefficients should be comparable to R, i.e. the "B" column, 
not "Beta". 

-pd 

> On 5 May 2017, at 01:58 , Nick Brown <nick.br...@free.fr> wrote: 
> 
> Hi Simon, 
> 
> Yes, if I uses coefficients() I get the same results for lm() and lm.ridge(). 
> So that's consistent, at least. 
> 
> Interestingly, the "wrong" number I get from lm.ridge()$coef agrees with the 
> value from SPSS to 5dp, which is an interesting coincidence if these numbers 
> have no particular external meaning in lm.ridge(). 
> 
> Kind regards, 
> Nick 
> 
> - Original Message ----- 
> 
> From: "Simon Bonner" <sbonn...@uwo.ca> 
> To: "Nick Brown" <nick.br...@free.fr>, r-devel@r-project.org 
> Sent: Thursday, 4 May, 2017 7:07:33 PM 
> Subject: RE: [Rd] lm() gives different results to lm.ridge() and SPSS 
> 
> Hi Nick, 
> 
> I think that the problem here is your use of $coef to extract the 
> coefficients of the ridge regression. The help for lm.ridge states that coef 
> is a "matrix of coefficients, one row for each value of lambda. Note that 
> these are not on the original scale and are for use by the coef method." 
> 
> I ran a small test with simulated data, code is copied below, and indeed the 
> output from lm.ridge differs depending on whether the coefficients are 
> accessed via $coef or via the coefficients() function. The latter does 
> produce results that match the output from lm. 
> 
> I hope that helps. 
> 
> Cheers, 
> 
> Simon 
> 
> ## Load packages 
> library(MASS) 
> 
> ## Set seed 
> set.seed() 
> 
> ## Set parameters 
> n <- 100 
> beta <- c(1,0,1) 
> sigma <- .5 
> rho <- .75 
> 
> ## Simulate correlated covariates 
> Sigma <- matrix(c(1,rho,rho,1),ncol=2) 
> X <- mvrnorm(n,c(0,0),Sigma=Sigma) 
> 
> ## Simulate data 
> mu <- beta[1] + X %*% beta[-1] 
> y <- rnorm(n,mu,sigma) 
> 
> ## Fit model with lm() 
> fit1 <- lm(y ~ X) 
> 
> ## Fit model with lm.ridge() 
> fit2 <- lm.ridge(y ~ X) 
> 
> ## Compare coefficients 
> cbind(fit1$coefficients,c(NA,fit2$coef),coefficients(fit2)) 
> 
> [,1] [,2] [,3] 
> (Intercept) 0.99276001 NA 0.99276001 
> X1 -0.03980772 -0.04282391 -0.03980772 
> X2 1.11167179 1.06200476 1.11167179 
> 
> -- 
> 
> Simon Bonner 
> Assistant Professor of Environmetrics/ Director MMASc 
> Department of Statistical and Actuarial Sciences/Department of Biology 
> University of Western Ontario 
> 
> Office: Western Science Centre rm 276

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread peter dalgaard

I asked you before, but in case you missed it: Are you looking at the right 
place in SPSS output?

The UNstandardized coefficients should be comparable to R, i.e. the "B" column, 
not "Beta".

-pd

> On 5 May 2017, at 01:58 , Nick Brown <nick.br...@free.fr> wrote:
> 
> Hi Simon, 
> 
> Yes, if I uses coefficients() I get the same results for lm() and lm.ridge(). 
> So that's consistent, at least. 
> 
> Interestingly, the "wrong" number I get from lm.ridge()$coef agrees with the 
> value from SPSS to 5dp, which is an interesting coincidence if these numbers 
> have no particular external meaning in lm.ridge(). 
> 
> Kind regards, 
> Nick 
> 
> - Original Message -
> 
> From: "Simon Bonner" <sbonn...@uwo.ca> 
> To: "Nick Brown" <nick.br...@free.fr>, r-devel@r-project.org 
> Sent: Thursday, 4 May, 2017 7:07:33 PM 
> Subject: RE: [Rd] lm() gives different results to lm.ridge() and SPSS 
> 
> Hi Nick, 
> 
> I think that the problem here is your use of $coef to extract the 
> coefficients of the ridge regression. The help for lm.ridge states that coef 
> is a "matrix of coefficients, one row for each value of lambda. Note that 
> these are not on the original scale and are for use by the coef method." 
> 
> I ran a small test with simulated data, code is copied below, and indeed the 
> output from lm.ridge differs depending on whether the coefficients are 
> accessed via $coef or via the coefficients() function. The latter does 
> produce results that match the output from lm. 
> 
> I hope that helps. 
> 
> Cheers, 
> 
> Simon 
> 
> ## Load packages 
> library(MASS) 
> 
> ## Set seed 
> set.seed() 
> 
> ## Set parameters 
> n <- 100 
> beta <- c(1,0,1) 
> sigma <- .5 
> rho <- .75 
> 
> ## Simulate correlated covariates 
> Sigma <- matrix(c(1,rho,rho,1),ncol=2) 
> X <- mvrnorm(n,c(0,0),Sigma=Sigma) 
> 
> ## Simulate data 
> mu <- beta[1] + X %*% beta[-1] 
> y <- rnorm(n,mu,sigma) 
> 
> ## Fit model with lm() 
> fit1 <- lm(y ~ X) 
> 
> ## Fit model with lm.ridge() 
> fit2 <- lm.ridge(y ~ X) 
> 
> ## Compare coefficients 
> cbind(fit1$coefficients,c(NA,fit2$coef),coefficients(fit2)) 
> 
> [,1] [,2] [,3] 
> (Intercept) 0.99276001 NA 0.99276001 
> X1 -0.03980772 -0.04282391 -0.03980772 
> X2 1.11167179 1.06200476 1.11167179 
> 
> -- 
> 
> Simon Bonner 
> Assistant Professor of Environmetrics/ Director MMASc 
> Department of Statistical and Actuarial Sciences/Department of Biology 
> University of Western Ontario 
> 
> Office: Western Science Centre rm 276 
> 
> Email: sbonn...@uwo.ca | Telephone: 519-661-2111 x88205 | Fax: 519-661-3813 
> Twitter: @bonnerstatslab | Website: 
> http://simon.bonners.ca/bonner-lab/wpblog/ 
> 
>> -Original Message- 
>> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Nick 
>> Brown 
>> Sent: May 4, 2017 10:29 AM 
>> To: r-devel@r-project.org 
>> Subject: [Rd] lm() gives different results to lm.ridge() and SPSS 
>> 
>> Hallo, 
>> 
>> I hope I am posting to the right place. I was advised to try this list by 
>> Ben Bolker 
>> (https://twitter.com/bolkerb/status/859909918446497795). I also posted this 
>> question to StackOverflow 
>> (http://stackoverflow.com/questions/43771269/lm-gives-different-results- 
>> from-lm-ridgelambda-0). I am a relative newcomer to R, but I wrote my first 
>> program in 1975 and have been paid to program in about 15 different 
>> languages, so I have some general background knowledge. 
>> 
>> 
>> I have a regression from which I extract the coefficients like this: 
>> lm(y ~ x1 * x2, data=ds)$coef 
>> That gives: x1=0.40, x2=0.37, x1*x2=0.09 
>> 
>> 
>> 
>> When I do the same regression in SPSS, I get: 
>> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14. 
>> So the main effects are in agreement, but there is quite a difference in the 
>> coefficient for the interaction. 
>> 
>> 
>> X1 and X2 are correlated about .75 (yes, yes, I know - this model wasn't my 
>> idea, but it got published), so there is quite possibly something going on 
>> with 
>> collinearity. So I thought I'd try lm.ridge() to see if I can get an idea of 
>> where 
>> the problems are occurring. 
>> 
>> 
>> The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge 
>> penalty) and 
>> check we get the same results as with lm(): 
>> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef 
>> x1=0.40, x2=0.37, x1*x2=0.14 
>>

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-04 Thread Nick Brown

Hi Simon, 

Yes, if I uses coefficients() I get the same results for lm() and lm.ridge(). 
So that's consistent, at least. 

Interestingly, the "wrong" number I get from lm.ridge()$coef agrees with the 
value from SPSS to 5dp, which is an interesting coincidence if these numbers 
have no particular external meaning in lm.ridge(). 

Kind regards, 
Nick 

- Original Message -

From: "Simon Bonner" <sbonn...@uwo.ca> 
To: "Nick Brown" <nick.br...@free.fr>, r-devel@r-project.org 
Sent: Thursday, 4 May, 2017 7:07:33 PM 
Subject: RE: [Rd] lm() gives different results to lm.ridge() and SPSS 

Hi Nick, 

I think that the problem here is your use of $coef to extract the coefficients 
of the ridge regression. The help for lm.ridge states that coef is a "matrix of 
coefficients, one row for each value of lambda. Note that these are not on the 
original scale and are for use by the coef method." 

I ran a small test with simulated data, code is copied below, and indeed the 
output from lm.ridge differs depending on whether the coefficients are accessed 
via $coef or via the coefficients() function. The latter does produce results 
that match the output from lm. 

I hope that helps. 

Cheers, 

Simon 

## Load packages 
library(MASS) 

## Set seed 
set.seed() 

## Set parameters 
n <- 100 
beta <- c(1,0,1) 
sigma <- .5 
rho <- .75 

## Simulate correlated covariates 
Sigma <- matrix(c(1,rho,rho,1),ncol=2) 
X <- mvrnorm(n,c(0,0),Sigma=Sigma) 

## Simulate data 
mu <- beta[1] + X %*% beta[-1] 
y <- rnorm(n,mu,sigma) 

## Fit model with lm() 
fit1 <- lm(y ~ X) 

## Fit model with lm.ridge() 
fit2 <- lm.ridge(y ~ X) 

## Compare coefficients 
cbind(fit1$coefficients,c(NA,fit2$coef),coefficients(fit2)) 

[,1] [,2] [,3] 
(Intercept) 0.99276001 NA 0.99276001 
X1 -0.03980772 -0.04282391 -0.03980772 
X2 1.11167179 1.06200476 1.11167179 

-- 

Simon Bonner 
Assistant Professor of Environmetrics/ Director MMASc 
Department of Statistical and Actuarial Sciences/Department of Biology 
University of Western Ontario 

Office: Western Science Centre rm 276 

Email: sbonn...@uwo.ca | Telephone: 519-661-2111 x88205 | Fax: 519-661-3813 
Twitter: @bonnerstatslab | Website: http://simon.bonners.ca/bonner-lab/wpblog/ 

> -Original Message- 
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Nick 
> Brown 
> Sent: May 4, 2017 10:29 AM 
> To: r-devel@r-project.org 
> Subject: [Rd] lm() gives different results to lm.ridge() and SPSS 
> 
> Hallo, 
> 
> I hope I am posting to the right place. I was advised to try this list by Ben 
> Bolker 
> (https://twitter.com/bolkerb/status/859909918446497795). I also posted this 
> question to StackOverflow 
> (http://stackoverflow.com/questions/43771269/lm-gives-different-results- 
> from-lm-ridgelambda-0). I am a relative newcomer to R, but I wrote my first 
> program in 1975 and have been paid to program in about 15 different 
> languages, so I have some general background knowledge. 
> 
> 
> I have a regression from which I extract the coefficients like this: 
> lm(y ~ x1 * x2, data=ds)$coef 
> That gives: x1=0.40, x2=0.37, x1*x2=0.09 
> 
> 
> 
> When I do the same regression in SPSS, I get: 
> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14. 
> So the main effects are in agreement, but there is quite a difference in the 
> coefficient for the interaction. 
> 
> 
> X1 and X2 are correlated about .75 (yes, yes, I know - this model wasn't my 
> idea, but it got published), so there is quite possibly something going on 
> with 
> collinearity. So I thought I'd try lm.ridge() to see if I can get an idea of 
> where 
> the problems are occurring. 
> 
> 
> The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge 
> penalty) and 
> check we get the same results as with lm(): 
> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef 
> x1=0.40, x2=0.37, x1*x2=0.14 
> So lm.ridge() agrees with SPSS, but not with lm(). (Of course, lambda=0 is 
> the 
> default, so it can be omitted; I can alternate between including or deleting 
> ".ridge" in the function call, and watch the coefficient for the interaction 
> change.) 
> 
> 
> 
> What seems slightly strange to me here is that I assumed that lm.ridge() just 
> piggybacks on lm() anyway, so in the specific case where lambda=0 and there 
> is no "ridging" to do, I'd expect exactly the same results. 
> 
> 
> Unfortunately there are 34,000 cases in the dataset, so a "minimal" reprex 
> will 
> not be easy to make, but I can share the data via Dropbox or something if 
> that 
> would help. 
> 
> 
> 
> I appreciate that when there is strong collinearity then all bets are off in 
> terms 
> of what the betas mean, but I

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-04 Thread Simon Bonner

Hi Nick,

I think that the problem here is your use of $coef to extract the coefficients 
of the ridge regression. The help for lm.ridge states that coef is a "matrix of 
coefficients, one row for each value of lambda. Note that these are not on the 
original scale and are for use by the coef method."

I ran a small test with simulated data, code is copied below, and indeed the 
output from lm.ridge differs depending on whether the coefficients are accessed 
via $coef or via the coefficients() function. The latter does produce results 
that match the output from lm.

I hope that helps.

Cheers,

Simon

## Load packages
library(MASS)

## Set seed 
set.seed()

## Set parameters
n <- 100
beta <- c(1,0,1)
sigma <- .5
rho <- .75

## Simulate correlated covariates
Sigma <- matrix(c(1,rho,rho,1),ncol=2)
X <- mvrnorm(n,c(0,0),Sigma=Sigma)

## Simulate data
mu <- beta[1] + X %*% beta[-1]
y <- rnorm(n,mu,sigma)

## Fit model with lm()
fit1 <- lm(y ~ X)

## Fit model with lm.ridge()
fit2 <- lm.ridge(y ~ X)

## Compare coefficients
cbind(fit1$coefficients,c(NA,fit2$coef),coefficients(fit2))

   [,1][,2][,3]
(Intercept)  0.99276001  NA  0.99276001
X1  -0.03980772 -0.04282391 -0.03980772
X2   1.11167179  1.06200476  1.11167179

--

Simon Bonner
Assistant Professor of Environmetrics/ Director MMASc 
Department of Statistical and Actuarial Sciences/Department of Biology 
University of Western Ontario

Office: Western Science Centre rm 276

Email: sbonn...@uwo.ca | Telephone: 519-661-2111 x88205 | Fax: 519-661-3813
Twitter: @bonnerstatslab | Website: http://simon.bonners.ca/bonner-lab/wpblog/

> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Nick
> Brown
> Sent: May 4, 2017 10:29 AM
> To: r-devel@r-project.org
> Subject: [Rd] lm() gives different results to lm.ridge() and SPSS
> 
> Hallo,
> 
> I hope I am posting to the right place. I was advised to try this list by Ben 
> Bolker
> (https://twitter.com/bolkerb/status/859909918446497795). I also posted this
> question to StackOverflow
> (http://stackoverflow.com/questions/43771269/lm-gives-different-results-
> from-lm-ridgelambda-0). I am a relative newcomer to R, but I wrote my first
> program in 1975 and have been paid to program in about 15 different
> languages, so I have some general background knowledge.
> 
> 
> I have a regression from which I extract the coefficients like this:
> lm(y ~ x1 * x2, data=ds)$coef
> That gives: x1=0.40, x2=0.37, x1*x2=0.09
> 
> 
> 
> When I do the same regression in SPSS, I get:
> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14.
> So the main effects are in agreement, but there is quite a difference in the
> coefficient for the interaction.
> 
> 
> X1 and X2 are correlated about .75 (yes, yes, I know - this model wasn't my
> idea, but it got published), so there is quite possibly something going on 
> with
> collinearity. So I thought I'd try lm.ridge() to see if I can get an idea of 
> where
> the problems are occurring.
> 
> 
> The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge 
> penalty) and
> check we get the same results as with lm():
> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef
> x1=0.40, x2=0.37, x1*x2=0.14
> So lm.ridge() agrees with SPSS, but not with lm(). (Of course, lambda=0 is the
> default, so it can be omitted; I can alternate between including or deleting
> ".ridge" in the function call, and watch the coefficient for the interaction
> change.)
> 
> 
> 
> What seems slightly strange to me here is that I assumed that lm.ridge() just
> piggybacks on lm() anyway, so in the specific case where lambda=0 and there
> is no "ridging" to do, I'd expect exactly the same results.
> 
> 
> Unfortunately there are 34,000 cases in the dataset, so a "minimal" reprex 
> will
> not be easy to make, but I can share the data via Dropbox or something if that
> would help.
> 
> 
> 
> I appreciate that when there is strong collinearity then all bets are off in 
> terms
> of what the betas mean, but I would really expect lm() and lm.ridge() to give
> the same results. (I would be happy to ignore SPSS, but for the moment it's
> part of the majority!)
> 
> 
> 
> Thanks for reading,
> Nick
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-04 Thread peter dalgaard

Um, the link to StackOverflow does not seem to contain the same question. It 
does contain a stern warning not to use the $coef component of lm.ridge...

Is it perhaps the case that x1 and x2 have already been scaled to have standard 
deviation 1? In that case, x1*x2 won't be.

Also notice that SPSS tends to use "Beta" for standardized regression 
coefficients, and (AFAIR) "b" for the regular ones.

-pd

> On 4 May 2017, at 16:28 , Nick Brown  wrote:
> 
> Hallo, 
> 
> I hope I am posting to the right place. I was advised to try this list by Ben 
> Bolker (https://twitter.com/bolkerb/status/859909918446497795). I also posted 
> this question to StackOverflow 
> (http://stackoverflow.com/questions/43771269/lm-gives-different-results-from-lm-ridgelambda-0).
>  I am a relative newcomer to R, but I wrote my first program in 1975 and have 
> been paid to program in about 15 different languages, so I have some general 
> background knowledge. 
> 
> 
> I have a regression from which I extract the coefficients like this: 
> lm(y ~ x1 * x2, data=ds)$coef 
> That gives: x1=0.40, x2=0.37, x1*x2=0.09 
> 
> 
> 
> When I do the same regression in SPSS, I get: 
> beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14. 
> So the main effects are in agreement, but there is quite a difference in the 
> coefficient for the interaction. 
> 
> 
> X1 and X2 are correlated about .75 (yes, yes, I know - this model wasn't my 
> idea, but it got published), so there is quite possibly something going on 
> with collinearity. So I thought I'd try lm.ridge() to see if I can get an 
> idea of where the problems are occurring. 
> 
> 
> The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge 
> penalty) and check we get the same results as with lm(): 
> lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef 
> x1=0.40, x2=0.37, x1*x2=0.14 
> So lm.ridge() agrees with SPSS, but not with lm(). (Of course, lambda=0 is 
> the default, so it can be omitted; I can alternate between including or 
> deleting ".ridge" in the function call, and watch the coefficient for the 
> interaction change.) 
> 
> 
> 
> What seems slightly strange to me here is that I assumed that lm.ridge() just 
> piggybacks on lm() anyway, so in the specific case where lambda=0 and there 
> is no "ridging" to do, I'd expect exactly the same results. 
> 
> 
> Unfortunately there are 34,000 cases in the dataset, so a "minimal" reprex 
> will not be easy to make, but I can share the data via Dropbox or something 
> if that would help. 
> 
> 
> 
> I appreciate that when there is strong collinearity then all bets are off in 
> terms of what the betas mean, but I would really expect lm() and lm.ridge() 
> to give the same results. (I would be happy to ignore SPSS, but for the 
> moment it's part of the majority!) 
> 
> 
> 
> Thanks for reading, 
> Nick 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-04 Thread Duncan Murdoch


On 04/05/2017 10:28 AM, Nick Brown wrote:

Hallo,

I hope I am posting to the right place. I was advised to try this list by Ben 
Bolker (https://twitter.com/bolkerb/status/859909918446497795). I also posted 
this question to StackOverflow 
(http://stackoverflow.com/questions/43771269/lm-gives-different-results-from-lm-ridgelambda-0).
 I am a relative newcomer to R, but I wrote my first program in 1975 and have 
been paid to program in about 15 different languages, so I have some general 
background knowledge.


I have a regression from which I extract the coefficients like this:
lm(y ~ x1 * x2, data=ds)$coef
That gives: x1=0.40, x2=0.37, x1*x2=0.09



When I do the same regression in SPSS, I get:
beta(x1)=0.40, beta(x2)=0.37, beta(x1*x2)=0.14.
So the main effects are in agreement, but there is quite a difference in the 
coefficient for the interaction.


I don't know about this instance, but a common cause of this sort of 
difference is a different parametrization.  If that's the case, then 
predictions in the two systems would match, even if coefficients don't.


Duncan Murdoch




X1 and X2 are correlated about .75 (yes, yes, I know - this model wasn't my 
idea, but it got published), so there is quite possibly something going on with 
collinearity. So I thought I'd try lm.ridge() to see if I can get an idea of 
where the problems are occurring.


The starting point is to run lm.ridge() with lambda=0 (i.e., no ridge penalty) 
and check we get the same results as with lm():
lm.ridge(y ~ x1 * x2, lambda=0, data=ds)$coef
x1=0.40, x2=0.37, x1*x2=0.14
So lm.ridge() agrees with SPSS, but not with lm(). (Of course, lambda=0 is the default, 
so it can be omitted; I can alternate between including or deleting ".ridge" in 
the function call, and watch the coefficient for the interaction change.)



What seems slightly strange to me here is that I assumed that lm.ridge() just piggybacks 
on lm() anyway, so in the specific case where lambda=0 and there is no 
"ridging" to do, I'd expect exactly the same results.


Unfortunately there are 34,000 cases in the dataset, so a "minimal" reprex will 
not be easy to make, but I can share the data via Dropbox or something if that would help.



I appreciate that when there is strong collinearity then all bets are off in 
terms of what the betas mean, but I would really expect lm() and lm.ridge() to 
give the same results. (I would be happy to ignore SPSS, but for the moment 
it's part of the majority!)



Thanks for reading,
Nick


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

14 matches

Site Navigation

Mail list logo

Footer information