Re: [R] Homogeneity of regression slopes

2010-09-15 Thread Doug Adams
That's good insight, and gives me some good ideas for what direction
to this.  Thanks everyone !

Doug

P.S. - I guess if you have a significant interaction, that implies the
slopes of the individual regression lines are significantly different
anyway, doesn't it...



On Tue, Sep 14, 2010 at 11:33 AM, Thomas Stewart tgstew...@gmail.com wrote:
 If you are interested in exploring the homogeneity of variance assumption,
 I would suggest you model the variance explicitly.  Doing so allows you to
 compare the homogeneous variance model to the heterogeneous variance model
 within a nested model framework.  In that framework, you'll have likelihood
 ratio tests, etc.
 This is why I suggested the nlme package and the gls function.  The gls
 function allows you to model the variance.
 -tgs
 P.S. WLS is a type of GLS.
 P.P.S It isn't clear to me how a variance stabilizing transformation would
 help in this case.

 On Tue, Sep 14, 2010 at 6:53 AM, Clifford Long gnolff...@gmail.com wrote:

 Hi Thomas,

 Thanks for the additional information.

 Just wondering, and hoping to learn ... would any lack of homogeneity of
 variance (which is what I believe you mean by different stddev estimates) be
 found when performing standard regression diagnostics, such as residual
 plots, Levene's test (or equivalent), etc.?  If so, then would a WLS routine
 or some type of variance stabilizing transformation be useful?

 Again, hoping to learn.  I'll check out the gls() routine in the nlme
 package, as you mentioned.

 Thanks.

 Cliff


 On Mon, Sep 13, 2010 at 10:02 PM, Thomas Stewart tgstew...@gmail.com
 wrote:

 Allow me to add to Michael's and Clifford's responses.

 If you fit the same regression model for each group, then you are also
 fitting a standard deviation parameter for each model.  The solution
 proposed by Michael and Clifford is a good one, but the solution assumes
 that the standard deviation parameter is the same for all three models.

 You may want to consider the degree by which the standard deviation
 estimates differ for the three separate models.  If they differ wildly,
 the
 method described by Michael and Clifford may not be the best.  Rather,
 you
 may want to consider gls() in the nlme package to explicitly allow the
 variance parameters to vary.

 -tgs

 On Mon, Sep 13, 2010 at 4:52 PM, Doug Adams f...@gmx.com wrote:

  Hello,
 
  We've got a dataset with several variables, one of which we're using
  to split the data into 3 smaller subsets.  (as the variable takes 1 of
  3 possible values).
 
  There are several more variables too, many of which we're using to fit
  regression models using lm.  So I have 3 models fitted (one for each
  subset of course), each having slope estimates for the predictor
  variables.
 
  What we want to find out, though, is whether or not the overall slopes
  for the 3 regression lines are significantly different from each
  other.  Is there a way, in R, to calculate the overall slope of each
  line, and test whether there's homogeneity of regression slopes?  (Am
  I using that phrase in the right context -- comparing the slopes of
  more than one regression line rather than the slopes of the predictors
  within the same fit.)
 
  I hope that makes sense.  We really wanted to see if the predicted
  values at the ends of the 3 regression lines are significantly
  different... But I'm not sure how to do the Johnson-Neyman procedure
  in R, so I think testing for slope differences will suffice!
 
  Thanks to any who may be able to help!
 
  Doug Adams
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Homogeneity of regression slopes

2010-09-14 Thread Clifford Long
Hi Thomas,

Thanks for the additional information.

Just wondering, and hoping to learn ... would any lack of homogeneity of
variance (which is what I believe you mean by different stddev estimates) be
found when performing standard regression diagnostics, such as residual
plots, Levene's test (or equivalent), etc.?  If so, then would a WLS routine
or some type of variance stabilizing transformation be useful?

Again, hoping to learn.  I'll check out the gls() routine in the nlme
package, as you mentioned.

Thanks.

Cliff


On Mon, Sep 13, 2010 at 10:02 PM, Thomas Stewart tgstew...@gmail.comwrote:

 Allow me to add to Michael's and Clifford's responses.

 If you fit the same regression model for each group, then you are also
 fitting a standard deviation parameter for each model.  The solution
 proposed by Michael and Clifford is a good one, but the solution assumes
 that the standard deviation parameter is the same for all three models.

 You may want to consider the degree by which the standard deviation
 estimates differ for the three separate models.  If they differ wildly, the
 method described by Michael and Clifford may not be the best.  Rather, you
 may want to consider gls() in the nlme package to explicitly allow the
 variance parameters to vary.

 -tgs

 On Mon, Sep 13, 2010 at 4:52 PM, Doug Adams f...@gmx.com wrote:

  Hello,
 
  We've got a dataset with several variables, one of which we're using
  to split the data into 3 smaller subsets.  (as the variable takes 1 of
  3 possible values).
 
  There are several more variables too, many of which we're using to fit
  regression models using lm.  So I have 3 models fitted (one for each
  subset of course), each having slope estimates for the predictor
  variables.
 
  What we want to find out, though, is whether or not the overall slopes
  for the 3 regression lines are significantly different from each
  other.  Is there a way, in R, to calculate the overall slope of each
  line, and test whether there's homogeneity of regression slopes?  (Am
  I using that phrase in the right context -- comparing the slopes of
  more than one regression line rather than the slopes of the predictors
  within the same fit.)
 
  I hope that makes sense.  We really wanted to see if the predicted
  values at the ends of the 3 regression lines are significantly
  different... But I'm not sure how to do the Johnson-Neyman procedure
  in R, so I think testing for slope differences will suffice!
 
  Thanks to any who may be able to help!
 
  Doug Adams
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Homogeneity of regression slopes

2010-09-14 Thread Thomas Stewart
If you are interested in exploring the homogeneity of variance assumption,
I would suggest you model the variance explicitly.  Doing so allows you to
compare the homogeneous variance model to the heterogeneous variance model
within a nested model framework.  In that framework, you'll have likelihood
ratio tests, etc.

This is why I suggested the nlme package and the gls function.  The gls
function allows you to model the variance.

-tgs

P.S. WLS is a type of GLS.
P.P.S It isn't clear to me how a variance stabilizing transformation would
help in this case.

On Tue, Sep 14, 2010 at 6:53 AM, Clifford Long gnolff...@gmail.com wrote:

 Hi Thomas,

 Thanks for the additional information.

 Just wondering, and hoping to learn ... would any lack of homogeneity of
 variance (which is what I believe you mean by different stddev estimates) be
 found when performing standard regression diagnostics, such as residual
 plots, Levene's test (or equivalent), etc.?  If so, then would a WLS routine
 or some type of variance stabilizing transformation be useful?

 Again, hoping to learn.  I'll check out the gls() routine in the nlme
 package, as you mentioned.

 Thanks.

 Cliff


 On Mon, Sep 13, 2010 at 10:02 PM, Thomas Stewart tgstew...@gmail.comwrote:

 Allow me to add to Michael's and Clifford's responses.

 If you fit the same regression model for each group, then you are also
 fitting a standard deviation parameter for each model.  The solution
 proposed by Michael and Clifford is a good one, but the solution assumes
 that the standard deviation parameter is the same for all three models.

 You may want to consider the degree by which the standard deviation
 estimates differ for the three separate models.  If they differ wildly,
 the
 method described by Michael and Clifford may not be the best.  Rather, you
 may want to consider gls() in the nlme package to explicitly allow the
 variance parameters to vary.

 -tgs

 On Mon, Sep 13, 2010 at 4:52 PM, Doug Adams f...@gmx.com wrote:

  Hello,
 
  We've got a dataset with several variables, one of which we're using
  to split the data into 3 smaller subsets.  (as the variable takes 1 of
  3 possible values).
 
  There are several more variables too, many of which we're using to fit
  regression models using lm.  So I have 3 models fitted (one for each
  subset of course), each having slope estimates for the predictor
  variables.
 
  What we want to find out, though, is whether or not the overall slopes
  for the 3 regression lines are significantly different from each
  other.  Is there a way, in R, to calculate the overall slope of each
  line, and test whether there's homogeneity of regression slopes?  (Am
  I using that phrase in the right context -- comparing the slopes of
  more than one regression line rather than the slopes of the predictors
  within the same fit.)
 
  I hope that makes sense.  We really wanted to see if the predicted
  values at the ends of the 3 regression lines are significantly
  different... But I'm not sure how to do the Johnson-Neyman procedure
  in R, so I think testing for slope differences will suffice!
 
  Thanks to any who may be able to help!
 
  Doug Adams
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Homogeneity of regression slopes

2010-09-13 Thread Doug Adams
Hello,

We've got a dataset with several variables, one of which we're using
to split the data into 3 smaller subsets.  (as the variable takes 1 of
3 possible values).

There are several more variables too, many of which we're using to fit
regression models using lm.  So I have 3 models fitted (one for each
subset of course), each having slope estimates for the predictor
variables.

What we want to find out, though, is whether or not the overall slopes
for the 3 regression lines are significantly different from each
other.  Is there a way, in R, to calculate the overall slope of each
line, and test whether there's homogeneity of regression slopes?  (Am
I using that phrase in the right context -- comparing the slopes of
more than one regression line rather than the slopes of the predictors
within the same fit.)

I hope that makes sense.  We really wanted to see if the predicted
values at the ends of the 3 regression lines are significantly
different... But I'm not sure how to do the Johnson-Neyman procedure
in R, so I think testing for slope differences will suffice!

Thanks to any who may be able to help!

Doug Adams

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Homogeneity of regression slopes

2010-09-13 Thread Michael Bedward
Hello Doug,

Perhaps it would just be easier to keep your data together and have a
single regression with a term for the grouping variable (a factor with
3 levels). If the groups give identical results the coefficients for
the two non-reference grouping variable levels will include 0 in their
confidence interval.

Michael


On 14 September 2010 06:52, Doug Adams f...@gmx.com wrote:
 Hello,

 We've got a dataset with several variables, one of which we're using
 to split the data into 3 smaller subsets.  (as the variable takes 1 of
 3 possible values).

 There are several more variables too, many of which we're using to fit
 regression models using lm.  So I have 3 models fitted (one for each
 subset of course), each having slope estimates for the predictor
 variables.

 What we want to find out, though, is whether or not the overall slopes
 for the 3 regression lines are significantly different from each
 other.  Is there a way, in R, to calculate the overall slope of each
 line, and test whether there's homogeneity of regression slopes?  (Am
 I using that phrase in the right context -- comparing the slopes of
 more than one regression line rather than the slopes of the predictors
 within the same fit.)

 I hope that makes sense.  We really wanted to see if the predicted
 values at the ends of the 3 regression lines are significantly
 different... But I'm not sure how to do the Johnson-Neyman procedure
 in R, so I think testing for slope differences will suffice!

 Thanks to any who may be able to help!

 Doug Adams

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Homogeneity of regression slopes

2010-09-13 Thread Clifford Long
If you'll allow me to throw in two cents ...

Like Michael said, the dummy variable route is the way to go, but I believe
that the coefficients on the dummy variables test for equal intercepts.  For
equality of slopes, do we need the interaction between the dummy variable
and the explanatory variable whose slope (coefficient) is of interest?  I'll
add some detail below.


For only two groups, we could use a single 2-level dummy variable D
D = 0 is the reference level (group)
D = 1 is the other level (group)


Equality of intercepts

y = b0 + b1*x + b2*D

If D = 0, then y = b0 + b1*x
If D = 1, then y = b0 + b1*x + b2   ..   group like terms: y = (b0 + b2)
+ b1*x

If coefficient b2 = 0, then we might fail to reject the null hypothesis that
the intercepts are equal
If coefficient b2  0, then we would reject the null hypothesis that the
intercepts are equal


Equality of slopes model

 y = b0 + b1*x + b2*D + b3*x*D

(we added the interaction between x and D)


If D = 0, then y = b0 + b1*x
If D = 1, then y = b0 + b1*x + b2 + b3*x  ..   group like terms: y = (b0
+ b2) + (b1 + b3)*x

If coefficient b3 = 0, then we might fail to reject the null hypothesis that
the slopes are equal
If coefficient b3  0, then we would reject the null hypothesis that
the slopes are equal


For a model with three groups, assuming that lm / glm / etc. would really do
this for you, the explicit dummy variable coding might look like:

 D1  D2
group 1   0 0(reference level ... can usually choose)
group 2   1 0
group 3   0 1

I believe that this is called a sigma-restricted model (??), as opposed to
an overparameterized model where three groups would have three dummy
variables.
You can probably find this info in most books on basic regression.  This
might be overly simplistic, and I'll happily stand corrected if I've made
any mistakes.

Otherwise, I hope that this helps.

Cliff




On Mon, Sep 13, 2010 at 7:12 PM, Michael Bedward
michael.bedw...@gmail.comwrote:

 Hello Doug,

 Perhaps it would just be easier to keep your data together and have a
 single regression with a term for the grouping variable (a factor with
 3 levels). If the groups give identical results the coefficients for
 the two non-reference grouping variable levels will include 0 in their
 confidence interval.

 Michael


 On 14 September 2010 06:52, Doug Adams f...@gmx.com wrote:
  Hello,
 
  We've got a dataset with several variables, one of which we're using
  to split the data into 3 smaller subsets.  (as the variable takes 1 of
  3 possible values).
 
  There are several more variables too, many of which we're using to fit
  regression models using lm.  So I have 3 models fitted (one for each
  subset of course), each having slope estimates for the predictor
  variables.
 
  What we want to find out, though, is whether or not the overall slopes
  for the 3 regression lines are significantly different from each
  other.  Is there a way, in R, to calculate the overall slope of each
  line, and test whether there's homogeneity of regression slopes?  (Am
  I using that phrase in the right context -- comparing the slopes of
  more than one regression line rather than the slopes of the predictors
  within the same fit.)
 
  I hope that makes sense.  We really wanted to see if the predicted
  values at the ends of the 3 regression lines are significantly
  different... But I'm not sure how to do the Johnson-Neyman procedure
  in R, so I think testing for slope differences will suffice!
 
  Thanks to any who may be able to help!
 
  Doug Adams
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Homogeneity of regression slopes

2010-09-13 Thread Michael Bedward
Thanks for turning my half-baked suggestion into something that would
actually work Cliff :)

Michael

On 14 September 2010 12:27, Clifford Long gnolff...@gmail.com wrote:
 If you'll allow me to throw in two cents ...

 Like Michael said, the dummy variable route is the way to go, but I believe
 that the coefficients on the dummy variables test for equal intercepts.  For
 equality of slopes, do we need the interaction between the dummy variable
 and the explanatory variable whose slope (coefficient) is of interest?  I'll
 add some detail below.


 For only two groups, we could use a single 2-level dummy variable D
 D = 0 is the reference level (group)
 D = 1 is the other level (group)


 Equality of intercepts

 y = b0 + b1*x + b2*D

 If D = 0, then y = b0 + b1*x
 If D = 1, then y = b0 + b1*x + b2   ..   group like terms: y = (b0 + b2)
 + b1*x

 If coefficient b2 = 0, then we might fail to reject the null hypothesis that
 the intercepts are equal
 If coefficient b2  0, then we would reject the null hypothesis that the
 intercepts are equal


 Equality of slopes model

 y = b0 + b1*x + b2*D + b3*x*D

 (we added the interaction between x and D)


 If D = 0, then y = b0 + b1*x
 If D = 1, then y = b0 + b1*x + b2 + b3*x  ..   group like terms: y = (b0
 + b2) + (b1 + b3)*x

 If coefficient b3 = 0, then we might fail to reject the null hypothesis that
 the slopes are equal
 If coefficient b3  0, then we would reject the null hypothesis that
 the slopes are equal


 For a model with three groups, assuming that lm / glm / etc. would really do
 this for you, the explicit dummy variable coding might look like:

  D1  D2
 group 1   0 0    (reference level ... can usually choose)
 group 2   1 0
 group 3   0 1

 I believe that this is called a sigma-restricted model (??), as opposed to
 an overparameterized model where three groups would have three dummy
 variables.
 You can probably find this info in most books on basic regression.  This
 might be overly simplistic, and I'll happily stand corrected if I've made
 any mistakes.

 Otherwise, I hope that this helps.

 Cliff

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Homogeneity of regression slopes

2010-09-13 Thread Thomas Stewart
Allow me to add to Michael's and Clifford's responses.

If you fit the same regression model for each group, then you are also
fitting a standard deviation parameter for each model.  The solution
proposed by Michael and Clifford is a good one, but the solution assumes
that the standard deviation parameter is the same for all three models.

You may want to consider the degree by which the standard deviation
estimates differ for the three separate models.  If they differ wildly, the
method described by Michael and Clifford may not be the best.  Rather, you
may want to consider gls() in the nlme package to explicitly allow the
variance parameters to vary.

-tgs

On Mon, Sep 13, 2010 at 4:52 PM, Doug Adams f...@gmx.com wrote:

 Hello,

 We've got a dataset with several variables, one of which we're using
 to split the data into 3 smaller subsets.  (as the variable takes 1 of
 3 possible values).

 There are several more variables too, many of which we're using to fit
 regression models using lm.  So I have 3 models fitted (one for each
 subset of course), each having slope estimates for the predictor
 variables.

 What we want to find out, though, is whether or not the overall slopes
 for the 3 regression lines are significantly different from each
 other.  Is there a way, in R, to calculate the overall slope of each
 line, and test whether there's homogeneity of regression slopes?  (Am
 I using that phrase in the right context -- comparing the slopes of
 more than one regression line rather than the slopes of the predictors
 within the same fit.)

 I hope that makes sense.  We really wanted to see if the predicted
 values at the ends of the 3 regression lines are significantly
 different... But I'm not sure how to do the Johnson-Neyman procedure
 in R, so I think testing for slope differences will suffice!

 Thanks to any who may be able to help!

 Doug Adams

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.