Re: [R] Homogeneity of regression slopes
That's good insight, and gives me some good ideas for what direction to this. Thanks everyone ! Doug P.S. - I guess if you have a significant interaction, that implies the slopes of the individual regression lines are significantly different anyway, doesn't it... On Tue, Sep 14, 2010 at 11:33 AM, Thomas Stewart tgstew...@gmail.com wrote: If you are interested in exploring the homogeneity of variance assumption, I would suggest you model the variance explicitly. Doing so allows you to compare the homogeneous variance model to the heterogeneous variance model within a nested model framework. In that framework, you'll have likelihood ratio tests, etc. This is why I suggested the nlme package and the gls function. The gls function allows you to model the variance. -tgs P.S. WLS is a type of GLS. P.P.S It isn't clear to me how a variance stabilizing transformation would help in this case. On Tue, Sep 14, 2010 at 6:53 AM, Clifford Long gnolff...@gmail.com wrote: Hi Thomas, Thanks for the additional information. Just wondering, and hoping to learn ... would any lack of homogeneity of variance (which is what I believe you mean by different stddev estimates) be found when performing standard regression diagnostics, such as residual plots, Levene's test (or equivalent), etc.? If so, then would a WLS routine or some type of variance stabilizing transformation be useful? Again, hoping to learn. I'll check out the gls() routine in the nlme package, as you mentioned. Thanks. Cliff On Mon, Sep 13, 2010 at 10:02 PM, Thomas Stewart tgstew...@gmail.com wrote: Allow me to add to Michael's and Clifford's responses. If you fit the same regression model for each group, then you are also fitting a standard deviation parameter for each model. The solution proposed by Michael and Clifford is a good one, but the solution assumes that the standard deviation parameter is the same for all three models. You may want to consider the degree by which the standard deviation estimates differ for the three separate models. If they differ wildly, the method described by Michael and Clifford may not be the best. Rather, you may want to consider gls() in the nlme package to explicitly allow the variance parameters to vary. -tgs On Mon, Sep 13, 2010 at 4:52 PM, Doug Adams f...@gmx.com wrote: Hello, We've got a dataset with several variables, one of which we're using to split the data into 3 smaller subsets. (as the variable takes 1 of 3 possible values). There are several more variables too, many of which we're using to fit regression models using lm. So I have 3 models fitted (one for each subset of course), each having slope estimates for the predictor variables. What we want to find out, though, is whether or not the overall slopes for the 3 regression lines are significantly different from each other. Is there a way, in R, to calculate the overall slope of each line, and test whether there's homogeneity of regression slopes? (Am I using that phrase in the right context -- comparing the slopes of more than one regression line rather than the slopes of the predictors within the same fit.) I hope that makes sense. We really wanted to see if the predicted values at the ends of the 3 regression lines are significantly different... But I'm not sure how to do the Johnson-Neyman procedure in R, so I think testing for slope differences will suffice! Thanks to any who may be able to help! Doug Adams __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Homogeneity of regression slopes
Hi Thomas, Thanks for the additional information. Just wondering, and hoping to learn ... would any lack of homogeneity of variance (which is what I believe you mean by different stddev estimates) be found when performing standard regression diagnostics, such as residual plots, Levene's test (or equivalent), etc.? If so, then would a WLS routine or some type of variance stabilizing transformation be useful? Again, hoping to learn. I'll check out the gls() routine in the nlme package, as you mentioned. Thanks. Cliff On Mon, Sep 13, 2010 at 10:02 PM, Thomas Stewart tgstew...@gmail.comwrote: Allow me to add to Michael's and Clifford's responses. If you fit the same regression model for each group, then you are also fitting a standard deviation parameter for each model. The solution proposed by Michael and Clifford is a good one, but the solution assumes that the standard deviation parameter is the same for all three models. You may want to consider the degree by which the standard deviation estimates differ for the three separate models. If they differ wildly, the method described by Michael and Clifford may not be the best. Rather, you may want to consider gls() in the nlme package to explicitly allow the variance parameters to vary. -tgs On Mon, Sep 13, 2010 at 4:52 PM, Doug Adams f...@gmx.com wrote: Hello, We've got a dataset with several variables, one of which we're using to split the data into 3 smaller subsets. (as the variable takes 1 of 3 possible values). There are several more variables too, many of which we're using to fit regression models using lm. So I have 3 models fitted (one for each subset of course), each having slope estimates for the predictor variables. What we want to find out, though, is whether or not the overall slopes for the 3 regression lines are significantly different from each other. Is there a way, in R, to calculate the overall slope of each line, and test whether there's homogeneity of regression slopes? (Am I using that phrase in the right context -- comparing the slopes of more than one regression line rather than the slopes of the predictors within the same fit.) I hope that makes sense. We really wanted to see if the predicted values at the ends of the 3 regression lines are significantly different... But I'm not sure how to do the Johnson-Neyman procedure in R, so I think testing for slope differences will suffice! Thanks to any who may be able to help! Doug Adams __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Homogeneity of regression slopes
If you are interested in exploring the homogeneity of variance assumption, I would suggest you model the variance explicitly. Doing so allows you to compare the homogeneous variance model to the heterogeneous variance model within a nested model framework. In that framework, you'll have likelihood ratio tests, etc. This is why I suggested the nlme package and the gls function. The gls function allows you to model the variance. -tgs P.S. WLS is a type of GLS. P.P.S It isn't clear to me how a variance stabilizing transformation would help in this case. On Tue, Sep 14, 2010 at 6:53 AM, Clifford Long gnolff...@gmail.com wrote: Hi Thomas, Thanks for the additional information. Just wondering, and hoping to learn ... would any lack of homogeneity of variance (which is what I believe you mean by different stddev estimates) be found when performing standard regression diagnostics, such as residual plots, Levene's test (or equivalent), etc.? If so, then would a WLS routine or some type of variance stabilizing transformation be useful? Again, hoping to learn. I'll check out the gls() routine in the nlme package, as you mentioned. Thanks. Cliff On Mon, Sep 13, 2010 at 10:02 PM, Thomas Stewart tgstew...@gmail.comwrote: Allow me to add to Michael's and Clifford's responses. If you fit the same regression model for each group, then you are also fitting a standard deviation parameter for each model. The solution proposed by Michael and Clifford is a good one, but the solution assumes that the standard deviation parameter is the same for all three models. You may want to consider the degree by which the standard deviation estimates differ for the three separate models. If they differ wildly, the method described by Michael and Clifford may not be the best. Rather, you may want to consider gls() in the nlme package to explicitly allow the variance parameters to vary. -tgs On Mon, Sep 13, 2010 at 4:52 PM, Doug Adams f...@gmx.com wrote: Hello, We've got a dataset with several variables, one of which we're using to split the data into 3 smaller subsets. (as the variable takes 1 of 3 possible values). There are several more variables too, many of which we're using to fit regression models using lm. So I have 3 models fitted (one for each subset of course), each having slope estimates for the predictor variables. What we want to find out, though, is whether or not the overall slopes for the 3 regression lines are significantly different from each other. Is there a way, in R, to calculate the overall slope of each line, and test whether there's homogeneity of regression slopes? (Am I using that phrase in the right context -- comparing the slopes of more than one regression line rather than the slopes of the predictors within the same fit.) I hope that makes sense. We really wanted to see if the predicted values at the ends of the 3 regression lines are significantly different... But I'm not sure how to do the Johnson-Neyman procedure in R, so I think testing for slope differences will suffice! Thanks to any who may be able to help! Doug Adams __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Homogeneity of regression slopes
Hello, We've got a dataset with several variables, one of which we're using to split the data into 3 smaller subsets. (as the variable takes 1 of 3 possible values). There are several more variables too, many of which we're using to fit regression models using lm. So I have 3 models fitted (one for each subset of course), each having slope estimates for the predictor variables. What we want to find out, though, is whether or not the overall slopes for the 3 regression lines are significantly different from each other. Is there a way, in R, to calculate the overall slope of each line, and test whether there's homogeneity of regression slopes? (Am I using that phrase in the right context -- comparing the slopes of more than one regression line rather than the slopes of the predictors within the same fit.) I hope that makes sense. We really wanted to see if the predicted values at the ends of the 3 regression lines are significantly different... But I'm not sure how to do the Johnson-Neyman procedure in R, so I think testing for slope differences will suffice! Thanks to any who may be able to help! Doug Adams __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Homogeneity of regression slopes
Hello Doug, Perhaps it would just be easier to keep your data together and have a single regression with a term for the grouping variable (a factor with 3 levels). If the groups give identical results the coefficients for the two non-reference grouping variable levels will include 0 in their confidence interval. Michael On 14 September 2010 06:52, Doug Adams f...@gmx.com wrote: Hello, We've got a dataset with several variables, one of which we're using to split the data into 3 smaller subsets. (as the variable takes 1 of 3 possible values). There are several more variables too, many of which we're using to fit regression models using lm. So I have 3 models fitted (one for each subset of course), each having slope estimates for the predictor variables. What we want to find out, though, is whether or not the overall slopes for the 3 regression lines are significantly different from each other. Is there a way, in R, to calculate the overall slope of each line, and test whether there's homogeneity of regression slopes? (Am I using that phrase in the right context -- comparing the slopes of more than one regression line rather than the slopes of the predictors within the same fit.) I hope that makes sense. We really wanted to see if the predicted values at the ends of the 3 regression lines are significantly different... But I'm not sure how to do the Johnson-Neyman procedure in R, so I think testing for slope differences will suffice! Thanks to any who may be able to help! Doug Adams __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Homogeneity of regression slopes
If you'll allow me to throw in two cents ... Like Michael said, the dummy variable route is the way to go, but I believe that the coefficients on the dummy variables test for equal intercepts. For equality of slopes, do we need the interaction between the dummy variable and the explanatory variable whose slope (coefficient) is of interest? I'll add some detail below. For only two groups, we could use a single 2-level dummy variable D D = 0 is the reference level (group) D = 1 is the other level (group) Equality of intercepts y = b0 + b1*x + b2*D If D = 0, then y = b0 + b1*x If D = 1, then y = b0 + b1*x + b2 .. group like terms: y = (b0 + b2) + b1*x If coefficient b2 = 0, then we might fail to reject the null hypothesis that the intercepts are equal If coefficient b2 0, then we would reject the null hypothesis that the intercepts are equal Equality of slopes model y = b0 + b1*x + b2*D + b3*x*D (we added the interaction between x and D) If D = 0, then y = b0 + b1*x If D = 1, then y = b0 + b1*x + b2 + b3*x .. group like terms: y = (b0 + b2) + (b1 + b3)*x If coefficient b3 = 0, then we might fail to reject the null hypothesis that the slopes are equal If coefficient b3 0, then we would reject the null hypothesis that the slopes are equal For a model with three groups, assuming that lm / glm / etc. would really do this for you, the explicit dummy variable coding might look like: D1 D2 group 1 0 0(reference level ... can usually choose) group 2 1 0 group 3 0 1 I believe that this is called a sigma-restricted model (??), as opposed to an overparameterized model where three groups would have three dummy variables. You can probably find this info in most books on basic regression. This might be overly simplistic, and I'll happily stand corrected if I've made any mistakes. Otherwise, I hope that this helps. Cliff On Mon, Sep 13, 2010 at 7:12 PM, Michael Bedward michael.bedw...@gmail.comwrote: Hello Doug, Perhaps it would just be easier to keep your data together and have a single regression with a term for the grouping variable (a factor with 3 levels). If the groups give identical results the coefficients for the two non-reference grouping variable levels will include 0 in their confidence interval. Michael On 14 September 2010 06:52, Doug Adams f...@gmx.com wrote: Hello, We've got a dataset with several variables, one of which we're using to split the data into 3 smaller subsets. (as the variable takes 1 of 3 possible values). There are several more variables too, many of which we're using to fit regression models using lm. So I have 3 models fitted (one for each subset of course), each having slope estimates for the predictor variables. What we want to find out, though, is whether or not the overall slopes for the 3 regression lines are significantly different from each other. Is there a way, in R, to calculate the overall slope of each line, and test whether there's homogeneity of regression slopes? (Am I using that phrase in the right context -- comparing the slopes of more than one regression line rather than the slopes of the predictors within the same fit.) I hope that makes sense. We really wanted to see if the predicted values at the ends of the 3 regression lines are significantly different... But I'm not sure how to do the Johnson-Neyman procedure in R, so I think testing for slope differences will suffice! Thanks to any who may be able to help! Doug Adams __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Homogeneity of regression slopes
Thanks for turning my half-baked suggestion into something that would actually work Cliff :) Michael On 14 September 2010 12:27, Clifford Long gnolff...@gmail.com wrote: If you'll allow me to throw in two cents ... Like Michael said, the dummy variable route is the way to go, but I believe that the coefficients on the dummy variables test for equal intercepts. For equality of slopes, do we need the interaction between the dummy variable and the explanatory variable whose slope (coefficient) is of interest? I'll add some detail below. For only two groups, we could use a single 2-level dummy variable D D = 0 is the reference level (group) D = 1 is the other level (group) Equality of intercepts y = b0 + b1*x + b2*D If D = 0, then y = b0 + b1*x If D = 1, then y = b0 + b1*x + b2 .. group like terms: y = (b0 + b2) + b1*x If coefficient b2 = 0, then we might fail to reject the null hypothesis that the intercepts are equal If coefficient b2 0, then we would reject the null hypothesis that the intercepts are equal Equality of slopes model y = b0 + b1*x + b2*D + b3*x*D (we added the interaction between x and D) If D = 0, then y = b0 + b1*x If D = 1, then y = b0 + b1*x + b2 + b3*x .. group like terms: y = (b0 + b2) + (b1 + b3)*x If coefficient b3 = 0, then we might fail to reject the null hypothesis that the slopes are equal If coefficient b3 0, then we would reject the null hypothesis that the slopes are equal For a model with three groups, assuming that lm / glm / etc. would really do this for you, the explicit dummy variable coding might look like: D1 D2 group 1 0 0 (reference level ... can usually choose) group 2 1 0 group 3 0 1 I believe that this is called a sigma-restricted model (??), as opposed to an overparameterized model where three groups would have three dummy variables. You can probably find this info in most books on basic regression. This might be overly simplistic, and I'll happily stand corrected if I've made any mistakes. Otherwise, I hope that this helps. Cliff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Homogeneity of regression slopes
Allow me to add to Michael's and Clifford's responses. If you fit the same regression model for each group, then you are also fitting a standard deviation parameter for each model. The solution proposed by Michael and Clifford is a good one, but the solution assumes that the standard deviation parameter is the same for all three models. You may want to consider the degree by which the standard deviation estimates differ for the three separate models. If they differ wildly, the method described by Michael and Clifford may not be the best. Rather, you may want to consider gls() in the nlme package to explicitly allow the variance parameters to vary. -tgs On Mon, Sep 13, 2010 at 4:52 PM, Doug Adams f...@gmx.com wrote: Hello, We've got a dataset with several variables, one of which we're using to split the data into 3 smaller subsets. (as the variable takes 1 of 3 possible values). There are several more variables too, many of which we're using to fit regression models using lm. So I have 3 models fitted (one for each subset of course), each having slope estimates for the predictor variables. What we want to find out, though, is whether or not the overall slopes for the 3 regression lines are significantly different from each other. Is there a way, in R, to calculate the overall slope of each line, and test whether there's homogeneity of regression slopes? (Am I using that phrase in the right context -- comparing the slopes of more than one regression line rather than the slopes of the predictors within the same fit.) I hope that makes sense. We really wanted to see if the predicted values at the ends of the 3 regression lines are significantly different... But I'm not sure how to do the Johnson-Neyman procedure in R, so I think testing for slope differences will suffice! Thanks to any who may be able to help! Doug Adams __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.