Re: [R] understanding the output from gls

2009-09-02 Thread Timothy_Handley
Kingsford,

Thanks for the information. As you suggest, if I don't hear from anyone
else about the degrees of freedom issue in a couple days, I'll try r-devel.

Also, while I appreciate your explanation of the correlation matrix
produced by summary.gls, I'm afriad I don't have the statistical background
to fully understand it. If it wouldn't impose too much on your time, could
you suggest a more intuitive way to interpret these numbers, or perhaps a
reference with a more intuitive explanation?

To me, the idea that fixed effects estimates could be correlated just
sounds odd. I understand that the effects themselves could be correlated.
While I have good biological reasons for thinking that my two fixed effects
are independent, the data may prove me wrong. However, the fixed effects
estimates are each just a single number for the data/model as a whole. How
can one pair of two single numbers be correlated?

Tim Handley
Fire Effects Monitor
Santa Monica Mountains National Recreation Area
401 W. Hillcrest Dr.
Thousand Oaks, CA 91360
805-370-2347




   
 Kingsford Jones   
 kingsfordjo...@g 
 mail.com  To 
   timothy_hand...@nps.gov 
 09/01/2009 05:53   cc 
 PMR-help@r-project.org
   Subject 
   Re: [R] understanding the output
   from gls
   
   
   
   
   
   




Hi Tim,

On Tue, Sep 1, 2009 at 2:00 PM, timothy_hand...@nps.gov wrote:

 I'd like to compare two models which were fitted using gls, however I'm
 having trouble interpreting the results of gls. If any of you could offer
 me some advice, I'd greatly appreciate it.

 Short explanation of models: These two models have the same fixed-effects
 structure (two independent, linear effects),

Known to be independent?

 and differ only in that the
 second model includes a corExp structure for spatial autocorrelation.
(more
 detailed explanation of the models at end).

 Specific questions:

 1. The second model estimates two additional parameters in the process of
 fitting the corSpatial object - the range and nugget of the spatial
 autocorrelation. Based on this, I would expect the second model to have
two
 fewer residual degrees of freedom. However, the summary function reports
 that both models have the same number of residual degrees of
freedom.  Why
 is this? (Interestingly, the difference in AIC between the two models
 reflects this difference in the number of model parameters)


That's a good question.  It does seem degrees of freedom would be
spent in estimating the spatial parameters.

The reason the functions using logLik get the number of parameters
right is because logLik.gls includes:

attr(val, df) - p + length(coef(object[[modelStruct]])) + 1

where modelStruct holds the estimated parameters that structure
residual variance and correlation, and I believe p is the number of
columns in the design matrix, but I couldn't easily follow the code
within gls that produces it.

If nobody offers an explanation for the current result from
print.summary.gls you could ask on r-devel if the results are
intended.

 2. In the model summary, what is the meaning of the small correlation
 matrix under the heading Correlation:? At first, I thought that this
was
 describing possible correlations among the predictor variables, but then
I
 saw that it also included the model intercept. What do these correlation
 value mean?

The GLS covariance of estimated fixed effects (including the
intercept) is (X' \hat{\Sigma}^-1 X)^-1, where X is the model matrix
and \Sigma is the response/error covariance matrix.  This will
generally have non-zero off-diagonals, implying the fixed effects
estimates are correlated.  The values you're inquiring about are the
scaled estimates of those off-diagonals.

hth,

Kingsford Jones


 ##More detailed information
 ##function calls:
  sppl.i.xx = gls(all.all.rch~l10area+newx, data = gtemp, method=ML)
  sppl.i.ex = gls(all.all.rch~l10area+newx, data = gtemp, method=ML,
              correlation = corExp(c(20,.8), form=~x+y|area, nugget=TRUE))

 ##model summaries

Re: [R] understanding the output from gls

2009-09-02 Thread Kingsford Jones
On Wed, Sep 2, 2009 at 2:39 PM, timothy_hand...@nps.gov wrote:
 Kingsford,

 Thanks for the information. As you suggest, if I don't hear from anyone
 else about the degrees of freedom issue in a couple days, I'll try r-devel.

Determining denominator degrees of freedom for F tests in mixed models
is an area of current research (and some contention) and it may be
that the same holds true for df in a linear model w/ estimated
parameters structuring the error covariance.  So, these may be some
deep waters that nobody is eager to wade intoor maybe there's a
simple explanation...


 Also, while I appreciate your explanation of the correlation matrix
 produced by summary.gls, I'm afriad I don't have the statistical background
 to fully understand it. If it wouldn't impose too much on your time, could
 you suggest a more intuitive way to interpret these numbers, or perhaps a
 reference with a more intuitive explanation?

 To me, the idea that fixed effects estimates could be correlated just
 sounds odd. I understand that the effects themselves could be correlated.
 While I have good biological reasons for thinking that my two fixed effects
 are independent, the data may prove me wrong. However, the fixed effects
 estimates are each just a single number for the data/model as a whole. How
 can one pair of two single numbers be correlated?



Well the issue is that you're assuming a model.  Here's a simple idea
that may help you intuition:

Visualize a scatterplot and place a point on that plot that marks
(mean x, mean y).  The way the math works out, a regression line fit
to the data must go through that point. Now think about adjusting the
intercept of that line, while keeping it going through the (mean x,
mean y) point -- clearly the slope will change too.  So, x and y are
not independent.

The reason the covariance matrix of the fixed effects coefficients is
usually of interest is for inference.  For example a joint confidence
region for the slope and intercept would be an ellipse with elongation
determined by the covariance.


hth,

Kingsford


 Tim Handley
 Fire Effects Monitor
 Santa Monica Mountains National Recreation Area
 401 W. Hillcrest Dr.
 Thousand Oaks, CA 91360
 805-370-2347





             Kingsford Jones
             kingsfordjo...@g
             mail.com                                                  To
                                       timothy_hand...@nps.gov
             09/01/2009 05:53                                           cc
             PM                        r-h...@r-project.org
                                                                   Subject
                                       Re: [R] understanding the output
                                       from gls










 Hi Tim,

 On Tue, Sep 1, 2009 at 2:00 PM, timothy_hand...@nps.gov wrote:

 I'd like to compare two models which were fitted using gls, however I'm
 having trouble interpreting the results of gls. If any of you could offer
 me some advice, I'd greatly appreciate it.

 Short explanation of models: These two models have the same fixed-effects
 structure (two independent, linear effects),

 Known to be independent?

 and differ only in that the
 second model includes a corExp structure for spatial autocorrelation.
 (more
 detailed explanation of the models at end).

 Specific questions:

 1. The second model estimates two additional parameters in the process of
 fitting the corSpatial object - the range and nugget of the spatial
 autocorrelation. Based on this, I would expect the second model to have
 two
 fewer residual degrees of freedom. However, the summary function reports
 that both models have the same number of residual degrees of
 freedom.  Why
 is this? (Interestingly, the difference in AIC between the two models
 reflects this difference in the number of model parameters)


 That's a good question.  It does seem degrees of freedom would be
 spent in estimating the spatial parameters.

 The reason the functions using logLik get the number of parameters
 right is because logLik.gls includes:

 attr(val, df) - p + length(coef(object[[modelStruct]])) + 1

 where modelStruct holds the estimated parameters that structure
 residual variance and correlation, and I believe p is the number of
 columns in the design matrix, but I couldn't easily follow the code
 within gls that produces it.

 If nobody offers an explanation for the current result from
 print.summary.gls you could ask on r-devel if the results are
 intended.

 2. In the model summary, what is the meaning of the small correlation
 matrix under the heading Correlation:? At first, I thought that this
 was
 describing possible correlations among the predictor variables, but then
 I
 saw that it also included the model intercept. What do these correlation
 value mean?

 The GLS covariance of estimated fixed effects (including the
 intercept) is (X' \hat{\Sigma}^-1 X)^-1, where X is the model matrix
 and \Sigma is the response/error

Re: [R] understanding the output from gls

2009-09-01 Thread Kingsford Jones
Hi Tim,

On Tue, Sep 1, 2009 at 2:00 PM, timothy_hand...@nps.gov wrote:

 I'd like to compare two models which were fitted using gls, however I'm
 having trouble interpreting the results of gls. If any of you could offer
 me some advice, I'd greatly appreciate it.

 Short explanation of models: These two models have the same fixed-effects
 structure (two independent, linear effects),

Known to be independent?

 and differ only in that the
 second model includes a corExp structure for spatial autocorrelation. (more
 detailed explanation of the models at end).

 Specific questions:

 1. The second model estimates two additional parameters in the process of
 fitting the corSpatial object - the range and nugget of the spatial
 autocorrelation. Based on this, I would expect the second model to have two
 fewer residual degrees of freedom. However, the summary function reports
 that both models have the same number of residual degrees of freedom.  Why
 is this? (Interestingly, the difference in AIC between the two models
 reflects this difference in the number of model parameters)


That's a good question.  It does seem degrees of freedom would be
spent in estimating the spatial parameters.

The reason the functions using logLik get the number of parameters
right is because logLik.gls includes:

attr(val, df) - p + length(coef(object[[modelStruct]])) + 1

where modelStruct holds the estimated parameters that structure
residual variance and correlation, and I believe p is the number of
columns in the design matrix, but I couldn't easily follow the code
within gls that produces it.

If nobody offers an explanation for the current result from
print.summary.gls you could ask on r-devel if the results are
intended.

 2. In the model summary, what is the meaning of the small correlation
 matrix under the heading Correlation:? At first, I thought that this was
 describing possible correlations among the predictor variables, but then I
 saw that it also included the model intercept. What do these correlation
 value mean?

The GLS covariance of estimated fixed effects (including the
intercept) is (X' \hat{\Sigma}^-1 X)^-1, where X is the model matrix
and \Sigma is the response/error covariance matrix.  This will
generally have non-zero off-diagonals, implying the fixed effects
estimates are correlated.  The values you're inquiring about are the
scaled estimates of those off-diagonals.

hth,

Kingsford Jones


 ##More detailed information
 ##function calls:
  sppl.i.xx = gls(all.all.rch~l10area+newx, data = gtemp, method=ML)
  sppl.i.ex = gls(all.all.rch~l10area+newx, data = gtemp, method=ML,
              correlation = corExp(c(20,.8), form=~x+y|area, nugget=TRUE))

 ##model summaries

 summary(sppl.i.xx)
 Generalized least squares fit by maximum likelihood
  Model: all.all.rch ~ l10area + newx
  Data: gtemp
       AIC     BIC    logLik
  567.4893 578.572 -279.7446

 Coefficients:
               Value Std.Error   t-value p-value
 (Intercept) 6.891867 0.3295097 20.915522   0e+00
 l10area     6.586182 0.3048870 21.602046   0e+00
 newx        0.047901 0.0117281  4.084307   1e-04

  Correlation:
        (Intr) l10are
 l10area -0.364
 newx     0.577 -0.007

 Standardized residuals:
        Min          Q1         Med          Q3         Max
 -3.34307266 -0.57949890 -0.07214605  0.64309760  2.66409931

 Residual standard error: 2.590313
 Degrees of freedom: 118 total; 115 residual

 summary(sppl.i.ex)
 Generalized least squares fit by maximum likelihood
  Model: all.all.rch ~ l10area + newx
  Data: gtemp
      AIC      BIC    logLik
  559.167 575.7911 -273.5835

 Correlation Structure: Exponential spatial correlation
  Formula: ~x + y | area
  Parameter estimate(s):
     range     nugget
 15.4448835  0.3741476

 Coefficients:
               Value Std.Error   t-value p-value
 (Intercept) 7.621306 0.7648135  9.964921  0.
 l10area     6.400442 0.5588160 11.453576  0.
 newx        0.066535 0.0204417  3.254857  0.0015

  Correlation:
        (Intr) l10are
 l10area -0.592
 newx     0.358  0.014

 Standardized residuals:
       Min         Q1        Med         Q3        Max
 -3.0035983 -0.5990432 -0.2226852  0.5113270  2.263

 Residual standard error: 2.820337
 Degrees of freedom: 118 total; 115 residual




 Tim Handley
 Fire Effects Monitor
 Santa Monica Mountains National Recreation Area
 401 W. Hillcrest Dr.
 Thousand Oaks, CA 91360
 805-370-2347

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.