Thanks for these perfectly consistent replies - I didn't understand the purpose 
of m = sum(w * f/sum(w)) and saw it merely as a weighted average of the fitted 
values.

My ultimate concern is how to compute an appropriate weighted TSS (or 
equivalently, MSS) for PRESS-R^2 = 1 - PRESS/TSS = 1 - PRESS/ (MSS + PRESS). Do 
you think it then makes sense to substitute the vector of leave-one-out fitted 
values for f here?

m <- sum(w * f/sum(w))
mss <-  sum(w * (f - m)^2)

Murray
________________________________________
From: peter dalgaard <[email protected]>
Sent: Friday, 8 April 2016 11:28 p.m.
To: Duncan Murdoch
Cc: Murray Efford; [email protected]
Subject: Re: [R] R.squared in summary.lm with weights

On 08 Apr 2016, at 12:57 , Duncan Murdoch <[email protected]> wrote:

> On 07/04/2016 5:21 PM, Murray Efford wrote:
>> Following some old advice on this list, I have been reading the code for 
>> summary.lm to understand the computation of R-squared from a weighted 
>> regression. Usually weights in lm are applied to squared residuals, but I 
>> see that the weighted mean of the observations is calculated as if the 
>> weights are on the original scale:
>>
>> [...]
>>     f <- z$fitted.values
>>     w <- z$weights
>> [...]
>>             m <- sum(w * f/sum(w))
>>             [mss <-]  sum(w * (f - m)^2)
>> [...]
>>
>> This seems inconsistent to me. What am I missing?
>
> I think you are expecting consistency where there needn't be any.  Why do you 
> see an inconsistency here?  Those are different calculations. You get 
> expressions like these if you assume observations have variance sigma^2/w, 
> and you're trying to estimate sigma^2.
>


It's also perfectly consistent that m is the minimizer of mss:

d/dm sum(w*(f-m)^2) = -2 sum(w*(f-m)) = 0 => m = sum(w*f) / sum(w)

However, beware the distiction between inverse variance weights, replication 
weights, and sampling weights.


> Duncan Murdoch
>
> ______________________________________________
> [email protected] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [email protected]  Priv: [email protected]


______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to