[R] Servreg $loglik

2010-07-20 Thread Charles Annis, P.E.
Dear R-experts:

I am using survreg() to estimate the parameters of a Weibull density having
right-censored observations.  Some observations are weighted.  To do that I
regress the weighed observations against a column of ones.

When I enter the data as 37 weighted observations, the parameter estimates
are exactly the same as when I enter the data as the corresponding 70
unweighted observations.  This is to be expected, of course.

I don't understand, however, why the reported loglikelihood is 
 parameter.estimates$loglik
[1] -120.4699 -120.4699
for the 37 weighted observations, but
 parameter.estimates$loglik
[1] -135.1527 -135.1527
for the 70 unweighted observations.

(For the record, my computations of the loglikelihood, using the dweibull()
function for the observations and the pweibull() function for the censored
observations, is -135.1527 for both 37 weighted and 70 unweighted.)

I am using the data from Meeker and Escobar, _Statistical Methods for
Reliability Data_, Wiley (1998), Table C.1, shown below:

Hours   Status  Num.Parts
  450   Failure 1
  460   R-Censored  1
 1150   Failure 2
 1560   R-Censored  1
 1600   Failure 1
 1660   R-Censored  1
 1850   R-Censored  5
 2030   R-Censored  3
 2070   Failure 2
 2080   Failure 1
 2200   R-Censored  1
 3000   R-Censored  4
 3100   Failure 1
 3200   R-Censored  1
 3450   Failure 1
 3750   R-Censored  2
 4150   R-Censored  4
 4300   R-Censored  4
 4600   Failure 1
 4850   R-Censored  4
 5000   R-Censored  3
 6100   R-Censored  3
 6100   Failure 1
 6300   R-Censored  1
 6450   R-Censored  2
 6700   R-Censored  1
 7450   R-Censored  1
 7800   R-Censored  2
 8100   R-Censored  2
 8200   R-Censored  1
 8500   R-Censored  3
 8750   R-Censored  2
 8750   Failure 1
 9400   R-Censored  1
 9900   R-Censored  1
10100   R-Censored  3
11500   R-Censored  1

I am running R version 2.11.1 (2010-05-31) on a HP Windows 7 box with 8 gig
RAM.


Thank you for your help.

Charles Annis, P.E.

charles.an...@statisticalengineering.com
561-352-9699
http://www.StatisticalEngineering.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Servreg $loglik

2010-07-20 Thread David Winsemius


On Jul 20, 2010, at 11:20 AM, Charles Annis, P.E. wrote:


Dear R-experts:

I am using survreg() to estimate the parameters of a Weibull density  
having
right-censored observations.  Some observations are weighted.  To do  
that I

regress the weighed observations against a column of ones.

When I enter the data as 37 weighted observations, the parameter  
estimates

are exactly the same as when I enter the data as the corresponding 70
unweighted observations.  This is to be expected, of course.

I don't understand, however, why the reported loglikelihood is

parameter.estimates$loglik

[1] -120.4699 -120.4699
for the 37 weighted observations, but

parameter.estimates$loglik

[1] -135.1527 -135.1527
for the 70 unweighted observations.


This has come up on r-help many times before (and probably on other  
lists as well), despite not being an R question at all. It is  
commonplace in modeling grouped data to see likelihoods reported  
differently from the result obtained when modeling ungrouped data  
representations with the same frequencies. The only valid statistical  
process is to compare differences in the likelihoods (or log(L) ),  
since the likelihood (or log(L) ) is only defined up to an arbitrary  
constant. You need to be comparing the result to some sort of null  
model for it to have any meaning. (... or perhaps that is your null  
model and you need to be looking at the impact of adding a covariate  
or two.)


--
David.



(For the record, my computations of the loglikelihood, using the  
dweibull()
function for the observations and the pweibull() function for the  
censored

observations, is -135.1527 for both 37 weighted and 70 unweighted.)

I am using the data from Meeker and Escobar, _Statistical Methods for
Reliability Data_, Wiley (1998), Table C.1, shown below:

Hours   Status  Num.Parts
 450Failure 1
 460R-Censored  1
1150Failure 2
1560R-Censored  1
1600Failure 1
1660R-Censored  1
1850R-Censored  5
2030R-Censored  3
2070Failure 2
2080Failure 1
2200R-Censored  1
3000R-Censored  4
3100Failure 1
3200R-Censored  1
3450Failure 1
3750R-Censored  2
4150R-Censored  4
4300R-Censored  4
4600Failure 1
4850R-Censored  4
5000R-Censored  3
6100R-Censored  3
6100Failure 1
6300R-Censored  1
6450R-Censored  2
6700R-Censored  1
7450R-Censored  1
7800R-Censored  2
8100R-Censored  2
8200R-Censored  1
8500R-Censored  3
8750R-Censored  2
8750Failure 1
9400R-Censored  1
9900R-Censored  1
10100   R-Censored  3
11500   R-Censored  1

I am running R version 2.11.1 (2010-05-31) on a HP Windows 7 box  
with 8 gig

RAM.


Thank you for your help.

Charles Annis, P.E.

charles.an...@statisticalengineering.com
561-352-9699
http://www.StatisticalEngineering.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Servreg $loglik

2010-07-20 Thread Charles Annis, P.E.
David:

Thank you for your comments.  

I do understand that absolute values of likelihoods by themselves aren't
meaningful, and only gain meaning when compared with others computed using
the same model but with differing parameter values (for example). That is
why I compute likelihoods myself for confidence interval construction using
the loglikelihood ratio criterion.  But when my plain-vanilla
max(loglikelihood) didn't agree with that reported by survreg() (except when
the data are unweighted) I was afraid I overlooked something.  I am still
puzzled that the servreg(), using the same model with the same data (37
weighted and the corresponding 70 unweighted) produces different values for
loglik.

Thank you for your help.  It gives me peace of mind.  ~:-)

Charles Annis, P.E.

charles.an...@statisticalengineering.com
561-352-9699
http://www.StatisticalEngineering.com


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Tuesday, July 20, 2010 12:27 PM
To: charles.an...@statisticalengineering.com
Cc: r-help@r-project.org
Subject: Re: [R] Servreg $loglik


On Jul 20, 2010, at 11:20 AM, Charles Annis, P.E. wrote:

 Dear R-experts:

 I am using survreg() to estimate the parameters of a Weibull density  
 having
 right-censored observations.  Some observations are weighted.  To do  
 that I
 regress the weighed observations against a column of ones.

 When I enter the data as 37 weighted observations, the parameter  
 estimates
 are exactly the same as when I enter the data as the corresponding 70
 unweighted observations.  This is to be expected, of course.

 I don't understand, however, why the reported loglikelihood is
 parameter.estimates$loglik
 [1] -120.4699 -120.4699
 for the 37 weighted observations, but
 parameter.estimates$loglik
 [1] -135.1527 -135.1527
 for the 70 unweighted observations.

This has come up on r-help many times before (and probably on other  
lists as well), despite not being an R question at all. It is  
commonplace in modeling grouped data to see likelihoods reported  
differently from the result obtained when modeling ungrouped data  
representations with the same frequencies. The only valid statistical  
process is to compare differences in the likelihoods (or log(L) ),  
since the likelihood (or log(L) ) is only defined up to an arbitrary  
constant. You need to be comparing the result to some sort of null  
model for it to have any meaning. (... or perhaps that is your null  
model and you need to be looking at the impact of adding a covariate  
or two.)

-- 
David.


 (For the record, my computations of the loglikelihood, using the  
 dweibull()
 function for the observations and the pweibull() function for the  
 censored
 observations, is -135.1527 for both 37 weighted and 70 unweighted.)

 I am using the data from Meeker and Escobar, _Statistical Methods for
 Reliability Data_, Wiley (1998), Table C.1, shown below:

 Hours Status  Num.Parts
  450  Failure 1
  460  R-Censored  1
 1150  Failure 2
 1560  R-Censored  1
 1600  Failure 1
 1660  R-Censored  1
 1850  R-Censored  5
 2030  R-Censored  3
 2070  Failure 2
 2080  Failure 1
 2200  R-Censored  1
 3000  R-Censored  4
 3100  Failure 1
 3200  R-Censored  1
 3450  Failure 1
 3750  R-Censored  2
 4150  R-Censored  4
 4300  R-Censored  4
 4600  Failure 1
 4850  R-Censored  4
 5000  R-Censored  3
 6100  R-Censored  3
 6100  Failure 1
 6300  R-Censored  1
 6450  R-Censored  2
 6700  R-Censored  1
 7450  R-Censored  1
 7800  R-Censored  2
 8100  R-Censored  2
 8200  R-Censored  1
 8500  R-Censored  3
 8750  R-Censored  2
 8750  Failure 1
 9400  R-Censored  1
 9900  R-Censored  1
 10100 R-Censored  3
 11500 R-Censored  1

 I am running R version 2.11.1 (2010-05-31) on a HP Windows 7 box  
 with 8 gig
 RAM.


 Thank you for your help.

 Charles Annis, P.E.

 charles.an...@statisticalengineering.com
 561-352-9699
 http://www.StatisticalEngineering.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.