[R] Servreg $loglik
Dear R-experts: I am using survreg() to estimate the parameters of a Weibull density having right-censored observations. Some observations are weighted. To do that I regress the weighed observations against a column of ones. When I enter the data as 37 weighted observations, the parameter estimates are exactly the same as when I enter the data as the corresponding 70 unweighted observations. This is to be expected, of course. I don't understand, however, why the reported loglikelihood is parameter.estimates$loglik [1] -120.4699 -120.4699 for the 37 weighted observations, but parameter.estimates$loglik [1] -135.1527 -135.1527 for the 70 unweighted observations. (For the record, my computations of the loglikelihood, using the dweibull() function for the observations and the pweibull() function for the censored observations, is -135.1527 for both 37 weighted and 70 unweighted.) I am using the data from Meeker and Escobar, _Statistical Methods for Reliability Data_, Wiley (1998), Table C.1, shown below: Hours Status Num.Parts 450 Failure 1 460 R-Censored 1 1150 Failure 2 1560 R-Censored 1 1600 Failure 1 1660 R-Censored 1 1850 R-Censored 5 2030 R-Censored 3 2070 Failure 2 2080 Failure 1 2200 R-Censored 1 3000 R-Censored 4 3100 Failure 1 3200 R-Censored 1 3450 Failure 1 3750 R-Censored 2 4150 R-Censored 4 4300 R-Censored 4 4600 Failure 1 4850 R-Censored 4 5000 R-Censored 3 6100 R-Censored 3 6100 Failure 1 6300 R-Censored 1 6450 R-Censored 2 6700 R-Censored 1 7450 R-Censored 1 7800 R-Censored 2 8100 R-Censored 2 8200 R-Censored 1 8500 R-Censored 3 8750 R-Censored 2 8750 Failure 1 9400 R-Censored 1 9900 R-Censored 1 10100 R-Censored 3 11500 R-Censored 1 I am running R version 2.11.1 (2010-05-31) on a HP Windows 7 box with 8 gig RAM. Thank you for your help. Charles Annis, P.E. charles.an...@statisticalengineering.com 561-352-9699 http://www.StatisticalEngineering.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Servreg $loglik
On Jul 20, 2010, at 11:20 AM, Charles Annis, P.E. wrote: Dear R-experts: I am using survreg() to estimate the parameters of a Weibull density having right-censored observations. Some observations are weighted. To do that I regress the weighed observations against a column of ones. When I enter the data as 37 weighted observations, the parameter estimates are exactly the same as when I enter the data as the corresponding 70 unweighted observations. This is to be expected, of course. I don't understand, however, why the reported loglikelihood is parameter.estimates$loglik [1] -120.4699 -120.4699 for the 37 weighted observations, but parameter.estimates$loglik [1] -135.1527 -135.1527 for the 70 unweighted observations. This has come up on r-help many times before (and probably on other lists as well), despite not being an R question at all. It is commonplace in modeling grouped data to see likelihoods reported differently from the result obtained when modeling ungrouped data representations with the same frequencies. The only valid statistical process is to compare differences in the likelihoods (or log(L) ), since the likelihood (or log(L) ) is only defined up to an arbitrary constant. You need to be comparing the result to some sort of null model for it to have any meaning. (... or perhaps that is your null model and you need to be looking at the impact of adding a covariate or two.) -- David. (For the record, my computations of the loglikelihood, using the dweibull() function for the observations and the pweibull() function for the censored observations, is -135.1527 for both 37 weighted and 70 unweighted.) I am using the data from Meeker and Escobar, _Statistical Methods for Reliability Data_, Wiley (1998), Table C.1, shown below: Hours Status Num.Parts 450Failure 1 460R-Censored 1 1150Failure 2 1560R-Censored 1 1600Failure 1 1660R-Censored 1 1850R-Censored 5 2030R-Censored 3 2070Failure 2 2080Failure 1 2200R-Censored 1 3000R-Censored 4 3100Failure 1 3200R-Censored 1 3450Failure 1 3750R-Censored 2 4150R-Censored 4 4300R-Censored 4 4600Failure 1 4850R-Censored 4 5000R-Censored 3 6100R-Censored 3 6100Failure 1 6300R-Censored 1 6450R-Censored 2 6700R-Censored 1 7450R-Censored 1 7800R-Censored 2 8100R-Censored 2 8200R-Censored 1 8500R-Censored 3 8750R-Censored 2 8750Failure 1 9400R-Censored 1 9900R-Censored 1 10100 R-Censored 3 11500 R-Censored 1 I am running R version 2.11.1 (2010-05-31) on a HP Windows 7 box with 8 gig RAM. Thank you for your help. Charles Annis, P.E. charles.an...@statisticalengineering.com 561-352-9699 http://www.StatisticalEngineering.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Servreg $loglik
David: Thank you for your comments. I do understand that absolute values of likelihoods by themselves aren't meaningful, and only gain meaning when compared with others computed using the same model but with differing parameter values (for example). That is why I compute likelihoods myself for confidence interval construction using the loglikelihood ratio criterion. But when my plain-vanilla max(loglikelihood) didn't agree with that reported by survreg() (except when the data are unweighted) I was afraid I overlooked something. I am still puzzled that the servreg(), using the same model with the same data (37 weighted and the corresponding 70 unweighted) produces different values for loglik. Thank you for your help. It gives me peace of mind. ~:-) Charles Annis, P.E. charles.an...@statisticalengineering.com 561-352-9699 http://www.StatisticalEngineering.com -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Tuesday, July 20, 2010 12:27 PM To: charles.an...@statisticalengineering.com Cc: r-help@r-project.org Subject: Re: [R] Servreg $loglik On Jul 20, 2010, at 11:20 AM, Charles Annis, P.E. wrote: Dear R-experts: I am using survreg() to estimate the parameters of a Weibull density having right-censored observations. Some observations are weighted. To do that I regress the weighed observations against a column of ones. When I enter the data as 37 weighted observations, the parameter estimates are exactly the same as when I enter the data as the corresponding 70 unweighted observations. This is to be expected, of course. I don't understand, however, why the reported loglikelihood is parameter.estimates$loglik [1] -120.4699 -120.4699 for the 37 weighted observations, but parameter.estimates$loglik [1] -135.1527 -135.1527 for the 70 unweighted observations. This has come up on r-help many times before (and probably on other lists as well), despite not being an R question at all. It is commonplace in modeling grouped data to see likelihoods reported differently from the result obtained when modeling ungrouped data representations with the same frequencies. The only valid statistical process is to compare differences in the likelihoods (or log(L) ), since the likelihood (or log(L) ) is only defined up to an arbitrary constant. You need to be comparing the result to some sort of null model for it to have any meaning. (... or perhaps that is your null model and you need to be looking at the impact of adding a covariate or two.) -- David. (For the record, my computations of the loglikelihood, using the dweibull() function for the observations and the pweibull() function for the censored observations, is -135.1527 for both 37 weighted and 70 unweighted.) I am using the data from Meeker and Escobar, _Statistical Methods for Reliability Data_, Wiley (1998), Table C.1, shown below: Hours Status Num.Parts 450 Failure 1 460 R-Censored 1 1150 Failure 2 1560 R-Censored 1 1600 Failure 1 1660 R-Censored 1 1850 R-Censored 5 2030 R-Censored 3 2070 Failure 2 2080 Failure 1 2200 R-Censored 1 3000 R-Censored 4 3100 Failure 1 3200 R-Censored 1 3450 Failure 1 3750 R-Censored 2 4150 R-Censored 4 4300 R-Censored 4 4600 Failure 1 4850 R-Censored 4 5000 R-Censored 3 6100 R-Censored 3 6100 Failure 1 6300 R-Censored 1 6450 R-Censored 2 6700 R-Censored 1 7450 R-Censored 1 7800 R-Censored 2 8100 R-Censored 2 8200 R-Censored 1 8500 R-Censored 3 8750 R-Censored 2 8750 Failure 1 9400 R-Censored 1 9900 R-Censored 1 10100 R-Censored 3 11500 R-Censored 1 I am running R version 2.11.1 (2010-05-31) on a HP Windows 7 box with 8 gig RAM. Thank you for your help. Charles Annis, P.E. charles.an...@statisticalengineering.com 561-352-9699 http://www.StatisticalEngineering.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.