Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data.
On Wed, 8 Oct 2008, Gough Lauren wrote: Hi, Thank you very much for your reply. This seems to be working OK when fitting weibull and lognormal distributions. However, fitdistr now requires me to include start values: As documented. ltwei-function(x,shape,scale,log=FALSE){ + dweibull(x,shape,scale,log)/pweibull(1,shape,scale,lower=FALSE) + } ltweifit-fitdistr(x,ltwei) # x is observed data Error in fitdistr(x, ltwei) : 'start' must be a named list ltweifit-fitdistr(x,ltwei,start=list(shape=0.5,scale=0.5)) There were 34 warnings (use warnings() to see them) ltweifit shape scale 1.11108278 13.00703630 ( 0.01936651) ( 0.42897340) Is there anyway I can fit to truncated data without having to name start values? Alternatively, is there any recommended technique for choosing sensible start values? Not really, depends how heavy the truncation is. Further, when I try to fit an exponential distribution I get an error message: But a truncated exponential is just a shifted exponential and has one parameter -- you gave it two! Just fit an exponential to x-1. ltexp-function(x,rate,log=FALSE){ + dexp(x,rate,log)/pexp(1,rate,lower=FALSE) + } ltexpfit-fitdistr(x,ltexp) Error in fitdistr(x, ltexp) : 'start' must be a named list ltexpfit-fitdistr(x,ltexp,start=list(0.1)) Warning message: In optim(x = c(2.541609, 1.436143, 4.600524, 6.437174, 2.84974, : one-diml optimization by Nelder-Mead is unreliable: use optimize ltexpfit Error in dn[[2]] : subscript out of bounds This error message seems to occur regardless of the start value used. Do you know why this is? Sorry to pester you again, and apologies if I am asking silly questions - my knowledge of R and probability distributions (except the normal!) are rather limited! Best wishes Lauren -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: 07 October 2008 12:25 To: [EMAIL PROTECTED] Cc: Gough Lauren; vito muggeo; r-help@r-project.org Subject: Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data. On Tue, 7 Oct 2008, [EMAIL PROTECTED] wrote: I have several datasets, all left-truncated at x=1, that I am attempting to fit distributions to (lognormal, weibull and exponential). I had been using fitdistr in the MASS package as follows: A possible solution is to use the survreg() in the survival package without specifying the covariates, i.e. library(survival) survreg(Surv(..)~1, dist=weibull) where Surv(..) accepts information about times, censoring/truncation variables and dist allows to specify alternative distributions. See ?Surv e ?survreg The survival package is mostly targeted at right-censored data. The NADA package provides wrappers for many of the survival routines so they work with left-censored data. Left-censoring and left-truncation are not the same thing. With left-censoring you see that you had observations 1, and with left-truncation you do not (at least how the terms are usually applied: occasionally the meanings are reversed). For left-truncation it is relatively easy, e.g. ltwei - function(x, shape, scale = 1, log = FALSE) dweibull(x, shape, scale, log)/pweibull(1, shape, scale, lower=FALSE) and use this in fitdistr. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data.
The package 'eha' fits these distributions (and more) with general left truncation and right censoring, and also regression models a la survreg. Look at 'phreg' for parametric proportional hazards models and 'aftreg' for accelerated failure time models. In your case of no covariates, the two functions give (of course) identical results. Hth, Göran On Wed, Oct 8, 2008 at 1:09 PM, Prof Brian Ripley [EMAIL PROTECTED] wrote: On Wed, 8 Oct 2008, Gough Lauren wrote: Hi, Thank you very much for your reply. This seems to be working OK when fitting weibull and lognormal distributions. However, fitdistr now requires me to include start values: As documented. ltwei-function(x,shape,scale,log=FALSE){ + dweibull(x,shape,scale,log)/pweibull(1,shape,scale,lower=FALSE) + } ltweifit-fitdistr(x,ltwei) # x is observed data Error in fitdistr(x, ltwei) : 'start' must be a named list ltweifit-fitdistr(x,ltwei,start=list(shape=0.5,scale=0.5)) There were 34 warnings (use warnings() to see them) ltweifit shape scale 1.11108278 13.00703630 ( 0.01936651) ( 0.42897340) Is there anyway I can fit to truncated data without having to name start values? Alternatively, is there any recommended technique for choosing sensible start values? Not really, depends how heavy the truncation is. Further, when I try to fit an exponential distribution I get an error message: But a truncated exponential is just a shifted exponential and has one parameter -- you gave it two! Just fit an exponential to x-1. ltexp-function(x,rate,log=FALSE){ + dexp(x,rate,log)/pexp(1,rate,lower=FALSE) + } ltexpfit-fitdistr(x,ltexp) Error in fitdistr(x, ltexp) : 'start' must be a named list ltexpfit-fitdistr(x,ltexp,start=list(0.1)) Warning message: In optim(x = c(2.541609, 1.436143, 4.600524, 6.437174, 2.84974, : one-diml optimization by Nelder-Mead is unreliable: use optimize ltexpfit Error in dn[[2]] : subscript out of bounds This error message seems to occur regardless of the start value used. Do you know why this is? Sorry to pester you again, and apologies if I am asking silly questions - my knowledge of R and probability distributions (except the normal!) are rather limited! Best wishes Lauren -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: 07 October 2008 12:25 To: [EMAIL PROTECTED] Cc: Gough Lauren; vito muggeo; r-help@r-project.org Subject: Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data. On Tue, 7 Oct 2008, [EMAIL PROTECTED] wrote: I have several datasets, all left-truncated at x=1, that I am attempting to fit distributions to (lognormal, weibull and exponential). I had been using fitdistr in the MASS package as follows: A possible solution is to use the survreg() in the survival package without specifying the covariates, i.e. library(survival) survreg(Surv(..)~1, dist=weibull) where Surv(..) accepts information about times, censoring/truncation variables and dist allows to specify alternative distributions. See ?Surv e ?survreg The survival package is mostly targeted at right-censored data. The NADA package provides wrappers for many of the survival routines so they work with left-censored data. Left-censoring and left-truncation are not the same thing. With left-censoring you see that you had observations 1, and with left-truncation you do not (at least how the terms are usually applied: occasionally the meanings are reversed). For left-truncation it is relatively easy, e.g. ltwei - function(x, shape, scale = 1, log = FALSE) dweibull(x, shape, scale, log)/pweibull(1, shape, scale, lower=FALSE) and use this in fitdistr. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Göran Broström
[R] Fitting weibull, exponential and lognormal distributions to left-truncated data.
Dear All, I have two questions regarding distribution fitting. I have several datasets, all left-truncated at x=1, that I am attempting to fit distributions to (lognormal, weibull and exponential). I had been using fitdistr in the MASS package as follows: fitdistr-(x,weibull) However, this does not take into consideration the truncation at x=1. I read another posting in this forum that suggested using the argument lower to truncate the distribution fitting. However, this does not seem to be working. For example, when I attempt to fit a weibull distribution truncated at x=1 using lower, it seems to set the best-fit shape parameter at 1: fitdistr(x,weibull,lower=1) shapescale 1. 9.87964337 (0.02358731) (0.40649570) ##I have tried this on other datasets also truncated at x=1 and get the same result (i.e. shape=1). Does anyone know how to successfully fit the exponential, weibull and lognormal distributions to truncated data? Secondly, as my datasets are large (1000 data points) assessing the fit of the distribution with kolmogorov smirnov goodness of fit tests is routinely showing statistical significance for all distributions. Therefore, I would like to plot the observed data with the theoretical best fit distributions (weibull, exponential and lognormal) to visually assess which fits the observed data best. So far I have been doing this as follows: fitdistr(x,weibull) shapescale a b D1-density(x) ##density distribution of observed data D2-density(rweibull(1500,shape=a,scale=b)) ##density of a random variable following the theoretical best fit weibull distribution with shape parameter =a, scale parameter = b. plot(range(D1$x),range(D1$y,D2$y),type=n,xlab=x,ylab=Density) lines(D1,col=red) lines(D2,col=blue) This successfully plots the two density curves on the same graph, but it plots data below the x=1 threshold - even for the observed data! I have tried limiting the scale of x-axis using xlim=c(1,150) but the graph still plots the origin of the graph as (0,0). I can only get different origins if I limit x more extremely e.g. xlim=c(50,150). Does anyone know how I can successfully change the origin of the graph to (1,0)? Sorry for the long e-mail! Any help would be greatly appreciated. Regards, Lauren This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data.
I have several datasets, all left-truncated at x=1, that I am attempting to fit distributions to (lognormal, weibull and exponential). I had been using fitdistr in the MASS package as follows: A possible solution is to use the survreg() in the survival package without specifying the covariates, i.e. library(survival) survreg(Surv(..)~1, dist=weibull) where Surv(..) accepts information about times, censoring/truncation variables and dist allows to specify alternative distributions. See ?Surv e ?survreg The survival package is mostly targeted at right-censored data. The NADA package provides wrappers for many of the survival routines so they work with left-censored data. Regards, Richie. Mathematical Sciences Unit HSL ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data.
On Tue, 7 Oct 2008, [EMAIL PROTECTED] wrote: I have several datasets, all left-truncated at x=1, that I am attempting to fit distributions to (lognormal, weibull and exponential). I had been using fitdistr in the MASS package as follows: A possible solution is to use the survreg() in the survival package without specifying the covariates, i.e. library(survival) survreg(Surv(..)~1, dist=weibull) where Surv(..) accepts information about times, censoring/truncation variables and dist allows to specify alternative distributions. See ?Surv e ?survreg The survival package is mostly targeted at right-censored data. The NADA package provides wrappers for many of the survival routines so they work with left-censored data. Left-censoring and left-truncation are not the same thing. With left-censoring you see that you had observations 1, and with left-truncation you do not (at least how the terms are usually applied: occasionally the meanings are reversed). For left-truncation it is relatively easy, e.g. ltwei - function(x, shape, scale = 1, log = FALSE) dweibull(x, shape, scale, log)/pweibull(1, shape, scale, lower=FALSE) and use this in fitdistr. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.