Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data.

2008-10-08 Thread Prof Brian Ripley

On Wed, 8 Oct 2008, Gough Lauren wrote:


Hi,

Thank you very much for your reply. This seems to be working OK when
fitting weibull and lognormal distributions.  However, fitdistr now
requires me to include start values:


As documented.


ltwei-function(x,shape,scale,log=FALSE){

+ dweibull(x,shape,scale,log)/pweibull(1,shape,scale,lower=FALSE)
+ }

ltweifit-fitdistr(x,ltwei) # x is observed data

Error in fitdistr(x, ltwei) : 'start' must be a named list

ltweifit-fitdistr(x,ltwei,start=list(shape=0.5,scale=0.5))

There were 34 warnings (use warnings() to see them)

ltweifit

 shape scale
  1.11108278   13.00703630
( 0.01936651) ( 0.42897340)

Is there anyway I can fit to truncated data without having to name start
values?  Alternatively, is there any recommended technique for choosing
sensible start values?


Not really, depends how heavy the truncation is.


Further, when I try to fit an exponential distribution I get an error
message:


But a truncated exponential is just a shifted exponential and has one 
parameter -- you gave it two!  Just fit an exponential to x-1.



ltexp-function(x,rate,log=FALSE){

+ dexp(x,rate,log)/pexp(1,rate,lower=FALSE)
+ }

ltexpfit-fitdistr(x,ltexp)

Error in fitdistr(x, ltexp) : 'start' must be a named list

ltexpfit-fitdistr(x,ltexp,start=list(0.1))

Warning message:
In optim(x = c(2.541609, 1.436143, 4.600524, 6.437174, 2.84974,  :
 one-diml optimization by Nelder-Mead is unreliable: use optimize

ltexpfit

Error in dn[[2]] : subscript out of bounds

This error message seems to occur regardless of the start value used.
Do you know why this is?

Sorry to pester you again, and apologies if I am asking silly questions
- my knowledge of R and probability distributions (except the normal!)
are rather limited!

Best wishes

Lauren

-Original Message-
From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
Sent: 07 October 2008 12:25
To: [EMAIL PROTECTED]
Cc: Gough Lauren; vito muggeo; r-help@r-project.org
Subject: Re: [R] Fitting weibull, exponential and lognormal
distributions to left-truncated data.

On Tue, 7 Oct 2008, [EMAIL PROTECTED] wrote:


I have several datasets, all left-truncated at x=1, that I am

attempting

to fit distributions to (lognormal, weibull and exponential).  I had



been using fitdistr in the MASS package as follows:



A possible solution is to use the survreg() in the survival package
without specifying the covariates, i.e.

library(survival)
survreg(Surv(..)~1, dist=weibull)

where Surv(..) accepts information about times,
censoring/truncation variables and dist allows to specify alternative

distributions.

See ?Surv e ?survreg


The survival package is mostly targeted at right-censored data.  The
NADA package provides wrappers for many of the survival routines so
they work with left-censored data.


Left-censoring and left-truncation are not the same thing.  With
left-censoring you see that you had observations  1, and with
left-truncation you do not (at least how the terms are usually applied:
occasionally the meanings are reversed).

For left-truncation it is relatively easy, e.g.

ltwei - function(x, shape, scale = 1, log = FALSE)
dweibull(x, shape, scale, log)/pweibull(1, shape, scale,
lower=FALSE)

and use this in fitdistr.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.




--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data.

2008-10-08 Thread Göran Broström
The package 'eha' fits these distributions (and more) with general
left truncation and right censoring, and also regression models a la
survreg. Look at 'phreg' for parametric proportional hazards models
and 'aftreg' for accelerated failure time models. In your case of no
covariates, the two functions give (of course) identical results.

Hth,
Göran

On Wed, Oct 8, 2008 at 1:09 PM, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 On Wed, 8 Oct 2008, Gough Lauren wrote:

 Hi,

 Thank you very much for your reply. This seems to be working OK when
 fitting weibull and lognormal distributions.  However, fitdistr now
 requires me to include start values:

 As documented.

 ltwei-function(x,shape,scale,log=FALSE){

 + dweibull(x,shape,scale,log)/pweibull(1,shape,scale,lower=FALSE)
 + }

 ltweifit-fitdistr(x,ltwei) # x is observed data

 Error in fitdistr(x, ltwei) : 'start' must be a named list

 ltweifit-fitdistr(x,ltwei,start=list(shape=0.5,scale=0.5))

 There were 34 warnings (use warnings() to see them)

 ltweifit

 shape scale
  1.11108278   13.00703630
 ( 0.01936651) ( 0.42897340)

 Is there anyway I can fit to truncated data without having to name start
 values?  Alternatively, is there any recommended technique for choosing
 sensible start values?

 Not really, depends how heavy the truncation is.

 Further, when I try to fit an exponential distribution I get an error
 message:

 But a truncated exponential is just a shifted exponential and has one
 parameter -- you gave it two!  Just fit an exponential to x-1.

 ltexp-function(x,rate,log=FALSE){

 + dexp(x,rate,log)/pexp(1,rate,lower=FALSE)
 + }

 ltexpfit-fitdistr(x,ltexp)

 Error in fitdistr(x, ltexp) : 'start' must be a named list

 ltexpfit-fitdistr(x,ltexp,start=list(0.1))

 Warning message:
 In optim(x = c(2.541609, 1.436143, 4.600524, 6.437174, 2.84974,  :
  one-diml optimization by Nelder-Mead is unreliable: use optimize

 ltexpfit

 Error in dn[[2]] : subscript out of bounds

 This error message seems to occur regardless of the start value used.
 Do you know why this is?

 Sorry to pester you again, and apologies if I am asking silly questions
 - my knowledge of R and probability distributions (except the normal!)
 are rather limited!

 Best wishes

 Lauren

 -Original Message-
 From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
 Sent: 07 October 2008 12:25
 To: [EMAIL PROTECTED]
 Cc: Gough Lauren; vito muggeo; r-help@r-project.org
 Subject: Re: [R] Fitting weibull, exponential and lognormal
 distributions to left-truncated data.

 On Tue, 7 Oct 2008, [EMAIL PROTECTED] wrote:

 I have several datasets, all left-truncated at x=1, that I am

 attempting

 to fit distributions to (lognormal, weibull and exponential).  I had

 been using fitdistr in the MASS package as follows:

 A possible solution is to use the survreg() in the survival package
 without specifying the covariates, i.e.

 library(survival)
 survreg(Surv(..)~1, dist=weibull)

 where Surv(..) accepts information about times,
 censoring/truncation variables and dist allows to specify alternative

 distributions.

 See ?Surv e ?survreg

 The survival package is mostly targeted at right-censored data.  The
 NADA package provides wrappers for many of the survival routines so
 they work with left-censored data.

 Left-censoring and left-truncation are not the same thing.  With
 left-censoring you see that you had observations  1, and with
 left-truncation you do not (at least how the terms are usually applied:
 occasionally the meanings are reversed).

 For left-truncation it is relatively easy, e.g.

 ltwei - function(x, shape, scale = 1, log = FALSE)
dweibull(x, shape, scale, log)/pweibull(1, shape, scale,
 lower=FALSE)

 and use this in fitdistr.

 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

 This message has been checked for viruses but the contents of an
 attachment
 may still contain software viruses, which could damage your computer
 system:
 you are advised to perform your own checks. Email communications with the
 University of Nottingham may be monitored as permitted by UK legislation.



 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Göran Broström

[R] Fitting weibull, exponential and lognormal distributions to left-truncated data.

2008-10-07 Thread Gough Lauren
Dear All,

I have two questions regarding distribution fitting.

I have several datasets, all left-truncated at x=1, that I am attempting
to fit distributions to (lognormal, weibull and exponential).  I had
been using fitdistr in the MASS package as follows:

fitdistr-(x,weibull)

However, this does not take into consideration the truncation at x=1.  I
read another posting in this forum that suggested using the argument
lower to truncate the distribution fitting.  However, this does not
seem to be working.  For example, when I attempt to fit a weibull
distribution truncated at x=1 using lower, it seems to set the
best-fit shape parameter at 1:

 fitdistr(x,weibull,lower=1)
 shapescale   
  1.   9.87964337 
 (0.02358731) (0.40649570) ##I have tried this on other datasets also
truncated at x=1 and get the same result (i.e. shape=1).

Does anyone know how to successfully fit the exponential, weibull and
lognormal distributions to truncated data?



Secondly, as my datasets are large (1000 data points) assessing the fit
of the distribution with kolmogorov smirnov goodness of fit tests is
routinely showing statistical significance for all distributions.
Therefore, I would like to plot the observed data with the theoretical
best fit distributions (weibull, exponential and lognormal) to visually
assess which fits the observed data best.  So far I have been doing this
as follows:

fitdistr(x,weibull)
shapescale   
  a b 

D1-density(x) ##density distribution of observed data
D2-density(rweibull(1500,shape=a,scale=b)) ##density of a random
variable following the theoretical best fit weibull distribution with
shape parameter =a, scale parameter = b.

plot(range(D1$x),range(D1$y,D2$y),type=n,xlab=x,ylab=Density)
lines(D1,col=red)
lines(D2,col=blue)

This successfully plots the two density curves on the same graph, but it
plots data below the x=1 threshold - even for the observed data!  I have
tried limiting the scale of x-axis using xlim=c(1,150) but the graph
still plots the origin of the graph as (0,0).  I can only get different
origins if I limit x more extremely e.g. xlim=c(50,150).  Does anyone
know how I can successfully change the origin of the graph to (1,0)?


Sorry for the long e-mail! Any help would be greatly appreciated.

Regards,

Lauren

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data.

2008-10-07 Thread Richard . Cotton
  I have several datasets, all left-truncated at x=1, that I am 
attempting
  to fit distributions to (lognormal, weibull and exponential).  I had
  been using fitdistr in the MASS package as follows:

 A possible solution is to use the survreg() in the survival package 
 without specifying the covariates, i.e.
 
 library(survival)
 survreg(Surv(..)~1, dist=weibull)
 
 where Surv(..) accepts information about times, censoring/truncation 
 variables and dist allows to specify alternative distributions.
 See ?Surv e ?survreg

The survival package is mostly targeted at right-censored data.  The NADA 
package provides wrappers for many of the survival routines so they work 
with left-censored data.

Regards,
Richie.

Mathematical Sciences Unit
HSL



ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting weibull, exponential and lognormal distributions to left-truncated data.

2008-10-07 Thread Prof Brian Ripley

On Tue, 7 Oct 2008, [EMAIL PROTECTED] wrote:


I have several datasets, all left-truncated at x=1, that I am

attempting

to fit distributions to (lognormal, weibull and exponential).  I had
been using fitdistr in the MASS package as follows:



A possible solution is to use the survreg() in the survival package
without specifying the covariates, i.e.

library(survival)
survreg(Surv(..)~1, dist=weibull)

where Surv(..) accepts information about times, censoring/truncation
variables and dist allows to specify alternative distributions.
See ?Surv e ?survreg


The survival package is mostly targeted at right-censored data.  The NADA
package provides wrappers for many of the survival routines so they work
with left-censored data.


Left-censoring and left-truncation are not the same thing.  With 
left-censoring you see that you had observations  1, and with 
left-truncation you do not (at least how the terms are usually applied: 
occasionally the meanings are reversed).


For left-truncation it is relatively easy, e.g.

ltwei - function(x, shape, scale = 1, log = FALSE)
dweibull(x, shape, scale, log)/pweibull(1, shape, scale, lower=FALSE)

and use this in fitdistr.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.