Re: [R] Fitting a Weibull/NaNs
On 19 Oct 2003 21:18:09 -0700, you wrote: The problem seems to be that some of the values of d$Age.Month are 0 and since the Weibull always has a value of 0 at 0, the log likelihood comes out insane. (I'm getting 0 values due to quantization error). OTOH when I remove the 0 values it works great, but that seems kind of ad hoc. Is there some standard fix for this? A standard fix is to replace zeros with some small positive number, but this isn't entirely satisfactory. It's definitely worthwhile doing a sensitivity test: e.g. do you get essentially the same answer using 0.01 and 0.0001 for the replacement value? If not, you might want to question the use of a model that predicts zero density in an area where you've got observations. Duncan Murdoch __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] Fitting a Weibull/NaNs
I'm trying to fit a Weibull distribution to some data via maximum likelihood estimation. I'm following the procedure described by Doug Bates in his Using Open Source Software to Teach Mathematical Statistics but I keep getting warnings about NaNs being converted to maximum positive value: llfunc - function (x) { -sum(dweibull(AM,shape=x[1],scale=x[2], log=TRUE))} mle - nlm(llfunc,c(shape=1.5,scale=40), hessian=TRUE) Warning messages: 1: NaNs produced in: dweibull(x, shape, scale, log) 2: NA/Inf replaced by maximum positive value 3: NaNs produced in: dweibull(x, shape, scale, log) 4: NA/Inf replaced by maximum positive value Can someone offer some advice here? Thanks, -Ekr -- [Eric Rescorla [EMAIL PROTECTED] http://www.rtfm.com/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Fitting a Weibull/NaNs
I have not used nlm, but that happens routinely with function minimizers trying to test negative values for one or more component of x. My standard approach to something like this is to parameterize llfunc in terms of log(shape) and log(scale), as follows: llfunc - function (x) { -sum(dweibull(AM,shape=exp(x[1]),scale=exp(x[2]), log=TRUE))} Have you tried this? If no, I suspect the warnings will disappear when you try this. If not, I suggest you rewrite llfunc to store nlglk - (-sum(...)) and then print out x whenever nlglk is NA or Inf or Nan. hope this helps. spencer graves Eric Rescorla wrote: I'm trying to fit a Weibull distribution to some data via maximum likelihood estimation. I'm following the procedure described by Doug Bates in his Using Open Source Software to Teach Mathematical Statistics but I keep getting warnings about NaNs being converted to maximum positive value: llfunc - function (x) { -sum(dweibull(AM,shape=x[1],scale=x[2], log=TRUE))} mle - nlm(llfunc,c(shape=1.5,scale=40), hessian=TRUE) Warning messages: 1: NaNs produced in: dweibull(x, shape, scale, log) 2: NA/Inf replaced by maximum positive value 3: NaNs produced in: dweibull(x, shape, scale, log) 4: NA/Inf replaced by maximum positive value Can someone offer some advice here? Thanks, -Ekr -- [Eric Rescorla [EMAIL PROTECTED] http://www.rtfm.com/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Fitting a Weibull/NaNs
Spencer Graves [EMAIL PROTECTED] writes: I have not used nlm, but that happens routinely with function minimizers trying to test negative values for one or more component of x. My standard approach to something like this is to parameterize llfunc in terms of log(shape) and log(scale), as follows: llfunc - function (x) { -sum(dweibull(AM,shape=exp(x[1]),scale=exp(x[2]), log=TRUE))} Have you tried this? If no, I suspect the warnings will disappear when you try this. This works. I've got some more questions, though: (1) Does it introduce bias to work with the logs like this? (2) My original data set had zero values. I added .5 experimentally, which is how I got to this data set. This procedure doesn't work on the original data set. Instead I get (with the numbers below being the values that caused problems): [1] 0.41 3.70 1.00 [1] 0.41 3.70 1.00 [1] 0.410001 3.70 1.00 [1] 0.41 3.74 1.00 [1] 0.41 3.70 1.01 Warning messages: 1: NA/Inf replaced by maximum positive value 2: NA/Inf replaced by maximum positive value 3: NA/Inf replaced by maximum positive value 4: NA/Inf replaced by maximum positive value Thanks, -Ekr -- [Eric Rescorla [EMAIL PROTECTED] http://www.rtfm.com/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Fitting a Weibull/NaNs
If the algorithm works properly, you should get exactly the same answer using a linear or a log scale for the parameters. The bigger question is not bias but the accuracy of a normal approximation for confidence intervals and regions. I have evaluated this by making contour plots of the log(likelihood). I use outer to compute this over an appropriate grid of the parameters. Then I use contour [or image with contour(..., add=TRUE)] to see the result. After I get a picture, I may specify the levels, using, e.g., 2*log(likelihood ratio) is approximately chi-square with 2 degrees of freedom. The normality assumption says that the contours should be close to elliptical. I've also fit log(likelihood) to a parabola in the parameters, possibly after deleting points beyond the 0.001 level for chi-square(2). If I get a good fit, I'm happy. If not, I try a different parameterization. When I've done this, I've found that I tend to get more nearly normal contours by throwing the constraint to (-Inf) than leaving it at 0, i.e., by Bates and Watts (1988) Nonlinear Regression Analysis and Its Applications (Wiley) explain that parameter effects curvature seems to be vastly greater than the intrinsic curvature of the nonlinear manifold, onto which a response vector is projected by nonlinear least square. This is different from maximum likelihood, but I believe that this principle would still likely apply. Does this make sense? spencer graves p.s. I don't understand what you are saying about 0.41 3.70 1.00 below. You are giving me a set of three numbers when you are trying to estimate two parameters and getting NAs, Inf's and NaNs. I don't understand. Are you printing out x when the log(likelihood) is NA, NaN or Inf? If yes, is one component of x = 0? Eric Rescorla wrote: Spencer Graves [EMAIL PROTECTED] writes: I have not used nlm, but that happens routinely with function minimizers trying to test negative values for one or more component of x. My standard approach to something like this is to parameterize llfunc in terms of log(shape) and log(scale), as follows: llfunc - function (x) { -sum(dweibull(AM,shape=exp(x[1]),scale=exp(x[2]), log=TRUE))} Have you tried this? If no, I suspect the warnings will disappear when you try this. This works. I've got some more questions, though: (1) Does it introduce bias to work with the logs like this? (2) My original data set had zero values. I added .5 experimentally, which is how I got to this data set. This procedure doesn't work on the original data set. Instead I get (with the numbers below being the values that caused problems): [1] 0.41 3.70 1.00 [1] 0.41 3.70 1.00 [1] 0.410001 3.70 1.00 [1] 0.41 3.74 1.00 [1] 0.41 3.70 1.01 Warning messages: 1: NA/Inf replaced by maximum positive value 2: NA/Inf replaced by maximum positive value 3: NA/Inf replaced by maximum positive value 4: NA/Inf replaced by maximum positive value Thanks, -Ekr __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Fitting a Weibull/NaNs
Spencer Graves [EMAIL PROTECTED] writes: Bates and Watts (1988) Nonlinear Regression Analysis and Its Applications (Wiley) explain that parameter effects curvature seems to be vastly greater than the intrinsic curvature of the nonlinear manifold, onto which a response vector is projected by nonlinear least square. This is different from maximum likelihood, but I believe that this principle would still likely apply. Does this make sense? spencer graves Some :) p.s. I don't understand what you are saying about 0.41 3.70 1.00 below. You are giving me a set of three numbers when you are trying to estimate two parameters and getting NAs, Inf's and NaNs. I don't understand. Are you printing out x when the log(likelihood) is NA, NaN or Inf? If yes, is one component of x = 0? Eric Rescorla wrote: Doh! Typographical error to R. I had the hessian=TRUE clause inside the c(). Doesn't make any difference for the results, though. I'm doing the following: llfunc - + function (zzz) { + tmp - -sum(dweibull(d$Age.Month,shape=exp(zzz[1]),scale=exp(zzz[2]), log=TRUE)) + if(is.infinite(tmp) | is.na(tmp)) { print(zzz);} + tmp + + } mle - nlm(llfunc,c(shape=.37,scale=4.0), hessian=TRUE) [1] 0.37 4.00 [1] 0.37 4.00 [1] 0.370001 4.00 [1] 0.37 4.04 [1] 0.3701 4. [1] 0.3700 4.0004 [1] 0.3702 4. [1] 0.3701 4.0004 [1] 0.3700 4.0008 Warning messages: 1: NA/Inf replaced by maximum positive value 2: NA/Inf replaced by maximum positive value 3: NA/Inf replaced by maximum positive value 4: NA/Inf replaced by maximum positive value 5: NA/Inf replaced by maximum positive value 6: NA/Inf replaced by maximum positive value 7: NA/Inf replaced by maximum positive value 8: NA/Inf replaced by maximum positive value I'm a little vague on how this is supposed to work, but when I just compute -sum(dweibull(d$Age.Month,shape=1.5,scale=40,log=TRUE)) I get Inf. The problem seems to be that some of the values of d$Age.Month are 0 and since the Weibull always has a value of 0 at 0, the log likelihood comes out insane. (I'm getting 0 values due to quantization error). OTOH when I remove the 0 values it works great, but that seems kind of ad hoc. Is there some standard fix for this? Thanks much, -Ekr -- [Eric Rescorla [EMAIL PROTECTED] http://www.rtfm.com/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help