On 06-Sep-05 Huntsinger, Reid wrote: > The MLE of beta is the reciprocal of the sample mean, so you > don't need an optimizer here. > > Reid Huntsinger
While that is true (and Naja clearly knew this), nevertheless one expects that using an optimiser should also work. Nadja's observations need an explanation. If things don't behave as expected, it is worth while embedding "debug prints" so as to monitor what is going on internally (as fas as one can). In this case, if one modifies Nadja's "ll" function to ll<-function(beta){ n<-24 x<-data2 temp<-(-n*log(beta)+beta*sum(x)) print(temp) temp } and re-runs 'mle', one sees that while there are some numerical values in the output, there are many NaNs. Also, given the warning message and the advice to look at "warnings()", one learns that "NaNs produced in: log(x)" repeatedly. This very strongly suggests that attempts have been made to take logs of negative numbers which in trun suggests that the method of computing the next approximation readily takes the value of beta outside the valid range of beta > 0. Now is the time to look at "?mle", which says that the default method is "BGFS" for which "see optim". Under "?optim" we learn that "BGFS" is a "quasi-Newton method". Such methods work by calculating a local tangent to the derivative function and extrapolating this until it meets the beta-axis, and this can easily take the estimate outside admissible ranges (try using Newton-Raphson to solve sqrt(x) = 0). However, a related method available for 'optim' is "L-BFGS-B" "which allows _box constraints_, that is each variable can be given a lower and/or upper bound. The initial value must satisfy the constraints." This can be set in a parameter for 'mle'. So now you can try something like est<-mle(minuslog=ll, start=list(beta=0.1), method="L-BFGS-B", lower=10*(.Machine$double.eps)) and now the trace-prints show a series of numbers, with no NaNs, so clearly we are getting somewhere (and have identified and dealt with at least one aspect of the problem). However, observing the prints, one sees that after an initial trend to convergence there is a tendency to oscillate between values in the neighbouhood of beta=360 and values in the neighbourhood of beta=800, finally failing when two successive values "360.6573" are printed, which in turn suggests that an attempt is made to comuted a gradient from identical points. So clearly there is something not right about how the method works for this particular problem (which, as a statistical estimation problem, could hardly be simpler!). Now, "?optim" has, at the end, a Note to the effect that the default method (admittedly Nelder-Mead, which is not relevant to the above) may not work well and suggests using 'optimize' instead. So let's try 'optimize' anyway. Now, with optimize(ll,lower=10*(.Machine$double.eps),upper=1e10) we get a clean set of debug-prints, and convergence to beta = 5.881105e-05 with "minimum" 'll' equal to 254.6480. Now compare with the known MLE which is beta = 1/mean(data2) = 6.766491e-05 giving ll(1/mean(data2)) = 254.4226, So clearly, now, using 'optimise' instead of 'optim' which is what 'mle' uses, we are now "in the right parish". However, there is apparently no parameter to 'mle' which would enable us to force it to use 'optimize' rather than 'optim'! This interesting saga, provoked by Nadja's query, now raises an important general question: Given the simplicity of the problem, why is the use of 'mle' so unexpectedly problematic? While in the case of an exponential distribution (which has a well-known analytical solution) one would not want to use 'mle' to find the MLE (except as a test of 'mle'. perhaps), one can easily think of other distributions, in form and behaviour very similar to the negative exponential but without analytical solution, for which use of 'mle' or some other optimisation routine would be required. Such distributions could well give rise to similar problems -- or worse: in Nadja's example,it was clear that it was not working; in other cases, it might appear to give a result, but the result might be very wrong and this would not be obvious. Hmmm. Ted. > > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Nadja Riedwyl > Sent: Tuesday, September 06, 2005 9:39 AM > To: r-help@stat.math.ethz.ch > Subject: [R] fitting distributions with R > > > Dear all > I've got the dataset > data:2743;4678;21427;6194;10286;1505;12811;2161;6853;2625;14542;694;1149 > 1; > _ _ _ _ _ 14924;28640;17097;2136;5308;3477;91301;11488;3860;64114;14334 > I know from other testing that it should be possible to fit the data > with > the > exponentialdistribution. I tried to get parameterestimates for the > exponentialdistribution with R, but as the values > of the parameter are very close to 0 i get into troubles. Do you know, > what > i > could do in order to get estimates?How do you choose the starting > values? in > > my opinion it should be around 1/mean(data). > > >#Parameterestimation _with mle() with the log-likelihood funktion of the >#_ >#exponentialdistribution > library(stats4) > ll<-function(beta) > {n<-24 > x<-data2 > -n*log(beta)+beta*sum(x)} > est<-mle(minuslog=ll, start=list(beta=0.1)) > summary(est) > >#instead of a result, i get: > > > Error in optim(start, f, method = method, hessian = TRUE, ...) : > _ _ _ _ non-finite finite-difference value [1] > In addition: There were 50 or more warnings (use warnings() to see the > first > > 50) >#with fitdistr() for the exponentialdistribution > library(MASS) > fitdistr(data2,densfun=dexp,start=list(rate=0.1),lower=6e-06,method="BFG > S") > >#instead of a result, i get > > Error in optim(start, mylogfn, x = x, hessian = TRUE, ...) : > _ _ _ _ non-finite finite-difference value [1] > In addition: Warning messages: > 1: bounds can only be used with method L-BFGS-B in: optim(start, > mylogfn, x > = > x, hessian = TRUE, ...) > 2: NaNs produced in: dexp(x, 1/rate, log) > > > i'll be very happy for any help i can get to solve this problem > thank you! > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html -------------------------------------------------------------------- E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 Date: 06-Sep-05 Time: 16:46:12 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html