Re: [R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'
There's at least one package that can do zero-inflated gamma regression (Rfast2::zigamma). I'm not sure it's ML, though. On Thu, Jan 19, 2023 at 10:17 AM Jeff Newmiller wrote: > Beware of adding a constant... the magnitude of the constant used can have > an outsized impact on the answer obtained. See e.g. > https://gist.github.com/jdnewmil/99301a88de702ad2fcbaef33326b08b4 > > On January 19, 2023 3:49:29 AM PST, peter dalgaard > wrote: > >Not necessarily homework, Bert. There's a generic issue with MLE and > rounded data, in that gamma densities may be 0 at the boundary but small > numbers are represented as 0, making the log-likelihood -Inf. > > > >The cleanest way out is to switch to a discretized distribution in the > likelihood, so that instead of log(dgamma(0,...)) you use > log(pgamma(.005,..) - pgamma(0,...)) == pgamma(.005,..., log=TRUE). (For > data rounded to nearest .01, that is). Cruder techniques would be to just > add, like, .0025 to all the zeros. > > > >-pd > > > >> On 10 Jan 2023, at 18:42 , Bert Gunter wrote: > >> > >> Is this homework? This list has a no-homework policy. > >> > >> > >> -- Bert > >> > >> On Tue, Jan 10, 2023 at 8:13 AM Nyasha > wrote: > >>> > >>> Please how can one go about this one? I don't know how to go about it. > >>> > >>>[[alternative HTML version deleted]] > >>> > >>> __ > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >> > >> __ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > Sent from my phone. Please excuse my brevity. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'
Another situation for the presence of 0 is about dosage when concentration is below the detection limit. It is not necessary to discretize the data. We propose a method here: Salvat-Leal I, Cortés-Gómez AA, Romero D, Girondot M (2022) New method for imputation of unquantifiable values using Bayesian statistics for a mixture of censored or truncated distributions: Application to trace elements measured in blood of olive ridley sea turtles from mexico. Animals 2022: 2919 doi 10.3390/ani12212919 with R code. If the data has "true" 0, the gamma distribution is not the best choice as 0 is impossible with gamma distribution. Marc Le 19/01/2023 à 12:49, peter dalgaard a écrit : Not necessarily homework, Bert. There's a generic issue with MLE and rounded data, in that gamma densities may be 0 at the boundary but small numbers are represented as 0, making the log-likelihood -Inf. The cleanest way out is to switch to a discretized distribution in the likelihood, so that instead of log(dgamma(0,...)) you use log(pgamma(.005,..) - pgamma(0,...)) == pgamma(.005,..., log=TRUE). (For data rounded to nearest .01, that is). Cruder techniques would be to just add, like, .0025 to all the zeros. -pd On 10 Jan 2023, at 18:42 , Bert Gunter wrote: Is this homework? This list has a no-homework policy. -- Bert On Tue, Jan 10, 2023 at 8:13 AM Nyasha wrote: Please how can one go about this one? I don't know how to go about it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'
Beware of adding a constant... the magnitude of the constant used can have an outsized impact on the answer obtained. See e.g. https://gist.github.com/jdnewmil/99301a88de702ad2fcbaef33326b08b4 On January 19, 2023 3:49:29 AM PST, peter dalgaard wrote: >Not necessarily homework, Bert. There's a generic issue with MLE and rounded >data, in that gamma densities may be 0 at the boundary but small numbers are >represented as 0, making the log-likelihood -Inf. > >The cleanest way out is to switch to a discretized distribution in the >likelihood, so that instead of log(dgamma(0,...)) you use log(pgamma(.005,..) >- pgamma(0,...)) == pgamma(.005,..., log=TRUE). (For data rounded to nearest >.01, that is). Cruder techniques would be to just add, like, .0025 to all the >zeros. > >-pd > >> On 10 Jan 2023, at 18:42 , Bert Gunter wrote: >> >> Is this homework? This list has a no-homework policy. >> >> >> -- Bert >> >> On Tue, Jan 10, 2023 at 8:13 AM Nyasha wrote: >>> >>> Please how can one go about this one? I don't know how to go about it. >>> >>>[[alternative HTML version deleted]] >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'
Not necessarily homework, Bert. There's a generic issue with MLE and rounded data, in that gamma densities may be 0 at the boundary but small numbers are represented as 0, making the log-likelihood -Inf. The cleanest way out is to switch to a discretized distribution in the likelihood, so that instead of log(dgamma(0,...)) you use log(pgamma(.005,..) - pgamma(0,...)) == pgamma(.005,..., log=TRUE). (For data rounded to nearest .01, that is). Cruder techniques would be to just add, like, .0025 to all the zeros. -pd > On 10 Jan 2023, at 18:42 , Bert Gunter wrote: > > Is this homework? This list has a no-homework policy. > > > -- Bert > > On Tue, Jan 10, 2023 at 8:13 AM Nyasha wrote: >> >> Please how can one go about this one? I don't know how to go about it. >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'
Is this homework? This list has a no-homework policy. -- Bert On Tue, Jan 10, 2023 at 8:13 AM Nyasha wrote: > > Please how can one go about this one? I don't know how to go about it. > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'
Please how can one go about this one? I don't know how to go about it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE packages
Hi Bert, thanks for the quick reply. I spent a while searching before I posted, and also read through the documentation for the mle fn and the maxLik and bbmle packages. As you say, it seems likely I'm reinventing something standard, but nothing I can find quite seems to do what I need. Hence posting on the mailing list... . best wishes, Mohan On Mon, 21 Oct 2019 at 16:40, Bert Gunter wrote: > Are you familiar with R resources you can search? > > 1. CRAN task views: > https://cran.r-project.org/web/views/ > > 2. For searching: https://rseek.org/ > Searching on "maximum likelihood" there appeared to bring up relevant > resources. > > 3. RStudio resources: https://education.rstudio.com/ > Note: RStudio is a private company that is not part of the R Foundation, > but may have useful programming resources for you. > > 4. Tons of online tutorials: Just search! > > I have not looked at your code in any detail, but I'd be willing to bet > you're trying to reinvent a square wheel. > > Cheers, > Bert > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Oct 21, 2019 at 8:21 AM Mohan Ganesalingam > wrote: > >> I'm fairly new to R. The language is amazing, but I'm having trouble >> navigating packages. I have a solution that handles the problems I'm >> working on, but I don't know if it could be solved more cleanly with mle, >> bbmle, maxLik, etc.. >> >> Here's an example problem first. I have run many WAV files through voice >> recognition software; the software returns 50 hypotheses for each, >> together >> with scores S_{ni} indicating how 'good' the i^th hypothesis is. I want to >> map the S_{ni} to a probability distribution. So I'm using MLE to fit a >> function f that maps scores to logs of relative probabilities. This means >> maximising >> >> \sum_n[ f(S_{nc_n}) - \log \sum_i \exp f(S_{ni}) ] >> >> where c_n is the index of the correct hypothesis for the n^th sample. >> >> Here's the code: >> >> ave_log_likelihood = function(f, scores) { >> def = scores %>% filter(Sc > 0) >> log_likelihoods = with(def, f(Sc) - matrixStats::rowLogSumExps(f(S), >> na.rm = T)) >> return(mean(log_likelihoods)) >> } >> >> nlopts = list(algorithm = "NLOPT_LN_BOBYQA", maxeval = 500, print_level = >> 0) >> >> best_linear_fit = function(scores) { >> res <- nloptr(c(0.01), >> function(a) -ave_log_likelihood(function(x) (a * x), >> scores), >> opts = nlopts) >> return (data.frame(log_likelihood = -res$objective, slope = >> res$solution, >> doubling = log(2)/res$solution)) >> } >> >> >> Now, I need to write a lot of variants of this with different objectives >> and with different classes of function. But there's a lot of verbiage in >> best_linear_fit which would currently be copy/pasted. Also, as written it >> makes it messy to fit on training data and then evaluate on test data. >> >> I'd appreciate any advice on packages that might make it easier to write >> this more cleanly, ideally using the idioms used in `lm`, etc., such as >> formulae and `predict`. (Any pointers on writing cleaner R code would also >> be lovely!) >> >> Thanks in advance; >> Mohan >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE packages
Are you familiar with R resources you can search? 1. CRAN task views: https://cran.r-project.org/web/views/ 2. For searching: https://rseek.org/ Searching on "maximum likelihood" there appeared to bring up relevant resources. 3. RStudio resources: https://education.rstudio.com/ Note: RStudio is a private company that is not part of the R Foundation, but may have useful programming resources for you. 4. Tons of online tutorials: Just search! I have not looked at your code in any detail, but I'd be willing to bet you're trying to reinvent a square wheel. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Oct 21, 2019 at 8:21 AM Mohan Ganesalingam wrote: > I'm fairly new to R. The language is amazing, but I'm having trouble > navigating packages. I have a solution that handles the problems I'm > working on, but I don't know if it could be solved more cleanly with mle, > bbmle, maxLik, etc.. > > Here's an example problem first. I have run many WAV files through voice > recognition software; the software returns 50 hypotheses for each, together > with scores S_{ni} indicating how 'good' the i^th hypothesis is. I want to > map the S_{ni} to a probability distribution. So I'm using MLE to fit a > function f that maps scores to logs of relative probabilities. This means > maximising > > \sum_n[ f(S_{nc_n}) - \log \sum_i \exp f(S_{ni}) ] > > where c_n is the index of the correct hypothesis for the n^th sample. > > Here's the code: > > ave_log_likelihood = function(f, scores) { > def = scores %>% filter(Sc > 0) > log_likelihoods = with(def, f(Sc) - matrixStats::rowLogSumExps(f(S), > na.rm = T)) > return(mean(log_likelihoods)) > } > > nlopts = list(algorithm = "NLOPT_LN_BOBYQA", maxeval = 500, print_level = > 0) > > best_linear_fit = function(scores) { > res <- nloptr(c(0.01), > function(a) -ave_log_likelihood(function(x) (a * x), > scores), > opts = nlopts) > return (data.frame(log_likelihood = -res$objective, slope = res$solution, > doubling = log(2)/res$solution)) > } > > > Now, I need to write a lot of variants of this with different objectives > and with different classes of function. But there's a lot of verbiage in > best_linear_fit which would currently be copy/pasted. Also, as written it > makes it messy to fit on training data and then evaluate on test data. > > I'd appreciate any advice on packages that might make it easier to write > this more cleanly, ideally using the idioms used in `lm`, etc., such as > formulae and `predict`. (Any pointers on writing cleaner R code would also > be lovely!) > > Thanks in advance; > Mohan > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE packages
I'm fairly new to R. The language is amazing, but I'm having trouble navigating packages. I have a solution that handles the problems I'm working on, but I don't know if it could be solved more cleanly with mle, bbmle, maxLik, etc.. Here's an example problem first. I have run many WAV files through voice recognition software; the software returns 50 hypotheses for each, together with scores S_{ni} indicating how 'good' the i^th hypothesis is. I want to map the S_{ni} to a probability distribution. So I'm using MLE to fit a function f that maps scores to logs of relative probabilities. This means maximising \sum_n[ f(S_{nc_n}) - \log \sum_i \exp f(S_{ni}) ] where c_n is the index of the correct hypothesis for the n^th sample. Here's the code: ave_log_likelihood = function(f, scores) { def = scores %>% filter(Sc > 0) log_likelihoods = with(def, f(Sc) - matrixStats::rowLogSumExps(f(S), na.rm = T)) return(mean(log_likelihoods)) } nlopts = list(algorithm = "NLOPT_LN_BOBYQA", maxeval = 500, print_level = 0) best_linear_fit = function(scores) { res <- nloptr(c(0.01), function(a) -ave_log_likelihood(function(x) (a * x), scores), opts = nlopts) return (data.frame(log_likelihood = -res$objective, slope = res$solution, doubling = log(2)/res$solution)) } Now, I need to write a lot of variants of this with different objectives and with different classes of function. But there's a lot of verbiage in best_linear_fit which would currently be copy/pasted. Also, as written it makes it messy to fit on training data and then evaluate on test data. I'd appreciate any advice on packages that might make it easier to write this more cleanly, ideally using the idioms used in `lm`, etc., such as formulae and `predict`. (Any pointers on writing cleaner R code would also be lovely!) Thanks in advance; Mohan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE Estimation in Extreme Value Approach to VaR (using Frechet distribution)
Hi R-Users, I am trying to estimate 95%-le VaR (Value-at-Risk) of a portfolio using Extreme Value Theory. In particular, I'll use the Frechet distribution (heavy left tail), I have data on percentage returns ( R_t) for T = 5000 past dates. This data has been divided into g = 50 non-overlapping periods of size n = 100 each and we compute the minimum return r_i over each period (i = 1,2,3,,50) Firstly, I need to estimate, by maximum likelihood approach, the 3 unknown parameters: a (scale), b (shift) and alpha = -1/k (tail index) Interpretation: *(r_i - b)/a* converges to the *Frechet distribution*, which is given by: *F*(x) = 1 - exp[ -( 1+kx )^(1/k) ]* The likelihood (to be maximized wrt a,b and k ) is given by: L = f(r_1) * f(r_2) *..*f(r_g), where *f(r_i) = (1/a) * [ 1 + k*m_i ]^(-1+ 1/k) * exp[- ( 1 + k*m_i)^(1/k) ]* i = 1,2,3,.g Here, as a short-hand, I have used m_i = (r_i - b)/a My question is: this ML-estimation by differentiating L is going to be extremely messy and the data may be poorly-conditioned (eg, the returns data may be positive, negative and of very small magnitude [~ 10^(-5) to 10^(-3) ].) Wanted your help in performing this estimation process efficiently. As a wrap, the 95%-le VaR would finally come to *VaR = b - (a/k) * [ 1 - {-n*log(0.95)}^k ]*, but of course, I need to plug in the estimated a,b and k values here. Any help will be sincerely appreciated. (For details, you can use Section 7.5.2 & 7.6 of '*Analysis of Financial Time Series*' by Ruey S.Tsay -2nd edition) - - Preetam Pal (+91)-9432212774 M-Stat 2nd Year, Room No. N-114 Statistics Division, C.V.Raman Hall Indian Statistical Institute, B.H.O.S. Kolkata. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mle
Hello, I am going to estimate the parameter of the count model: pr(N=n)= integration{B(x, alpha)-C(x,alpha)} by maximum likelihood estimation. n-c(0,1,2,3) and F- (0,3,4,5) are the vectors of values and observed frequency respectively. The function C(x,alpha) is not defined for n=0, but suppose C(x,alpha)=1 when n=0. When I want to insert this exception in the following loop, I don't receive reasonable estimate. pari (alpha){ nloglik- function(alpha){ B-function(x,k){} C-function(x,k){} A-function(x){ s-rep(0,length(x)) s-s+ C(x,k) s- s+B(x,k) } s } d-0 for (n in seq(along=F)){ lik-integrate(A,0,1)$value d- d - F[n]*log(lik)}} d } F- (0,3,4,5) n-length(F) mle (nloglik, start=list(alpha=alpha) } This program gives the answer when n= 1,2,3. But for n=0 I get error, I have to consider the exception : C(x,alpha)=1. Does anybody know where I need to put the exception in the program? ( For 'if' loops, I don't get reasonable results) I would appreciate any help Best Regards, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE
Dear Sir/Madam I am trying to get the Maximum Likelihood Estimation of a parameter in a probability mass function. The problem is my pmf which includes a summation and one integral. it is not similar to other known pdfs or pmfs such as normal, exponential, poisson, . Does anybody know whether I can use the current packages(like Maxlik) in R for getting the MLE of the parameter? Can anybody explains me with an example? I would appreciate any help. Best Regards, Pari __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE
Dear Pari On 21 December 2014 at 06:59, pari hesabi statistic...@hotmail.com wrote: I am trying to get the Maximum Likelihood Estimation of a parameter in a probability mass function. The problem is my pmf which includes a summation and one integral. it is not similar to other known pdfs or pmfs such as normal, exponential, poisson, . Does anybody know whether I can use the current packages(like Maxlik) in R for getting the MLE of the parameter? maxLik (not Maxlik) will probably work. Can anybody explains me with an example? R library( maxLik ) R ?maxLik http://dx.doi.org/10.1007/s00180-010-0217-1 https://absalon.itslearning.com/data/ku/103018/publications/maxLik.pdf I would appreciate any help. http://www.R-project.org/posting-guide.html http://maxlik.org/ https://r-forge.r-project.org/projects/maxlik/ Best regards, Arne -- Arne Henningsen http://www.arne-henningsen.name __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE with parameters restricted to a defined range using bbmle
Thanks, Rolf and Ben, Both solutions worked, but I finished by contraining parameters values using optim(). Best,Bernardo Niebuhr Em Segunda-feira, 8 de Dezembro de 2014 18:36, Rolf Turner r.tur...@auckland.ac.nz escreveu: I know nothing about the bbmle package and its mle2() function, but it is a general truth that if you need to constrain a parameter to be positive in an optimisation procedure a simple and effective approach is to reparameterize using exp(). I.e. represent xmin as exp(lxmin) (say) and use lxmin as the argument to your objective function. This strategy rarely if ever fails to work. cheers, Rolf Turner On 09/12/14 09:04, Bernardo Santos wrote: Dear all, I am fitting models to data with mle2 function of the bbmle package.In specific, I want to fit a power-law distribution model, as defined here (http://arxiv.org/pdf/cond-mat/0412004v3.pdf), to data. However, one of the parameters - xmin -, must be necessarily greater than zero. What can I do to restrict the possible values of a parameter that are passed to the optimizer? Here there is a sample of my code: # Loading library library(bbmle) # Creating data set.seed(1234) data - rexp(1000, rate = 0.1) # The fit will not be too good, but it is just to test # Creating the power-law distribution density function dpowlaw - function(x, alfa, xmin, log=FALSE){ c - (alfa-1)*xmin^(alfa-1) if(log) ifelse(x xmin, 0, log(c*x^(-alfa))) else ifelse(x xmin, 0, c*x^(-alfa)) } # Testing the function integrate(dpowlaw, -Inf, Inf, alfa=2, xmin=1) curve(dpowlaw(x, alfa=2.5, xmin=10), from=0, to=100, log=) curve(dpowlaw(x, alfa=2.5, xmin=1), from=1, to=100, log=xy) # Negative log-likelihood function LLlevy - function(mu, xmin){ -sum(dpowlaw(data, alfa=mu, xmin=xmin, log=T)) } # Fitting model to data mlevy - mle2(LLlevy, start=list(mu=2, xmin=1)) The result of model fitting here is Coefficients: mu xmin -916.4043 890.4248 but this does not make sense!xmin must be 0, and mu must be 1.What should I do? Thanks in advance!Bernardo Niebuhr -- Rolf Turner Technical Editor ANZJS [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE with parameters restricted to a defined range using bbmle
Dear all, I am fitting models to data with mle2 function of the bbmle package.In specific, I want to fit a power-law distribution model, as defined here (http://arxiv.org/pdf/cond-mat/0412004v3.pdf), to data. However, one of the parameters - xmin -, must be necessarily greater than zero. What can I do to restrict the possible values of a parameter that are passed to the optimizer? Here there is a sample of my code: # Loading library library(bbmle) # Creating data set.seed(1234) data - rexp(1000, rate = 0.1) # The fit will not be too good, but it is just to test # Creating the power-law distribution density function dpowlaw - function(x, alfa, xmin, log=FALSE){ c - (alfa-1)*xmin^(alfa-1) if(log) ifelse(x xmin, 0, log(c*x^(-alfa))) else ifelse(x xmin, 0, c*x^(-alfa)) } # Testing the function integrate(dpowlaw, -Inf, Inf, alfa=2, xmin=1) curve(dpowlaw(x, alfa=2.5, xmin=10), from=0, to=100, log=) curve(dpowlaw(x, alfa=2.5, xmin=1), from=1, to=100, log=xy) # Negative log-likelihood function LLlevy - function(mu, xmin){ -sum(dpowlaw(data, alfa=mu, xmin=xmin, log=T)) } # Fitting model to data mlevy - mle2(LLlevy, start=list(mu=2, xmin=1)) The result of model fitting here is Coefficients: mu xmin -916.4043 890.4248 but this does not make sense!xmin must be 0, and mu must be 1.What should I do? Thanks in advance!Bernardo Niebuhr [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE with parameters restricted to a defined range using bbmle
I know nothing about the bbmle package and its mle2() function, but it is a general truth that if you need to constrain a parameter to be positive in an optimisation procedure a simple and effective approach is to reparameterize using exp(). I.e. represent xmin as exp(lxmin) (say) and use lxmin as the argument to your objective function. This strategy rarely if ever fails to work. cheers, Rolf Turner On 09/12/14 09:04, Bernardo Santos wrote: Dear all, I am fitting models to data with mle2 function of the bbmle package.In specific, I want to fit a power-law distribution model, as defined here (http://arxiv.org/pdf/cond-mat/0412004v3.pdf), to data. However, one of the parameters - xmin -, must be necessarily greater than zero. What can I do to restrict the possible values of a parameter that are passed to the optimizer? Here there is a sample of my code: # Loading library library(bbmle) # Creating data set.seed(1234) data - rexp(1000, rate = 0.1) # The fit will not be too good, but it is just to test # Creating the power-law distribution density function dpowlaw - function(x, alfa, xmin, log=FALSE){ c - (alfa-1)*xmin^(alfa-1) if(log) ifelse(x xmin, 0, log(c*x^(-alfa))) else ifelse(x xmin, 0, c*x^(-alfa)) } # Testing the function integrate(dpowlaw, -Inf, Inf, alfa=2, xmin=1) curve(dpowlaw(x, alfa=2.5, xmin=10), from=0, to=100, log=) curve(dpowlaw(x, alfa=2.5, xmin=1), from=1, to=100, log=xy) # Negative log-likelihood function LLlevy - function(mu, xmin){ -sum(dpowlaw(data, alfa=mu, xmin=xmin, log=T)) } # Fitting model to data mlevy - mle2(LLlevy, start=list(mu=2, xmin=1)) The result of model fitting here is Coefficients: mu xmin -916.4043 890.4248 but this does not make sense!xmin must be 0, and mu must be 1.What should I do? Thanks in advance!Bernardo Niebuhr -- Rolf Turner Technical Editor ANZJS __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE with parameters restricted to a defined range using bbmle
Rolf Turner r.turner at auckland.ac.nz writes: I know nothing about the bbmle package and its mle2() function, but it is a general truth that if you need to constrain a parameter to be positive in an optimisation procedure a simple and effective approach is to reparameterize using exp(). mle2() is a wrapper for R's built-in optim() function, so you can use method=L-BFGS-B and set the minimum values via the lower= argument. The only potentially tricky part is that you may want to set the lower bound slightly above the desired bound, as L-BFGS-B uses as finite difference approximation to compute the gradients, and I'm not 100% sure that the finite difference computation always respects the bounds automatically. (The finite-difference stepsize is set by the 'ndeps' parameter and is 0.001 by default.) I.e. represent xmin as exp(lxmin) (say) and use lxmin as the argument to your objective function. This strategy rarely if ever fails to work. cheers, Rolf Turner On 09/12/14 09:04, Bernardo Santos wrote: Dear all, I am fitting models to data with mle2 function of the bbmle package. In specific, I want to fit a power-law distribution model, as defined here (http://arxiv.org/pdf/cond-mat/0412004v3.pdf), to data. However, one of the parameters - xmin -, must be necessarily greater than zero. What can I do to restrict the possible values of a parameter that are passed to the optimizer? Here there is a sample of my code: # Loading library library(bbmle) # Creating data set.seed(1234) data - rexp(1000, rate = 0.1) # ## The fit will not be too good, but it is just to test # Creating the power-law distribution density function dpowlaw - function(x, alfa, xmin, log=FALSE){ c - (alfa-1)*xmin^(alfa-1) if(log) ifelse(x xmin, 0, log(c*x^(-alfa))) else ifelse(x xmin, 0, c*x^(-alfa)) } # Testing the function integrate(dpowlaw, -Inf, Inf, alfa=2, xmin=1) curve(dpowlaw(x, alfa=2.5, xmin=10), from=0, to=100, log=) curve(dpowlaw(x, alfa=2.5, xmin=1), from=1, to=100, log=xy) # Negative log-likelihood function LLlevy - function(mu, xmin){ -sum(dpowlaw(data, alfa=mu, xmin=xmin, log=T)) } # Fitting model to data mlevy - mle2(LLlevy, start=list(mu=2, xmin=1)) The result of model fitting here is Coefficients: mu xmin -916.4043 890.4248 but this does not make sense!xmin must be 0, and mu must be 1.What should I do? Thanks in advance!Bernardo Niebuhr __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE
HiI have a probability mass function similar to pr(N=n)= integral(((2-x)^n)*(exp(ax-2))) - integral (((5-ax)^n)), both integrals are defined over the interval(0,2) with respect to x. I am going to estimate the parameter (a) with method of maximum likelihood estimation. The loglikelihood is : [F log(pr(n))]=lnL where F is the vector of observations and (n) is the vector of input for the defined pmf . Can anybody suggest me the fastest way of getting the MLE?I have tried this program:n-c(0,1,2,3,4,5,6,7,8)F-c(0,0,1,3,5,7,8,11,10)loglik- function(a) {sum(F*log(pr(n)))}re- maxlik(loglik, start=.5)summary(re) I don't know how to define the probability mass function ( pr(n) ) in the written program.I would appreciate for any help.Best Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mle together with pnorm
Hi, The problem I have is that the standard errors for the estimates doesn't make any sense. Here is the background: The values in vector a are seen as the true values and I want to estimate them using mle. I make 100 disturbed vectors from a by adding noise, N(0,sigma^2). For every disturbed vector I sort them in decreasing order. Say that the observed order is: y[1]y[4]y[5]y[3]y[2] We want to calculate the probability to observe this order and we do so by: P(Y[1]-Y[4]0|a)*P(Y[4]-Y[5]0|a)*P(Y[5]-Y[3]0|a)*P(Y[3]-Y[2]0|a) (we ignore the dependency), where Y[1]-Y[4] ~ N(a[1]-a[4],2*std^2) and so on. Usually when you use mle you use density function, but here we use pnorm. The problem I have after running the code below is that the standard errors are all the same (and way too large) for the estimates. What can I have done wrong? R-code #library(stats4)n - 100x - mat.or.vec(n,5)y - mat.or.vec(n,5)a - c(100,100.5,100.75,100.7,100.25) std - sd(a)Var - std^2for(i in 1:n){ y[i,] - a+rnorm(5,mean=0,sd=std) x[i,] - sort(y[i,],decreasing = TRUE,index.return=TRUE)$ix } matris - mat.or.vec(n,4)sigma - sqrt(2*Var)fit - function(f1,f2,f3,f4,f5,x){ for(i in 1:n){ for(j in 1:4){ P - if(x[i,j] == 1) {f1} else if(x[i,j] == 2) {f2} else if(x[i,j] == 3) {f3} else if(x[i,j] == 4) {f4} else {f5} Q - if(x[i,j+1] == 1) {f1} else if(x[i,(j+1)] == 2) {f2} else if(x[i,(j+1)] == 3) {f3} else if(x[i,(j+1)] == 4) {f4} else {f5} mu - P-Q matris[i,j] - pnorm(0,mean=mu,sd=sigma,lower.tail=FALSE,log.p=TRUE)} } -sum(matris)}mle.results - mle(fit,start=list(f1=a[1],f2=a[2],f3=a[3],f4=a[4],f5=a[5]),fixed=list(x=x))summary(mle.results) Best regardsAlex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE
Hello I know how to use R for estimating a parameter by using MLE if I have a simple function f(x,a). I am trying to design a program for a complicated function such as: g(.)=sum(integral(f(x,a,t,k))) where (a) is a parameter(needs to be estimated) , integral depends on (t) and sum is over (k). Does anybody have any idea? Thank you Diba [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for probit regression. How to avoid p=1 or p=0
Dear all: I am writing the following small function for a probit likelihood. As indicated, in order to avoid p=1 or p=0, I defined some precisions. I feel however, that there might be a better way to do this. Any help is greatly appreciated. ## ##set limits to avoid px=0 or px=1 precision1 - 0.99 precision0 - 0.01 logpost - function(par, data){ px - pnorm(b0 + b1x) # to avoid px=1 or px=0 px[px precision1] - precision1 px[px precision0] - precision0 loga - sum( y*log(px)+(1-y)*log(1-px) ) loga } # Best, Keramat Nourijelyani, PhD Associate Professorof Biostatistics Tehran University of Medical Sciences http://tums.ac.ir/faculties/nourij [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for probit regression. How to avoid p=1 or p=0
Hello, You write a function of two arguments, 'par' and 'data' and do not use them in the body of the function. Furthermore, what are b0, b1x and y? Also, take a look at ?.Machine. In particular, couldn't you use precision0 - .Machine$double.eps precision1 - 1 - .Machine$double.eps instead of 0.01 and 0.99? Hope this helps, Rui Barradas Em 27-05-2013 16:21, knouri escreveu: Dear all: I am writing the following small function for a probit likelihood. As indicated, in order to avoid p=1 or p=0, I defined some precisions. I feel however, that there might be a better way to do this. Any help is greatly appreciated. ## ##set limits to avoid px=0 or px=1 precision1 - 0.99 precision0 - 0.01 logpost - function(par, data){ px- pnorm(b0 + b1x) # to avoid px=1 or px=0 px[px precision1] - precision1 px[px precision0] - precision0 loga - sum( y*log(px)+(1-y)*log(1-px) ) loga } # Best, Keramat Nourijelyani, PhD Associate Professorof Biostatistics Tehran University of Medical Sciences http://tums.ac.ir/faculties/nourij [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mle function
Hallo, I'm working with the mle function and I would like to ask you a couple of questions. My goal is to construct the historical value of v1(t), v2(t) and v3(t) using the maximum likelihood estimation. So, I need to optimize the following log-likelihood: sum(E1_f[t,]*(v1*teta1[] + v2*teta2[] + v3*teta3[]) - E_f[t,]*log(1 + exp(v1*teta1[] + v2*teta2[] + v3*teta3[]))) (E_f and E1_f are 136x111 matrices and teta1,teta2 and teta3 are 111x1 vectors). By writing the code below, I just obtain the result for t=1. library(stats4) likelihood - function(v1,v2,v3){ for (t in 1:136){ return(-(sum(E1_f[t,]*(v1*teta1[] + v2*teta2[] + v3*teta3[]) - E_f[t,]*log(1 + exp(v1*teta1[] + v2*teta2[] + v3*teta3[]) } } L_f - mle(minuslog=likelihood, start=list(v1=1, v2=2, v3=3)) x - summary(L_f) What should I change in the code? And how can I store the values of v1(t), v2(t) and v3(t) in 3 vectors, in order to use them after? Thank you very much. Roger __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE with R
Hi everyone, I'm writing a thesis about financial copulas (gaussian and t-student copulas) but i have problems about the R-code. I'll explain better: i downloaded 10 time series about financial indeces and i have to apply the gaussian copula. First i have to divide the ranks of the osservations by T. Employing these pseudo-osservations the copula parameters are estimated via maximum likelihood estimation. I don't know the R-code about MLE. I tried with optim but i don't have the function fn. I tried also with procedure that explained by Christian Cech but it's quite complex. How do i write the R-code? Thanks -- View this message in context: http://r.789695.n4.nabble.com/MLE-with-R-tp4653818.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE with R
click here This is a link http://www.w3schools.com -- View this message in context: http://r.789695.n4.nabble.com/MLE-with-R-tp4653818p4653819.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE of negative binomial distribution parameters
Thank you!!! -- View this message in context: http://r.789695.n4.nabble.com/MLE-of-negative-binomial-distribution-parameters-tp4646703p4646927.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE of negative binomial distribution parameters
Zoraida zmorales at ingellicom.com writes: I need to estimate the parameters for negative binomial distribution (pdf) using maximun likelihood, I also need to estimate the parameter for the Poisson by ML, which can be done by hand, but later I need to conduct a likelihood ratio test between these two distributions and I don't know how to start! I'm not an expert programmer in R. Please help It sounds like you might need some local help. If you're trying to fit the parameters to a single data set (i.e. no predictor variables, just a set of values), then you probably want fitdistr() from the MASS package: modified from ?fitdistr: library(MASS) set.seed(123) x4 - rnegbin(500, mu = 5, theta = 4) ff - fitdistr(x4, Negative Binomial) ff2 - fitdistr(x4, Poisson) ff size mu 4.2159071 4.9447685 (0.5043658) (0.1466082) ff2 lambda 4.9440 (0.09943842) logLik(ff) 'log Lik.' -1250.121 (df=2) logLik(ff2) 'log Lik.' -1350.088 (df=1) You can use the pchisq() function to compute the p-value for the likelihood ratio test (hint: use lower.tail=FALSE to compute the upper tail area ...) If you want to fit and compare negative binomial or Poisson models with covariates, use glm and MASS::glm.nb, or mle2 from the bbmle packages ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE of negative binomial distribution parameters
I need to estimate the parameters for negative binomial distribution (pdf) using maximun likelihood, I also need to estimate the parameter for the Poisson by ML, which can be done by hand, but later I need to conduct a likelihood ratio test between these two distributions and I don't know how to start! I'm not an expert programmer in R. Please help -- View this message in context: http://r.789695.n4.nabble.com/MLE-of-negative-binomial-distribution-parameters-tp4646703.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE
Homework?? We don't do homework here. If not, ?optim or look at the CRAN Optimize task view for optimizers. There is even a maxLik package that might be useful. -- Bert On Mon, Jul 2, 2012 at 8:58 PM, Ali Tamaddoni alicivilizati...@gmail.comwrote: Hi All I have a data frame called nbd with two columns (x and T). Based on this dataset I want to find the parameters of a distribution with the following log-liklihood function and with r and alpha as its parameters: log(gamma(nbd$x+r))-log(gamma(r))+r*log(alpha)-(r+nbd$x)*log(nbd$T+alpha) the initial value for both parameters is 1. I would be thankful if you could help me to solve this problem in R. Regards, Ali [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE
Hi All I have a data frame called nbd with two columns (x and T). Based on this dataset I want to find the parameters of a distribution with the following log-liklihood function and with r and alpha as its parameters: log(gamma(nbd$x+r))-log(gamma(r))+r*log(alpha)-(r+nbd$x)*log(nbd$T+alpha) the initial value for both parameters is 1. I would be thankful if you could help me to solve this problem in R. Regards, Ali [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for estimating the parameters of the TVECM in R
Dear Mr. Matthieu Stigler i so excited for your package 'tsDyn'. firstly introduce myself, i student at Gadjah Mada University,Indonesia. i'am new user of R and applying it for solving Bi-Variate ( interest rate and inflation ) with threshold vector error correction model. now, i writing my final examination about threshold vector error correction model and i use refference from paper testing for two regime threshold cointegration in vector error correction model by hansen and seo (2002) to estimate parameter. i have tried to reduce MLE , and it's succes. now i have A1(hat), A2(hat) with MLE and gamma(hat), beta(hat) with grid search from MLE. My problem, when i using function HanSeo_TVECM() in R, this function can't running, only to estimate linier cointegration (VECM). and if i using packade tsDyn version 0-8.1, function HanSeo_TVECM() not avaliable. however there are function TVECM() but this function using CLS for estimate parameter. whether the MLE and CLS estimation would result in same relative valuesââ? can u help me sir? for function HanSen_TVECM()? thanks a lot output R from function HanSeo_TVECM HanSeo_TVECM(data,lag=2,trim= 0.05,gn=300,bn=300) ###Linear VECM estimates (by Johansen MLE) Cointegrating vector 1 -8.434287 Standard error of the cointegrating value 2.616888 Parameters ECMInterceptbi -1 inf -1 bi -2 Equation bi -0.008627598 -0.006055985 0.758366 0.02512693 -0.1240975 Equation inf 0.131374112 -0.355994435 3.971587 0.20726565 -2.7661240 inf -2 Equation bi -0.007603223 Equation inf -0.224955971 Standard errors with Eicker-White covariance matrix estimator ECM Intercept bi -1 inf -1 bi -2 inf -2 Equation bi 0.004081985 0.01663758 0.08914615 0.02456267 0.08221155 0.01799844 Equation inf 0.034760850 0.08915086 1.66241171 0.19362012 0.93278372 0.06298056 Negative LL -183.1256 AIC -159.1256 BIC -160.4877Error in solve.default(t(zzj) %*% zzj) : system is computationally singular: reciprocal condition number = 1.15007e-020 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE where loglikelihood function is a function of numerical solutions
HI Berend, Thank you for your reply. 2011/4/13 Berend Hasselman b...@xs4all.nl Questions: 1. why are you defining Bo within a loop? 2. Why are you doing library(nleqslv) within the loop? Yes, I see what you mean. There's no reason for defining that within the loop. Doing both those statements outside the loop once is more efficient. In your transdens function you are not using the function argument parameters, why? Shouldn't there be a with(parameters) since otherwise where is for example K_vv supposed to come from? I can't believe that the code worked: in the call of nleqslv you have ... control=list(matxit=10) ... It should be maxit and nleqslv will issue an error message and stop (at least in the latest versions). And why 10? If that is required, something is not ok with starting values and/or functions. I get the error message 6 that the Jacobian is singular, which I think stems from the parameter values. The parameter values are from a similar study and serve only as a starting point. The 10 was just me playing around. Finally the likelihood function at the end of your code #Maximum likelihood estimation using mle package library(stats4) #defining loglikelighood function #T - length(v) #minuslogLik - function(x,x2) #{f - rep(NA, length(x)) #for(i in 1:T) #{ #f[1] - -1/T*sum(log(transdens(parameters = parameters, x = c(v[i],v[i+1])))-log(Jac(outmat=outmat, x2=c(v[i],r[i]))) #} #f # } How do the arguments of your function x and x2 influence the calculations in the likelihood function? What I was thinking was that the x and x2 would be the input for the transdens() and Jac() functions, but I guess that is already taken care of when defining them. As written now with argument x and x2 not being used in the body of the function, there is nothing to optimize. That is correct and that is the problem. The likelihood need to be stated as a function of the parameters, here the vector parameters. Because in the maximum likelihood estimation we want to maximize the value of the likelihood function by changing the parameter values. The likelihood function is a function of the parameters but only through the functions, for example Kristian() is a function of parameters which feeds into Bo(), transdens() and Jac(). Do you have any suggestions how to get around this issue? Shouldn't f[1] be f[i] because otherwise the question is why are looping for( i in 1:T)? But then returning f as a vector seems wrong here. Shouldn't a likelihood function return a scalar? The likelihood function should return a scalar. I think the fix could be to make the function calculate the function value at each i and then make it return the mean of all the f[i]s. Berend -- View this message in context: http://r.789695.n4.nabble.com/MLE-where-loglikelihood-function-is-a-function-of-numerical-solutions-tp3439436p3447224.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Kristian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE where loglikelihood function is a function of numerical solutions
On 14-04-2011, at 09:00, Kristian Lind wrote: HI Berend, Thank you for your reply. .. Finally the likelihood function at the end of your code #Maximum likelihood estimation using mle package library(stats4) #defining loglikelighood function #T - length(v) #minuslogLik - function(x,x2) #{f - rep(NA, length(x)) #for(i in 1:T) #{ #f[1] - -1/T*sum(log(transdens(parameters = parameters, x = c(v[i],v[i+1])))-log(Jac(outmat=outmat, x2=c(v[i],r[i]))) #} #f # } How do the arguments of your function x and x2 influence the calculations in the likelihood function? What I was thinking was that the x and x2 would be the input for the transdens() and Jac() functions, but I guess that is already taken care of when defining them. Huh? If you define zz - function(x) {...} x is an argument which must be specified when using the function. As written now with argument x and x2 not being used in the body of the function, there is nothing to optimize. That is correct and that is the problem. The likelihood need to be stated as a function of the parameters, here the vector parameters. Because in the maximum likelihood estimation we want to maximize the value of the likelihood function by changing the parameter values. The likelihood function is a function of the parameters but only through the functions, for example Kristian() is a function of parameters which feeds into Bo(), transdens() and Jac(). Do you have any suggestions how to get around this issue? What kind of problem?. Why don't you then do (parameters and outmat already known globally) f[i] - -1/T*sum(log(transdens(parameters = parameters, x = x))-log(Jac(outmat=outmat, x2=x2))) You should pass the arguments used by the optimizer in calling your likelihood function to the functions you defined. That way f[] will depend on x and x2 and so will the likelihood. Shouldn't f[1] be f[i] because otherwise the question is why are looping for( i in 1:T)? But then returning f as a vector seems wrong here. Shouldn't a likelihood function return a scalar? The likelihood function should return a scalar. I think the fix could be to make the function calculate the function value at each i and then make it return the mean of all the f[i]s. Is the mean a likelihood? Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE where loglikelihood function is a function of numerical solutions
Albyn and others, Thank you for your replies. In order to be more specific I've constructed my program. I know it's long and in some places quite messy. It works until the last part where the log-likelihood function has to be defined and maximized wrt the parameters. The log-likelihood has the form L = 1/T*sum_t=0^T[log(transdens(v_t, r_t))-log(Jac(r_t,v_t))], where the functions transdens(v_t, r_t) and Jac(r_t,v_t) are defined below. The problem remains the same, how do I construct the log-likelihood function such that the numerical procedures employed are updated in the maximization procedure? I was thinking something along the lines of minuslogLik - function(x,x2) {f - rep(NA, length(x)) for(i in 1:T) { f[1] - -1/T*sum(log(transdens(parameters = parameters, x = c(v[i],v[i+1])))-log(Jac(outmat=outmat, x2=c(v[i],r[i]))) } f } Thank you in advance. Kristian --Program starts here-- # Solving ODEs parameters - c(K_vv- 0.0047, K_rv--0.0268, K_rr-0.3384, theta_v - 50, theta_r -5.68, Sigma_rv-0.0436, Sigma_rr-0.1145, lambda_v-0, lambda_r--0.0764, k_v-K_vv*theta_v, k_r- K_rv*theta_v+K_rr*theta_r, alpha_r - 0, B_rv- (Sigma_rr+Sigma_rv)^2) #declaring state variables i.e. the functions and their intial conditions state - c(b_1 = 0, b_2 = 0, a = 0) #delclaring function and system of equations. Kristian - function(t, state, parameters) { with(as.list(c(state, parameters)), { db_1 = -((K_vv+lambda_v)*b_1+(K_rv+Sigma_rv*lambda_v+Sigma_rr*lambda_r)*b_2+0.5*(b_1)^2+Sigma_rv*b_1*b_2+0.5*((Sigma_rv)^2+(Sigma_rr)^2)*(b_2)^2 ) db_2 = -K_rr*b_2+1 da = K_vv*theta_v*b_1+(K_rv*theta_v+K_rr*theta_r)*b_2 list(c(db_1, db_2, da)) }) } # time making a sequence from t to T evaluated at each delta seq(t, T, by = delta) times - seq(0, 10, by = 0.5) #solving the model using the deSolve function ode library(deSolve) outmat - ode(y = state, times = times, func = Kristian, parms = parameters) #print(outmat) #Solving 2 equations in 2 unknowns using nleslv # simulating data for testing c2 - matrix(data = rnorm(20, mean =0, sd =1)+2, nrow = 20, ncol =1) c10 - matrix(data = rnorm(20, mean =0, sd =1)+5, nrow = 20, ncol =1) besselinput - matrix(data = 0, nrow =20, ncol = 1) vncChi - matrix(data = 0, nrow =20, ncol = 1) v - matrix(data = 0, nrow =20, ncol = 1) r - matrix(data = 0, nrow =20, ncol = 1) for(i in 1:20) { Bo - function(x, s, outmat) { f - rep(NA, length(x)) z - - outmat[,2:4] %*% c(x[2],x[1],1) f[1] - (1-exp(z[4]))/sum(exp(z[1:4])) - s[1] f[2] - (1-exp(z[20]))/sum(exp(z)) - s[2] f } s - c(c2[i],c10[i] ) p - c(50, 5) # loading nleqslv package library(nleqslv) ans.nlq - nleqslv(x=p, fn=Bo, s=s, outmat=outmat, control = list(matxit =10)) v[i] - ans.nlq$x[1] r[i] - ans.nlq$x[2] #print(ans.nlq$termcd) #ans.nlq$fvec } #calculating transition density as a function of parameters transdens - function(parameters, x) { delta -1 c - 2*K_vv*(1-exp(-K_vv*delta))^(-1) #2.004704 q - 2*k_v-1 #0.00959666 f - rep(NA, 1) #besselinput[2] - 2*c*(x[2]*x[1]*exp(-K_vv*delta))^(0.5) f[1]- c*exp(c*(x[2]+exp(-K_vv*delta)*x[1]))*(x[2]/(exp(-K_vv*delta)*x[1]))^(q/2)*besselI(2*c*(x[2]*x[1]*exp(-K_vv*delta))^(0.5), q, expon.scaled = FALSE) f } vncChi - transdens(parameters = parameters, x=c(v[1],v[2])) print(vncChi) #calculating Determinant of Jacobian as a function of outmat. Jac - function(outmat, x2){ f - rep(NA, 1) y - - outmat[,2:4] %*% c(x2[1],x2[2],1) w1 - outmat[,2]*exp(-outmat[,2:4] %*% c(x2[1],x2[2],1)) w2 - outmat[,3]*exp(-outmat[,2:4] %*% c(x2[1],x2[2],1)) f[1] - abs((outmat[5,2]*exp(y[4])/(sum(exp(y[1:4])))+ (1-exp(y[4]))*sum(w1[1:4])/(sum(exp(y[1:4])))^2)*(outmat[21,3] *exp(y[21])/(sum(exp(y)))+(1-exp(y[21]))*sum(w2)/ (sum(exp(y)))^2)-(outmat[21,2]*exp(y[21])/(sum(exp(y))) +(1-exp(y[21]))*sum(w1)/(sum(exp(y)))^2)*(outmat[5,3] *exp(y[4])/(sum(exp(y[1:4])))+(1-exp(y[4]))*sum(w2[1:4]) /(sum(exp(y[1:4])))^2)) f } #--Program works as it is supposed to until here #Maximum likelihood estimation using mle package library(stats4) #defining loglikelighood function #T - length(v) #minuslogLik - function(x,x2) #{f - rep(NA, length(x)) #for(i in 1:T) #{ #f[1] - -1/T*sum(log(transdens(parameters = parameters, x = c(v[i],v[i+1])))-log(Jac(outmat=outmat, x2=c(v[i],r[i]))) #} #f } 2011/4/10 Albyn Jones jo...@reed.edu to clarify: by if you knew that LL(psi+eps) were well
Re: [R] MLE where loglikelihood function is a function of numerical solutions
Questions: 1. why are you defining Bo within a loop? 2. Why are you doing library(nleqslv) within the loop? Doing both those statements outside the loop once is more efficient. In your transdens function you are not using the function argument parameters, why? Shouldn't there be a with(parameters) since otherwise where is for example K_vv supposed to come from? I can't believe that the code worked: in the call of nleqslv you have ... control=list(matxit=10) ... It should be maxit and nleqslv will issue an error message and stop (at least in the latest versions). And why 10? If that is required, something is not ok with starting values and/or functions. Finally the likelihood function at the end of your code #Maximum likelihood estimation using mle package library(stats4) #defining loglikelighood function #T - length(v) #minuslogLik - function(x,x2) #{f - rep(NA, length(x)) #for(i in 1:T) #{ #f[1] - -1/T*sum(log(transdens(parameters = parameters, x = c(v[i],v[i+1])))-log(Jac(outmat=outmat, x2=c(v[i],r[i]))) #} #f # } How do the arguments of your function x and x2 influence the calculations in the likelihood function? As written now with argument x and x2 not being used in the body of the function, there is nothing to optimize. Shouldn't f[1] be f[i] because otherwise the question is why are looping for( i in 1:T)? But then returning f as a vector seems wrong here. Shouldn't a likelihood function return a scalar? Berend -- View this message in context: http://r.789695.n4.nabble.com/MLE-where-loglikelihood-function-is-a-function-of-numerical-solutions-tp3439436p3447224.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE where loglikelihood function is a function of numerical solutions
Hi there, I'm trying to solve a ML problem where the likelihood function is a function of two numerical procedures and I'm having some problems figuring out how to do this. The log-likelihood function is of the form L(c,psi) = 1/T sum [log (f(c, psi)) - log(g(c,psi))], where c is a 2xT matrix of data and psi is the parameter vector. f(c, psi) is the transition density which can be approximated. The problem is that in order to approximate this we need to first numerically solve 3 ODEs. Second, numerically solve 2 non-linear equations in two unknowns wrt the data. The g(c,psi) function is known, but dependent on the numerical solutions. I have solved the ODEs using the deSolve package and the 2 non-linear equations using the BB package, but the results are dependent on the parameters. How can I write a program that will maximise this log-likelihood function, taking into account that the numerical procedures needs to be updated for each iteration in the maximization procedure? Any help will be much appreciated. Kristian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE where loglikelihood function is a function of numerical solutions
Hi Kristian The obvious approach is to treat it like any other MLE problem: evaluation of the log-likelihood is done as often as necessary for the optimizer you are using: eg a call to optim(psi,LL,...) where LL(psi) evaluates the log likelihood at psi. There may be computational shortcuts that would work if you knew that LL(psi+eps) were well approximated by LL(psi), for the values of eps used to evaluate numerical derivatives of LL. Of course, then you might need to write your own custom optimizer. albyn Quoting Kristian Lind kristian.langgaard.l...@gmail.com: Hi there, I'm trying to solve a ML problem where the likelihood function is a function of two numerical procedures and I'm having some problems figuring out how to do this. The log-likelihood function is of the form L(c,psi) = 1/T sum [log (f(c, psi)) - log(g(c,psi))], where c is a 2xT matrix of data and psi is the parameter vector. f(c, psi) is the transition density which can be approximated. The problem is that in order to approximate this we need to first numerically solve 3 ODEs. Second, numerically solve 2 non-linear equations in two unknowns wrt the data. The g(c,psi) function is known, but dependent on the numerical solutions. I have solved the ODEs using the deSolve package and the 2 non-linear equations using the BB package, but the results are dependent on the parameters. How can I write a program that will maximise this log-likelihood function, taking into account that the numerical procedures needs to be updated for each iteration in the maximization procedure? Any help will be much appreciated. Kristian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE where loglikelihood function is a function of numerical solutions
to clarify: by if you knew that LL(psi+eps) were well approximated by LL(psi), for the values of eps used to evaluate numerical derivatives of LL. I mean the derivatives of LL(psi+eps) are close to the derivatives of LL(psi), and perhaps you would want the hessian to be close as well. albyn Quoting Albyn Jones jo...@reed.edu: Hi Kristian The obvious approach is to treat it like any other MLE problem: evaluation of the log-likelihood is done as often as necessary for the optimizer you are using: eg a call to optim(psi,LL,...) where LL(psi) evaluates the log likelihood at psi. There may be computational shortcuts that would work if you knew that LL(psi+eps) were well approximated by LL(psi), for the values of eps used to evaluate numerical derivatives of LL. Of course, then you might need to write your own custom optimizer. albyn Quoting Kristian Lind kristian.langgaard.l...@gmail.com: Hi there, I'm trying to solve a ML problem where the likelihood function is a function of two numerical procedures and I'm having some problems figuring out how to do this. The log-likelihood function is of the form L(c,psi) = 1/T sum [log (f(c, psi)) - log(g(c,psi))], where c is a 2xT matrix of data and psi is the parameter vector. f(c, psi) is the transition density which can be approximated. The problem is that in order to approximate this we need to first numerically solve 3 ODEs. Second, numerically solve 2 non-linear equations in two unknowns wrt the data. The g(c,psi) function is known, but dependent on the numerical solutions. I have solved the ODEs using the deSolve package and the 2 non-linear equations using the BB package, but the results are dependent on the parameters. How can I write a program that will maximise this log-likelihood function, taking into account that the numerical procedures needs to be updated for each iteration in the maximization procedure? Any help will be much appreciated. Kristian [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE estimation
Dear List, This problem is more a statistic one than a R one. Any one can recommend me some references website or online paper on maximum likelihood estimation?I'm now working on that,while still doubt how to prove that the estimated parameters are normal distributed. Thanks for your time and help! Best, Ning __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE estimation
Check out Casella and Berger's Statistical Inference. ISBN 978-81-315-0394-2 or http://en.wikipedia.org/wiki/Maximum_likelihood as an online reference. --Mark J. Lamias From: Ning Cheng wakinchaue...@gmail.com To: r-help@r-project.org Sent: Sunday, February 27, 2011 3:19 PM Subject: [R] MLE estimation Dear List, This problem is more a statistic one than a R one. Any one can recommend me some references website or online paper on maximum likelihood estimation?I'm now working on that,while still doubt how to prove that the estimated parameters are normal distributed. Thanks for your time and help! Best, Ning __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE estimation
I am partial to Gary King's book: Unifying Political Methodology: The Likelihood Theory of Statistical Inference (University of Michigan Press, 1998) Cheers David Cross d.cr...@tcu.edu www.davidcross.us On Feb 27, 2011, at 2:19 PM, Ning Cheng wrote: Dear List, This problem is more a statistic one than a R one. Any one can recommend me some references website or online paper on maximum likelihood estimation?I'm now working on that,while still doubt how to prove that the estimated parameters are normal distributed. Thanks for your time and help! Best, Ning __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE
Hi, I am having problems carrying out a mle for 3 parameters in a non-homogenous poisson process. I am trying to use the optim function to minimise the -ve log-likelihood. When I use assumed values of my three parameters (20,1,1) the -ve log-likelihood function returns a value of 1309122 but I i then use these values as a starting point in the optim function the parameter estimates and the function value are much lower. Below is a summary of the output: optim(c(20,1,1), fn=Poisson.lik, gr=NULL, method=Nelder-Mead,w=w, t1=t1, t2=t2) $par [1] 0.004487104 -2.657468035 12.186003355 $value [1] 289.6901 $counts function gradient 220 NA $convergence [1] 0 $message NULL There were 50 or more warnings (use warnings() to see the first 50) Where the warnings are: 1: In log(((theta0 * w * t2[i]) - (theta1 * cos(w * t2[i])) + ... : NaNs produced I thought I was using optim correctly but obviously not! Does anyone have any suggestions as to what to try? Thanks, Doug -- View this message in context: http://r.789695.n4.nabble.com/MLE-tp3320852p3320852.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mle
Hi, I am looking for some help regarding the use of the mle function. I am trying to get mle for 3 parameters (theta0, theta1 and theta2) that have been defined in the the log-likelihood equation as theta0=theta[1], theta1=theta[2] and theta2=theta[3]. My R code for mle is: mle(Poisson.lik, start=list(theta=c(20,1,1), method=Nelder-Mead, fixed=list(w=w, t1=t1, t2=t2)) But I keep getting the following error, Error in eval(expr, envir, enclos) : argument is missing, with no default I am trying to maximise theta starting at (20,1,1) over a fixed range of w, t1 and t2. Can anyone shed some light as to what is going on? Thanks, Doug -- View this message in context: http://r.789695.n4.nabble.com/mle-tp3318857p3318857.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mle
Hi, I am looking for some help regarding the use of the mle function. I am trying to get mle for 3 parameters (theta0, theta1 and theta2) that have been defined in the the log-likelihood equation as theta0=theta[1], theta1=theta[2] and theta2=theta[3]. My R code for mle is: mle(Poisson.lik, start=list(theta=c(20,1,1), method=Nelder-Mead, fixed=list(w=w, t1=t1, t2=t2)) But I keep getting the following error, Error in eval(expr, envir, enclos) : argument is missing, with no default I am trying to maximise theta starting at (20,1,1) over a fixed range of w, t1 and t2. Can anyone shed some light as to what is going on? Thanks, Doug -- View this message in context: http://r.789695.n4.nabble.com/mle-tp3318856p3318856.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mle question
Hello, is there somebody who can help me with my question (see below)? Antje On 1 February 2011 09:09, Antje Niederlein niederlein-rs...@yahoo.de wrote: Hello, I tried to use mle to fit a distribution(zero-inflated negbin for count data). My call is very simple: mle(ll) ll() takes the three parameters, I'd like to be estimated (size, mu and prob). But within the ll() function I have to judge if the current parameter-set gives a nice fit or not. So I have to apply them to observation data. But how does the method know about my observed data? The mle()-examples define this data outside of this method and it works. For a simple example, it was fine but when it comes to a loop (tapply) providing different sets of observation data, it doesn't work anymore. I'm confused - is there any way to do better? Here is a little example which show my problem: # R-code - lambda.data - runif(10,0.5,10) ll - function(lambda = 1) { cat(x in ll(),x,\n) y.fit - dpois(x, lambda) sum( (y - y.fit)^2 ) } lapply(1:10, FUN = function(x){ raw.data - rpois(100,lambda.data[x]) freqTab - count(raw.data) x - freqTab$x y - freqTab$freq / sum(freqTab$freq) cat(x in lapply, x,\n) fit - mle(ll) coef(fit) }) Can anybody help? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mle question
Hello, is there somebody who can help me with my question (see below)? Antje On 1 February 2011 09:09, Antje Niederlein niederlein-rs...@yahoo.de wrote: Hello, I tried to use mle to fit a distribution(zero-inflated negbin for count data). My call is very simple: mle(ll) ll() takes the three parameters, I'd like to be estimated (size, mu and prob). But within the ll() function I have to judge if the current parameter-set gives a nice fit or not. So I have to apply them to observation data. But how does the method know about my observed data? The mle()-examples define this data outside of this method and it works. For a simple example, it was fine but when it comes to a loop (tapply) providing different sets of observation data, it doesn't work anymore. I'm confused - is there any way to do better? Here is a little example which show my problem: # R-code - lambda.data - runif(10,0.5,10) ll - function(lambda = 1) { cat(x in ll(),x,\n) y.fit - dpois(x, lambda) sum( (y - y.fit)^2 ) } lapply(1:10, FUN = function(x){ raw.data - rpois(100,lambda.data[x]) freqTab - count(raw.data) x - freqTab$x y - freqTab$freq / sum(freqTab$freq) cat(x in lapply, x,\n) fit - mle(ll) coef(fit) }) Can anybody help? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mle question
Hello, I tried to use mle to fit a distribution(zero-inflated negbin for count data). My call is very simple: mle(ll) ll() takes the three parameters, I'd like to be estimated (size, mu and prob). But within the ll() function I have to judge if the current parameter-set gives a nice fit or not. So I have to apply them to observation data. But how does the method know about my observed data? The mle()-examples define this data outside of this method and it works. For a simple example, it was fine but when it comes to a loop (tapply) providing different sets of observation data, it doesn't work anymore. I'm confused - is there any way to do better? Here is a little example which show my problem: # R-code - lambda.data - runif(10,0.5,10) ll - function(lambda = 1) { cat(x in ll(),x,\n) y.fit - dpois(x, lambda) sum( (y - y.fit)^2 ) } lapply(1:10, FUN = function(x){ raw.data - rpois(100,lambda.data[x]) freqTab - count(raw.data) x - freqTab$x y - freqTab$freq / sum(freqTab$freq) cat(x in lapply, x,\n) fit - mle(ll) coef(fit) }) Can anybody help? Antje __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for multivariate t-distribution
Hey, I've got a problem with the estimation of a multivariate t-distribution. I've got 200 observations vor 20 variables. Now I want to estimate the parameters of the densityfunction of the multivarate t-distribution with mean=0. I found a function mst.mle in the package sn, but here it is for a skwed t- distribution and the mean is also estimated. I need a function which estimated only the degrees of freedom and the covarianzmatrix/ Omega. Can anybody help me please! Thanks a lot! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE optimization
Folks, I'm kind of newbie in R, but with some background in Matlab and VBA programming. Last month I was implementing a Maximum Likelihood Estimation in Matlab, but the algorithms didn't converge. So my academic advisor suggested using R. My problem is: estimate a mean reverting jump diffusion parameters. I've succeeded in deriving the likelihood function (which looks like a gaussian mixture) and it is implemented in R. My main doubts are related to the inputs and outputs that this function should generate, for instance, in Matlab this function should get the parameters as input and output the likelihood using the sample data (imported within the function). In order to make R optimizers to work I, apparently, should write a function that uses the parameters and the sample data as input and outputs the likelihood. Is it correct? Could someone reply with an example code which examplifies the type of function I should write and the syntax to optimize? Alternatively, could anyone suggest a good MLE tutorial and package? Thankfully, JC -- View this message in context: http://n4.nabble.com/MLE-optimization-tp998655p998655.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE optimization
should write a function that uses the parameters and the sample data as input and outputs the likelihood. Is it correct? Yes, that is correct. Take a look at the optim() function. ?optim What type of convergence problems did you experience with Matlab? I am not sure if using R can overcome fundamental modeling and computational issues, such as over-specification of the model for the data at hand. But, may be you need to impose constraints on the parameter if you are fitting a Gaussian mixture. Another option is to use packages that are specially designed to model finite mixtures such as flexmix or mixtools. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of jckval Sent: Monday, January 04, 2010 5:53 PM To: r-help@r-project.org Subject: [R] MLE optimization Folks, I'm kind of newbie in R, but with some background in Matlab and VBA programming. Last month I was implementing a Maximum Likelihood Estimation in Matlab, but the algorithms didn't converge. So my academic advisor suggested using R. My problem is: estimate a mean reverting jump diffusion parameters. I've succeeded in deriving the likelihood function (which looks like a gaussian mixture) and it is implemented in R. My main doubts are related to the inputs and outputs that this function should generate, for instance, in Matlab this function should get the parameters as input and output the likelihood using the sample data (imported within the function). In order to make R optimizers to work I, apparently, should write a function that uses the parameters and the sample data as input and outputs the likelihood. Is it correct? Could someone reply with an example code which examplifies the type of function I should write and the syntax to optimize? Alternatively, could anyone suggest a good MLE tutorial and package? Thankfully, JC -- View this message in context: http://n4.nabble.com/MLE-optimization-tp998655p998655.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE optimization
Hello On 1/4/10, jckval jcnogueirafi...@gmail.com wrote: Alternatively, could anyone suggest a good MLE tutorial and package? Search for MLE on Rseek.org and among other results check the Task Views. Also, search for MLE in vignettes on RSiteSearch [1]. [1] http://finzi.psych.upenn.edu/nmz.html To get a good list of MLE-related functions in R, try something similar to the following (you can also do this graphically with RcmdrPlugin.sos): require(sos) .sos - findFn('mle') found 642 matches; retrieving 20 pages, 400 matches. 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 grepFn('mix', .sos, column='Description', ignore.case=TRUE) summary(.sos) Call: findFn(string = mle) Total number of matches: 642 Downloaded 400 links in 121 packages. Packages with at least 8 matches using pattern 'mle' Package Count MaxScore TotalScore Date 1 stats415 73347 2009-11-26 2FitAR15 27198 2008-11-20 3 sn15 27179 2005-07-18 4 wle14 80392 2006-09-01 5 MLEcens13 56308 2007-08-31 6 ghyp13 45206 2009-10-19 7 circular11 32107 2007-08-31 8degreenet11 13 76 2008-01-24 9 gbs 9 27169 2008-05-10 10 SpatialExtremes 8 25 41 2009-06-13 11DCluster 8 13 50 2009-01-15 12distrMod 8 12 46 2009-11-05 To get more acquainted with R and user-defined functions, you might try [1] and [2]. HTH Liviu [1] http://www.statmethods.net/management/userfunctions.html [2] http://rforsasandspssusers.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE optimization
First of all, thanks for your answer! About the optimization problem, I'm pretty careful on constraints for the parameters. Regular papers usually do this kind of estimation just restricting the weight of the mixture (bivariate) between 0 and 1, but this can lead to some strange results, like negative variance. This kind of constraint seems pretty easy to implement, since optim method L-BFGS-B accepts lower and upper bounds for the parameters. Other acceptable restriction is to set one variance as multiple of the other, like sig1=k.sig2, with k different from 0 and infinite. The problem is that I can't figure out how to implement this kind of constraint in R... Since I have 6 parameters to estimate and usual bivariate gaussian mixture problems just have 5, I guess packages designed for finite mixtures could be used to reduce the dimension of the optimization, since by estimating the variance of the first gaussian (sig1) via flexmix or mixtools I could generate a (probably linear) constraint for my problem (since sig1 in my estimation is a combination of two jump diffusion parameters). Is it easy to declare linear constraints in R optimization? Anyway, I really appreciate your help and attention. Thankfully, JC 2010/1/4 Ravi Varadhan [via R] ml-node+998666-298506...@n4.nabble.comml-node%2b998666-298506...@n4.nabble.com should write a function that uses the parameters and the sample data as input and outputs the likelihood. Is it correct? Yes, that is correct. Take a look at the optim() function. ?optim What type of convergence problems did you experience with Matlab? I am not sure if using R can overcome fundamental modeling and computational issues, such as over-specification of the model for the data at hand. But, may be you need to impose constraints on the parameter if you are fitting a Gaussian mixture. Another option is to use packages that are specially designed to model finite mixtures such as flexmix or mixtools. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [hidden email]http://n4.nabble.com/user/SendEmail.jtp?type=nodenode=998666i=0 Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: [hidden email]http://n4.nabble.com/user/SendEmail.jtp?type=nodenode=998666i=1[mailto:[hidden email] http://n4.nabble.com/user/SendEmail.jtp?type=nodenode=998666i=2] On Behalf Of jckval Sent: Monday, January 04, 2010 5:53 PM To: [hidden email]http://n4.nabble.com/user/SendEmail.jtp?type=nodenode=998666i=3 Subject: [R] MLE optimization Folks, I'm kind of newbie in R, but with some background in Matlab and VBA programming. Last month I was implementing a Maximum Likelihood Estimation in Matlab, but the algorithms didn't converge. So my academic advisor suggested using R. My problem is: estimate a mean reverting jump diffusion parameters. I've succeeded in deriving the likelihood function (which looks like a gaussian mixture) and it is implemented in R. My main doubts are related to the inputs and outputs that this function should generate, for instance, in Matlab this function should get the parameters as input and output the likelihood using the sample data (imported within the function). In order to make R optimizers to work I, apparently, should write a function that uses the parameters and the sample data as input and outputs the likelihood. Is it correct? Could someone reply with an example code which examplifies the type of function I should write and the syntax to optimize? Alternatively, could anyone suggest a good MLE tutorial and package? Thankfully, JC -- View this message in context: http://n4.nabble.com/MLE-optimization-tp998655p998655.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email]http://n4.nabble.com/user/SendEmail.jtp?type=nodenode=998666i=4mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ [hidden email]http://n4.nabble.com/user/SendEmail.jtp?type=nodenode=998666i=5mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View message @ http://n4.nabble.com/MLE-optimization-tp998655p998666.html To unsubscribe from MLE optimization, click here
Re: [R] MLE for a t distribution
Brian Ripley sometimes on this list or elsewhere suggested to reparametrize as 1/k. I have used that with good results. But you should be aware that usually data contains very little information about k, so thhat if you do not have a lot more than 100 observations you coukld be out of luck. You should try to plot the likelihood as a function of k, possibly also the profile likelihood. Kjetil On Thu, Dec 10, 2009 at 6:06 PM, Barbara Gonzalez barbara.p.gonza...@gmail.com wrote: Thank you. I actually found fitdistr() in the package MASS, that estimates the df, but it does a very bad job. I know that the main problem is that the t distribution has a lot of local maxima, and of course, when k-infty we have the Normal distribution, which has nice and easy to obtain MLEs. I will try re-parametrizing k, but I doubt this will solve the problem with the multiple local maxima. I would like to implement something like the EM algorithm to go around this problem, but I don't know how to do that. Barbara On Thu, Dec 10, 2009 at 2:59 PM, Albyn Jones jo...@reed.edu wrote: k - infinity gives the normal distribution. You probably don't care much about the difference between k=1000 and k=10, so you might try reparametrizing df on [1,infinity) to a parameter on [0,1]... albyn On Thu, Dec 10, 2009 at 02:14:26PM -0600, Barbara Gonzalez wrote: Given X1,...,Xn ~ t_k(mu,sigma) student t distribution with k degrees of freedom, mean mu and standard deviation sigma, I want to obtain the MLEs of the three parameters (mu, sigma and k). When I try traditional optimization techniques I don't find the MLEs. Usually I just get k-infty. Does anybody know of any algorithms/functions in R that can help me obtain the MLEs? I am especially interested in the MLE for k, the degrees of freedom. Thank you! Barbara __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for a t distribution
Given X1,...,Xn ~ t_k(mu,sigma) student t distribution with k degrees of freedom, mean mu and standard deviation sigma, I want to obtain the MLEs of the three parameters (mu, sigma and k). When I try traditional optimization techniques I don't find the MLEs. Usually I just get k-infty. Does anybody know of any algorithms/functions in R that can help me obtain the MLEs? I am especially interested in the MLE for k, the degrees of freedom. Thank you! Barbara __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for a t distribution
k - infinity gives the normal distribution. You probably don't care much about the difference between k=1000 and k=10, so you might try reparametrizing df on [1,infinity) to a parameter on [0,1]... albyn On Thu, Dec 10, 2009 at 02:14:26PM -0600, Barbara Gonzalez wrote: Given X1,...,Xn ~ t_k(mu,sigma) student t distribution with k degrees of freedom, mean mu and standard deviation sigma, I want to obtain the MLEs of the three parameters (mu, sigma and k). When I try traditional optimization techniques I don't find the MLEs. Usually I just get k-infty. Does anybody know of any algorithms/functions in R that can help me obtain the MLEs? I am especially interested in the MLE for k, the degrees of freedom. Thank you! Barbara __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for a t distribution
Thank you. I actually found fitdistr() in the package MASS, that estimates the df, but it does a very bad job. I know that the main problem is that the t distribution has a lot of local maxima, and of course, when k-infty we have the Normal distribution, which has nice and easy to obtain MLEs. I will try re-parametrizing k, but I doubt this will solve the problem with the multiple local maxima. I would like to implement something like the EM algorithm to go around this problem, but I don't know how to do that. Barbara On Thu, Dec 10, 2009 at 2:59 PM, Albyn Jones jo...@reed.edu wrote: k - infinity gives the normal distribution. You probably don't care much about the difference between k=1000 and k=10, so you might try reparametrizing df on [1,infinity) to a parameter on [0,1]... albyn On Thu, Dec 10, 2009 at 02:14:26PM -0600, Barbara Gonzalez wrote: Given X1,...,Xn ~ t_k(mu,sigma) student t distribution with k degrees of freedom, mean mu and standard deviation sigma, I want to obtain the MLEs of the three parameters (mu, sigma and k). When I try traditional optimization techniques I don't find the MLEs. Usually I just get k-infty. Does anybody know of any algorithms/functions in R that can help me obtain the MLEs? I am especially interested in the MLE for k, the degrees of freedom. Thank you! Barbara __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE in R
Hello Liang, Besides looking at ?optim, you may also want to look into this nice working example www.mayin.org/ajayshah/KB/R/documents/mle/mle.html Regards, Francisco Francisco J. Zagmutt Vose Consulting 1643 Spruce St., Boulder Boulder, CO, 80302 USA www.voseconsulting.com Liang Wang wrote: Hi, dear R users I am a newbie in R and I need to use the method of meximum likelihood to fit a Weibull distribution to my survival data. I use optim as follows: optim(c(1.14,0.25),weibull.like,mydata=mydata,method=L-BFGS-B,hessian = TRUE) My question is: how do I setup the constraints that the two parametrs of Weibull to be pisotive? Many thanks! Any comments are greatly appreciated! Steven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE in R
Hi, dear R users I am a newbie in R and I need to use the method of meximum likelihood to fit a Weibull distribution to my survival data. I use optim as follows: optim(c(1.14,0.25),weibull.like,mydata=mydata,method=L-BFGS-B,hessian = TRUE) My question is: how do I setup the constraints that the two parametrs of Weibull to be pisotive? Many thanks! Any comments are greatly appreciated! Steven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE in R
Please read ?optim and about its arguments lower, upperBounds on the variables for the L-BFGS-B method. Uwe Ligges Liang Wang wrote: Hi, dear R users I am a newbie in R and I need to use the method of meximum likelihood to fit a Weibull distribution to my survival data. I use optim as follows: optim(c(1.14,0.25),weibull.like,mydata=mydata,method=L-BFGS-B,hessian = TRUE) My question is: how do I setup the constraints that the two parametrs of Weibull to be pisotive? Many thanks! Any comments are greatly appreciated! Steven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for lambda of Poisson distribution using fitdistr
In general Poisson data consists of a pair of numbers (y,n), where y is the event count for the unit and n is the size of the unit. The Poisson MLE is sum(y)/sum(n). A general example is county level data where y is the number of events (rare cancer) and n is the county size. Two special cases are where n==1 for all cases and the mle=mean(y), or where y==1 for all subjects and n= observation time until the first event, where mle=1/mean(n). My preferred way to fit the distribution is glm( y ~ offset(log(n)) + other covariates, family=poisson) because of the mature printout,standard errors, residuals, etc. The other covariates are optional of course. If n=1 for all observations the offset can be omitted. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for lambda of Poisson distribution using fitdistr
On Oct 26, 2009, at 11:25 PM, ankush...@yahoo.com wrote: Hi, I am using the fitdistr of MASS to get the MLE for the lambda of a Poisson distribution. When i run the fitdistr command, i get an output that looks like - lambda 3.75 (0.03343) Couple of questions - 1. is the MLE 0.03343 for the lambda of the given distribution then? 2. How would I calculate the variance of the MLE for the lambda? It would be more typical statistical usage to have output of the form: estimate ( se(estimate) ) so I was expecting 3.75 to be the estimate. Looking at the help page and running str(.) on the fitdistr object of the first example confirms my expectations. Why did you think the help page was suggesting otherwise? -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for lambda of Poisson distribution using fitdistr
What is wrong with using mean(x) to get the MLE of the poisson lambda? Kjetil On Tue, Oct 27, 2009 at 9:17 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 26, 2009, at 11:25 PM, ankush...@yahoo.com wrote: Hi, I am using the fitdistr of MASS to get the MLE for the lambda of a Poisson distribution. When i run the fitdistr command, i get an output that looks like - lambda 3.75 (0.03343) Couple of questions - 1. is the MLE 0.03343 for the lambda of the given distribution then? 2. How would I calculate the variance of the MLE for the lambda? It would be more typical statistical usage to have output of the form: estimate ( se(estimate) ) so I was expecting 3.75 to be the estimate. Looking at the help page and running str(.) on the fitdistr object of the first example confirms my expectations. Why did you think the help page was suggesting otherwise? -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for lambda of Poisson distribution using fitdistr
Kjetil Halvorsen wrote: What is wrong with using mean(x) to get the MLE of the poisson lambda? and mean(x)/length(x) to get its estimated variance. -Peter Ehlers Kjetil On Tue, Oct 27, 2009 at 9:17 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 26, 2009, at 11:25 PM, ankush...@yahoo.com wrote: Hi, I am using the fitdistr of MASS to get the MLE for the lambda of a Poisson distribution. When i run the fitdistr command, i get an output that looks like - lambda 3.75 (0.03343) Couple of questions - 1. is the MLE 0.03343 for the lambda of the given distribution then? 2. How would I calculate the variance of the MLE for the lambda? It would be more typical statistical usage to have output of the form: estimate ( se(estimate) ) so I was expecting 3.75 to be the estimate. Looking at the help page and running str(.) on the fitdistr object of the first example confirms my expectations. Why did you think the help page was suggesting otherwise? -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for lambda of Poisson distribution using fitdistr
Thank you everyone for responding. David, 3.75 in my example was equivalent to the mean of the values, which i thought was too much a coincidence... What do you think the significance of (0.03343). What is this value? Kjetil, Are you saying that mean(x) is same as the MLE for the poisson lambda? Thanks again! Ankush From: Peter Ehlers ehl...@ucalgary.ca To: Kjetil Halvorsen kjetilbrinchmannhalvor...@gmail.com Sent: Tue, October 27, 2009 10:15:31 AM Subject: Re: [R] MLE for lambda of Poisson distribution using fitdistr Kjetil Halvorsen wrote: What is wrong with using mean(x) to get the MLE of the poisson lambda? and mean(x)/length(x) to get its estimated variance. -Peter Ehlers Kjetil On Tue, Oct 27, 2009 at 9:17 AM, David Winsemius dwinsem...@comcast.net wrote: Hi, I am using the fitdistr of MASS to get the MLE for the lambda of a Poisson distribution. When i run the fitdistr command, i get an output that looks like - lambda 3.75 (0.03343) Couple of questions - 1. is the MLE 0.03343 for the lambda of the given distribution then? 2. How would I calculate the variance of the MLE for the lambda? It would be more typical statistical usage to have output of the form: estimate ( se(estimate) ) so I was expecting 3.75 to be the estimate. Looking at the help page and running str(.) on the fitdistr object of the first example confirms my expectations. Why did you think the help page was suggesting otherwise? -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for lambda of Poisson distribution using fitdistr
Ankush Bhargava wrote: Thank you everyone for responding. David, 3.75 in my example was equivalent to the mean of the values, which i thought was too much a coincidence... What do you think the significance of (0.03343). What is this value? Kjetil, Are you saying that mean(x) is same as the MLE for the poisson lambda? Yes, the mle for lambda is mean(x) [easy to derive from the likelihood function] and the estimated sd for the mle is sqrt(mean(x)/length(x)) which is 0.03343 in your case. Cheers, -Peter Ehlers Thanks again! Ankush From: Peter Ehlers ehl...@ucalgary.ca To: Kjetil Halvorsen kjetilbrinchmannhalvor...@gmail.com Cc: r-h...@stat.math.ethz.ch; ankush...@yahoo.com Sent: Tue, October 27, 2009 10:15:31 AM Subject: Re: [R] MLE for lambda of Poisson distribution using fitdistr Kjetil Halvorsen wrote: What is wrong with using mean(x) to get the MLE of the poisson lambda? and mean(x)/length(x) to get its estimated variance. -Peter Ehlers Kjetil On Tue, Oct 27, 2009 at 9:17 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 26, 2009, at 11:25 PM, ankush...@yahoo.com wrote: Hi, I am using the fitdistr of MASS to get the MLE for the lambda of a Poisson distribution. When i run the fitdistr command, i get an output that looks like - lambda 3.75 (0.03343) Couple of questions - 1. is the MLE 0.03343 for the lambda of the given distribution then? 2. How would I calculate the variance of the MLE for the lambda? It would be more typical statistical usage to have output of the form: estimate ( se(estimate) ) so I was expecting 3.75 to be the estimate. Looking at the help page and running str(.) on the fitdistr object of the first example confirms my expectations. Why did you think the help page was suggesting otherwise? -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for noncentral t distribution
Hi, Actually I am facing a similar problem. I would like to fit both an ordinary (symmetric) and a non-central t distribution to my (one-dimensional) data (quite some values.. 1 mio.). For the symmetric one, fitdistr or funInfoFun (using fitdistr) from the qAnalyst package should do the job, and for the non-central one.. am I right to use gamlss(x ~ 1, family=GT()) ? Anyway, I am a little unsure how to handle the degrees of freedom. I have the feeling that it is not smart to not hold them fixed, but how can I actually determine them? If anyone could help me, I'd be really grateful... gamlss has a great documentation, but it's a bit overwhelming. Kind regards Susanne Susanne Balzer PhD Student Institute of Marine Research N-5073 Bergen, Norway Phone: +47 55 23 69 45 susanne.balzer at imr.no www.imr.no [R] MLE for noncentral t distribution Spencer Graves spencer.graves at pdf.com Fri May 9 16:32:47 CEST 2008 * Previous message: [R] MLE for noncentral t distribution * Next message: [R] MLE for noncentral t distribution * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] Hi, Martin and Kate: KATE: Do you really want the noncentral t? It has mean zero but strange tails created by a denominator following a noncentral chi-square. The answer Martin gave is for a scaled but otherwise standard t, which sounds like what you want, since you said the sample mean = 0.23, s = 4.04, etc. A noncentral t has an additional noncenrality parameter. Hope this helps. Spencer Martin Maechler wrote: k == kate yhsu6 at uiuc.edu on Thu, 8 May 2008 10:45:04 -0500 writes: k In my data, sample mean =-0.3 and the histogram looks like t distribution; k therefore, I thought non-central t distribution may be a good fit. Anyway, I k try t distribution to get MLE. I found some warnings as follows; besides, I k got three parameter estimators: m=0.23, s=4.04, df=1.66. I want to simulate k the data with sample size 236 and this parameter estimates. Is the command k rt(236, df=1.66)? Where should I put m and s when I do simulation? m + s * rt(n, df= df) [I still hope this isn't a student homework problem...] Martin Maechler, ETH Zurich k m s df k 0.2340746 4.0447124 1.6614823 k (0.3430796) (0.4158891) (0.2638703) k Warning messages: k 1: In dt(x, df, log) : generates NaNs k 2: In dt(x, df, log) : generates NaNs k 3: In dt(x, df, log) :generates NaNs k 4: In log(s) : generates NaNs k 5: In dt(x, df, log) : generates NaNs k 6: In dt(x, df, log) : generates NaNs k Thanks a lot, k Kate k - Original Message - k From: Prof Brian Ripley ripley at stats.ox.ac.uk k To: kate yhsu6 at uiuc.edu k Cc: r-help at r-project.org k Sent: Thursday, May 08, 2008 10:02 AM k Subject: Re: [R] MLE for noncentral t distribution On Thu, 8 May 2008, kate wrote: I have a data with 236 observations. After plotting the histogram, I found that it looks like non-central t distribution. I would like to get MLE for mu and df. So you mean 'non-central'? See ?dt. I found an example to find MLE for gamma distribution from fitting distributions with R: library(stats4) ## loading package stats4 ll-function(lambda,alfa) {n-200 x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-(alfa- 1)*sum(log(x))+lambda*sum(x)} ## -log-likelihood function est-mle(minuslog=ll, start=list(lambda=2,alfa=1)) Is anyone how how to write down -log-likelihood function for noncentral t distribution? Just use dt. E.g. library(MASS) ?fitdistr shows you a worked example for location, scale and df, but note the comments. You could fit a non-central t, but it would be unusual to do so. Thanks a lot!! Kate __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for lambda of Poisson distribution using fitdistr
Hi, I am using the fitdistr of MASS to get the MLE for the lambda of a Poisson distribution. When i run the fitdistr command, i get an output that looks like - lambda 3.75 (0.03343) Couple of questions - 1. is the MLE 0.03343 for the lambda of the given distribution then? 2. How would I calculate the variance of the MLE for the lambda? Thanks much! Ankush -- This message was sent on behalf of ankush...@yahoo.com at openSubscriber.com http://www.opensubscriber.com/message/r-h...@stat.math.ethz.ch/3316818.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mle from stats4
I am using mle as a wrapper from optim( ). How would I extract the convergence code, to know that optim( ) converged properly? Thanks, Stephen Collins, MPP | Analyst Global Strategy | Aon Benfield [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mle from stats4
Stephen Collins wrote: I am using mle as a wrapper from optim( ). How would I extract the convergence code, to know that optim( ) converged properly? The return value from optim is contained in the details slot, so f...@details$convergence [1] 0 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mle from stats4
Stephen Collins-6 wrote: I am using mle as a wrapper from optim( ). How would I extract the convergence code, to know that optim( ) converged properly? library(stats4) example(mle) slotNames(fit1) f...@details f...@details$convergence -- View this message in context: http://www.nabble.com/mle-from-stats4-tp25772337p25772668.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE estimation of constrained mean Singh-Maddala distribution
INTRODUCTION TO THE PROBLEM I am trying to fit a distribution to a dataset. The distribution that I am currently considering is the (3-parameter) Singh-Maddala (Burr) distribution. The final model will fix the mean of the distribution to a given value and estimate the remaining parameters accordingly; however, the code provided below ignores this. For this distribution the three parameters ('a': corresponding to 'a' in the function 'dsinmad' under the package 'VGAM'; 'b': corresponding to 'scale'; and 'q': corresponding to 'q.arg') are constrained to be strictly positive. For this I estimate parameters, first using 'optim' (and later 'nlm'), defined over the set of real numbers and then use the exponential function to convert them to positive values. For these constraints and the additional constraint: 'q-1/a0' the expected value is given as (as reported on the help page for the function 'sinmad'): E(Y) = b gamma(1 + 1/a) gamma(q - 1/a) / gamma(q). THE PROBLEM The problem I am having is constraining the search to the region where 'q-1/a0'. I was hoping not to just set the value of the negative log likelihood to 'Inf' (unless suggested otherwise) when 'optim' set the parameter at values that violated the constraint. I do not have much experience with simulated annealing so am not sure if the method 'SANN' would address this issue. I was also hoping to avoid using 'optim' recursively to solve this problem (unless otherwise suggested). Does 'R' have a function that can address the issue of the mentioned constraint in solving the maximum likelihood problem (that is applicable when the mean is fixed and other parameters estimated accordingly)? I have included code below to provide a better understanding of the problem (the dataset for 'y' is not included). Thanks, Josh Parcel ### R CODE ### library(VGAM) #Fit a Singh-Maddala distribution to 'y'. #Singh-Maddala Distribution #Parameters: a:shape ; b: scale; q: shape. #a0, b0, q0. To use given mean require -a1aq. #Use exponential function to ensure above constraints (except '-a1aq'. neg.LL_sinmad-function(p, y, x) { a_iter-exp(p[1]) b_iter-exp(p[2]) q_iter-exp(p[3]) #NEED TO PUT IN CONTITION ENSURING THAT 'q_iter-1/a_iter0'. z-(-sum(log(dsinmad(y, a=a_iter, scale=b_iter, q.arg=q_iter, log=FALSE } #Set values for initial guess of the parameters. a_guess-2 b_guess-3 q_guess-7 q_guess-1/a_guess p0-c(log(a_guess), log(b_guess), log(q_guess)) #Optimize to find minimum of the negative likelihood function est_p_A-optim(par=p0, fn=neg.LL_sinmad, y=y) est_p_A$par a_hat1-exp(est_p_A$par[1]) b_hat1-exp(est_p_A$par[2]) q_hat1-exp(est_p_A$par[3]) q_hat1-1/a_hat1 est_p_B-nlm(f=neg.LL_sinmad, p=c(est_p_A$par[1], est_p_A$par[2], est_p_A$par[3]), print.level=1, y=y) a_hat-exp(est_p_A$par[1]) b_hat-exp(est_p_A$par[2]) q_hat-exp(est_p_A$par[3]) q_hat-1/a_hat CONFIDENTIALITY NOTICE: This email, and any attachments hereto, contains information which may be CONFIDENTIAL. The information is intended to be for the use of the individual or entity named above. If you are not the intended recipient, please note that any unauthorized disclosure, copying, distribution or use of the information is prohibited. If you have received this electronic transmission in error, please return the e-mail to the sender and delete it from your computer. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mle() question
Is there a way to code the mle() function in library stats4 such that it switches optimizing methods midstream (i.e. BFGS to Newton and back to BFGS, etc.)? Thanks, Stephen Collins, MPP | Analyst Health Benefits | Aon Consulting [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for bimodal distribution
Just wanted to thank everyone for their help, I think I mostly managed to solve my problem. -- View this message in context: http://www.nabble.com/MLE-for-bimodal-distribution-tp22954970p23000785.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for bimodal distribution
On 08-Apr-09 23:39:36, Ted Harding wrote: On 08-Apr-09 22:10:26, Ravi Varadhan wrote: EM algorithm is a better approach for maximum likelihood estimation of finite-mixture models than direct maximization of the mixture log-likelihood. Due to its ascent properties, it is guaranteed to converge to a local maximum. By theoretical construction, it also automatically ensures that all the parameter constraints are satisfied. [snip] Be warned that EM convergence can be excruciatingly slow. Acceleration methods can be of help in large simulation studies or for bootstrapping. Best, Ravi. [snip] As to acceleration: agreed, EM can be slow. Aitken acceleration can be dramatically faster. Several outlines of the Aitken procedure can be found by googling on aitken acceleration. I recently wrote a short note, describing its general principle and outlining its application to the EM algorithm, using the Probit model as illustration (with R code). For fitting the location parameter alone, Raw EM took 59 iterations, Aitken-accelerated EM took 3. For fitting the location and scale paramaters, Raw EM took 108, and Aitken took 4. If anyone would like a copy (PS or PDF) of this, drop me a line. I have now placed a PDF copy of this, if anyone is interested (it was intended as a brief expository note), at: http://www.zen89632.zen.co.uk/R/EM_Aitken/em_aitken.pdf Best wishes to all, Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 10-Apr-09 Time: 17:15:06 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for bimodal distribution
Hello everyone, I'm trying to use mle from package stats4 to fit a bi/multi-modal distribution to some data, but I have some problems with it. Here's what I'm doing (for a bimodal distribution): # Build some fake binormally distributed data, the procedure fails also with real data, so the problem isn't here data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3)) # Just to check it's bimodal plot(density(data)) f = function(m, s, m2, s2, w) { -log( sum(w*dnorm(data, mean=m, sd=s)) + sum((1-w)*dnorm(data, mean=m2, sd=s2)) ) } res = mle(f, start=list(m=3, s=0.5, m2=5, s2=0.35, w=0.6)) This gives an error: Error in optim(start, f, method = method, hessian = TRUE, ...) : non-finite finite-difference value [2] In addition: There were 50 or more warnings (use warnings() to see the first 50) And the warnings are about dnorm producing NaNs So, my questions are: 1) What does non-finite finite-difference value mean? My guess would be that an Inf was passed somewhere where a finite number was expected...? I'm not sure how optim works though... 2) Is the log likelihood function I wrote correct? Can I use the same type of function for 3 modes? 3) What should I do to avoid function failure? I tried by changing the parameters, but it did not work. 4) Can I put constraints to the parameters? In particular, I would like w to be between 0 and 1. 5) Is there a way to get the log-likelihood value, so that I can compare different extimations? 6) Do you have any (possibly a bit pratically oriented) readings about MLE to suggest? Thanks in advance Nico -- View this message in context: http://www.nabble.com/MLE-for-bimodal-distribution-tp22954970p22954970.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for bimodal distribution
_nico_ wrote: Hello everyone, I'm trying to use mle from package stats4 to fit a bi/multi-modal distribution to some data, but I have some problems with it. Here's what I'm doing (for a bimodal distribution): # Build some fake binormally distributed data, the procedure fails also with real data, so the problem isn't here data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3)) # Just to check it's bimodal plot(density(data)) f = function(m, s, m2, s2, w) { -log( sum(w*dnorm(data, mean=m, sd=s)) + sum((1-w)*dnorm(data, mean=m2, sd=s2)) ) } res = mle(f, start=list(m=3, s=0.5, m2=5, s2=0.35, w=0.6)) This gives an error: Error in optim(start, f, method = method, hessian = TRUE, ...) : non-finite finite-difference value [2] In addition: There were 50 or more warnings (use warnings() to see the first 50) And the warnings are about dnorm producing NaNs So, my questions are: 1) What does non-finite finite-difference value mean? My guess would be that an Inf was passed somewhere where a finite number was expected...? I'm not sure how optim works though... 2) Is the log likelihood function I wrote correct? Can I use the same type of function for 3 modes? 3) What should I do to avoid function failure? I tried by changing the parameters, but it did not work. 4) Can I put constraints to the parameters? In particular, I would like w to be between 0 and 1. 5) Is there a way to get the log-likelihood value, so that I can compare different extimations? 6) Do you have any (possibly a bit pratically oriented) readings about MLE to suggest? Thanks in advance Nico Here's some tweaked code that works. Read about the L-BFGS-B method in ?optim to constrain parameters, although you will have some difficulty making this work for more than two components. I think there's also a mixture model problem in Venables Ripley (MASS). There are almost certainly more efficient approaches for mixture model fitting, although this brute force approach should be fine if you don't need to do anything too complicated. (RSiteSearch(mixture model)) set.seed(1001) data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3)) # Just to check it's bimodal plot(density(data)) f = function(m, s, m2, s2, w) { -sum(log(w*dnorm(data, mean=m, sd=s)+ (1-w)*dnorm(data, mean=m2, sd=s2))) } library(stats4) start0 - list(m=3, s=0.5, m2=5, s2=0.35, w=0.6) do.call(f,start0) ## OK res = mle(f, start=start0) with(as.list(coef(res)), curve(w*dnorm(x,m,s)+(1-w)*dnorm(x,m2,s2),add=TRUE,col=2)) -- View this message in context: http://www.nabble.com/MLE-for-bimodal-distribution-tp22954970p22957613.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for bimodal distribution
The problem is that fitting mixtures models is hard -- (non)identifiability is a serious problem: very large sample sizes may be necessary to clearly distinguish the modes. As VR say in MASS, 4th edition, p. 320: ... fitting normal mixtures is a difficult problem, and the results obtained are often heavily dependent on the initial configuration ([i.e. starting values. BG] supplied The EM algorithm is quite commonly used: you might have a look at em() and related functions in the mclust package if Ben's straightforward approach fails to do the job for you. No guarantee em will work either, of course. -- Bert Bert Gunter Genentech Nonclinical Biostatistics 650-467-7374 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ben Bolker Sent: Wednesday, April 08, 2009 12:47 PM To: r-help@r-project.org Subject: Re: [R] MLE for bimodal distribution _nico_ wrote: Hello everyone, I'm trying to use mle from package stats4 to fit a bi/multi-modal distribution to some data, but I have some problems with it. Here's what I'm doing (for a bimodal distribution): # Build some fake binormally distributed data, the procedure fails also with real data, so the problem isn't here data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3)) # Just to check it's bimodal plot(density(data)) f = function(m, s, m2, s2, w) { -log( sum(w*dnorm(data, mean=m, sd=s)) + sum((1-w)*dnorm(data, mean=m2, sd=s2)) ) } res = mle(f, start=list(m=3, s=0.5, m2=5, s2=0.35, w=0.6)) This gives an error: Error in optim(start, f, method = method, hessian = TRUE, ...) : non-finite finite-difference value [2] In addition: There were 50 or more warnings (use warnings() to see the first 50) And the warnings are about dnorm producing NaNs So, my questions are: 1) What does non-finite finite-difference value mean? My guess would be that an Inf was passed somewhere where a finite number was expected...? I'm not sure how optim works though... 2) Is the log likelihood function I wrote correct? Can I use the same type of function for 3 modes? 3) What should I do to avoid function failure? I tried by changing the parameters, but it did not work. 4) Can I put constraints to the parameters? In particular, I would like w to be between 0 and 1. 5) Is there a way to get the log-likelihood value, so that I can compare different extimations? 6) Do you have any (possibly a bit pratically oriented) readings about MLE to suggest? Thanks in advance Nico Here's some tweaked code that works. Read about the L-BFGS-B method in ?optim to constrain parameters, although you will have some difficulty making this work for more than two components. I think there's also a mixture model problem in Venables Ripley (MASS). There are almost certainly more efficient approaches for mixture model fitting, although this brute force approach should be fine if you don't need to do anything too complicated. (RSiteSearch(mixture model)) set.seed(1001) data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3)) # Just to check it's bimodal plot(density(data)) f = function(m, s, m2, s2, w) { -sum(log(w*dnorm(data, mean=m, sd=s)+ (1-w)*dnorm(data, mean=m2, sd=s2))) } library(stats4) start0 - list(m=3, s=0.5, m2=5, s2=0.35, w=0.6) do.call(f,start0) ## OK res = mle(f, start=start0) with(as.list(coef(res)), curve(w*dnorm(x,m,s)+(1-w)*dnorm(x,m2,s2),add=TRUE,col=2)) -- View this message in context: http://www.nabble.com/MLE-for-bimodal-distribution-tp22954970p22957613.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for bimodal distribution
Ben Bolker wrote: Here's some tweaked code that works. [cut] Thanks, that saved me a few headaches. I also find out the answer to my (dumb) question #5, which is obviously to call f with the returned parameters or use the logLik function. I will have a look at the mixture model problem, as you suggested. Looking at my data I don't think I really need more than bimodal distributions anyway, yet I was going to have a go at it mostly out of curiosity. Well, thank you very much again for your help! It was really appreciated. Nico -- View this message in context: http://www.nabble.com/MLE-for-bimodal-distribution-tp22954970p22958318.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for bimodal distribution
_nico_ wrote: Hello everyone, I'm trying to use mle from package stats4 to fit a bi/multi-modal distribution to some data, but I have some problems with it. Here's what I'm doing (for a bimodal distribution): # Build some fake binormally distributed data, the procedure fails also with real data, so the problem isn't here data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3)) # Just to check it's bimodal plot(density(data)) f = function(m, s, m2, s2, w) { -log( sum(w*dnorm(data, mean=m, sd=s)) + sum((1-w)*dnorm(data, mean=m2, sd=s2)) ) } res = mle(f, start=list(m=3, s=0.5, m2=5, s2=0.35, w=0.6)) This gives an error: Error in optim(start, f, method = method, hessian = TRUE, ...) : non-finite finite-difference value [2] In addition: There were 50 or more warnings (use warnings() to see the first 50) And the warnings are about dnorm producing NaNs So, my questions are: 1) What does non-finite finite-difference value mean? My guess would be that an Inf was passed somewhere where a finite number was expected...? I'm not sure how optim works though... 2) Is the log likelihood function I wrote correct? Can I use the same type of function for 3 modes? I think it is correct but it is difficult to fit as others have pointed out. As an alternative, you may discretise your variable so the mixture is fit to counts on a histogram using a multinomial likelihood. HTH Rubén __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for bimodal distribution
EM algorithm is a better approach for maximum likelihood estimation of finite-mixture models than direct maximization of the mixture log-likelihood. Due to its ascent properties, it is guaranteed to converge to a local maximum. By theoretical construction, it also automatically ensures that all the parameter constraints are satisfied. The problem with mixture estimation, however, is that the local maximum may not be a good solution. Even global maximum may not be a good solution. For example, it is well known that in a Gaussian mixture, you can get a log-likelihood of +infinity by letting mu = one of the data points, and sigma = 0 for one of the mixture components. You can detect trouble in MLE if you observe one of the following: (1) a degenerate solution, i.e. one of the mixing proportions is close to 0, (2) one of the sigmas is too small. EM convergence is sensitive to starting values. Although, there are several starting strategies (see MacLachlan and Basford's book on Finite Mixtures), there is no universally sound strategy for ensuring a good estimate (e.g. global maximum, when it makes sense). It is always a good practice to run EM with multiple starting values. Be warned that EM convergence can be excruciatingly slow. Acceleration methods can be of help in large simulation studies or for bootstrapping. Best, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Bert Gunter gunter.ber...@gene.com Date: Wednesday, April 8, 2009 4:14 pm Subject: Re: [R] MLE for bimodal distribution To: 'Ben Bolker' bol...@ufl.edu, r-help@r-project.org The problem is that fitting mixtures models is hard -- (non)identifiability is a serious problem: very large sample sizes may be necessary to clearly distinguish the modes. As VR say in MASS, 4th edition, p. 320: ... fitting normal mixtures is a difficult problem, and the results obtained are often heavily dependent on the initial configuration ([i.e. starting values. BG] supplied The EM algorithm is quite commonly used: you might have a look at em() and related functions in the mclust package if Ben's straightforward approach fails to do the job for you. No guarantee em will work either, of course. -- Bert Bert Gunter Genentech Nonclinical Biostatistics 650-467-7374 -Original Message- From: r-help-boun...@r-project.org [ On Behalf Of Ben Bolker Sent: Wednesday, April 08, 2009 12:47 PM To: r-help@r-project.org Subject: Re: [R] MLE for bimodal distribution _nico_ wrote: Hello everyone, I'm trying to use mle from package stats4 to fit a bi/multi-modal distribution to some data, but I have some problems with it. Here's what I'm doing (for a bimodal distribution): # Build some fake binormally distributed data, the procedure fails also with real data, so the problem isn't here data = c(rnorm(1000, 3, 0.5), rnorm(500, 5, 0.3)) # Just to check it's bimodal plot(density(data)) f = function(m, s, m2, s2, w) { -log( sum(w*dnorm(data, mean=m, sd=s)) + sum((1-w)*dnorm(data, mean=m2, sd=s2)) ) } res = mle(f, start=list(m=3, s=0.5, m2=5, s2=0.35, w=0.6)) This gives an error: Error in optim(start, f, method = method, hessian = TRUE, ...) : non-finite finite-difference value [2] In addition: There were 50 or more warnings (use warnings() to see the first 50) And the warnings are about dnorm producing NaNs So, my questions are: 1) What does non-finite finite-difference value mean? My guess would be that an Inf was passed somewhere where a finite number was expected...? I'm not sure how optim works though... 2) Is the log likelihood function I wrote correct? Can I use the same type of function for 3 modes? 3) What should I do to avoid function failure? I tried by changing the parameters, but it did not work. 4) Can I put constraints to the parameters? In particular, I would like w to be between 0 and 1. 5) Is there a way to get the log-likelihood value, so that I can compare different extimations? 6) Do you have any (possibly a bit pratically oriented) readings about MLE to suggest? Thanks in advance Nico Here's some tweaked code that works. Read about the L-BFGS-B method in ?optim to constrain parameters, although you will have some difficulty making this work for more than two components. I think there's also a mixture model problem in Venables Ripley (MASS). There are almost certainly more efficient approaches for mixture model fitting, although this brute force approach should be fine if you don't need to do anything too complicated. (RSiteSearch
Re: [R] MLE for bimodal distribution
On 08-Apr-09 22:10:26, Ravi Varadhan wrote: EM algorithm is a better approach for maximum likelihood estimation of finite-mixture models than direct maximization of the mixture log-likelihood. Due to its ascent properties, it is guaranteed to converge to a local maximum. By theoretical construction, it also automatically ensures that all the parameter constraints are satisfied. The problem with mixture estimation, however, is that the local maximum may not be a good solution. Even global maximum may not be a good solution. For example, it is well known that in a Gaussian mixture, you can get a log-likelihood of +infinity by letting mu = one of the data points, and sigma = 0 for one of the mixture components. You can detect trouble in MLE if you observe one of the following: (1) a degenerate solution, i.e. one of the mixing proportions is close to 0, (2) one of the sigmas is too small. EM convergence is sensitive to starting values. Although, there are several starting strategies (see MacLachlan and Basford's book on Finite Mixtures), there is no universally sound strategy for ensuring a good estimate (e.g. global maximum, when it makes sense). It is always a good practice to run EM with multiple starting values. Be warned that EM convergence can be excruciatingly slow. Acceleration methods can be of help in large simulation studies or for bootstrapping. Best, Ravi. Well put! I've done a fair bit of mixture-fitting in my time, and one approach which can be worth trying (and for which there is often a reasonably objective justification) is to do a series of fits (assuming a given number of components, e.g. 2 for the present purpose) with the sigma's in a constant ratio in each one: sigma2 = lambda*sigma1 Then you won't get those singularities where it tries to climb up a single data-point. If you do this for a range of values of lambda, and keep track of the log-likelihood, then you get in effect a profile likelihood for lambda. Once you see what that looks like, you can then set about penalising ranges you don't want to see. The reason I say there is often a reasonably objective justification is that often you can be pretty sure, if there is a mixture present, that lambda does not go outside say 1/10 - 10 (or whatever), since you expect that the latent groups are fairly similar, and unlikely to have markedly different sigma's. As to acceleration: agreed, EM can be slow. Aitken acceleration can be dramatically faster. Several outlines of the Aitken procedure can be found by googling on aitken acceleration. I recently wrote a short note, describing its general principle and outlining its application to the EM algorithm, using the Probit model as illustration (with R code). For fitting the location parameter alone, Raw EM took 59 iterations, Aitken-accelerated EM took 3. For fitting the location and scale paramaters, Raw EM took 108, and Aitken took 4. If anyone would like a copy (PS or PDF) of this, drop me a line. Or is there some repository for R-help (like the Files area in Google Groups) where one can place files for others to download? [And, if not, do people think it might be a good idea?] Best wishes tgo all, Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 09-Apr-09 Time: 00:33:02 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for bimodal distribution
Dear Ted, Thanks for your comments on the profilie-likelihood approach for ratio of sigmas. I would like to look at your R code and the note on Aitken acceleration. I would appreciate if you could share this with me. I am glad you nibbled at my bait on EM acceleration. This is one of my favourite topics. EM is indeed slow, but it pretty much guarantees convergence to a local maximum. Acceleration methods, on the other hand, do not have this guarantee, as they forsake the ascent property. This trade-off between rate of convergence and monotonicity is the crux of the acceleration problem. I recently wrote a paper on this (Scand J Stats, June 2008). I have developed a class of acceleration methods, called SQUAREM, which strikes a good balance between speed and stability. They are monotone, yet fast. They will not be as fast as unbridled Aitken acceleration, which are susceptible to numerical instabilities. SQUAREM, on the other hand, is guarenteed to converge like the original EM algorithm, and will provide significant improvements in speed. In other words, you can have your cake and eat it too! I have written an R function to implement this. I am planning to release an R package soon (as soon as I can free-up some time). This package can be used to accelerate any EM algorithm. The user is only required to supply the EM update function, i.e. the function that computes a single E and M step. This function can be used as an off-the-shelf accelerator without needing any problem-specific input. Best, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: ted.hard...@manchester.ac.uk (Ted Harding) Date: Wednesday, April 8, 2009 7:43 pm Subject: Re: [R] MLE for bimodal distribution To: r-h...@stat.math.ethz.ch On 08-Apr-09 22:10:26, Ravi Varadhan wrote: EM algorithm is a better approach for maximum likelihood estimation of finite-mixture models than direct maximization of the mixture log-likelihood. Due to its ascent properties, it is guaranteed to converge to a local maximum. By theoretical construction, it also automatically ensures that all the parameter constraints are satisfied. The problem with mixture estimation, however, is that the local maximum may not be a good solution. Even global maximum may not be a good solution. For example, it is well known that in a Gaussian mixture, you can get a log-likelihood of +infinity by letting mu = one of the data points, and sigma = 0 for one of the mixture components. You can detect trouble in MLE if you observe one of the following: (1) a degenerate solution, i.e. one of the mixing proportions is close to 0, (2) one of the sigmas is too small. EM convergence is sensitive to starting values. Although, there are several starting strategies (see MacLachlan and Basford's book on Finite Mixtures), there is no universally sound strategy for ensuring a good estimate (e.g. global maximum, when it makes sense). It is always a good practice to run EM with multiple starting values. Be warned that EM convergence can be excruciatingly slow. Acceleration methods can be of help in large simulation studies or for bootstrapping. Best, Ravi. Well put! I've done a fair bit of mixture-fitting in my time, and one approach which can be worth trying (and for which there is often a reasonably objective justification) is to do a series of fits (assuming a given number of components, e.g. 2 for the present purpose) with the sigma's in a constant ratio in each one: sigma2 = lambda*sigma1 Then you won't get those singularities where it tries to climb up a single data-point. If you do this for a range of values of lambda, and keep track of the log-likelihood, then you get in effect a profile likelihood for lambda. Once you see what that looks like, you can then set about penalising ranges you don't want to see. The reason I say there is often a reasonably objective justification is that often you can be pretty sure, if there is a mixture present, that lambda does not go outside say 1/10 - 10 (or whatever), since you expect that the latent groups are fairly similar, and unlikely to have markedly different sigma's. As to acceleration: agreed, EM can be slow. Aitken acceleration can be dramatically faster. Several outlines of the Aitken procedure can be found by googling on aitken acceleration. I recently wrote a short note, describing its general principle and outlining its application to the EM algorithm, using the Probit model as illustration (with R code). For fitting the location parameter alone, Raw EM took 59 iterations, Aitken-accelerated EM took 3
[R] MLE for right-censored data with covariates
I am a student (and very to new to R) working on a senior design project that is attempting to determine the demand distributions for single copy newspaper draws at individual sales outlet locations. Our sales data is right-censored, because sell-outs constitute a majority of the data, and we are also testing the relevance of including covariates (weather, seasonality, economic condition, etc.). We are trying to do MLE calculations to determine each demand distribution and then use those distributions for demand in the Newsvendor model. Is there any package (or combination of packages) to help? We've been looking at survival... -- View this message in context: http://www.nabble.com/MLE-for-right-censored-data-with-covariates-tp21864312p21864312.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE Constraints
Dears, Any help? Thanks, LFRC LFRC wrote: Dears, I'm trying to find the parameters (a,b, ... l) that optimize the function (Model) described below. 1) How can I set some constraints with MLE2 function? I want to set p10, p20, p30, p1p3. 2) The code is giving the following warning. Warning: optimization did not converge (code 1) How can I solve this problem? Can someone help me? M - 14 Y = c(0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1) x1 = c(0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.25, 0.25, 0.25, 0.25) x2 = c(-1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1) x3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) states = c(1, 1, 2, 3, 1, 2, 3, 1, 1, 2, 2, 3, 1, 1) prob_fn = rep(0,M) Model=function(a, b, c, d, e, f, g, h, i, j, k, l) { p1 = exp(-(a g*x1 d*x2 j*x3)) p2 = exp(-(b h*x1 e*x2 k*x3)) p3 = exp(-(c i*x1 f*x2 l*x3)) ### Set P t5 = 0 while(t5M) { t5 = t5 1 if(states[t5]==1) {prob_ok = p1[1]} if(states[t5]==2) {prob_ok = p2[1]} if(states[t5]==3) {prob_ok = p3[1]} prob_fn[t5] = c(prob_ok) } prob_fn[prob_fn==0] = 0.1 ### LL ll_calc = -(sum(Y*log(prob_fn))) return(ll_calc) } res = mle2(Model, start=list(a=1, b=1, c=1, d=0.15, e=0.15, f=0.15, g=0.9, h=0.9, i=0.9, j=0.1, k=0.1, l=0.1), method = Nelder- Mead) res Best regards, LFRC -- View this message in context: http://www.nabble.com/MLE-Constraints-tp19994553p20016631.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE Constraints
Dears, I'm trying to find the parameters (a,b, ... l) that optimize the function (Model) described below. 1) How can I set some constraints with MLE2 function? I want to set p10, p20, p30, p1p3. 2) The code is giving the following warning. Warning: optimization did not converge (code 1) How can I solve this problem? Can someone help me? M - 14 Y = c(0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1) x1 = c(0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.25, 0.25, 0.25, 0.25) x2 = c(-1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1) x3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) states = c(1, 1, 2, 3, 1, 2, 3, 1, 1, 2, 2, 3, 1, 1) prob_fn = rep(0,M) Model=function(a, b, c, d, e, f, g, h, i, j, k, l) { p1 = exp(-(a g*x1 d*x2 j*x3)) p2 = exp(-(b h*x1 e*x2 k*x3)) p3 = exp(-(c i*x1 f*x2 l*x3)) ### Set P t5 = 0 while(t5M) { t5 = t5 1 if(states[t5]==1) {prob_ok = p1[1]} if(states[t5]==2) {prob_ok = p2[1]} if(states[t5]==3) {prob_ok = p3[1]} prob_fn[t5] = c(prob_ok) } prob_fn[prob_fn==0] = 0.1 ### LL ll_calc = -(sum(Y*log(prob_fn))) return(ll_calc) } res = mle2(Model, start=list(a=1, b=1, c=1, d=0.15, e=0.15, f=0.15, g=0.9, h=0.9, i=0.9, j=0.1, k=0.1, l=0.1), method = Nelder- Mead) res Best regards, LFRC -- View this message in context: http://www.nabble.com/MLE-Constraints-tp19994553p19994553.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE
May I ask one statistics related question please? I have one query on MLE itself. It's property says that, for large sample size it is normally distributed [i.e. asymptotically normal]. On the other hand it is Efficient as well. My doubt is, how this two asymptotic properties exist simultaneously? If it is consistent then asymptotically it should collapse to truth i.e. for large sample size, variance of MLE should be zero. However asymptotic normality says, MLE have some distribution and hence variance. Â Can anyone please clarify me? Your help will be highly appreciated. Â Get your new Email address! Grab the Email name you#39;ve always wanted before someone else does! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'
Greetings, all I am having difficulty getting the fitdistr() function to return without an error on my data. Specifically, what I'm trying to do is get a parameter estimation for fracture intensity data in a well / borehole. Lower bound is 0 (no fractures in the selected data interval), and upper bound is ~ 10 - 50, depending on what scale you are conducting the analysis on. I read in the data from a text file, convert it to numerics, and then calculate initial estimates of the shape and scale parameters for the gamma distribution from moments. I then feed this back into the fitdistr() function. R code (to this point): data.raw=c(readLines(FSM_C_9m_ENE.inp)) data.num - as.numeric(data.raw) data.num library(MASS) shape.mom = ((mean(data.num))/ (sd(data.num))^2 shape.mom med.data = mean(data.num) sd.data = sd(data.num) med.data sd.data shape.mom = (med.data/sd.data)^2 shape.mom scale.mom = (sd.data^2)/med.data scale.mom fitdistr(data.num,gamma,list(shape=shape.mom, scale=scale.mom),lower=0) fitdistr() returns the following error: Error in optim(x = c(0.402707037, 0.40348, 0.404383704, 2.432626667, : L-BFGS-B needs finite values of 'fn' Next thing I tried was to manually specify the negative log-likelihood function and pass it straight to mle() (the method specified in Ricci's tutorial on fitting distributions with R). Basically, I got the same result as using fitdistr(). Finally I tried using some R code I found from someone with a similar problem back in 2003 from the archives of this mailing list: R code gamma.param1 - shape.mom gamma.param2 - scale.mom log.gamma.param1 - log(gamma.param1) log.gamma.param2 - log(gamma.param2) gammaLoglik - function(params, negative=TRUE){ lglk - sum(dgamma(data, shape=exp(params[1]), scale=exp(params[2]), log=TRUE)) if(negative) return(-lglk) else return(lglk) } optim.list - optim(c(log.gamma.param1, log.gamma.param2), gammaLoglik) gamma.param1 - exp(optim.list$par[1]) gamma.param2 - exp(optim.list$par[2]) # If I test this function using my sample data and the estimates of shape and scale derived from the method of moments, gammaLogLike returns as INF. I suspect the problem is that the zeros in the data are causing the optim solver problems when it attempts to minimize the negative log-likelihood function. Can anyone suggest some advice on a work-around? I have seen suggestions online that a 'censoring' algorithm can allow one to use MLE methods to estimate the gamma distribution for data with zero values (Wilkes, 1990, Journal of Climate). I have not, however, found R code to implement this, and, frankly, am not smart enough to do it myself... :-) Any suggestions? Has anyone else run up against this and written code to solve the problem? Thanks in advance! Aaron Fox Senior Project Geologist, Golder Associates +1 425 882 5484 || +1 425 736 3958 (mobile) [EMAIL PROTECTED] || www.fracturedreservoirs.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'
Fox, Aaron wrote: Greetings, all I am having difficulty getting the fitdistr() function to return without an error on my data. Specifically, what I'm trying to do is get a parameter estimation for fracture intensity data in a well / borehole. Lower bound is 0 (no fractures in the selected data interval), and upper bound is ~ 10 - 50, depending on what scale you are conducting the analysis on. I read in the data from a text file, convert it to numerics, and then calculate initial estimates of the shape and scale parameters for the gamma distribution from moments. I then feed this back into the fitdistr() function. R code (to this point): data.raw=c(readLines(FSM_C_9m_ENE.inp)) data.num - as.numeric(data.raw) data.num library(MASS) shape.mom = ((mean(data.num))/ (sd(data.num))^2 shape.mom med.data = mean(data.num) sd.data = sd(data.num) med.data sd.data shape.mom = (med.data/sd.data)^2 shape.mom scale.mom = (sd.data^2)/med.data scale.mom fitdistr(data.num,gamma,list(shape=shape.mom, scale=scale.mom),lower=0) fitdistr() returns the following error: Error in optim(x = c(0.402707037, 0.40348, 0.404383704, 2.432626667, : L-BFGS-B needs finite values of 'fn' Next thing I tried was to manually specify the negative log-likelihood function and pass it straight to mle() (the method specified in Ricci's tutorial on fitting distributions with R). Basically, I got the same result as using fitdistr(). Finally I tried using some R code I found from someone with a similar problem back in 2003 from the archives of this mailing list: R code gamma.param1 - shape.mom gamma.param2 - scale.mom log.gamma.param1 - log(gamma.param1) log.gamma.param2 - log(gamma.param2) gammaLoglik - function(params, negative=TRUE){ lglk - sum(dgamma(data, shape=exp(params[1]), scale=exp(params[2]), log=TRUE)) if(negative) return(-lglk) else return(lglk) } optim.list - optim(c(log.gamma.param1, log.gamma.param2), gammaLoglik) gamma.param1 - exp(optim.list$par[1]) gamma.param2 - exp(optim.list$par[2]) # If I test this function using my sample data and the estimates of shape and scale derived from the method of moments, gammaLogLike returns as INF. I suspect the problem is that the zeros in the data are causing the optim solver problems when it attempts to minimize the negative log-likelihood function. Can anyone suggest some advice on a work-around? I have seen suggestions online that a 'censoring' algorithm can allow one to use MLE methods to estimate the gamma distribution for data with zero values (Wilkes, 1990, Journal of Climate). I have not, however, found R code to implement this, and, frankly, am not smart enough to do it myself... :-) Any suggestions? Has anyone else run up against this and written code to solve the problem? It's fairly easy. You decide that the zeros really represent values less than delta (e.g. 0.5 if your data are integers), then replace dgamma(0,) with pgamma(delta,...) in the likelihood. (And, BTW, the problem is not that the optimizers get in trouble, but rather that the log-likelihood *is* +/- Inf if there are zeros in data unless the shape parameter is exactly 1 -- the x^(a-1) factor in the gamma density causes this). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE Estimation of Gamma Distribution Parameters for data with 'zeros'
Fox, Aaron Afox at golder.com writes: Greetings, all I am having difficulty getting the fitdistr() function to return without an error on my data. Specifically, what I'm trying to do is get a parameter estimation for fracture intensity data in a well / borehole. Lower bound is 0 (no fractures in the selected data interval), and upper bound is ~ 10 - 50, depending on what scale you are conducting the analysis on. You're right that the basic problem is with the gamma distribution. P(x,shape) dx = 0 (shape1), 1 (shape=1), or Inf (shape1). A quick cheat would be to add a small number (0.001?) to your data, try it again, and see how sensitive the estimate is to how small that number is. You could also try a negative binomial fit, which is the discrete analog of the gamma (and hence won't have any problem with zeros). People who do beta regressions with zero values in them often talk about adding a small Bayesian 'fudge factor' to deal with this problem ... (see http://psychology.anu.edu.au/people/smithson/details/betareg/Readme.pdf ) Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for noncentral t distribution
k == kate [EMAIL PROTECTED] on Thu, 8 May 2008 10:45:04 -0500 writes: k In my data, sample mean =-0.3 and the histogram looks like t distribution; k therefore, I thought non-central t distribution may be a good fit. Anyway, I k try t distribution to get MLE. I found some warnings as follows; besides, I k got three parameter estimators: m=0.23, s=4.04, df=1.66. I want to simulate k the data with sample size 236 and this parameter estimates. Is the command k rt(236, df=1.66)? Where should I put m and s when I do simulation? m + s * rt(n, df= df) [I still hope this isn't a student homework problem...] Martin Maechler, ETH Zurich k m s df k 0.2340746 4.0447124 1.6614823 k (0.3430796) (0.4158891) (0.2638703) k Warning messages: k 1: In dt(x, df, log) : generates NaNs k 2: In dt(x, df, log) : generates NaNs k 3: In dt(x, df, log) :generates NaNs k 4: In log(s) : generates NaNs k 5: In dt(x, df, log) : generates NaNs k 6: In dt(x, df, log) : generates NaNs k Thanks a lot, k Kate k - Original Message - k From: Prof Brian Ripley [EMAIL PROTECTED] k To: kate [EMAIL PROTECTED] k Cc: r-help@r-project.org k Sent: Thursday, May 08, 2008 10:02 AM k Subject: Re: [R] MLE for noncentral t distribution On Thu, 8 May 2008, kate wrote: I have a data with 236 observations. After plotting the histogram, I found that it looks like non-central t distribution. I would like to get MLE for mu and df. So you mean 'non-central'? See ?dt. I found an example to find MLE for gamma distribution from fitting distributions with R: library(stats4) ## loading package stats4 ll-function(lambda,alfa) {n-200 x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-(alfa- 1)*sum(log(x))+lambda*sum(x)} ## -log-likelihood function est-mle(minuslog=ll, start=list(lambda=2,alfa=1)) Is anyone how how to write down -log-likelihood function for noncentral t distribution? Just use dt. E.g. library(MASS) ?fitdistr shows you a worked example for location, scale and df, but note the comments. You could fit a non-central t, but it would be unusual to do so. Thanks a lot!! Kate __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for noncentral t distribution
Hi, Martin and Kate: KATE: Do you really want the noncentral t? It has mean zero but strange tails created by a denominator following a noncentral chi-square. The answer Martin gave is for a scaled but otherwise standard t, which sounds like what you want, since you said the sample mean = 0.23, s = 4.04, etc. A noncentral t has an additional noncenrality parameter. Hope this helps. Spencer Martin Maechler wrote: k == kate [EMAIL PROTECTED] on Thu, 8 May 2008 10:45:04 -0500 writes: k In my data, sample mean =-0.3 and the histogram looks like t distribution; k therefore, I thought non-central t distribution may be a good fit. Anyway, I k try t distribution to get MLE. I found some warnings as follows; besides, I k got three parameter estimators: m=0.23, s=4.04, df=1.66. I want to simulate k the data with sample size 236 and this parameter estimates. Is the command k rt(236, df=1.66)? Where should I put m and s when I do simulation? m + s * rt(n, df= df) [I still hope this isn't a student homework problem...] Martin Maechler, ETH Zurich k m s df k 0.2340746 4.0447124 1.6614823 k (0.3430796) (0.4158891) (0.2638703) k Warning messages: k 1: In dt(x, df, log) : generates NaNs k 2: In dt(x, df, log) : generates NaNs k 3: In dt(x, df, log) :generates NaNs k 4: In log(s) : generates NaNs k 5: In dt(x, df, log) : generates NaNs k 6: In dt(x, df, log) : generates NaNs k Thanks a lot, k Kate k - Original Message - k From: Prof Brian Ripley [EMAIL PROTECTED] k To: kate [EMAIL PROTECTED] k Cc: r-help@r-project.org k Sent: Thursday, May 08, 2008 10:02 AM k Subject: Re: [R] MLE for noncentral t distribution On Thu, 8 May 2008, kate wrote: I have a data with 236 observations. After plotting the histogram, I found that it looks like non-central t distribution. I would like to get MLE for mu and df. So you mean 'non-central'? See ?dt. I found an example to find MLE for gamma distribution from fitting distributions with R: library(stats4) ## loading package stats4 ll-function(lambda,alfa) {n-200 x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-(alfa- 1)*sum(log(x))+lambda*sum(x)} ## -log-likelihood function est-mle(minuslog=ll, start=list(lambda=2,alfa=1)) Is anyone how how to write down -log-likelihood function for noncentral t distribution? Just use dt. E.g. library(MASS) ?fitdistr shows you a worked example for location, scale and df, but note the comments. You could fit a non-central t, but it would be unusual to do so. Thanks a lot!! Kate __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for noncentral t distribution
I have a data with 236 observations. After plotting the histogram, I found that it looks like non-central t distribution. I would like to get MLE for mu and df. I found an example to find MLE for gamma distribution from fitting distributions with R: library(stats4) ## loading package stats4 ll-function(lambda,alfa) {n-200 x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-(alfa- 1)*sum(log(x))+lambda*sum(x)} ## -log-likelihood function est-mle(minuslog=ll, start=list(lambda=2,alfa=1)) Is anyone how how to write down -log-likelihood function for noncentral t distribution? Thanks a lot!! Kate [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for noncentral t distribution
On 5/8/2008 10:34 AM, kate wrote: I have a data with 236 observations. After plotting the histogram, I found that it looks like non-central t distribution. I would like to get MLE for mu and df. I found an example to find MLE for gamma distribution from fitting distributions with R: library(stats4) ## loading package stats4 ll-function(lambda,alfa) {n-200 x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-(alfa- 1)*sum(log(x))+lambda*sum(x)} ## -log-likelihood function est-mle(minuslog=ll, start=list(lambda=2,alfa=1)) Is anyone how how to write down -log-likelihood function for noncentral t distribution? dt() has a non-centrality parameter and a log parameter, so it would simply be ll - function(x, ncp, df) sum(dt(x, ncp=ncp, df=df, log=TRUE)) Make sure you convert mu into the ncp properly; the man page says how ncp is interpreted. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for noncentral t distribution
On Thu, 8 May 2008, kate wrote: I have a data with 236 observations. After plotting the histogram, I found that it looks like non-central t distribution. I would like to get MLE for mu and df. So you mean 'non-central'? See ?dt. I found an example to find MLE for gamma distribution from fitting distributions with R: library(stats4) ## loading package stats4 ll-function(lambda,alfa) {n-200 x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-(alfa- 1)*sum(log(x))+lambda*sum(x)} ## -log-likelihood function est-mle(minuslog=ll, start=list(lambda=2,alfa=1)) Is anyone how how to write down -log-likelihood function for noncentral t distribution? Just use dt. E.g. library(MASS) ?fitdistr shows you a worked example for location, scale and df, but note the comments. You could fit a non-central t, but it would be unusual to do so. Thanks a lot!! Kate [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for noncentral t distribution
Thanks for your quick reply. I try the command as follows, library(stats4) ## loading package stats4 ll - function(change, ncp, df) {-sum(dt(x, ncp=ncp, df=df, log=TRUE))}#-log-likelihood function est-mle(minuslog=ll, start=list(ncp=-0.3,df=2)) But the warnings appears as follows, invalid class mle object: invalid object for slot fullcoef in class mle: got class list, should be or extend class numeric When I typed warnings(), I get In dt(x, ncp = ncp, df = df, log = TRUE) : full precision was not achieved in 'pnt' Does anyone know how to solve it? Thanks, Kate - Original Message - From: Duncan Murdoch [EMAIL PROTECTED] To: kate [EMAIL PROTECTED] Cc: r-help@r-project.org Sent: Thursday, May 08, 2008 9:46 AM Subject: Re: [R] MLE for noncentral t distribution On 5/8/2008 10:34 AM, kate wrote: I have a data with 236 observations. After plotting the histogram, I found that it looks like non-central t distribution. I would like to get MLE for mu and df. I found an example to find MLE for gamma distribution from fitting distributions with R: library(stats4) ## loading package stats4 ll-function(lambda,alfa) {n-200 x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-(alfa- 1)*sum(log(x))+lambda*sum(x)} ## -log-likelihood function est-mle(minuslog=ll, start=list(lambda=2,alfa=1)) Is anyone how how to write down -log-likelihood function for noncentral t distribution? dt() has a non-centrality parameter and a log parameter, so it would simply be ll - function(x, ncp, df) sum(dt(x, ncp=ncp, df=df, log=TRUE)) Make sure you convert mu into the ncp properly; the man page says how ncp is interpreted. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for noncentral t distribution
QRMlib has routines for fitting t distributions. Have a look at that package. Also sn has routines for skew-t distributions David Scott On Thu, 8 May 2008, kate wrote: I have a data with 236 observations. After plotting the histogram, I found that it looks like non-central t distribution. I would like to get MLE for mu and df. I found an example to find MLE for gamma distribution from fitting distributions with R: library(stats4) ## loading package stats4 ll-function(lambda,alfa) {n-200 x-x.gam -n*alfa*log(lambda)+n*log(gamma(alfa))-(alfa- 1)*sum(log(x))+lambda*sum(x)} ## -log-likelihood function est-mle(minuslog=ll, start=list(lambda=2,alfa=1)) Is anyone how how to write down -log-likelihood function for noncentral t distribution? Thanks a lot!! Kate [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. _ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 Auckland 1142,NEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email: [EMAIL PROTECTED] Graduate Officer, Department of Statistics Director of Consulting, Department of Statistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for censored distributions in R
Hi just wondering if there is a package that can get the maximum likelihood or method of moments estimator for distributions with censored data? The distributions I'm interested in are: Exponential, pareto, beta, gamma and lognormal. Look at the survreg function in the survival library. It can find the MLE for any distribution for which g(y) can be written in a location-scale form, for some transform g. See names(survreg.distributions) for a list of those that are built in; others can be added. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MLE for censored distributions in R
Le mar. 22 janv. à 17:47, Thomas Downey a écrit : Hi just wondering if there is a package that can get the maximum likelihood or method of moments estimator for distributions with censored data? The distributions I'm interested in are: Exponential, pareto, beta, gamma and lognormal. I'm highly biased here, but you may have a look at package actuar. The function coverage() will define the pdf or cdf of a random variable modified in many sorts of ways (truncation, censoring, inflation, etc.) based on any distribution you want. You can then use the function is usual fitting procedures such as fitdistr() for ML. See the lossdist vignette in the package for details. HTH --- Vincent Goulet, Associate Professor École d'actuariat Université Laval, Québec [EMAIL PROTECTED] http://vgoulet.act.ulaval.ca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for censored distributions in R
Hi just wondering if there is a package that can get the maximum likelihood or method of moments estimator for distributions with censored data? The distributions I'm interested in are: Exponential, pareto, beta, gamma and lognormal. -- View this message in context: http://www.nabble.com/MLE-for-censored-distributions-in-R-tp15022863p15022863.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mle
Hello, I'm trying to obtain a maximum likelyhood estimate of a gaussian model by the MLE command, as I did with a Poisson model: x - rep(1:2,each=500) y - rnorm(length(x), mean=10+3*x, sd=1) glm1 - glm(y ~ x , family=gaussian()) library(stats4) func1 - function(alfa=10, beta=3, sigma=1) -sum(dnorm(y, mean=alfa+beta*x, sd=sigma), log=FALSE) mle(func1, method = BFGS) func2 - function(alfa=10, beta=3, sigma=1) -sum((1/sqrt(2*pi*sigma^2))*exp(-0.5*(((y-alfa-beta*x)^2)/sigma^2))) mle(func2, method = BFGS) I don't understand why it doesn't work. Have you some suggestions? Thank you so much for your help Antonio Gasparrini Public and Environmental Health Research Unit (PEHRU) London School of Hygiene Tropical Medicine Keppel Street, London WC1E 7HT, UK Office: 0044 (0)20 79272406 Mobile: 0044 (0)79 64925523 www.lshtm.ac.uk/pehru/ [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.