What is the output of versioninfo() and Pkg.installed("Optim")? Also, would
it be possible to make a gist with your code?2014-05-19 12:44 GMT+02:00 Holger Stichnoth <[email protected]>: > Hello, > > I installed Julia a couple of days ago and was impressed how easy it was > to make the switch from Matlab and to parallelize my code > (something I had never done before in any language; I'm an economist with > only limited programming experience, mainly in Stata and Matlab). > > However, I ran into a problem when using Optim.jl for Maximum Likelihood > estimation of a conditional logit model. With the default Nelder-Mead > algorithm, optimize from the Optim.jl package gave me the same result that > I had obtained in Stata and Matlab. > > With gradient-based methods such as BFGS, however, the algorithm jumped > from the starting values to parameter values that are completely different. > This happened for all thr starting values I tried, including the case in > which I took a vector that is closed to the optimum from the Nelder-Mead > algorithm. > > The problem seems to be that the algorithm tried values so large (in > absolute value) that this caused problems for the objective > function, where I call exponential functions into which these parameter > values enter. As a result, the optimization based on the BFGS algorithm did > not produce the expected optimum. > > While I could try to provide the analytical gradient in this simple case, > I was planning to use Julia for Maximum Likelihood or Simulated Maximum > Likelihood estimation in cases where the gradient is more difficult to > derive, so it would be good if I could make the optimizer run also with > numerical gradients. > > I suspect that my problems with optimize from Optim.jl could have > something to do with the gradient() function. In the example below, for > instance, I do not understand why the output of the gradient function > includes values such as 11470.7, given that the function values differ only > minimally. > > Best wishes, > Holger > > > julia> Optim.gradient(clogit_ll,zeros(4)) > 60554544523933395e-22 > 0Op > 0 > 0 > > 14923.564009972584 > -60554544523933395e-22 > 0 > 0 > 0 > > 14923.565228435104 > 0 > 60554544523933395e-22 > 0 > 0 > > 14923.569064311248 > 0 > -60554544523933395e-22 > 0 > 0 > > 14923.560174904109 > 0 > 0 > 60554544523933395e-22 > 0 > > 14923.63413848258 > 0 > 0 > -60554544523933395e-22 > 0 > > 14923.495218282553 > 0 > 0 > 0 > 60554544523933395e-22 > > 14923.58699717058 > 0 > 0 > 0 > -60554544523933395e-22 > > 14923.54224130672 > 4-element Array{Float64,1}: > -100.609 > 734.0 > 11470.7 > 3695.5 > > function clogit_ll(beta::Vector) > > # Print the parameters and the return value to > # check how gradient() and optimize() work. > println(beta) > println(-sum(compute_ll(beta,T,0))) > > # compute_ll computes the individual likelihood contributions > # in the sample. T is the number of periods in the panel. The 0 > # is not used in this simple example. In related functions, I > # pass on different values here to estimate finite mixtures of > # the conditional logit model. > return -sum(compute_ll(beta,T,0)) > end > > > > > > -- Med venlig hilsen Andreas Noack Jensen
