Re: [julia-users] Optim.jl: unexpected results when using gradient-based methods

John Myles White Tue, 20 May 2014 08:50:17 -0700

Yes, to use autodiff you need to make sure that all of the functions you call 
could be applied to Array{T} for all T <: Number. The typing on your code is 
currently overly restrictive when you define clogit_ll(beta::Vector{Float64}) 
and friends. If you loosen things to clogit_ll(beta::Vector), you might get 
autodiff to work.


 — John

On May 20, 2014, at 8:42 AM, Holger Stichnoth <[email protected]> wrote:

> When I set autodiff = true in the Gist I posted above, I get the message 
> "ERROR: no method clogit_ll(Array{Dual{Float64},1},)".
> 
> Holger
> 
> 
> On Monday, 19 May 2014 14:51:16 UTC+1, John Myles White wrote:
> If you can, please do share an example of your code. Logit-style models are 
> in general numerically unstable, so it would be good to see how exactly 
> you’ve coded things up.
> 
> One thing you may be able to do is use automatic differentiation via the 
> autodiff = true keyword to optimize, but that assumes that your objective 
> function is written in completely pure Julia code (which means, for example, 
> that your code must not call any of functions not written in Julia provided 
> by Distributions.jl).
> 
>  — John
> 
> On May 19, 2014, at 4:09 AM, Andreas Noack Jensen <[email protected]> 
> wrote:
> 
>> What is the output of versioninfo() and Pkg.installed("Optim")? Also, would 
>> it be possible to make a gist with your code?
>> 
>> 
>> 2014-05-19 12:44 GMT+02:00 Holger Stichnoth <[email protected]>:
>>  Hello,
>> 
>> I installed Julia a couple of days ago and was impressed how easy it was to 
>> make the switch from Matlab and to parallelize my code
>> (something I had never done before in any language; I'm an economist with 
>> only limited programming experience, mainly in Stata and Matlab).
>> 
>> However, I ran into a problem when using Optim.jl for Maximum Likelihood 
>> estimation of a conditional logit model. With the default Nelder-Mead 
>> algorithm, optimize from the Optim.jl package gave me the same result that I 
>> had obtained in Stata and Matlab.
>> 
>> With gradient-based methods such as BFGS, however, the algorithm jumped from 
>> the starting values to parameter values that are completely different. This 
>> happened for all thr starting values I tried, including the case in which I 
>> took a vector that is closed to the optimum from the Nelder-Mead algorithm.  
>> 
>> The problem seems to be that the algorithm tried values so large (in 
>> absolute value) that this caused problems for the objective
>> function, where I call exponential functions into which these parameter 
>> values enter. As a result, the optimization based on the BFGS algorithm did 
>> not produce the expected optimum.
>> 
>> While I could try to provide the analytical gradient in this simple case, I 
>> was planning to use Julia for Maximum Likelihood or Simulated Maximum 
>> Likelihood estimation in cases where the gradient is more difficult to 
>> derive, so it would be good if I could make the optimizer run also with 
>> numerical gradients.
>> 
>> I suspect that my problems with optimize from Optim.jl could have something 
>> to do with the gradient() function. In the example below, for instance, I do 
>> not understand why the output of the gradient function includes values such 
>> as 11470.7, given that the function values differ only minimally.
>> 
>> Best wishes,
>> Holger
>> 
>> 
>> julia> Optim.gradient(clogit_ll,zeros(4))
>> 60554544523933395e-22
>> 0Op
>> 0
>> 0
>> 
>> 14923.564009972584
>> -60554544523933395e-22
>> 0
>> 0
>> 0
>> 
>> 14923.565228435104
>> 0
>> 60554544523933395e-22
>> 0
>> 0
>> 
>> 14923.569064311248
>> 0
>> -60554544523933395e-22
>> 0
>> 0
>> 
>> 14923.560174904109
>> 0
>> 0
>> 60554544523933395e-22
>> 0
>> 
>> 14923.63413848258
>> 0
>> 0
>> -60554544523933395e-22
>> 0
>> 
>> 14923.495218282553
>> 0
>> 0
>> 0
>> 60554544523933395e-22
>> 
>> 14923.58699717058
>> 0
>> 0
>> 0
>> -60554544523933395e-22
>> 
>> 14923.54224130672
>> 4-element Array{Float64,1}:
>>   -100.609
>>    734.0
>>  11470.7
>>   3695.5
>> 
>> function clogit_ll(beta::Vector)
>> 
>>     # Print the parameters and the return value to
>>     # check how gradient() and optimize() work.
>>     println(beta) 
>>     println(-sum(compute_ll(beta,T,0)))
>> 
>>     # compute_ll computes the individual likelihood contributions
>>     # in the sample. T is the number of periods in the panel. The 0
>>     # is not used in this simple example. In related functions, I
>>     # pass on different values here to estimate finite mixtures of
>>     # the conditional logit model.
>>     return -sum(compute_ll(beta,T,0))
>> end
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> -- 
>> Med venlig hilsen
>> 
>> Andreas Noack Jensen
>

Re: [julia-users] Optim.jl: unexpected results when using gradient-based methods

Reply via email to